# Final 2023 Spring

###### Created by Qihang Ma -- 2023.05.01

## Name I used to generate data

- python3.11 datageneration.py "Qihang Ma" "..." 

- Writing Random Values for Qihang Ma into ...

- Hashed Name:  7b8449b9dcbacd4fe5537e2a5211ca1569bf9f77


If you want to run my code, you may need to install my library with the name "RiskLib".

In [1]:
import warnings
warnings.filterwarnings("ignore")
from RiskLib import calculation, cov_matrix, linear_regression, optimal_portfolio, risk_parity, risk_attribution, Option, simulation, VaR
import pandas as pd
import numpy as np
import datetime as dt
from scipy.optimize import fsolve, minimize
import statsmodels.api as sm
from scipy.stats import t, norm, kurtosis, skew
import matplotlib.pyplot as plt

## Problem 1

Using the data in “problem1.csv”
- a. Calculate Log Returns (2pts)

- b. Calculate Pairwise Covariance (4pt)

- c. Is this Matrix PSD? If not, fix it with the “near_psd” method (2pt) 

- d. Discuss when you might see data like this in the real world. (2pt)

In [2]:
missing_data = pd.read_csv('problem1.csv')
missing_data

Unnamed: 0,Price1,Price2,Price3,Date
0,102.826412,94.650195,98.743159,2023-04-12
1,,94.790948,100.022901,2023-04-13
2,102.785907,,,2023-04-14
3,102.847258,96.056428,98.541876,2023-04-15
4,102.818215,94.861366,97.983723,2023-04-16
5,,,98.458978,2023-04-17
6,102.829005,94.108024,98.650071,2023-04-18
7,102.920044,94.206004,,2023-04-19
8,102.848208,94.611114,99.883936,2023-04-20
9,102.642653,96.23473,99.904116,2023-04-21


#### 1.1 Calculate the log return for the price

- If the data is missing, then the return for that day is NaN, and for the next day, it will also be NaN.

In [3]:
missing_returns = calculation.return_calculate(missing_data, method = 'LOG').drop('Date',axis=1)
missing_returns

Unnamed: 0,Price1,Price2,Price3
0,,0.001486,0.012877
1,,,
2,0.000597,,
3,-0.000282,-0.012519,-0.00568
4,,,0.004839
5,,,0.001939
6,0.000885,0.001041,
7,-0.000698,0.004291,
8,-0.002001,0.017015,0.000202
9,0.001087,-0.00841,-0.019604


#### 1.2 Calculate the pairwise Covariance Matrix

- Since there are some missing values in the returns, calculate the covariance matrix with the pairwise method.

In [4]:
pairwise_cov = cov_matrix.missing_cov(missing_returns.values, skipMiss=False, fun=np.cov)
pd.DataFrame(pairwise_cov)

Unnamed: 0,0,1,2
0,9.983249e-07,3e-06,3e-06
1,2.985551e-06,0.000501,0.000112
2,2.997105e-06,0.000112,0.000217


#### 1.3 Check if the matrix psd or not

In [5]:
cov_matrix.is_psd(pairwise_cov)

True

So, it is a psd matrix.

#### 1.4 When we will see the missing data

Not all markets are open at the same time on the same days. A holiday in one market is not necessarily a holiday in another, even in the same country. Or in different countries, there will be different opening time. So, we may see the missing datas like this.

## Problem 2

“problem2.csv” contains data about a call option. Time to maturity is given in days. Assume 255 days in a year.

- a. Calculate the call price (1pt)

- b. Calculate Delta (1pt)

- c. Calculate Gamma (1pt)

- d. Calculate Vega (1pt)

- e. Calculate Rho (1pt)

Assume you are long 1 share of underlying and are short 1 call option. Using Monte Carlo assuming a Normal distribution of arithmetic returns where the implied volatility is the annual volatility and 0 mean

- f. Calculate VaR at 5% (2pt)

- g. Calculate ES at 5% (2pt)

- h. This portfolio’s payoff structure most closely resembles what? (1pt)

In [6]:
call_option = pd.read_csv('problem2.csv')
call_option

Unnamed: 0,Underlying,Strike,IV,TTM,RF,DivRate
0,85.084564,74.575976,0.22,148,0.045,0.04477


#### 2.1 Calculate the value and greeks about this option with Black-scholes model 

In [7]:
S0 = call_option['Underlying'].values[0]
K = call_option['Strike'].values[0]
iv = call_option['IV'].values[0]
rf = call_option['RF'].values[0]
q = call_option['DivRate'].values[0]
ttm = call_option['TTM'].values[0]/255

call = Option.black_scholes_matrix(S0,K,ttm,rf,q,iv,'call')
pd.DataFrame(call.greeks(), index=['call'])

Unnamed: 0,Value,Delta,Gamma,Vega,Theta,Rho,Carry Rho
call,11.844251,0.787432,0.018651,17.240393,-2.749939,32.010991,38.885301


#### 2.2 Simulate the returns and calculate the VaR and ES

In [8]:
ttm_1 = (call_option['TTM'].values[0]-1)/255
VaRs = []
ESs = []
dfs = []

for i in range(1000):

    np.random.seed(i)
    sim_r = np.random.normal(size=10000, loc=0, scale=iv/np.sqrt(255))
    sim_price = pd.DataFrame((1+sim_r) * S0, columns=['Price'])

    sim_price['PnL'] = sim_price.apply(lambda x:call.price() - Option.black_scholes(x,K,ttm_1,rf,q,iv,'call') + x - S0)


    VaRs.append(VaR.calculate_var(sim_price['PnL']))
    ESs.append(VaR.calculate_ES(sim_price['PnL']))
    dfs.append(simulation.Fitting_t_MLE(sim_price['PnL'])[0])

VaRs = np.array(VaRs)
ESs = np.array(ESs)
dfs = np.array(dfs)

print("VaR Mean: {:.4f} -- 5% range [{:.4f}, {:.4f}].".format(VaRs.mean(), np.quantile(VaRs, 0.025), np.quantile(VaRs, 0.975)))
print("ES Mean: {:.4f} -- 5% range [{:.4f}, {:.4f}].\n".format(ESs.mean(), np.quantile(ESs, 0.025), np.quantile(ESs, 0.975)))

print("If fitting with t distribution, the degree of freedom:")
print("df Mean: {:.4f} -- 5% range [{:.4f}, {:.4f}].".format(dfs.mean(), np.quantile(dfs, 0.025), np.quantile(dfs, 0.975)))

VaR Mean: 0.4337 -- 5% range [0.4216, 0.4461].
ES Mean: 0.5607 -- 5% range [0.5455, 0.5762].

If fitting with t distribution, the degree of freedom:
df Mean: 52.0227 -- 5% range [25.3523, 121.9174].


Since the degree of freedom is relatively high, so this portfolio’s payoff structure most closely resembles normal distribution.

## Problem 3

Data in “problem3_cov.csv” is the covariance for 3 assets. “problem3_ER.csv” is the expected return for each asset as well as the risk free rate.

- a. Calculate the Maximum Sharpe Ratio Portfolio (4pt)

- b. Calculate the Risk Parity Portfolio (4pt)

- c. Compare the differences between the portfolio and explain why. (2pt)

In [9]:
covar_matrix = pd.read_csv('problem3_cov.csv')
origin_exp_return = pd.read_csv('problem3_ER.csv')

assets = covar_matrix.columns.to_list()

exp_return = origin_exp_return.values[0][1:]
rf = origin_exp_return['RF'].values[0]

#### 3.1 Calculate the Maximum Sharpe Ratio Portfolio

- With the constrain of positive weights

In [10]:
weights_sr, _ = optimal_portfolio.Optweight_sr(assets, exp_return, covar_matrix, rf)
weights_sr

Unnamed: 0,Stock,Weight,cEr
0,Asset1,0.398464,0.056812
1,Asset2,0.28962,0.051914
2,Asset3,0.311915,0.052918


#### 3.2 Calculate the Risk Parity Portfolio

In [11]:
weights_rp = risk_parity.vol_risk_parity(exp_return, covar_matrix)
weights_rp

Unnamed: 0,Weight,cEr,CSD
0,0.398465,0.056812,0.064306
1,0.289625,0.051915,0.064306
2,0.31191,0.052918,0.064306


#### 3.3 Comparation of two portfolios

Correlations and Sharpe ratios are equal -> risk parity is the maximum sharpe ratio portfolio

## Problem 4 

Data in “problem4_returns.csv” is a series of returns for 3 assets.“problem4_startWeight.csv” is the starting weights of a portfolio of these assets as of the first day in the return series.

- a. Calculate the new weights for the start of each time period (2pt)

- b. Calculate the ex-post return attribution of the portfolio on each asset (4pt)

- c. Calculate the ex-post risk attribution of the portfolio on each asset (2pt)

In [12]:
returns = pd.read_csv('problem4_returns.csv').drop('Date', axis=1)
start_weights = pd.read_csv('problem4_startWeight.csv').values[0]

#### 4.1 Calculate the new weights from the start to each time period

In [13]:
stocks = list(returns.columns)
n = returns.shape[0]

weights = np.empty((n+1, len(start_weights)))
lastW = np.copy(start_weights)
matReturns = returns[stocks].values

for i in range(n):
    # Save Current Weights in Matrix
    weights[i,:] = lastW

    # Update Weights by return
    lastW = lastW * (1.0 + matReturns[i,:])

    # Portfolio return is the sum of the updated weights
    pR = np.sum(lastW)
    # Normalize the wieghts back so sum = 1
    lastW = lastW / pR

weights[n,:] = lastW

pd.DataFrame(weights, columns=stocks)


Unnamed: 0,Asset1,Asset2,Asset3
0,0.429088,0.284605,0.286308
1,0.442359,0.271454,0.286187
2,0.457152,0.273731,0.269117
3,0.432297,0.288847,0.278856
4,0.451087,0.264639,0.284274
5,0.471658,0.233571,0.294771
6,0.486315,0.228444,0.285241
7,0.507897,0.239593,0.25251
8,0.534966,0.233213,0.231821
9,0.512858,0.242326,0.244816


#### 4.2 & 4.3 Calculate the ex-post return & risk attribution

In [14]:
risk_attribution.expost_attribution(start_weights,returns)

Unnamed: 0,Value,Asset1,Asset2,Asset3,Portfolio
0,TotalReturn,0.629554,-0.060875,-0.210485,0.192545
1,Return Attribution,0.276461,-0.019145,-0.064771,0.192545
2,Vol Attribution,0.021414,0.008023,0.006717,0.036153


## Problem 5

Input prices in “problem5.csv” are for a portfolio. You hold 1 share of each asset. Using arithmetic returns, fit a generalized T distribution to each asset return series. Using a Gaussian Copula:

- a. Calculate VaR (5%) for each asset (3pt)

- b. Calculate VaR (5%) for a portfolio of Asset 1 & 2 and a portfolio of Asset 3 & 4 (4pt)

- c. Calculate VaR (5%) for a portfolio of all 4 assets. (3pt)

In [15]:
dataset = pd.read_csv('problem5.csv')
all_returns = calculation.return_calculate(dataset).drop('Date',axis=1)

latest_prices = dataset.drop('Date',axis=1).tail(1).values[0]

In [16]:
def gaussian_copula(returns, fitting_model=None, n_sample=10000, seed=12345):
    stocks = returns.columns.tolist()
    n = len(stocks)

    if fitting_model is None:
        fitting_model = np.full(n, 't')


    # Fitting model for each stock
    parameters = []
    assets_returns_cdf = pd.DataFrame()
    for i, stock in enumerate(stocks):
        if fitting_model[i] == 't':
            params = t.fit(returns[stock])
            fitting = 't'
        elif fitting_model[i] == 'n':
            params = norm.fit(returns[stock])
            fitting = 'n'
        parameters.append(params)
        assets_returns_cdf[stock] = t.cdf(returns[stock],df=params[0], loc=params[1], scale = params[2]) if fitting == 't' else norm.cdf(returns[stock],loc=params[0], scale = params[1])

    # Simulate N samples with spearman correlation matrix
    np.random.seed(seed)
    spearman_corr_matrix = assets_returns_cdf.corr(method='spearman')
    sim_sample = simulation.multivariate_normal_simulation(spearman_corr_matrix, n_sample, method='pca',seed=seed)
    sim_sample = pd.DataFrame(sim_sample, columns=stocks)

    # Convert simulation result with cdf of standard normal distribution
    sim_sample_cdf = pd.DataFrame()
    for stock in stocks:
        sim_sample_cdf[stock] = norm.cdf(sim_sample[stock],loc=0,scale=1)
            
    # Convert cdf matrix to return matrix with parameter
    sim_returns = pd.DataFrame()
    for i, stock in enumerate(stocks):
        if fitting_model[i] == 't':       
            sim_returns[stock] = t.ppf(sim_sample_cdf[stock], df=parameters[i][0], loc=parameters[i][1], scale = parameters[i][2])
        elif fitting_model[i] == 'n':
            sim_returns[stock] = norm.ppf(sim_sample_cdf[stock],  loc=parameters[i][0], scale = parameters[i][1])
    
    return sim_returns, pd.DataFrame(parameters,index=[stocks,fitting_model])

In [17]:
sim_returns, params = gaussian_copula(all_returns, seed=1)

#### 5.1 Calculate VaR for each asset

In [18]:
for i, asset in enumerate(sim_returns.columns):
    single_VaR = VaR.calculate_var(sim_returns[asset])
    single_dollar_VaR = VaR.calculate_var(sim_returns[asset] * latest_prices[i])

    print("For Asset {}, the VaR is {:.6f}, $VaR is {:.6f}.".format(i+1,single_VaR, single_dollar_VaR))

For Asset 1, the VaR is 0.000814, $VaR is 0.092485.
For Asset 2, the VaR is 0.000547, $VaR is 0.049353.
For Asset 3, the VaR is 0.000498, $VaR is 0.043831.
For Asset 4, the VaR is 0.000720, $VaR is 0.062007.


#### 5.2 Calculate VaR for a portfolio of Asset 1 & 2 and a portfolio of Asset 3 & 4

In [19]:
VaR_12 = VaR.calculate_var(sim_returns[['Price1','Price2']].dot(latest_prices[:2]))
print("For Portfolio of Asset 1 & 2, the VaR is {:.6f}, $VaR is {:.6f}.".format(VaR_12/sum(latest_prices[:2]),VaR_12))

For Portfolio of Asset 1 & 2, the VaR is 0.000670, $VaR is 0.136627.


In [20]:
VaR_34 = VaR.calculate_var(sim_returns[['Price3','Price4']].dot(latest_prices[2:]))
print("For Portfolio of Asset 3 & 4, the VaR is {:.6f}, $VaR is {:.6f}.".format(VaR_34/sum(latest_prices[2:]),VaR_34))

For Portfolio of Asset 3 & 4, the VaR is 0.000592, $VaR is 0.103059.


#### 5.3 Calculate VaR for the whole portfolio

In [21]:
VaR_all = VaR.calculate_var(sim_returns.dot(latest_prices))
print("For the whole Portfolio, the VaR is {:.6f}, $VaR is {:.6f}.".format(VaR_all/sum(latest_prices),VaR_all))

For the whole Portfolio, the VaR is 0.000618, $VaR is 0.233630.
