<left>FINM 36700 - Portfolio Theory and Risk Management</left>
<br>
<left>Fall 2022</left>

<h2><center> Final Exam </center></h2>

<center>Monday, December 5th 2022</center>

<h3>Ki Hyun</h3>

<h3>CNetID: kwhyun23</h3>

## 1. Short Answer Questions

### 1 - 1.

Narrowly has a higher return when broadly has lower volatility.
Yes, the data supported this conclusion

### 1 - 2.

LTCM focuses on long-term investment. Moreover, it anticipates the investor's higher demand for greater return in
"bad times". Furthermore, due to the long-term horizon, the haircut that LTCM receives is extremely favorable.
Therefore, the market exposure is non-linear.

It means that LTCM has a higher upside. However, empirically, since the investment horizon is longer, you may also
expect the downside to be larger.

### 1 - 3.

The Mean-Variance Optimization portfolio is extremely sensitive to estimation error

### 1 - 4.

- The sample size would be 22 $\times$ 12 and the regression would be done 40 times for the time series
- The sample size would be 40 and the regression would be done 22 $\times$ 12 times

### 1 - 5.

GMO uses dividend yield estimates that investors were likely to require over the long run and the expected long-run
dividend growth rate. They employed the "Gordon Growth Model", which posits that long-run required return on stocks was
the sum of fair dividend yield, required by the investors and expected long-run dividend growth.

Yes

### 1 - 6.

They group assets into classes and do a mean-variance analysis across the classes.
This would be computationally easier and therefore could be implemented more easily.

### 1 - 7.

Longer the horizon, larger the volatility and the Sharpe ratio of an investor will continue to decrease

### 1 - 8.

UIP implies CIP since CIP only looks at the expected (or market implied) exchange rate in the future whereas UIP
specifically looks at the spot rate in future time. However, CIP does not imply UIP

### 1- 9.

Managed funds returns are consistently beating the market
Managed funds' realized returns may be different to average returns

### 1 - 10.

We would calculated the optimal ratio using the linear regression
This ratio would then be used to replicate the portfolio and hedge for the losses.

## Imports

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns

from scipy.stats import kurtosis, skew
from scipy.stats import norm

import statsmodels.api as sm
from statsmodels.regression.rolling import RollingOLS

from sklearn.linear_model import LinearRegression
from sklearn.ensemble import RandomForestRegressor
from sklearn.linear_model import LinearRegression
from sklearn import tree
from sklearn.neural_network import MLPRegressor

from arch import arch_model
from arch.univariate import GARCH, EWMAVariance

import warnings
warnings.filterwarnings("ignore")

%matplotlib inline

import matplotlib.pyplot as plt

## Helper Functions

In [2]:
from Portfolio_Helper_Functions import *

## Data

In [3]:
# Setting the file and sheet names
filename = "../data/final_exam_data.xlsx"
futures_sheet = "futures (excess returns)"
factors_sheet = "factors (excess returns)"
GLD_sheet = "forecasting (weekly)"
FX_sheet = "fx (daily)"

In [4]:
# reading the futures data
futures = pd.read_excel(filename, sheet_name = futures_sheet, index_col = 'Date')

In [5]:
# reading the factors data
factors = pd.read_excel(filename, sheet_name = factors_sheet, index_col = 'Date')

In [6]:
# reading the GLD return data with interest rate signals
GLD = pd.read_excel(filename, sheet_name = GLD_sheet, index_col = 'Date')

In [7]:
# reading FX on GBP data
FX = pd.read_excel(filename, sheet_name = FX_sheet).rename(columns = {'DATE': 'Date'}).set_index('Date')

## 2. Value at Risk

### 2 - 1. 5th Percentile VaR and CVaR for GLD using empirical CDF

In [8]:
GLD_ret = GLD['GLD']
# getting the empirical 0.05 VaR
EVaR = GLD_ret.quantile(0.05)
ECVaR = GLD_ret[GLD_ret < EVaR].mean()
template = "Using empirical CDF, the 5th Percentile VaR is {} and the CVaR is {}"
template.format(EVaR.round(5), ECVaR.round(5))

'Using empirical CDF, the 5th Percentile VaR is -0.03332 and the CVaR is -0.04712'

### 2 - 2. 5th Percentile VaR for GLD using normal approximation

#### 2 - 2 - 1. Full Sample Volatility

In [9]:
# setting the full sample to estimate volatility
vol = GLD_ret.std()
norm.cdf(0.05) * vol

0.010980367862847203

#### 2 - 2 - 2. Rolling 150-week window for Volatility

In [10]:
# setting the 150-week window Volatility
vol = GLD_ret.loc['2020-01-26':,].std()
norm.cdf(0.05) * vol

0.012009625702859007

### 2 - 3.

The normal approximation with rolling window for volatility performed best.
We judge based on actual data and violation rate

## 3. Pricing Models

### 3 - 1. Time series test

#### 3 - 1 - a. Summary Statistics of Model

In [11]:
portfolios = futures.columns
df_lst_q3= []
for port in portfolios:
    futures_ret = futures[port]
    reg = regression_based_performance(factors,futures_ret,0)
    beta_mkt = reg[0][0]
    beta_umd = reg[0][1]
    alpha = reg[3]
    r_squared = reg[4]
    df_lst_q3.append(pd.DataFrame([[beta_mkt,beta_umd,alpha,r_squared]],
                                   columns=['Market Beta', 'Momentum Beta', 'Alpha','R-Squared'],
                                   index = [port]))

reg_performance_q3 = pd.concat(df_lst_q3)
reg_performance_q3.T

Unnamed: 0,NG1,KC1,CC1,LB1,CT1,SB1,LC1,W1,S1,C1,GC1,SI1,HG1,PA1
Market Beta,0.354111,0.315122,0.207322,0.942075,0.504249,0.057967,0.183059,0.298899,0.399481,0.340399,0.131624,0.511838,0.690631,0.664541
Momentum Beta,0.38123,-0.027467,-0.035816,-0.004789,-0.178597,-0.319205,0.066096,0.022426,0.02726,0.062038,0.148689,0.144376,-0.138838,0.173333
Alpha,0.00933,0.001933,0.005899,0.005373,0.002076,0.007761,0.001285,0.004544,0.003545,0.005073,0.005874,0.005462,0.003984,0.006599
R-Squared,0.017306,0.025887,0.012031,0.13685,0.09902,0.03273,0.020046,0.021333,0.052917,0.028248,0.027364,0.058439,0.214665,0.075625


In [12]:
n = futures.shape[1]
alpha_sum = abs(reg_performance_q3['Alpha']).sum()
mae_q3 = alpha_sum/n
pd.DataFrame([[mae_q3]],columns=['Mean Absolute Error'],
             index = ['Time Series'])

Unnamed: 0,Mean Absolute Error
Time Series,0.00491


In [13]:
rsquared_avg_q3 = abs(reg_performance_q3['R-Squared']).mean()
pd.DataFrame([[rsquared_avg_q3]],columns=['Average R-Squared '],
             index = ['Time Series'])

Unnamed: 0,Average R-Squared
Time Series,0.058747


#### 3 - 1 - b.

If the pricing model worked perfectly, we would expect a low MAE value, since we expect the $\alpha$ to be close to 0.
We would also expect a very high (close to 1) $R^2$ value.

### 3 - 2. Cross-sectional test

#### 3 - 2 - a. Summary Statistics of Model

In [14]:
y = futures.mean()
X = reg_performance_q3.loc[:,['Market Beta', 'Momentum Beta']]
CS_q3 = regression_based_performance(X, y, 0)
resid_cs = sm.OLS(y,X).fit().resid
pd.DataFrame([[CS_q3[3],CS_q3[4], resid_cs.mean()]],
             columns=['Annualized Intercept','R-Squared', 'Annualized MAE'],
             index = ['Cross-sectional'])

Unnamed: 0,Annualized Intercept,R-Squared,Annualized MAE
Cross-sectional,0.005093,0.391371,0.0013


In [15]:
pd.DataFrame([CS_q3[0][0], CS_q3[0][1]],
             columns = ['Annualized Factor Premia'],
             index = ['MKT', 'UMD'])

Unnamed: 0,Annualized Factor Premia
MKT,0.005165
UMD,0.006125


#### 3 - 2 - b.

If the pricing model worked perfectly we would expect the annualized intercept to be precisely 0.
Moreover, we would expect the $R^2$ value to be 1 and the annualized MAE to be 0.

### 3 - 3. Compare the factor premia

In [16]:
ts_premia = factors.mean().apply(lambda x: x*12)
cs_premia = CS_q3[0]
pd.DataFrame([[ts_premia[0], cs_premia[0]], [ts_premia[1], cs_premia[1]]],
             columns = ['Time Series', 'Cross-sectional'],
             index = ['MKT', 'UMD'])

Unnamed: 0,Time Series,Cross-sectional
MKT,0.070633,0.005165
UMD,0.018405,0.006125


## 4. Forecasting

In [17]:
# looking at the data
GLD.head()

Unnamed: 0_level_0,GLD,Tbill rate,Tbill change
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2009-04-19,-0.012629,0.13,-0.045
2009-04-26,0.052805,0.095,-0.035
2009-05-03,-0.030874,0.145,0.05
2009-05-10,0.034848,0.165,0.02
2009-05-17,0.017448,0.155,-0.01


### 4 - 1. Lagged Regression using signal

In [18]:
X = GLD.shift(1).dropna().loc[:,['Tbill rate', 'Tbill change']]
X = sm.tools.add_constant(X)
GLD_ret = GLD[1:].loc[:, ['GLD']]

signal_model = sm.OLS(GLD_ret, X).fit()

In [19]:
pd.DataFrame([signal_model.params[0], signal_model.params[1], signal_model.params[2], signal_model.rsquared],
             columns = ['OLS Result'],
             index = [r'$\alpha$', r'$\beta_{\text{Tbill rate}}$', r'$\beta_{\text{Tbill change}}$', r'$R^2$'])

Unnamed: 0,OLS Result
$\alpha$,0.000993
$\beta_{\text{Tbill rate}}$,0.00025
$\beta_{\text{Tbill change}}$,0.000456
$R^2$,0.000107


### 4 - 2. Trading weight strategy

In [20]:
# building the trading weight based on strategy
GLD['GLD_pred'] = signal_model.params[0] + \
                  signal_model.params[1] * GLD['Tbill rate'] + \
                  signal_model.params[2] * GLD['Tbill change']
GLD['weight'] = 0.2 + GLD['GLD_pred'] * 80
GLD[r'$r^X$'] = GLD['weight'].shift(1) * GLD['GLD']

In [21]:
GLD[r'$r^X$'].dropna().iloc[0:5, ]

Date
2009-04-26    0.014805
2009-05-03   -0.008646
2009-05-10    0.009901
2009-05-17    0.004946
2009-05-24    0.008013
Name: $r^X$, dtype: float64

In [22]:
GLD[r'$r^X$'].dropna().iloc[-5:, ]

Date
2022-11-06    0.007823
2022-11-13    0.018681
2022-11-20   -0.003900
2022-11-27    0.000964
2022-12-04   -0.001094
Name: $r^X$, dtype: float64

### 4 - 3. Summary Stats

In [23]:
performance_summary(pd.DataFrame(GLD.loc[::, r'$r^X$']))[['Annualized Return', 'Annualized Volatility',
                                                          'Annualized Sharpe Ratio', 'Max Drawdown']]

Unnamed: 0,Annualized Return,Annualized Volatility,Annualized Sharpe Ratio,Max Drawdown
$r^X$,0.004001,0.020932,0.191134,-0.142573


In [24]:
performance_summary(pd.DataFrame(GLD.loc[::, 'GLD']))[['Annualized Return', 'Annualized Volatility',
                                                          'Annualized Sharpe Ratio', 'Max Drawdown']]

Unnamed: 0,Annualized Return,Annualized Volatility,Annualized Sharpe Ratio,Max Drawdown
GLD,0.013374,0.073157,0.18281,-0.447446


### 4 - 4. LFD

In [25]:
GLD_factor = pd.DataFrame(GLD.loc['2009-04-26':, 'GLD'])
GLD_X = pd.DataFrame(GLD.loc[::, r'$r^X$']).dropna()
Q4_LFD = regression_based_performance(GLD_factor, GLD_X, 0)
pd.DataFrame([Q4_LFD[3], Q4_LFD[0][0], Q4_LFD[2]],
             columns = ['LFD Result'],
             index = ['Market Alpha', 'Market Beta', 'Information ratio'])

Unnamed: 0,LFD Result
Market Alpha,1e-05
Market Beta,0.285599
Information ratio,0.102559


### 4 - 5.

Tbill rate.
Tbill rate has a lower $\beta$


### 4 - 6. OOS performace

### 4 - 7.

### 4 - 8.

### 4 - 9.

## 5. FX Carry

### 5 - 1.

In [26]:
for col in FX.columns:
    if col == "GBP":
        FX[col] = FX[col]
        FX['log_'+col] = np.log(FX[col])
    else:
        FX[col] = FX[col]
        FX['log_'+col] = np.log(1+FX[col])

FX.head()

Unnamed: 0_level_0,GBP,SOFR,SONIA,log_GBP,log_SOFR,log_SONIA
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2018-04-03,1.4068,0.0183,0.004652,0.341318,0.018135,0.004641
2018-04-04,1.4076,0.0174,0.004624,0.341886,0.01725,0.004613
2018-04-05,1.3991,0.0175,0.004653,0.335829,0.017349,0.004642
2018-04-06,1.4088,0.0175,0.004666,0.342738,0.017349,0.004655
2018-04-09,1.4136,0.0175,0.004651,0.34614,0.017349,0.00464


In [27]:
FX['log_GBP'].mean()

0.26008074222970146

In [28]:
FX['log_SOFR'].mean()

0.011354489235353989

In [29]:
FX['log_SONIA'].mean()

0.0053100837214078

### 5 - 2.

We would expect the return to GBP to be 0

### 5 - 3.

In [30]:
fx_hldg_excess_ret = FX['log_GBP'] - FX['log_GBP'].shift(1) + FX['log_SONIA'].shift(1) - FX['log_SOFR'].shift(1)

fx_hldg_summary = performance_summary(fx_hldg_excess_ret.to_frame().dropna())

fx_hldg_summary.loc[:,['Annualized Return','Annualized Volatility','Min','Max']]

Unnamed: 0,Annualized Return,Annualized Volatility,Min,Max
0,-0.074109,0.032814,-0.045285,0.032196


### 5 - 4.
The interest rate spread did not help. The mean is negative
The USD must have appreciated relative to GBP over the time

### 5 - 5.

In [31]:
vol = fx_hldg_excess_ret.dropna().std()*np.sqrt(5*52)
mu = fx_hldg_excess_ret.mean()
norm.cdf((0 - mu)/vol).round(4)

0.5161

### 5 - 6.

In [32]:
y = (FX['log_GBP'].shift(1) - FX['log_GBP']).dropna()
X = (FX['log_SOFR'] - FX['log_SONIA']).loc['2018-04-04':,]
q5_model = regression_based_performance(X, y, 0)
pd.DataFrame([q5_model[3], q5_model[0][0], q5_model[4]],
             columns = ['OLS Result'],
             index = [r'$\alpha$', r'$\beta_$', r'$R^2$'])

Unnamed: 0,OLS Result
$\alpha$,0.000116
$\beta_$,0.002892
$R^2$,1.2e-05


### 5 - 7.

If UIP holds, we would expect the true estimates of the parameters to be 0

### 5 - 8.

Since the sign of $\beta$ is positive, we would expect USD to get weaker

### 5 - 9.

No we would expect the forward exchange rate to be lower than the spot exchange rate

### 5 - 10.