# FINM 36700 Midterm I

### Ki Hyun

## Imports

In [105]:
import pandas as pd
import numpy as np
import scipy.stats as stat
import seaborn as sns
import matplotlib.pyplot as plt
from scipy.stats import kurtosis, skew
import seaborn as sns
import statsmodels.api as sm
from statsmodels.regression.rolling import RollingOLS
from sklearn.linear_model import LinearRegression
from arch import arch_model
from arch.univariate import GARCH, EWMAVariance
import warnings
warnings.filterwarnings("ignore")

%matplotlib inline

## 1 Short Answer
### 1:
- No that is not a good decision since adding another asset class should be decided upon the covariance of that asset class with other asset classes--not the Sharpe Ratio. As we saw in HW1, when doing the mean-variance optimization for a portfolio, the weights of each asset class does not follow the order of Sharpe Ratio. Similarly, just looking at cryptocurrency's Sharpe ratio would not aid in making that decision.
### 2:
- True. As discussed during class, with two assets have similar
### 3:
- Approximation based on normal distribution performed better than historic simulation in the actual data.
- The judgement on which performs better was done based on testing the violation rate. (i.e. For a 0.05 VaR, for instance, we would test the VaR on actual returns data and the returns being below the VaR should occur around 5% of the time. Being closer to that number would mean that the VaR method performed better)
### 4:
- Harvard puts restriction on the weights so that the portfolio does not consist of extreme weights.
- The problem with this approach is that the optimization does not explore all the possible options. The solution given from the process may indeed be the best possible option within the set limits, but there could be a situation where a significantly better portfolio being able to be constructed just outside the boundaries of the limits on weights.
### 5:
- Ridge regression and Lasso both aid in the reduction of dimension when it comes to mean-variance optimization.
- Since the classic MV solution needs a lot of estimations for mean, variance, and covariance, reducing the dimension in the MV optimal weighting problem would make the process much more efficient.
### 6:
-

## Reading Data

In [73]:
df = pd.read_excel("../data/midterm_1.xlsx")
df = df.set_index('date')
df.head()

Unnamed: 0_level_0,CL1,GC1,KC1,ES1,BP1
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2009-01-31,-0.113532,0.048627,0.061111,-0.086109,-0.007831
2009-02-28,0.04468,0.015187,-0.079479,-0.107294,-0.008309
2009-03-31,0.087892,-0.021111,0.034402,0.087209,0.001745
2009-04-30,-0.013826,-0.036543,-0.006491,0.094679,0.032897
2009-05-31,0.287437,0.0983,0.185518,0.055172,0.088934


## Helper Functions

In [74]:
def performance_summary(return_data):
    """
        Returns the Performance Stats for given set of returns
        Inputs:
            return_data - DataFrame with Date index and Monthly Returns for different assets/strategies.
        Output:
            summary_stats - DataFrame with annualized mean return, vol, sharpe ratio. Skewness, Excess Kurtosis, Var (0.5) and
                            CVaR (0.5) and drawdown based on monthly returns.
    """
    summary_stats = return_data.mean().to_frame('Annualized Return').apply(lambda x: x*12)
    summary_stats['Annualized Volatility'] = return_data.std().apply(lambda x: x*np.sqrt(12))
    summary_stats['Mean/Vol'] = summary_stats['Annualized Return']/summary_stats['Annualized Volatility']

    summary_stats['Skewness'] = return_data.skew()
    summary_stats['Excess Kurtosis'] = return_data.kurtosis()
    summary_stats['VaR (0.05)'] = return_data.quantile(.05, axis = 0)
    summary_stats['CVaR (0.05)'] = return_data[return_data <= return_data.quantile(.05, axis = 0)].mean()
    summary_stats['Min'] = return_data.min()
    summary_stats['Max'] = return_data.max()

    wealth_index = 1000*(1+return_data).cumprod()
    previous_peaks = wealth_index.cummax()
    drawdowns = (wealth_index - previous_peaks)/previous_peaks

    summary_stats['Max Drawdown'] = drawdowns.min()
    summary_stats['Peak'] = [previous_peaks[col][:drawdowns[col].idxmin()].idxmax() for col in previous_peaks.columns]
    summary_stats['Bottom'] = drawdowns.idxmin()

    recovery_date = []
    for col in wealth_index.columns:
        prev_max = previous_peaks[col][:drawdowns[col].idxmin()].max()
        recovery_wealth = pd.DataFrame([wealth_index[col][drawdowns[col].idxmin():]]).T
        recovery_date.append(recovery_wealth[recovery_wealth[col] >= prev_max].index.min())
    summary_stats['Recovery'] = recovery_date

    return summary_stats

In [75]:
def tangency_portfolio_rfr(asset_return,cov_matrix, cov_diagnolize = False):
    """
        Returns the tangency portfolio weights in a (1 x n) vector
        Inputs:
            asset_return - return for each asset (n x 1) Vector
            cov_matrix = nxn covariance matrix for the assets
    """
    if cov_diagnolize:
        asset_cov = np.diag(np.diag(cov_matrix))
    else:
        asset_cov = np.array(cov_matrix)
    inverted_cov= np.linalg.inv(asset_cov)
    one_vector = np.ones(len(cov_matrix.index))

    den = (one_vector @ inverted_cov) @ (asset_return)
    num =  inverted_cov @ asset_return
    return (1/den) * num

In [76]:
def gmv_portfolio(asset_return,cov_matrix):
    """
        Returns the Global Minimum Variance portfolio weights in a (1 x n) vector
        Inputs:
            asset_return - return for each asset (n x 1) Vector
            cov_matrix = nxn covariance matrix for the assets
    """
    asset_cov = np.array(cov_matrix)
    inverted_cov= np.linalg.inv(asset_cov)
    one_vector = np.ones(len(cov_matrix.index))

    den = (one_vector @ inverted_cov) @ (one_vector)
    num =  inverted_cov @ one_vector
    return (1/den) * num

In [77]:
def mv_portfolio(asset_return,cov_matrix,target_ret,tangency_port):
    """
        Returns the Mean-Variance portfolio weights in a (1 x n) vector when no riskless assset is available
        Inputs:
            asset_return - total return for each asset (n x 1) Vector
            cov_matrix = nxn covariance matrix for the assets
            target_ret = Target Return (Not-Annualized)
            tangency_port = Tangency portfolio
    """
    omega_tan = tangency_portfolio_rfr(asset_return.mean(),cov_matrix)
    omega_gmv = gmv_portfolio(asset_return,cov_matrix)

    mu_tan = asset_return.mean() @ omega_tan
    mu_gmv = asset_return.mean() @ omega_gmv

    delta = (target_ret - mu_gmv)/(mu_tan - mu_gmv)
    mv_weights = delta * omega_tan + (1-delta)*omega_gmv
    return mv_weights

In [78]:
def mvo_performance_stats(asset_returns,cov_matrix,port_weights, port_type,period):
    """
        Returns the Annualized Performance Stats for given asset returns, portfolio weights and covariance matrix
        Inputs:
            asset_return - Excess return over the risk free rate for each asset (n x 1) Vector
            cov_matrix = nxn covariance matrix for the assets
            port_weights = weights of the assets in the portfolio (1 x n) Vector
            port_type = Type of Portfolio | Eg - Tangency or Mean-Variance Portfolio
            period = Monthly frequency
    """

    ret = np.dot(port_weights,asset_returns)
    vol = np.sqrt(port_weights @ cov_matrix @ port_weights.T)*np.sqrt(period)
    sharpe = ret/vol

    stats = pd.DataFrame([[ret,vol,sharpe]],columns= ["Annualized Return","Annualized Volatility","Annualized Sharpe Ratio"], index = [port_type])
    return stats

In [102]:
def rolling_regression_param(factor,fund_ret,roll_window = 60):
    """
        Returns the Rolling Regression parameters for given set of returns and factors
        Inputs:
            factor - Dataframe containing monthly returns of the regressors
            fund_ret - Dataframe containing monthly excess returns of the regressand fund
            roll_window = rolling window for regression
        Output:
            params - Dataframe with time-t as the index and constant and Betas as columns
    """
    X = sm.add_constant(factor)
    y= fund_ret
    rols = RollingOLS(y, X, window=roll_window)
    rres = rols.fit()
    params = rres.params.copy()
    params.index = np.arange(1, params.shape[0] + 1)
    return params

In [109]:
def regression_based_performance(factor,fund_ret,rf,constant = True):
    """
        Returns the Regression based performance Stats for given set of returns and factors
        Inputs:
            factor - Dataframe containing monthly returns of the regressors
            fund_ret - Dataframe containing monthly excess returns of the regressand fund
            rf - Monthly risk free rate of return
        Output:
            summary_stats - (Beta of regression, treynor ratio, information ratio, alpha).
    """
    if constant:
        X = sm.tools.add_constant(factor)
    else:
        X = factor
    y=fund_ret
    model = sm.OLS(y,X,missing='drop').fit()

    if constant:
        beta = model.params[1:]
        alpha = round(float(model.params['const']),6)

    else:
        beta = model.params
    treynor_ratio = ((fund_ret.values-rf.values).mean()*12)/beta[0]
    tracking_error = (model.resid.std()*np.sqrt(12))
    if constant:
        information_ratio = model.params[0]*12/tracking_error
    r_squared = model.rsquared
    if constant:
        return (beta,treynor_ratio,information_ratio,alpha,r_squared,tracking_error)
    else:
        return (beta,treynor_ratio,r_squared,tracking_error)

In [79]:
performance = performance_summary(df)
mean_total_ret =  np.array(performance['Annualized Return'])
tangency_port = tangency_portfolio_rfr(mean_total_ret,df.cov())

## 2 Allocation
### 1.

In [80]:
GMVPort_df = pd.DataFrame(gmv_portfolio(mean_total_ret, df.cov()),columns= ["GMV Portfolio Weight"],index=performance.index)
GMVPort_df

Unnamed: 0,GMV Portfolio Weight
CL1,-0.030674
GC1,0.179111
KC1,-0.010639
ES1,0.092154
BP1,0.770048


In [81]:
TangencyPort_df = pd.DataFrame(tangency_port, columns= ["Tangency Portfolio Weight"],index=performance.index)
TangencyPort_df

Unnamed: 0,Tangency Portfolio Weight
CL1,-0.128124
GC1,1.191087
KC1,0.097813
ES1,4.220019
BP1,-4.380796


### 2.

In [82]:
target_ret = 0.020
mv_port= mv_portfolio(df, df.cov(), target_ret, tangency_port)

MVPort_df = pd.DataFrame(mv_port,columns= ["Optimal Mean-Variance Portfolio Weight"],index=performance.index)
MVPort_df

Unnamed: 0,Optimal Mean-Variance Portfolio Weight
CL1,-0.065837
GC1,0.544268
KC1,0.028494
ES1,1.581636
BP1,-1.088561


In [83]:
optimal_port_stats = mvo_performance_stats(mean_total_ret,df.cov(),mv_port,'Tangency Portfolio',12)
optimal_port_stats

Unnamed: 0,Annualized Return,Annualized Volatility,Annualized Sharpe Ratio
Tangency Portfolio,0.24,0.224473,1.069171


### 3.
- Assuming that we can mix the risk-free asset in our portfolio, the mean return of 0.20 would be able to be reached with a lower overall volatility of the portfolio since the tangency portfolio can simply be mixed with the risk-free asset to attain 20% target mean return with the highest Sharpe ratio (lowest attainable volatility).

## 3. Hedging & Replication

In [124]:
params = rolling_regression_param(factor = df['BP1'], fund_ret=df['ES1'], roll_window=36)

### 1.

In [115]:
params

Unnamed: 0,const,BP1
36,0.010347,0.850830
37,0.013474,0.828043
38,0.017341,0.788871
39,0.015762,0.793073
40,0.013533,0.724540
...,...,...
158,0.013660,1.178892
159,0.014433,1.144168
160,0.012270,1.247557
161,0.013008,1.186588


- The last 5 rows above
### 2.

In [125]:
df['BP1'] * params['BP1'] + params['const']

2009-01-31 00:00:00   NaN
2009-02-28 00:00:00   NaN
2009-03-31 00:00:00   NaN
2009-04-30 00:00:00   NaN
2009-05-31 00:00:00   NaN
                       ..
158                   NaN
159                   NaN
160                   NaN
161                   NaN
162                   NaN
Length: 324, dtype: float64

### 3.

In [126]:
df['BP1'].mean() * params['BP1'].mean() + params['const'].mean()

0.01107550702690088

## 4. Modeling Risk
### 1.

In [86]:
df['Q4_port'] = 0.5 * df['ES1'] + 0.5 * df['GC1']
performance_summary(df[['ES1', 'GC1', 'Q4_port']])[['Annualized Return', 'Annualized Volatility', 'Mean/Vol', 'Max Drawdown']]

Unnamed: 0,Annualized Return,Annualized Volatility,Mean/Vol,Max Drawdown
ES1,0.129117,0.151067,0.854698,-0.203174
GC1,0.054008,0.160134,0.337267,-0.429597
Q4_port,0.091562,0.114079,0.802619,-0.120982


### 2.
- Volatility: The annualized volatility of the 50/50 portfolio seems to be lower than the individual asset's
- Maximum Drawdown: The Maximum drawdown of the 50/50 portfolio seems to be much less extreme than the individual asset's
- Mean-to-vol Ratio: The mean-to-vol ratio seems to be higher than 'GC1' but lower than 'ES1'
- The decrease in volatility and maximum drawdown is expected unless the two assets are extremely correlated. The Mean-to-vol ratio, however, would go up if the right weights of two assets were used. Since the Mean-to-vol ratio goes down, it seems that 50/50 is not close to the optimal weights for the two assets.

### 3.
#### (a)

In [94]:
sigma = df['Q4_port'].std() * np.sqrt(12)
(-stat.norm().pdf(.01)/stat.norm().cdf(.01)) * sigma

-0.09029723958289262

#### (b)

In [100]:
stdev = np.sqrt((df['Q4_port']**2).shift(1).expanding(60).mean().to_frame('Expanding Window'))
stdev

Unnamed: 0_level_0,Expanding Window
date,Unnamed: 1_level_1
2009-01-31,
2009-02-28,
2009-03-31,
2009-04-30,
2009-05-31,
...,...
2022-02-28,0.033542
2022-03-31,0.033454
2022-04-30,0.033447
2022-05-31,0.033628


In [None]:
sigma = 0.033552
(-stat.norm().pdf(.01)/stat.norm().cdf(.01)) * sigma

### 4.

In [101]:
mean_ES1 = np.log(df['ES1'] + 1).mean()
mean_port = np.log(df['Q4_port'] + 1).mean()
mu_tilde = mean_ES1 - mean_port
sigma = np.log(df['ES1'] + 1).std()
h = 10
stat.norm().cdf(-np.sqrt(h)*mu_tilde/sigma)

0.4227413691260392

About 42.27 %