# Midterm 1

## FINM 36700 - 2023

### UChicago Financial Mathematics

* Mark Hendricks
* hendricks@uchicago.edu

---

Sections 1 & 4.

* Tobias Rodriguez del Pozo
* tobiasdelpozo@uchicago.edu

Section 2.

* Mani Sawhney
* msawhn2@uchicago.edu

Section 3.

* Younghun Lee
* hun@uchicago.edu

# Instructions

## Please note the following:

Points
* The exam is 100 points.
* You have 120 minutes to complete the exam.
* For every minute late you submit the exam, you will lose one point.
Final Exam

Submission
* You will upload your solution to the `Midterm 1` assignment on Canvas, where you downloaded this. (Be sure to **submit** on Canvas, not just **save** on Canvas.
* Your submission should be readable, (the graders can understand your answers,) and it should **include all code used in your analysis in a file format that the code can be executed.** 

Rules
* The exam is open-material, closed-communication.
* You do not need to cite material from the course github repo--you are welcome to use the code posted there without citation.

Advice
* If you find any question to be unclear, state your interpretation and proceed. We will only answer questions of interpretation if there is a typo, error, etc.
* The exam will be graded for partial credit.

## Data

**All data files are found in the class github repo, in the `data` folder.**

This exam makes use of the following data files:
* `midterm_data_1.xlsx`

This file has sheets for...
* `info` - names of each stock ticker
* `excess returns` - weekly excess returns on several stocks
* `SPY` - weekly excess returns on SPY

Note the data is **weekly** so any annualizations should use `52` weeks in a year.

#### If useful
here is code to load in the data.

## Imports

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import statsmodels.api as sm
import scipy.stats as stats
import warnings
warnings.filterwarnings("ignore")

import sys
sys.path.append('../cmds/')
import TA_utils as ta

sns.set_theme()

## Helper Functions

### Performance Summary

In [2]:
def performance_summary(return_data, annualization = 12):
    """ 
        Returns the Performance Stats for given set of returns
        Inputs: 
            return_data - DataFrame with Date index and Monthly Returns for different assets/strategies.
        Output:
            summary_stats - DataFrame with annualized mean return, vol, sharpe ratio. Skewness, Excess Kurtosis, Var (0.5) and
                            CVaR (0.5) and drawdown based on monthly returns. 
    """
    summary_stats = return_data.mean().to_frame('Mean').apply(lambda x: x*annualization)
    summary_stats['Volatility'] = return_data.std().apply(lambda x: x*np.sqrt(annualization))
    summary_stats['Sharpe Ratio'] = summary_stats['Mean']/summary_stats['Volatility']
    
    summary_stats['Skewness'] = return_data.skew()
    summary_stats['Excess Kurtosis'] = return_data.kurtosis()
    summary_stats['VaR (0.05)'] = return_data.quantile(.05, axis = 0)
    summary_stats['CVaR (0.05)'] = return_data[return_data <= return_data.quantile(.05, axis = 0)].mean()
    summary_stats['Min'] = return_data.min()
    summary_stats['Max'] = return_data.max()
    
    wealth_index = 1000*(1+return_data).cumprod()
    previous_peaks = wealth_index.cummax()
    drawdowns = (wealth_index - previous_peaks)/previous_peaks

    summary_stats['Max Drawdown'] = drawdowns.min()
    summary_stats['Peak'] = [previous_peaks[col][:drawdowns[col].idxmin()].idxmax() for col in previous_peaks.columns]
    summary_stats['Bottom'] = drawdowns.idxmin()
    
    recovery_date = []
    for col in wealth_index.columns:
        prev_max = previous_peaks[col][:drawdowns[col].idxmin()].max()
        recovery_wealth = pd.DataFrame([wealth_index[col][drawdowns[col].idxmin():]]).T
        recovery_date.append(recovery_wealth[recovery_wealth[col] >= prev_max].index.min())
    summary_stats['Recovery'] = recovery_date
    
    return summary_stats

### Time-series Regression

In [3]:
def time_series_regression(portfolio, factors, FF3F = False, resid = False):
    
    ff_report = pd.DataFrame(index=portfolio.columns)
    bm_residuals = pd.DataFrame(columns=portfolio.columns)

    rhs = sm.add_constant(factors)

    for portf in portfolio.columns:
        lhs = portfolio[portf]
        res = sm.OLS(lhs, rhs, missing='drop').fit()
        ff_report.loc[portf, 'alpha_hat'] = res.params['const'] * 12
        ff_report.loc[portf, 'beta_mkt'] = res.params[1]
        if FF3F:
            ff_report.loc[portf, 'Size beta'] = res.params[2] 
            ff_report.loc[portf, 'Value beta'] = res.params[3]
            
        ff_report.loc[portf, 'info_ratio'] = np.sqrt(12) * res.params['const'] / res.resid.std()
        ff_report.loc[portf, 'treynor_ratio'] = 12 * portfolio[portf].mean() / res.params[1]
        ff_report.loc[portf, 'R-squared'] = res.rsquared
        ff_report.loc[portf, 'Tracking Error'] = (res.resid.std()*np.sqrt(12))

        if resid:
            bm_residuals[portf] = res.resid
            
            
        
    if resid:
        return bm_residuals
        
    return ff_report

### Tangency Weights

In [4]:
def tangency_weights(returns, cov_mat = 1):
    
    if cov_mat ==1:
        cov_inv = np.linalg.inv((returns.cov()*12))
    else:
        cov = returns.cov()
        covmat_diag = np.diag(np.diag((cov)))
        covmat = cov_mat * cov + (1-cov_mat) * covmat_diag
        cov_inv = np.linalg.inv((covmat*12))  
        
    ones = np.ones(returns.columns[1:].shape) 
    mu = returns.mean()*12
    scaling = 1/(np.transpose(ones) @ cov_inv @ mu)
    tangent_return = scaling*(cov_inv @ mu) 
    tangency_wts = pd.DataFrame(index = returns.columns[1:], data = tangent_return, columns = ['Tangent Weights'] )
        
    return tangency_wts

### Global-Minimum Variance Weights

In [5]:
def gmv_weights(tot_returns):
    
    ones = np.ones(tot_returns.columns[1:].shape)
    cov = tot_returns.cov()*12
    cov_inv = np.linalg.inv(cov)
    scaling = 1/(np.transpose(ones) @ cov_inv @ ones)
    gmv_tot = scaling * cov_inv @ ones
    gmv_wts = pd.DataFrame(index = tot_returns.columns[1:], data = gmv_tot, columns = ['GMV Weights'] )

    
    return gmv_wts


### Mean-Variance Portfolio

In [6]:
def mv_portfolio(target_ret, tot_returns):
    
    mu_tan = tot_returns.mean() @ tangency_weights(tot_returns, cov_mat = 1)
    mu_gmv = tot_returns.mean() @ gmv_weights(tot_returns)
    
    delta = (target_ret - mu_gmv[0])/(mu_tan[0] - mu_gmv[0])
    mv_weights = (delta * tangency_weights(tot_returns, cov_mat = 1)).values + ((1-delta)*gmv_weights(tot_returns)).values
    
    MV = pd.DataFrame(index = tot_returns.columns[1:], data = mv_weights, columns = ['MV Weights'] )
    MV['tangency weights'] =  tangency_weights(tot_returns, cov_mat = 1).values
    MV['GMV weights'] =   gmv_weights(tot_returns).values


    return MV


## Reading the data

In [7]:
FILEIN = '../data/midterm_1_data.xlsx'
sheet_exrets = 'excess returns'
sheet_spy = 'spy'

retsx = pd.read_excel(FILEIN, sheet_name=sheet_exrets).set_index('date')
spy = pd.read_excel(FILEIN, sheet_name=sheet_spy).set_index('date')

## Scoring

| Problem | Points |
|---------|--------|
| 1       | 20     |
| 2       | 35     |
| 3       | 30     |
| 4       | 15     |

### Each numbered question is worth 5 points.

### Notation
(Hidden LaTeX commands)

$$\newcommand{\mux}{\tilde{\boldsymbol{\mu}}}$$
$$\newcommand{\wtan}{\boldsymbol{\text{w}}^{\text{tan}}}$$
$$\newcommand{\wtarg}{\boldsymbol{\text{w}}^{\text{port}}}$$
$$\newcommand{\mutarg}{\tilde{\boldsymbol{\mu}}^{\text{port}}}$$
$$\newcommand{\wEW}{\boldsymbol{\text{w}}^{\text{EW}}}$$
$$\newcommand{\wRP}{\boldsymbol{\text{w}}^{\text{RP}}}$$
$$\newcommand{\wREG}{\boldsymbol{\text{w}}^{\text{REG}}}$$

# 1. Short Answer

### No Data Needed

These problem does not require any data file. Rather, analyze the situation conceptually, based on the information below. 

## 1

In what sense was ProShares `HDG` successful in hedging the `HFRI`, and in what sense was it unsuccessful in tracking the `HFRI`?

<font color='red'>

HDG is succesful in matchin the return variation of HFRI, as evidenced by the high correlation/R^2 between the two. However, it is unsuccessful in delivering high returns to investors, as evidenced by the low Sharpe ratio compared to HFRI. 

That is, it matches the variation well but not the mean return.

</font>

## 2

We discussed multiple ways of calculating Value-at-Risk (VaR). What are the tradeoffs between using the normal distribution formula versus a directly empirical approach?

<font color='red'>

Benefits of normal distribution:
- **Statistical power**; you can get a better estimate with less data, and you can get a better estimate of extreme events.
- Good way to compare and quote VaR.

But:
- **Bad for modeling tails as we already know that skewness and kurtosis from returns are very different from the normal distribution.**
- Assums i.i.d. returns.

Benefits of empirical approach:
- No assumptions about the distribution of returns.
- "Data driven" approach.
- Ease of implementation.

But:
- Requires a lot of data to get a good estimate for extreme events (e.g. 0.1\% VaR).
- Also assumes i.i.d. returns.

</font>

## 3

Did we find that **TIPS** have been useful in expanding the mean-variance frontier in the past? Did we conclude they might be useful in the future? Explain.

<font color='red'>

TIPS have not been particularly useful in expanding the MV frontier in the past. Recall from C.1. -- [link](https://github.com/MarkHendricks/finm-portfolio-2023/blob/main/discussions/C.1.%20Harvard%20Endowment.ipynb) -- that by dropping TIPS from the investment set, we barely see an impact on the weights for the other assets, and that it has a negligible result of portfolio performance. 

In the future, they may be useful, since as we also saw, adjusting the performance of TIPS upwards just by 1 standard error causes a big change in the allocations and performance of the portfolio. That means that if TIPS began to perform better, then they would be useful in expanding the MV frontier. However, when we read the case, we see that Harvard ***already*** has a treasury portfolio, and so I would argue that TIPS should just be added to that existing portfolio, and are not sufficiently different to be considered their own asset class.

</font>

## 4.

What aspect of the classic mean-variance optimization approach leads to extreme answers? How did regularization help with this issue?

<font color='red'>

The classic MV approach leads to extreme answers because of the instability of the inverted covariance matrix for correlated assets. That is, if the covariance matrix is nearly singular (caused by high correlations), then the inverse of the covariance matrix will be unstable, and the weights will be extreme.

Additionally, we saw that the weights are not only extreme in the base case, but they are also unstable. For example, we saw in C.1 -- [link](https://github.com/MarkHendricks/finm-portfolio-2023/blob/main/discussions/C.1.%20Harvard%20Endowment.ipynb) -- that just a 0.0012 change in monthly returns for TIPs leads to a large change in tangency portfolio weights.

Regularization helps with this issue as it makes the covariance matrix more stable, specifically by making is less singular. This is because we are (for example in HW1), taking the average of the observed covariance matrix with the diagonalized one. This shrinks the off-diagonal elements, which makes the matrix less singular, and thus more stable when we invert it -- leading to more stable tangency portfolio weights. Ridge and LASSO shrink the betas and achieve a similar goal

</font>

***

# 2. Allocation


Consider a mean-variance optimization of **excess** returns provided in `midterm_1_data.xlsx.`

## 1. 

Report the following **annualized** statistics:
* mean
* volatility
* Sharpe ratio

Which assets have the highest / lowest Sharpe ratios?

In [8]:
summary_stats_retsx = performance_summary(retsx, 52)
summary_stats_retsx

Unnamed: 0,Mean,Volatility,Sharpe Ratio,Skewness,Excess Kurtosis,VaR (0.05),CVaR (0.05),Min,Max,Max Drawdown,Peak,Bottom,Recovery
AAPL,0.319421,0.283883,1.125183,-0.334342,2.672198,-0.052313,-0.085612,-0.190566,0.143562,-0.372094,2018-10-05,2019-01-04,2019-11-08
MSFT,0.288087,0.240206,1.199334,-0.359175,1.737189,-0.049366,-0.071559,-0.150492,0.104231,-0.299537,2020-02-14,2020-03-20,2020-07-03
AMZN,0.239457,0.310389,0.771474,-0.21063,1.746315,-0.061868,-0.096065,-0.151901,0.156111,-0.468127,2021-07-09,2023-01-06,NaT
NVDA,0.650658,0.468096,1.390011,0.425676,2.244417,-0.083805,-0.119446,-0.210199,0.33258,-0.592344,2021-11-19,2022-10-14,2023-05-19
GOOGL,0.193328,0.274217,0.70502,0.041986,1.143573,-0.055729,-0.078408,-0.135524,0.149258,-0.348297,2022-03-25,2023-01-06,NaT
TSLA,0.569728,0.607026,0.938556,0.441455,1.527376,-0.122519,-0.155313,-0.284957,0.334897,-0.682185,2021-11-05,2023-01-06,NaT
XOM,0.124196,0.311613,0.398557,0.097936,3.129459,-0.061685,-0.09734,-0.175338,0.184173,-0.671435,2016-12-16,2020-03-20,2022-03-11


In [9]:
print("The asset with the best Sharpe ratio is: ")
display(summary_stats_retsx[summary_stats_retsx['Sharpe Ratio'] == summary_stats_retsx['Sharpe Ratio'].max()][['Sharpe Ratio']])
print("The asset with the worst Sharpe ratio is: ")
display(summary_stats_retsx[summary_stats_retsx['Sharpe Ratio'] == summary_stats_retsx['Sharpe Ratio'].min()][['Sharpe Ratio']])

The asset with the best Sharpe ratio is: 


Unnamed: 0,Sharpe Ratio
NVDA,1.390011


The asset with the worst Sharpe ratio is: 


Unnamed: 0,Sharpe Ratio
XOM,0.398557


## 2.

Report the weights of the tangency portfolio.

Also report the Sharpe ratio achieved by the tangency portfolio over this sample.

In [10]:
w_t = tangency_weights(retsx.reset_index(), cov_mat = 1)
w_t

Unnamed: 0,Tangent Weights
AAPL,0.322605
MSFT,0.787496
AMZN,-0.228607
NVDA,0.495996
GOOGL,-0.502721
TSLA,0.105975
XOM,0.019257


In [11]:
w_tan_summary_statistics = performance_summary(retsx @ w_t , 52)
w_tan_summary_statistics

Unnamed: 0,Mean,Volatility,Sharpe Ratio,Skewness,Excess Kurtosis,VaR (0.05),CVaR (0.05),Min,Max,Max Drawdown,Peak,Bottom,Recovery
Tangent Weights,0.563474,0.358351,1.572409,0.00438,1.899066,-0.063723,-0.096174,-0.223741,0.193774,-0.383571,2018-09-14,2019-01-04,2019-11-08


## 3.

* What weight is given to the asset with the lowest Sharpe ratio?
* What Sharpe ratio does the lowest (most negative) weight asset have?

Explain. Support your answer with evidence.

In [12]:
lowest_SR_asset = summary_stats_retsx[summary_stats_retsx['Sharpe Ratio'] == summary_stats_retsx['Sharpe Ratio'].min()][['Sharpe Ratio']].index[0]
negative_wt_asset = w_t.idxmin()[0]
SR_neg_wt = summary_stats_retsx.loc[negative_wt_asset]['Sharpe Ratio']

print("The weight given to the asset {} with the lowest Sharpe ratio is {}".format(lowest_SR_asset, w_t.loc[lowest_SR_asset][0]))
print("The Sharpe ratio assigned to {} - most negative weight is {}".format(negative_wt_asset,SR_neg_wt))

The weight given to the asset XOM with the lowest Sharpe ratio is 0.01925660795370571
The Sharpe ratio assigned to GOOGL - most negative weight is 0.7050198542160366


## 4.

Let's examine the out-of-sample performance.

Calculate and report the following three allocations using only data through the end of 2022:
* tangency portfolio
* equally weighted portfolio
* a regularized approach, with a new formula shown below

where
$$\wEW_i = \frac{1}{n}$$

$$\wREG \sim \widehat{\Sigma}^{-1}\mux$$

$$\widehat{\Sigma} = \frac{\Sigma + \boldsymbol{2}\,\Sigma_D}{\boldsymbol{3}}$$
where $\Sigma_D$ denotes a *diagonal* matrix of the security variances, with zeros in the off-diagonals.

In [13]:
retsx_IS = retsx.loc[:'2022']
retsx_OOS = retsx.loc['2023':]


wts = pd.DataFrame(index = retsx_IS.columns, columns = ['tangency','equal weights',
                                                        'regularized'])


wts.loc[:,'tangency'] = tangency_weights(retsx_IS.reset_index(), cov_mat = 1).values
wts.loc[:,'equal weights'] = 1/len(retsx_IS.columns)
wts.loc[:,'regularized'] = tangency_weights(retsx_IS.reset_index(), cov_mat = (1/3)).values

wts

Unnamed: 0,tangency,equal weights,regularized
AAPL,0.310565,0.142857,0.237267
MSFT,1.073114,0.142857,0.330835
AMZN,-0.25908,0.142857,0.047178
NVDA,0.380133,0.142857,0.196774
GOOGL,-0.751548,0.142857,0.011382
TSLA,0.101559,0.142857,0.09017
XOM,0.145257,0.142857,0.086393


In [14]:
wts_scaled = wts.copy()
wts_scaled *= (retsx_IS.mean()@wts_scaled)

wts_scaled

Unnamed: 0,tangency,equal weights,regularized
AAPL,0.002818,0.000806,0.001486
MSFT,0.009739,0.000806,0.002071
AMZN,-0.002351,0.000806,0.000295
NVDA,0.00345,0.000806,0.001232
GOOGL,-0.00682,0.000806,7.1e-05
TSLA,0.000922,0.000806,0.000565
XOM,0.001318,0.000806,0.000541


In [15]:
performance_summary(retsx_IS @ wts, 52)

Unnamed: 0,Mean,Volatility,Sharpe Ratio,Skewness,Excess Kurtosis,VaR (0.05),CVaR (0.05),Min,Max,Max Drawdown,Peak,Bottom,Recovery
tangency,0.471913,0.331179,1.424947,-0.120732,3.033074,-0.055151,-0.092073,-0.231313,0.195057,-0.373375,2020-02-14,2020-03-20,2020-06-05
equal weights,0.293432,0.260804,1.125106,-0.297489,1.712386,-0.052046,-0.075865,-0.153175,0.112469,-0.353207,2020-02-14,2020-03-20,2020-06-05
regularized,0.325593,0.260874,1.248085,-0.366472,1.902271,-0.049572,-0.076368,-0.161853,0.109635,-0.342349,2020-02-14,2020-03-20,2020-06-05


In [16]:
performance_summary(retsx_IS @ wts_scaled, 52)

Unnamed: 0,Mean,Volatility,Sharpe Ratio,Skewness,Excess Kurtosis,VaR (0.05),CVaR (0.05),Min,Max,Max Drawdown,Peak,Bottom,Recovery
tangency,0.004283,0.003006,1.424947,-0.120732,3.033074,-0.000501,-0.000836,-0.002099,0.00177,-0.003842,2020-02-14,2020-03-20,2020-05-08
equal weights,0.001656,0.001472,1.125106,-0.297489,1.712386,-0.000294,-0.000428,-0.000864,0.000635,-0.002299,2020-02-14,2020-03-20,2020-06-05
regularized,0.002039,0.001633,1.248085,-0.366472,1.902271,-0.00031,-0.000478,-0.001013,0.000686,-0.002449,2020-02-14,2020-03-20,2020-06-05


## 5.

Report the out-of-sample (2023) performance of all three portfolios in terms of annualized mean, vol, and Sharpe.

In [17]:
performance_summary(retsx_OOS @ wts, 52)

Unnamed: 0,Mean,Volatility,Sharpe Ratio,Skewness,Excess Kurtosis,VaR (0.05),CVaR (0.05),Min,Max,Max Drawdown,Peak,Bottom,Recovery
tangency,1.204709,0.443716,2.715043,-0.122475,0.000357,-0.080687,-0.102946,-0.110876,0.135754,-0.110876,2023-05-05,2023-05-12,2023-05-26
equal weights,0.955133,0.246953,3.867668,-0.181519,1.026404,-0.03576,-0.056245,-0.069187,0.094587,-0.069187,2023-03-03,2023-03-10,2023-03-31
regularized,1.013509,0.250254,4.049917,-0.160608,0.400467,-0.041306,-0.05573,-0.060332,0.088543,-0.060332,2023-03-03,2023-03-10,2023-03-24


In [18]:
performance_summary(retsx_OOS @ wts_scaled, 52)

Unnamed: 0,Mean,Volatility,Sharpe Ratio,Skewness,Excess Kurtosis,VaR (0.05),CVaR (0.05),Min,Max,Max Drawdown,Peak,Bottom,Recovery
tangency,0.010933,0.004027,2.715043,-0.122475,0.000357,-0.000732,-0.000934,-0.001006,0.001232,-0.001006,2023-05-05,2023-05-12,2023-05-26
equal weights,0.00539,0.001394,3.867668,-0.181519,1.026404,-0.000202,-0.000317,-0.00039,0.000534,-0.00039,2023-03-03,2023-03-10,2023-03-31
regularized,0.006346,0.001567,4.049917,-0.160608,0.400467,-0.000259,-0.000349,-0.000378,0.000554,-0.000378,2023-03-03,2023-03-10,2023-03-24


## 6.

Imagine just for this problem that this data is for **total** returns, not excess returns.

Report the weights of the global-minimum-variance portfolio.

In [19]:
gmv_weights(retsx.reset_index())

Unnamed: 0,GMV Weights
AAPL,0.206231
MSFT,0.49125
AMZN,0.160866
NVDA,-0.119168
GOOGL,0.011378
TSLA,-0.046927
XOM,0.296369


## 7.

To target a mean return of 0.005%, would you be long or short this global minimum variance portfolio?

In [15]:
w_tan_summary_statistics[['Mean']]

Unnamed: 0,Mean
Tangent Weights,0.563474


<font color='red'>
If the target mean is above the tangency mean, we must short the GMV. If target mean is below tangency mean, then we long the GMV. In this case, we are long the GMV
</font>

***

# 3. Performance

## 1. 

Report the following performance metrics of excess returns for Tesla (`TSLA`).
* skewness
* kurtosis

You are not annualizing any of these stats.

What do these metrics indicate about the nature of the returns?

In [21]:
retsx['TSLA'].agg(['skew', 'kurtosis'])

skew        0.441455
kurtosis    1.527376
Name: TSLA, dtype: float64

<font color='red'>

- The returns are positively skewed. This indicates more big positive returns than negative returns.
- The returns are leptokurtic, such that it has heavier tails. This implies that outsized returns are more frequent than a normal distribution

</font>

## 2. 

Report the maximum drawdown for `TSLA` over the sample.
* Ignore that your data is in excess returns rather than total returns.
* Simply proceed with the excess return data for this calculation.

In [22]:
wealth = (1 + retsx).cumprod()
wealth_max = wealth.cummax()
drawdown = wealth / wealth_max - 1
max_drawdown = drawdown.min()
max_drawdown['TSLA']

-0.6821852296331565

## 3.

For `TSLA`, calculate the following metrics, relative to `SPY`:
* market beta
* alpha
* sortino ratio

Annualize alpha and sortino ratio.

In [23]:
model = sm.OLS(retsx['TSLA'], sm.add_constant(spy), missing = 'drop').fit()
summary = model.params.to_frame('Summary').T
summary.columns = ['Alpha', 'Beta']
summary['Sortino Ratio'] = retsx['TSLA'].mean() / retsx['TSLA'][retsx['TSLA'] < 0].std() * np.sqrt(52)
summary['Information Ratio'] = summary['Alpha'] / model.resid.std() * np.sqrt(52)
summary['Treynor Ratio'] = retsx['TSLA'].mean() / summary['Beta'] * 52
summary['Alpha'] = summary['Alpha'] * 52
summary

Unnamed: 0,Alpha,Beta,Sortino Ratio,Information Ratio,Treynor Ratio
Summary,0.30947,1.776825,1.642329,0.596055,0.320644


## 4.

Continuing with `TSLA`, calculate the full-sample, 5th-percentile CVaR.
* Use the `normal` formula, assuming mean returns are zero.
* Use the full-sample volatility.

Use the entire sample to calculate a single CVaR number. 

In [24]:
cvar = -stats.norm.pdf(1.65) / 0.05 * retsx['TSLA'].std()
cvar

-0.17217191106804958

## 5.

Now calculate the 5th-percentile, one-period ahead, **VaR** for `TSLA`.

Here, calculate the running series of VaR estimates.

Again, 
* use the normal formula, with mean zero.

But now, use the rolling volatility, based on 
* rolling window or $m=52$ weeks.

Report the final 5 values of your calculated VaR series.

In [25]:
var = -1.65 * retsx['TSLA'].rolling(52).std().shift(1).dropna()
var.tail()

date
2023-06-16   -0.157886
2023-06-23   -0.157712
2023-06-30   -0.155459
2023-07-07   -0.153663
2023-07-14   -0.152046
Name: TSLA, dtype: float64

In [26]:
rol_vol_tsla = np.sqrt((retsx['TSLA']**2).rolling(52).mean().shift())
VaR_rol_tsla = -1.65 * rol_vol_tsla.dropna()
VaR_rol_tsla.tail(5)

date
2023-06-16   -0.156624
2023-06-23   -0.156740
2023-06-30   -0.154200
2023-07-07   -0.152694
2023-07-14   -0.150967
Name: TSLA, dtype: float64

## 6. 

Calculate the out-of-sample **hit ratio** for your VaR series reported in your previous answer.

In [27]:
(retsx['TSLA'].loc[var.index] < var).mean()

0.05588235294117647

In [28]:
(retsx['TSLA'].loc[VaR_rol_tsla.index] < VaR_rol_tsla).mean()

0.05588235294117647

***

# 4. Hedging

## 1. 

Consider the following scenario: you are holding a \$100 million long position in `NVDA`. You wish to hedge the position using some combination of 
* `AAPL`
* `AMZN`
* `GOOGL`
* `MSFT`

Report the positions you would hold of those 4 securities for an optimal hedge.

Note:
* In the regression estimation, include an intercept.
* Use the full-sample regression. No need to worry about in-sample versus out-of-sample.

In [29]:
regr = ta.calc_multivariate_regression(retsx['NVDA'], retsx[['AAPL', 'AMZN', 'GOOGL', 'MSFT']], intercept=True, adj=52)

betas = regr.loc[:, regr.columns.str.contains('Beta')].T

exposure = betas * -100_000_000
exposure.index = exposure.index.str.replace('Beta', '')
exposure.loc['Total'] = exposure.sum()
exposure.style.format('${:,.0f}')

Unnamed: 0,NVDA
AAPL,"$-34,168,649"
AMZN,"$-41,725,986"
GOOGL,"$784,795"
MSFT,"$-58,789,673"
Total,"$-133,899,513"


## 2.

How well does the hedge do? Cite a regression statistic to support your answer.

Also estimate the volatility of the basis, (epsilon.)

In [30]:
regr[['R-Squared', 'Tracking Error']].T

Unnamed: 0,NVDA
R-Squared,0.458168
Tracking Error,0.344562


<font color='red'>

Not particularly well. The R-Squared is only ~46\%, meaning that we can only hedge 46\% of the variance of NVDA using the other tech-stocks. Additionally, the (annualized) tracking error is quite high, at ~34\%. This means that the hedge is not very precise, and we would expect to see large deviations between the hedge and the actual NVDA returns.

There is also an additional problem, which is that we are hedging a \$100m long position by being net *short* \$134m, which is not particularly practical for most investors.

</font>

In [31]:
# Tracking error, not annualized
regr['Tracking Error'] /np.sqrt(52)

NVDA    0.047782
Name: Tracking Error, dtype: float64

## 3.

Report the annualized intercept. By including this intercept, what are you assuming about the nature of the returns of `NVDA` as well as the returns of the hedging instruments?

In [32]:
regr[['Alpha']]

Unnamed: 0,Alpha
NVDA,0.273752


<font color='red'>

The annualized intercept is around 27\%. This means that 27\% of NVDA's returns are not explained by the other tech stocks. This is a very large number, and it means that the hedge is not very effective. 

By including the intercept, we are assuming that the sample averages are not good predictors of the future averages. Thus we are allowing an intercept in the hedging regression, to ensure differences in mean returns do not impact the betas, which are the hedge recommendations.

If we really believed these sample averages are predictive, we would want the hedge ratios to account for that, and thus exclude an intercept, forcing these averages to impact the betas.

</font>