# Homework 2

## FINM 36700 - 2023

### UChicago Financial Mathematics

* Mark Hendricks
* hendricks@uchicago.edu

## HBS Case

### *The Harvard Management Company and Inflation-Indexed Bonds*

In [16]:
import pandas as pd 
import numpy as np 
import seaborn as sns

pd.options.display.float_format = "{:,.4f}".format

import warnings
warnings.filterwarnings("ignore")

In [None]:
# Helper Functions 

def mvo_performance_stats(asset_returns,cov_matrix,port_weights, port_type,period):
    """ 
        Returns the Annualized Performance Stats for given asset returns, portfolio weights and covariance matrix
        Inputs: 
            asset_return - Excess return over the risk free rate for each asset (n x 1) Vector
            cov_matrix = nxn covariance matrix for the assets
            port_weights = weights of the assets in the portfolio (1 x n) Vector
            port_type = Type of Portfolio | Eg - Tangency or Mean-Variance Portfolio
            period = Monthly frequency
    """
    
    ret = np.dot(port_weights,asset_returns)*period
    vol = np.sqrt(port_weights @ cov_matrix @ port_weights.T)*np.sqrt(period)
    sharpe = ret/vol

    stats = pd.DataFrame([[ret,vol,sharpe]],columns= ["Annualized Return","Annualized Volatility","Annualized Sharpe Ratio"], index = [port_type])
    return stats

In [20]:
def summary_statistics(data, period):
    """ 
        Returns the summary Stats for given set
        Inputs: 
            data - DataFrame with Date index and periodth data.
        Output:
            summary_stats - DataFrame with annualized mean mean, vol, sharpe ratio for periodth data 
    """
    summary_stats = data.mean().to_frame('Annualized Mean').apply(lambda x: x*period)
    summary_stats['Annualized Volatility'] = data.std().apply(lambda x: x*np.sqrt(period))
    summary_stats['Annualized Sharpe Ratio'] = summary_stats['Annualized Mean']/summary_stats['Annualized Volatility']
   
    return summary_stats

In [35]:
def tail_statistics_summary(data):
    """ 
        Returns the summary Stats for given set
        Inputs: 
            data - DataFrame with Date index.
        Output:
            summary_stats - DataFrame with Skewness, Excess Kurtosis, VaR (0.05), CVaR (0.05), Max Drawdown
    """

    tail_summary_stats = data.skew().to_frame('Skewness')
    tail_summary_stats['Excess Kurtosis'] = data.kurtosis()
    tail_summary_stats['VaR (0.05)'] = data.quantile(.05, axis = 0)
    tail_summary_stats['CVaR (0.05)'] = data[data <= data.quantile(.05, axis = 0)].mean()

    wealth_index = 1000*(1+data).cumprod()
    previous_peaks = wealth_index.cummax()
    drawdowns = (wealth_index - previous_peaks)/previous_peaks

    tail_summary_stats['Max Drawdown'] = drawdowns.min()
 
    return tail_summary_stats


***

# 1. The ProShares ETF Product

**Section 1 is not graded**, and you do not need to submit your answers. But you are encouraged to think about them, and we will discuss them.

## 1. Alternative ETFs

Describe the two types of investments referenced by this term.

## 2. Hedge Funds.

#### a. Using just the information in the case, what are two measures by which hedge funds are an attractive investment?

#### b. What are the main benefits of investing in hedge funds via an ETF instead of directly?

## 3. The Benchmarks

#### a. Explain as simply as possible how HFRI, MLFM, MLFM-ES, and HDG differ in their construction and purpose.

#### b. How well does the Merrill Lynch Factor Model (MLFM) track the HFRI?

#### c. In which factor does the MLFM have the largest loading? (See a slide in Exhibit 1.)

#### d. What are the main concerns you have for how the MLFM attempts to replicate the HFRI?

## 4. The HDG Product

#### a. What does ProShares ETF, HDG, attempt to track? Is the tracking error small?

#### b. HDG is, by construction, delivering beta for investors. Isn't the point of hedge funds to generate alpha? Then why would HDG be valuable?

#### c. The fees of a typical hedge-fund are 2% on total assets plus 20% of excess returns if positive. HDG's expense ratio is roughly 1% on total assets. What would their respective net Sharpe Ratios be, assuming both have a gross excess returns of 10% and volatility of 20%?

***

# 2.  Analyzing the Data

Use the data found on Canvas, in <b>'proshares analysis data.xlsx'</b>. 

It has monthly data on financial indexes and ETFs from `Aug 2011` through `Aug 2023`.

In [17]:
df = pd.read_excel('..\data\proshares_analysis_data.xlsx',sheet_name='hedge_fund_series').rename(columns={'Unnamed: 0':'date'}).set_index('date')
df.head(5)

Unnamed: 0_level_0,HFRIFWI Index,MLEIFCTR Index,MLEIFCTX Index,HDG US Equity,QAI US Equity
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2011-08-31,-0.0321,-0.0256,-0.0257,-0.027,-0.0065
2011-09-30,-0.0389,-0.0324,-0.0326,-0.0325,-0.0221
2011-10-31,0.0269,0.0436,0.0433,0.0505,0.0252
2011-11-30,-0.0135,-0.0121,-0.0124,-0.0286,-0.008
2011-12-31,-0.0045,0.0019,0.0018,0.0129,0.0018


## 1. 

For the series in the "hedge fund series" tab, report the following summary statistics:
* mean
* volatility
* Sharpe ratio

Annualize these statistics.

In [33]:
summary_stats = summary_statistics(df,365)
summary_stats

Unnamed: 0,Annualized Mean,Annualized Volatility,Annualized Sharpe Ratio
HFRIFWI Index,1.3147,0.3322,3.958
MLEIFCTR Index,0.9698,0.3142,3.0869
MLEIFCTX Index,0.9246,0.3133,2.9514
HDG US Equity,0.6239,0.3274,1.9057
QAI US Equity,0.5965,0.2761,2.1604


## 2.

For the series in the "hedge fund series" tab, calculate the following statistics related to tail-risk.
* Skewness
* Excess Kurtosis (in excess of 3)
* VaR (.05) - the fifth quantile of historic returns
* CVaR (.05) - the mean of the returns at or below the fifth quantile
* Maximum drawdown - include the dates of the max/min/recovery within the max drawdown period.

There is no need to annualize any of these statistics.

In [36]:
tail_stats = tail_statistics_summary(df)
tail_stats

Unnamed: 0,Skewness,Excess Kurtosis,VaR (0.05),CVaR (0.05),Max Drawdown
HFRIFWI Index,-0.9832,5.9183,-0.0251,-0.0375,-0.1155
MLEIFCTR Index,-0.2558,1.6643,-0.0287,-0.0359,-0.1243
MLEIFCTX Index,-0.2418,1.6316,-0.0289,-0.0358,-0.1244
HDG US Equity,-0.244,1.7801,-0.0312,-0.0376,-0.1407
QAI US Equity,-0.4584,1.7376,-0.0201,-0.0327,-0.1377


## 3. 

For the series in the "hedge fund series" tab, run a regression of each against SPY (found in the "merrill factors" tab.) Include an intercept. Report the following regression-based statistics:
* Market Beta
* Treynor Ratio
* Information ratio

Annualize these three statistics as appropriate.

add spy data

In [47]:
spy_data = pd.read_excel('..\data\proshares_analysis_data.xlsx',index_col=(0),usecols='A,B',sheet_name='merrill_factors')
df.join(spy_data).head(5)

Unnamed: 0_level_0,HFRIFWI Index,MLEIFCTR Index,MLEIFCTX Index,HDG US Equity,QAI US Equity,SPY US Equity
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2011-08-31,-0.0321,-0.0256,-0.0257,-0.027,-0.0065,-0.055
2011-09-30,-0.0389,-0.0324,-0.0326,-0.0325,-0.0221,-0.0694
2011-10-31,0.0269,0.0436,0.0433,0.0505,0.0252,0.1091
2011-11-30,-0.0135,-0.0121,-0.0124,-0.0286,-0.008,-0.0041
2011-12-31,-0.0045,0.0019,0.0018,0.0129,0.0018,0.0104


Regression

## 4. 

Discuss the previous statistics, and what they tell us about...

* the differences between SPY and the hedge-fund series?
* which performs better between HDG and QAI.
* whether HDG and the ML series capture the most notable properties of HFRI.

## 5. 

Report the correlation matrix for these assets.
* Show the correlations as a heat map.
* Which series have the highest and lowest correlations?

## 6.

Replicate HFRI with the six factors listed on the "merrill factors" tab. Include a constant, and run the unrestricted regression,

$\newcommand{\hfri}{\text{hfri}}$
$\newcommand{\merr}{\text{merr}}$

$$\begin{align}
r^{\hfri}_{t} &= \alpha^{\merr} + x_{t}^{\merr}\beta^{\merr} + \epsilon_{t}^{\merr}\\[5pt]
\hat{r}^{\hfri}_{t} &= \hat{\alpha}^{\merr} + x_{t}^{\merr}\hat{\beta}^{\merr}
\end{align}$$

Note that the second equation is just our notation for the fitted replication.

#### a. Report the intercept and betas.
#### b. Are the betas realistic position sizes, or do they require huge long-short positions?
#### c. Report the R-squared.
#### d. Report the volatility of $\epsilon^{\merr}$, the tracking error.

## 7.

Let's examine the replication out-of-sample (OOS).

Starting with $t = 61$ month of the sample, do the following:

* Use the previous 60 months of data to estimate the regression equation. 
This gives time-t estimates of the regression parameters, $\tilde{\alpha}^{\merr}_{t}$ and $\tilde{\beta}^{\merr}_{t}$.

* Use the estimated regression parameters, along with the time-t regressor values, $x^{\merr}_{t}$, calculate the time-t replication value that is, with respect to the regression estimate, built "out-of-sample" (OOS).

$$\hat{r}^{\hfri}_{t} \equiv \tilde{\alpha}^{\merr} + (x_{t}^{\merr})'\tilde{\beta}^{\merr}$$

* Step forward to $t = 62$, and now use $t = 2$ through $t = 61$ for the estimation. Re-run the steps above, and continue this process throughout the data series. Thus, we are running a rolling, 60-month regression for each point-in-time.

How well does the out-of-sample replication perform with respect to the target?

## 8.

We estimated the replications using an intercept. Try the full-sample estimation, but this time without an intercept.

$$\begin{align}
r^{\hfri}_{t} &= \alpha^{merr} + x_{t}^{\merr}\beta^{\merr} + \epsilon_{t}^{\merr}\\[5pt]
\check{r}^{\hfri}_{t} &= \check{\alpha}^{\merr} + x_{t}^{\merr}\check{\beta}^{\merr}
\end{align}$$

Report

* the regression beta. How does it compare to the estimated beta with an intercept, $\hat{\beta}^{\merr}$?

* the mean of the fitted value, $\check{r}^{\hfri}_{t}$. How does it compare to the mean of the HFRI?

* the correlations of the fitted values, $\check{r}^{\hfri}_{t}$ to the HFRI. How does the correlation compare to that of the fitted values with an intercept, $\hat{r}^{\hfri}_{t}$

Do you think Merrill and ProShares fit their replicators with an intercept or not?

***

# 3.  Extensions
<i>This section is not graded, and you do not need to submit it. Still, we may discuss it in class, in which case, you would be expected to know it.

## 1. 

Merrill constrains the weights of each asset in its replication regression of HFRI. Try constraining your weights by re-doing 2.6.

* Use Non-Negative Least Squares (NNLS) instead of OLS.
* Go further by using a Generalized Linear Model to put separate interval constraints on each beta, rather than simply constraining them to be non-negative.

#### Hints
* Try using LinearRegression in scikit-learn with the parameter `positive=True`. 
* Try using GLM in statsmodels.

## 2. 

Let's decompose a few other targets to see if they behave as their name suggests.

* Regress HEFA on the same style factors used to decompose HFRI. Does HEFA appear to be a currency-hedged version of EFA?

* Decompose TRVCI with the same style factors used to decompose HFRI. The TRVCI Index tracks venture capital funds--in terms of our styles, what best describes venture capital?

* TAIL is an ETF that tracks SPY, but that also buys put options to protect against market downturns. Calculate the statistics in questions 2.1-2.3 for TAIL. Does it seem to behave as indicated by this description? That is, does it have high correlation to SPY while delivering lower tail risk?

## 3. 

The ProShares case introduces Levered ETFs. ProShares made much of its name originally through levered, or "geared" ETFs.

Explain conceptually why Levered ETFs may track their index well for a given day but diverge over time. How is this exacerbated in volatile periods like 2008?

## 4.

Analyze SPXU and UPRO relative to SPY.
- SPXU is ProShares -3x SPX ETF.
- UPRO is ProShres +3x SPX ETF.

Questions:
* Analyze them with the statistics from 2.1-2.3. 

* Do these two ETFs seem to live up to their names?

* Plot the cumulative returns of both these ETFs along with SPY.

* What do you conclude about levered ETFs?

***