# Questions - Open Final Exam

## FINM 36700 - 2025

### UChicago Financial Mathematics

* Mark Hendricks
* hendricks@uchicago.edu

***


# Your Name

List your name and CNetID

* Name: Adith Srinivasan
* CNetID: adiths

## Citations

### AI

List any AI tools used in the exam. No need to list prompts, but rather just AI models or IDE integrations.

I expect most students will have something to list here.

* ChatGPT for code assistance

### Other resources

Please list any other resources **aside from course materials** from which you used substantially. (No need to list every Google search; just materials from which you used substantially or for specific, original content.) 

I expect most students will not have anything to list here.

* 
* 
*

***

### Instructions

* For every minute late you submit the exam, you will lose one point.
* The exam is open-material, closed-communication.
* If you find any question to be unclear, state your interpretation and proceed. We will only answer questions of interpretation if there is a typo, error, etc.

### Answer Format

- **Conceptual answers**: Use **at most 2 sentences (≈40 words)** per conceptual prompt. Graders will only read the **first 2 sentences**.
- **Output format**: When asked for tables, return a small `DataFrame` with the **rows/columns exactly as specified**. When asked for a scalar, print it with a **clear label**.
- **Plots**: Plots/figures are **not required** and will not be graded; focus on numeric outputs and short text.
- **Code organization**: You may create additional code cells, but keep your final answers clearly associated with each numbered sub-problem.
- **Numeric answers**: Round all reported numbers to **6 decimal places** unless noted otherwise.
    - **NOTE**: The default `pandas` rounding is 6 decimal places, so if you display a DataFrame, it will likely already be rounded correctly.


### Submission

#### Type

* You should submit a **single** Jupyter notebook (`.ipynb`) file containing all of your code and answers to Canvas.

* Note: If any other files are required to run your notebook, please include them **and only them** in a single `.zip` file.

#### Naming
Your submitted file (ipynb or .zip) must be named in the format...
* `final-LASTNAME-FIRSTNAME.ipynb`
* `final-LASTNAME-FIRSTNAME.zip`


***

## Scoring

| Problem | Points |
|---------|--------|
| 1 | 40 |
| 2 | 45 |
| 3 | 15 |
| **Total** | | **100** |

Numbered problems are worth **5 points**, unless specified otherwise.

## Data

**All data files are found at the course web-book.**

https://markhendricks.github.io/finm-portfolio/

The exam uses the following data files:
* `quality_factor_final_dataset.csv` - stock-level panel data with fundamentals
* `FFF.csv` - Fama-French factors (daily)

The stock data contains **daily** observations from `May 2017` to `October 2024` with:
* `price` - stock price
* `market_cap` - market capitalization (in millions)
* `debt_to_market_cap` - debt divided by market cap
* `return_on_investment` - return on investment (%)
* `price_to_earnings` - P/E ratio

The Fama-French data contains:
* `Mkt-RF` - market excess return (daily, in percentage points)
* `SMB` - small minus big factor
* `HML` - high minus low (value) factor
* `RF` - risk-free rate

**Annualization:** Use 252 trading days per year for daily data.


***


In [1]:
import os
import sys

# From root/notebooks -> go one level up to root
ROOT = os.path.abspath(os.path.join(os.getcwd(), ".."))
if ROOT not in sys.path:
    sys.path.insert(0, ROOT)

ROOT

'/Users/adithsrinivasan/Documents/GitHub/finm-portfolio-2025'

In [2]:
# Load packages
import warnings
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
from scipy.stats import norm, skew, kurtosis

warnings.filterwarnings("ignore")
pd.set_option('display.precision', 6)

ANN_FACTOR = 252  # daily data annualization


In [3]:
import cmds.portfolio_management_helper as pmh

In [4]:
# Load data
DATA_PATH = '/data/quality_factor_final_dataset.csv'
FF_PATH = '/data/FFF.csv'

df = pd.read_csv(ROOT+DATA_PATH).drop(columns=['Unnamed: 0'])
df['date'] = pd.to_datetime(df['date'])
display(df.tail())

FF = pd.read_csv(ROOT+FF_PATH).drop(columns=['Unnamed: 0'])
FF['date'] = pd.to_datetime(FF['date'])
FF.set_index('date', inplace=True)
FF = FF / 100  # Convert from percentage points to decimals
display(FF.tail())

Unnamed: 0,date,ticker,price,market_cap,debt_to_market_cap,return_on_investment,price_to_earnings
360179,2024-06-24,ZTS,171.0136,78630.9894,1.28,5.0716,130.5447
360180,2024-06-25,ZTS,167.1721,76864.7207,1.3095,5.1789,127.6123
360181,2024-06-26,ZTS,170.0781,78200.8618,1.2871,5.0973,129.8306
360182,2024-06-27,ZTS,175.6114,80745.0209,1.2465,4.9488,134.0545
360183,2024-06-28,ZTS,172.5263,79326.5149,1.2688,5.0305,131.6994


Unnamed: 0_level_0,Mkt-RF,SMB,HML,RF
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2024-06-24,-0.0025,0.0062,0.0109,0.0002
2024-06-25,0.0031,-0.0082,-0.0122,0.0002
2024-06-26,0.0016,-0.0001,-0.0019,0.0002
2024-06-27,0.0014,0.0053,-0.0039,0.0002
2024-06-28,-0.0035,0.01,0.0128,0.0002


***


# 1. Quality Factor Construction & Factor Pricing

In this problem, you will construct a "quality" factor from fundamental data and test its pricing implications using factor models from chapters 4-5.


### 1.1 Cross-Sectional Z-Scores

For each date $t$, compute cross-sectional z-scores for each of the three subfactors:
* `debt_to_market_cap`
* `return_on_investment`
* `price_to_earnings`

The z-score for stock $i$ on date $t$ for variable $x$ is:
$$z_{i,t}^x = \frac{x_{i,t} - \bar{x}_t}{\sigma_t^x}$$

where $\bar{x}_t$ and $\sigma_t^x$ are the cross-sectional mean and standard deviation on date $t$.

Report the z-scores for ticker `AAPL` for each subfactor on the **last 5 dates** in the sample. Round to 6 decimal places.

In [5]:
factors = ['debt_to_market_cap', 'return_on_investment', 'price_to_earnings']

for f in factors:
    df[f + '_z'] = df.groupby('date')[f].transform(
        lambda x: (x - x.mean()) / x.std(ddof=1)
    )

aapl_z = (
    df[df['ticker'] == 'AAPL']
    .sort_values('date')
    .tail(5)[['date'] + [f + '_z' for f in factors]]
)

aapl_z = aapl_z.round(6)
aapl_z

Unnamed: 0,date,debt_to_market_cap_z,return_on_investment_z,price_to_earnings_z
1795,2024-06-24,0.0184,2.2377,-0.2204
1796,2024-06-25,0.0167,2.2075,-0.2204
1797,2024-06-26,0.0106,2.1337,-0.2204
1798,2024-06-27,0.0097,2.1058,-0.2204
1799,2024-06-28,0.0116,2.1437,-0.2187


### 1.2 Quality Score Construction

Create a composite quality score by:

1. Applying economic direction so that "good is high":
   - `debt_to_market_cap` → multiply by −1 (lower debt is better)
   - `return_on_investment` → multiply by +1 (higher ROI is better)
   - `price_to_earnings` → multiply by −1 (lower P/E is better, i.e., "value")

2. Average the three signed z-scores to create `quality_raw`.

Report `quality_raw` for ticker `AAPL` on the **last 5 dates**. Round to 6 decimal places.


In [6]:
df['debt_to_market_cap_z_signed']    = -df['debt_to_market_cap_z']
df['return_on_investment_z_signed']  =  df['return_on_investment_z']
df['price_to_earnings_z_signed']     = -df['price_to_earnings_z']

df['quality_raw'] = df[
    ['debt_to_market_cap_z_signed',
     'return_on_investment_z_signed',
     'price_to_earnings_z_signed']
].mean(axis=1)

aapl_quality = (
    df[df['ticker'] == 'AAPL']
    .sort_values('date')
    .tail(5)[['date', 'quality_raw']]
)

aapl_quality = aapl_quality.round(6)
aapl_quality


Unnamed: 0,date,quality_raw
1795,2024-06-24,0.8132
1796,2024-06-25,0.8037
1797,2024-06-26,0.7812
1798,2024-06-27,0.7721
1799,2024-06-28,0.7836


### 1.3 Size Neutralization

Quality scores may be correlated with size. To obtain a "pure" quality signal, for each date $t$, regress `quality_raw` on the log of market cap:

$$\text{quality\_raw}_{i,t} = \alpha_t + \beta_t \cdot \ln(\text{market\_cap}_{i,t}) + \varepsilon_{i,t}$$

The residual $\varepsilon_{i,t}$ is the size-neutral quality signal, `quality_pure`.

Report `quality_pure` for ticker `AAPL` on the **last 5 dates**. Round to 6 decimal places.


In [7]:
df['log_mcap'] = np.log(df['market_cap'])

def size_neutral_resid(group):
    X = sm.add_constant(group['log_mcap'])
    y = group['quality_raw']
    model = sm.OLS(y, X, missing='drop').fit()
    return y - model.predict(X)

df['quality_pure'] = df.groupby('date', group_keys=False).apply(size_neutral_resid)

aapl_quality_pure = (
    df[df['ticker'] == 'AAPL']
    .sort_values('date')
    .tail(5)[['date', 'quality_pure']]
)

aapl_quality_pure = aapl_quality_pure.round(6)
aapl_quality_pure

Unnamed: 0,date,quality_pure
1795,2024-06-24,0.7941
1796,2024-06-25,0.7893
1797,2024-06-26,0.769
1798,2024-06-27,0.764
1799,2024-06-28,0.7749


### 1.4 Final Quality Score

Re-standardize `quality_pure` cross-sectionally by date to obtain `quality_score`:

$$\text{quality\_score}_{i,t} = \frac{\text{quality\_pure}_{i,t} - \bar{\text{quality\_pure}}_t}{\sigma_t^{\text{quality\_pure}}}$$

Report `quality_score` for ticker `AAPL` on the **last 5 dates**. Round to 6 decimal places.


In [8]:
df['quality_score'] = df.groupby('date')['quality_pure'].transform(
    lambda x: (x - x.mean()) / x.std(ddof=1)
)

aapl_quality_score = (
    df[df['ticker'] == 'AAPL']
    .sort_values('date')
    .tail(5)[['date', 'quality_score']]
)

aapl_quality_score = aapl_quality_score.round(6)
aapl_quality_score


Unnamed: 0,date,quality_score
1795,2024-06-24,1.1891
1796,2024-06-25,1.1821
1797,2024-06-26,1.1512
1798,2024-06-27,1.1441
1799,2024-06-28,1.1645


### 1.5. (10pts) Long-Short Factor Portfolio

Construct a long-short quality factor portfolio:

1. Compute daily returns for each stock from prices.
2. Each day, sort stocks into deciles based on `quality_pure` (using `pd.qcut`).
3. Using quality_score as of date t as a signal for return from t to t+1, form portfolios:
   - **Long**: Equal-weight average return of Decile 10 (highest quality)
   - **Short**: Equal-weight average return of Decile 1 (lowest quality, "junk")
   - **Factor Return**: Long − Short

Report the **last 5 daily returns** of the Long, Short, and Factor portfolios. Round to 6 decimal places.


In [9]:
df = df.sort_values(['ticker', 'date'])
df['ret'] = df.groupby('ticker')['price'].pct_change()

df['ret_fwd'] = df.groupby('ticker')['ret'].shift(-1)

df['decile'] = df.groupby('date')['quality_pure'].transform(
    lambda x: pd.qcut(x, 10, labels=False, duplicates='drop') + 1
)

decile_rets = (
    df.dropna(subset=['ret_fwd', 'decile'])
      .groupby(['date', 'decile'])['ret_fwd']
      .mean()
      .unstack('decile')
)

# Long = Decile 10, Short = Decile 1, Factor = Long - Short
factor_port = pd.DataFrame({
    'Long': decile_rets[10],
    'Short': decile_rets[1]
})
factor_port['Factor'] = factor_port['Long'] - factor_port['Short']

# Last 5 daily returns, rounded to 6 decimals
factor_port_last5 = factor_port.tail(5).round(6)
factor_port_last5

Unnamed: 0_level_0,Long,Short,Factor
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2024-06-21,0.0046,0.0107,-0.0061
2024-06-24,-0.0081,-0.0047,-0.0033
2024-06-25,-0.0056,-0.0057,0.0001
2024-06-26,-0.0034,0.0037,-0.007
2024-06-27,-0.0042,-0.0011,-0.0031


### 1.6 Factor Summary Statistics and Higher Moments

For the **Factor Return** (QMJ = Quality Minus Junk), report the following **annualized** statistics:
- Mean Return
- Volatility
- Sharpe Ratio

Also report the **skewness** and **excess kurtosis** of the Factor Return.

In [10]:
# Factor daily returns (QMJ)
factor_rets = factor_port['Factor'].dropna()

mean_ann = factor_rets.mean() * ANN_FACTOR
vol_ann  = factor_rets.std(ddof=1) * np.sqrt(ANN_FACTOR)
sharpe   = mean_ann / vol_ann

# Skewness and excess kurtosis (Fisher; 0 for normal)
skewness        = factor_rets.skew()
excess_kurtosis = factor_rets.kurt()

summary = pd.Series({
    'Mean_Annualized': mean_ann,
    'Vol_Annualized': vol_ann,
    'Sharpe_Ratio': sharpe,
    'Skewness': skewness,
    'Excess_Kurtosis': excess_kurtosis
}).round(6)

summary

Mean_Annualized    0.0655
Vol_Annualized     0.1431
Sharpe_Ratio       0.4577
Skewness          -1.5111
Excess_Kurtosis   21.8728
dtype: float64

### 1.7.

In **1–2 sentences**, comment on whether the QMJ factor appears to have fatter tails than a normal distribution, based on the higher moments.


Since the excess kurtosis of the QMJ factor is positive (above zero), its return distribution has fatter tails than a normal distribution, implying more frequent extreme outcomes.

***


# 2. Forecasting Returns

In this problem, you will use fundamentals-based signals to forecast returns, following the approach from Chapter 7 (GMO-style forecasting).

Use the QMJ factor return (`Factor_Return`) constructed in Problem 1.5 for the forecasting analysis below.


### 2.1 Feature Engineering

Construct the following predictors for forecasting the quality factor return:

1. **$X_{\text{Vol}}$** (Fear): Rolling 21-day annualized volatility of Mkt-RF
2. **$X_{\text{Spread}}$** (Quality Gap): Daily spread in quality scores between Decile 10 and Decile 1: $\bar{\text{quality\_score}}_{\text{D10},t} - \bar{\text{quality\_score}}_{\text{D1},t}$

Report the **last 5 values** for both $X_{\text{Vol}}$ and $X_{\text{Spread}}$.


In [11]:
X_Vol = FF['Mkt-RF'].rolling(21).std(ddof=1) * np.sqrt(252)
X_Vol.name = 'X_Vol'

quality_deciles = (
    df.dropna(subset=['quality_score', 'decile'])
      .groupby(['date', 'decile'])['quality_score']
      .mean()
      .unstack('decile')   # columns 1..10
)

X_Spread = quality_deciles[10] - quality_deciles[1]
X_Spread.name = 'X_Spread'

X_Vol_last5 = X_Vol.dropna().tail(5).round(6)
X_Spread_last5 = X_Spread.dropna().tail(5).round(6)

# Combine into one DataFrame with both as columns
pred_last5 = pd.concat([X_Vol_last5, X_Spread_last5], axis=1)
pred_last5.columns = ['X_Vol', 'X_Spread']
pred_last5

Unnamed: 0_level_0,X_Vol,X_Spread
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2024-06-24,0.0859,3.6268
2024-06-25,0.0792,3.6268
2024-06-26,0.0763,3.6299
2024-06-27,0.0761,3.6302
2024-06-28,0.0712,3.6272


### 2.2 In-Sample Forecasting Regression

Estimate the following forecasting regression with predictors lagged one period:

$$r^{\text{QMJ}}_{t+1} = \alpha + \beta_1 X_{\text{Spread},t} + \beta_2 X_{\text{Vol},t} + \epsilon_{t+1}$$

Report:
- $R^2$ of the regression
- Estimated $\alpha$ (constant)
- Estimated $\beta_1$ (X_Spread coefficient)
- Estimated $\beta_2$ (X_Vol coefficient)

In [12]:
# QMJ factor return (daily)
qmj = factor_port['Factor'].rename('r_qmj')
reg_df = pd.concat([qmj, X_Spread, X_Vol], axis=1).dropna()

reg_df['r_qmj_lead'] = reg_df['r_qmj'].shift(-1)
reg_df = reg_df.dropna(subset=['r_qmj_lead'])

X = reg_df[['X_Spread', 'X_Vol']]
X = sm.add_constant(X)
y = reg_df['r_qmj_lead']

model = sm.OLS(y, X).fit()

R2      = model.rsquared
alpha   = model.params['const']
beta_1  = model.params['X_Spread']
beta_2  = model.params['X_Vol']

pd.Series(
    {'R2': R2, 'alpha': alpha, 'beta_1 (X_Spread)': beta_1, 'beta_2 (X_Vol)': beta_2}
).round(6)

R2                   0.0002
alpha                0.0002
beta_1 (X_Spread)   -0.0001
beta_2 (X_Vol)       0.0012
dtype: float64

### 2.3 Out-of-Sample R²

Compute out-of-sample (OOS) statistics using a **rolling window** approach, using the expanding mean (from time 0 to $t$)as the benchmark/null forecast at time $t+1$.

Start at $t =$ `126` (minimum window of `126` days)


In [13]:
oos_df = pd.concat(
    [
        qmj,
        X_Spread.shift(1).rename('X_Spread_lag'),
        X_Vol.shift(1).rename('X_Vol_lag')
    ],
    axis=1
).dropna()

y = oos_df['r_qmj'].to_numpy()
X = sm.add_constant(oos_df[['X_Spread_lag', 'X_Vol_lag']]).to_numpy()
dates = oos_df.index

n = len(oos_df)
start = 126

y_hat_model = np.full(n, np.nan)
y_hat_mean  = np.full(n, np.nan)

alphas = []
betas_1 = []
betas_2 = []
coef_dates = []

for t in range(start, n):
    y_train = y[:t]
    X_train = X[:t, :]
    
    model_t = sm.OLS(y_train, X_train).fit()
    
    y_hat_model[t] = model_t.predict(X[t:t+1, :])[0]
    y_hat_mean[t]  = y_train.mean()
    
    alphas.append(model_t.params[0])          # const
    betas_1.append(model_t.params[1])         # X_Spread_lag
    betas_2.append(model_t.params[2])         # X_Vol_lag
    coef_dates.append(dates[t])

mask = ~np.isnan(y_hat_model)
num = ((y[mask] - y_hat_model[mask]) ** 2).sum()
den = ((y[mask] - y_hat_mean[mask]) ** 2).sum()
R2_oos = 1 - num / den

# Coefficient time series
coef_df = pd.DataFrame(
    {
        'alpha': alphas,
        'beta_1 (X_Spread)': betas_1,
        'beta_2 (X_Vol)': betas_2
    },
    index=pd.Index(coef_dates, name='date')
)

# Last window's coefficients + OOS R^2
oos_stats_last = pd.Series(
    {
        'R2_oos': R2_oos,
        'alpha_last': coef_df['alpha'].iloc[-1],
        'beta_1_last': coef_df['beta_1 (X_Spread)'].iloc[-1],
        'beta_2_last': coef_df['beta_2 (X_Vol)'].iloc[-1],
    }
).round(6)

oos_stats_last

R2_oos        -0.0067
alpha_last     0.0002
beta_1_last   -0.0001
beta_2_last    0.0012
dtype: float64

### 2.4 Trading Strategy from Forecasts

Build a trading strategy using the OOS forecasts:

1. Set portfolio weight: $w_t = 10 \times \hat{r}^{\text{QMJ}}_{t+1}$ (scaled forecast)
2. Strategy return: $r^{\text{strat}}_{t+1} = w_t \times r^{\text{QMJ}}_{t+1}$

Report the following **annualized** statistics for the strategy:
- Mean Return
- Volatility  
- Sharpe Ratio

**Also report the same annualized statistics** for a buy-and-hold position in the QMJ factor.

Round your answers to 6 decimal places.

In [14]:
forecast = pd.Series(y_hat_model, index=oos_df.index, name='forecast')
realized = oos_df['r_qmj']  # QMJ daily return

strat_df = pd.DataFrame({
    'forecast': forecast,
    'realized_next': realized.shift(-1)
}).dropna()

w_t = 10 * strat_df['forecast']
strat_rets = w_t * strat_df['realized_next']

bh_rets = strat_df['realized_next']

def ann_stats(r):
    mean_ann = r.mean() * ANN_FACTOR
    vol_ann  = r.std(ddof=1) * np.sqrt(ANN_FACTOR)
    sharpe   = mean_ann / vol_ann
    return mean_ann, vol_ann, sharpe

m_s, v_s, sh_s   = ann_stats(strat_rets)
m_bh, v_bh, sh_bh = ann_stats(bh_rets)

stats = pd.DataFrame(
    {
        'Strategy': [m_s, v_s, sh_s],
        'BuyHold_QMJ': [m_bh, v_bh, sh_bh]
    },
    index=['Mean_Annualized', 'Vol_Annualized', 'Sharpe_Ratio']
).round(6)

stats

Unnamed: 0,Strategy,BuyHold_QMJ
Mean_Annualized,0.0001,0.0566
Vol_Annualized,0.0019,0.1469
Sharpe_Ratio,0.0484,0.3854


### 2.5 Strategy Attribution

Regress the forecast strategy returns on Mkt-RF:

$$r^{\text{strat}}_{t} = \alpha + \beta \cdot r^{\text{Mkt-RF}}_{t} + \epsilon_t$$

Report:
- Alpha (annualized)
- Beta
- R-squared
- Annualized Information Ratio

Round your answers to 6 decimal places.

In [15]:
reg_df = pd.concat(
    [strat_rets.rename('r_strat'), FF['Mkt-RF']],
    axis=1
).dropna()

y = reg_df['r_strat']
X = sm.add_constant(reg_df['Mkt-RF'])

model = sm.OLS(y, X).fit()

alpha_daily = model.params['const']
beta        = model.params['Mkt-RF']
R2          = model.rsquared

# Annualized alpha
alpha_ann = alpha_daily * ANN_FACTOR

# Information ratio: annualized alpha / annualized residual vol
resid = model.resid
resid_vol_ann = resid.std(ddof=1) * np.sqrt(ANN_FACTOR)
IR = alpha_ann / resid_vol_ann

pd.Series(
    {
        'Alpha_Annualized': alpha_ann,
        'Beta': beta,
        'R2': R2,
        'Info_Ratio_Annualized': IR
    }
).round(6)

Alpha_Annualized         0.0001
Beta                    -0.0002
R2                       0.0004
Info_Ratio_Annualized    0.0611
dtype: float64

### 2.6. (10pts) Interpretation for Questions 2.2–2.5

Answer the following in **1–2 sentences each** (total combined length still subject to the global 2-sentence guideline per part):

1. Is the in-sample $R^2$ from Question 2.2 meaningful for real-world forecasting? Why or why not?

While the in-sample (R^2) can be informative, it is not very meaningful for real-world forecasting because it can be inflated by overfitting and does not reflect performance on unseen data. Ultimately, OOS performance is far more meaningful for trading strategy applications.

2. What does the OOS $R^2$ from Question 2.3 indicate about the predictability of quality factor returns?

The OOS (R^2) directly measures how much the model improves forecast accuracy relative to a simple mean benchmark on new data. A small or negative OOS (R^2) indicates that quality factor returns are hard to predict or that the model adds little value out of sample.

3. In evaluating the strategy's performance, do you care more about the OOS $\alpha$ or the OOS $R^2$?
   
For evaluating a trading strategy, the OOS alpha is more important because it captures the economic value added (risk-adjusted excess return) that an investor can earn. The OOS (R^2) helps assess statistical fit, but investors ultimately care about whether the strategy earns positive, significant alpha after costs and risk.

### 2.7 Strategy Risk

Compare the tail risk of the **Forecast Strategy** to the **Buy-and-Hold QMJ** using the OOS period data:

1. Compute the **Historical VaR (5%)** for both strategies
2. Compute the **Maximum Drawdown** for both strategies

Report all values. Round to 6 decimal places.

In [16]:
def hist_var_5(r):
    return np.percentile(r.dropna(), 5)

var_strat = hist_var_5(strat_rets)
var_bh    = hist_var_5(bh_rets)

def max_drawdown(r):
    # cumulative wealth index
    wealth = (1 + r.dropna()).cumprod()
    running_max = wealth.cummax()
    drawdowns = wealth / running_max - 1.0
    return drawdowns.min()

mdd_strat = max_drawdown(strat_rets)
mdd_bh    = max_drawdown(bh_rets)

risk_stats = pd.DataFrame(
    {
        'Forecast_Strategy': [var_strat, mdd_strat],
        'BuyHold_QMJ':       [var_bh, mdd_bh]
    },
    index=['VaR_5pct', 'Max_Drawdown']
).round(6)

risk_stats

Unnamed: 0,Forecast_Strategy,BuyHold_QMJ
VaR_5pct,-0.0001,-0.015
Max_Drawdown,-0.0058,-0.2951


### 2.8

In **1–2 sentences**, comment on whether the forecast-based strategy has better or worse tail risk than buy-and-hold.

The forecast-based strategy has substantially better (smaller) tail risk than buy-and-hold, with far less severe 5% VaR and dramatically smaller maximum drawdown. This indicates that using forecasts both boosts risk management and mitigates large losses relative to a simple QMJ buy-and-hold.

***

# 3. Dynamic Hedging & Portable Alpha

In this problem, you will construct a dynamically hedged quality factor and build a "portable alpha" product, following concepts from Chapters 8-9 (LTCM, managed funds).


In [17]:
from statsmodels.regression.rolling import RollingOLS

### 3.1 Rolling Factor Regression

Using a `126`-day rolling window**, regress the `QMJ` factor return on the Fama-French factors:

$$r^{\text{QMJ}}_t = \alpha_t + \beta^{\text{Mkt}}_t \cdot r^{\text{Mkt-RF}}_t + \beta^{\text{SMB}}_t \cdot r^{\text{SMB}}_t + \beta^{\text{HML}}_t \cdot r^{\text{HML}}_t + \epsilon_t$$

Report the rolling betas (`Mkt`, `SMB`, `HML`) for the **last 5 dates**. 

In [18]:
qmj = factor_port['Factor'].rename('QMJ')

# Align QMJ with Fama-French factors
reg_df = pd.concat(
    [qmj, FF[['Mkt-RF', 'SMB', 'HML']]],
    axis=1
).dropna()

y = reg_df['QMJ']
X = sm.add_constant(reg_df[['Mkt-RF', 'SMB', 'HML']])

window = 126
rols = RollingOLS(y, X, window=window).fit()

rolling_betas = rols.params[['Mkt-RF', 'SMB', 'HML']].rename(
    columns={'Mkt-RF': 'Mkt', 'SMB': 'SMB', 'HML': 'HML'}
)

# Last 5 dates' betas
rolling_betas_last5 = rolling_betas.tail(5).round(6)
rolling_betas_last5

Unnamed: 0_level_0,Mkt,SMB,HML
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2024-06-21,0.0835,-0.0091,-0.0259
2024-06-24,0.076,-0.0089,-0.0328
2024-06-25,0.0823,-0.005,-0.0318
2024-06-26,0.0815,-0.0001,-0.03
2024-06-27,0.0852,-0.0005,-0.0222


### 3.2 Hedged Returns

Construct hedged returns by using the **lagged** rolling betas as hedge ratios:

$$r^{\text{Hedged}}_t = r^{\text{QMJ}}_t - \left(\hat{\beta}^{\text{Mkt}}_{t-1} \cdot r^{\text{Mkt-RF}}_t + \hat{\beta}^{\text{SMB}}_{t-1} \cdot r^{\text{SMB}}_t + \hat{\beta}^{\text{HML}}_{t-1} \cdot r^{\text{HML}}_t\right)$$

Report the **annualized** performance statistics (Mean, Volatility, Sharpe) of the hedged return stream.

In [19]:
betas_lagged = rolling_betas.shift(1)

hedge_df = pd.concat(
    [
        reg_df[['QMJ', 'Mkt-RF', 'SMB', 'HML']],
        betas_lagged.add_prefix('beta_')
    ],
    axis=1
).dropna()

# Hedging term using lagged betas
hedge_term = (
    hedge_df['beta_Mkt'] * hedge_df['Mkt-RF'] +
    hedge_df['beta_SMB'] * hedge_df['SMB'] +
    hedge_df['beta_HML'] * hedge_df['HML']
)

# Hedged returns
r_hedged = hedge_df['QMJ'] - hedge_term

mean_ann = r_hedged.mean() * ANN_FACTOR 
vol_ann  = r_hedged.std(ddof=1) * np.sqrt(ANN_FACTOR)
sharpe   = mean_ann / vol_ann

pd.Series(
    {
        'Mean_Annualized': mean_ann,
        'Vol_Annualized': vol_ann,
        'Sharpe_Ratio': sharpe
    }
).round(6)

Mean_Annualized   0.0575
Vol_Annualized    0.1491
Sharpe_Ratio      0.3854
dtype: float64

### 3.3 Portable Alpha Product

Construct a **portable alpha product** that combines:
1. **Levered market exposure**: $r^{\text{LevMkt-RF}}_t = 1.5 \times r^{\text{Mkt-RF}}_t$
2. **Hedged QMJ overlay**: $r^{\text{Hedged}}_t$

$$r^{\text{Product}}_t = r^{\text{LevMkt-RF}}_t + r^{\text{Hedged}}_t$$

Regress the product return on the market factor. Report:
- Beta
- Alpha (annualized)

Rounded to 6 decimal places.

In [20]:
lev_mkt = (1.5 * FF['Mkt-RF']).rename('LevMkt_RF')
mkt = FF['Mkt-RF'].rename('Mkt_RF')

prod_df = pd.concat(
    [lev_mkt, r_hedged.rename('Hedged'), mkt],
    axis=1
).dropna()

prod_df['Product'] = prod_df['LevMkt_RF'] + prod_df['Hedged']

y = prod_df['Product']
X = sm.add_constant(prod_df['Mkt_RF'])

model_prod = sm.OLS(y, X).fit()

alpha_daily = model_prod.params['const']
beta        = model_prod.params['Mkt_RF']
alpha_ann   = alpha_daily * ANN_FACTOR

pd.Series(
    {'Beta': beta, 'Alpha_Annualized': alpha_ann}
).round(6)

Beta               1.5203
Alpha_Annualized   0.0549
dtype: float64