# Asset Pricing — Week 2
## Classic Equilibrium Models: Where the SDF Comes From

---

Week 1 established a single equation as the foundation of asset pricing:
$$
p_t = E_t[M_{t+1} x_{t+1}].
$$
We proved that an SDF $M$ must exist under no-arbitrage, and that every pricing question reduces to a question about the form of $M$. What we did *not* answer is: where does $M$ come from? No-arbitrage alone does not tell us what $M$ looks like — only that it is positive.

This week, we derive $M$ from **equilibrium**. We write down a model of an economy, optimise the representative agent's problem, and read the SDF off the first-order conditions. This is the consumption-based approach to asset pricing. The central results are:

- **Part A — The Lucas Tree Economy**: A complete general equilibrium model yields $M_{t+1} = \beta u'(C_{t+1})/u'(C_t)$.
- **Part B — The CAPM**: A special case of the SDF framework under specific assumptions.
- **Part C — The Equity Premium Puzzle**: The consumption-based SDF is wildly inconsistent with historical data.
- **Part D — Early Responses**: Habit formation and Epstein-Zin preferences as attempts to resolve the puzzle.

**Prerequisites**: Week 1 material. Familiarity with constrained optimisation and the envelope theorem.

---

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
from scipy.optimize import brentq, minimize_scalar, minimize
from scipy.stats import norm, lognorm
from scipy.special import logsumexp

plt.rcParams.update({
    'figure.dpi': 120,
    'axes.spines.top': False,
    'axes.spines.right': False,
    'font.size': 11,
})

rng = np.random.default_rng(seed=2024)

---

# Part A — The Lucas Tree Economy

---

## 1. Setting Up the Economy

The **Lucas (1978) exchange economy** is the workhorse model of consumption-based asset pricing. It is deliberately stripped of production — there are no investment decisions, no firms that choose inputs. This allows us to isolate the pricing of risk in its purest form.

### 1.1 The Physical Environment

Consider an infinite-horizon economy with:

- A single representative agent with preferences:
$$
E_0\left[\sum_{t=0}^{\infty} \beta^t u(C_t)\right], \quad \beta \in (0,1).
$$
- A single productive asset — the **Lucas tree** — which pays a stochastic dividend ("fruit") $D_t$ each period.
- The dividend process is exogenous. The agent cannot affect it.
- The agent can trade shares in the tree. Let $z_t$ be the number of shares held from $t$ to $t+1$, and $P_t$ be the price of one share at time $t$.

### 1.2 The Budget Constraint

At each date $t$, the agent receives dividend income $z_{t-1} D_t$ and the value of their existing shares $z_{t-1} P_t$, then chooses how much to consume and how many new shares to hold:
$$
C_t + z_t P_t = z_{t-1}(D_t + P_t).
$$
The right side is total wealth: the agent liquidates all their shares (at price $P_t$) plus receives the dividend.

### 1.3 The Equilibrium Condition

In equilibrium, the goods market clears: consumption must equal the dividend endowment at every date. Since the agent is the only agent and there is one share outstanding (normalise to $z_t = 1$ for all $t$):
$$
C_t = D_t \quad \text{for all } t.
$$
This **endowment constraint** is the key simplification of the exchange economy. It ties consumption to dividends exogenously, so we only need to solve for the prices that clear markets, not for quantities.

---

## 2. Deriving the SDF from First Principles

### 2.1 The Agent's Optimisation Problem

The agent solves:
$$
\max_{\{z_t\}} E_0\left[\sum_{t=0}^{\infty} \beta^t u(C_t)\right]
\quad \text{subject to} \quad
C_t = z_{t-1}(D_t + P_t) - z_t P_t.
$$
This is an infinite-dimensional optimisation. We solve it using the **Bellman equation**. Define the value function:
$$
V(z, D) = \max_{z'} \left\{ u(z(D + P(D)) - z' P(D)) + \beta E[V(z', D') \mid D] \right\},
$$
where $D'$ is next period's dividend and $P(D)$ is the equilibrium price function to be determined.

### 2.2 The Euler Equation

The first-order condition with respect to $z'$ (shares held going forward) gives:
$$
u'(C_t) P_t = \beta E_t[u'(C_{t+1})(P_{t+1} + D_{t+1})].
$$
This is the **Euler equation** of the Lucas model. Its interpretation is transparent:

- **Left side**: The marginal cost of buying one more share. To purchase one extra share at price $P_t$, the agent must give up $P_t$ units of current consumption, each worth $u'(C_t)$ in utility. So the utility cost is $u'(C_t) P_t$.
- **Right side**: The discounted expected marginal benefit. Next period, the extra share pays $D_{t+1}$ in dividends and can be liquidated at price $P_{t+1}$, for a total payoff of $D_{t+1} + P_{t+1}$. Each unit of payoff is worth $u'(C_{t+1})$ in utility, discounted back at rate $\beta$.

At an optimum, the marginal cost must equal the marginal benefit.

### 2.3 The Equilibrium SDF

Rearranging the Euler equation:
$$
P_t = E_t\left[\underbrace{\beta \frac{u'(C_{t+1})}{u'(C_t)}}_{\equiv M_{t+1}} (P_{t+1} + D_{t+1})\right] = E_t[M_{t+1} x_{t+1}],
$$
where $x_{t+1} = P_{t+1} + D_{t+1}$ is the payoff of one share held from $t$ to $t+1$.

We have derived the SDF from first principles:
$$
\boxed{M_{t+1} = \beta \frac{u'(C_{t+1})}{u'(C_t)}.}
$$
This is the **intertemporal marginal rate of substitution** (IMRS) of the representative agent. It is the ratio of marginal utility tomorrow to marginal utility today, discounted by $\beta$.

**Interpretation**: $M_{t+1}$ is high when $u'(C_{t+1})$ is high — that is, when consumption tomorrow is low (because $u'' < 0$). So the SDF is high in bad states of the world. Assets that pay poorly in bad states have high $-\text{Cov}(M, R)$ and thus command positive risk premia. This is the economic mechanism behind risk premia: compensation for bearing risk when marginal utility is high.

In [None]:
# Illustrate the SDF as the IMRS for CRRA utility
# u(C) = C^{1-gamma} / (1-gamma),  u'(C) = C^{-gamma}
# M_{t+1} = beta * (C_{t+1}/C_t)^{-gamma}

beta = 0.99
gammas = [0.5, 1.0, 2.0, 5.0, 10.0]
colors = plt.cm.plasma(np.linspace(0.1, 0.9, len(gammas)))

# Consumption growth ratio C_{t+1}/C_t
growth = np.linspace(0.7, 1.5, 500)

fig, axes = plt.subplots(1, 2, figsize=(13, 4))

ax = axes[0]
for gamma, color in zip(gammas, colors):
    M = beta * growth**(-gamma)
    ax.plot(growth, M, lw=2, color=color, label=f'$\\gamma = {gamma}$')
ax.axvline(1.0, color='black', lw=0.8, ls='--', label='No growth')
ax.axhline(beta, color='grey', lw=0.8, ls=':', label=f'$\\beta = {beta}$')
ax.set_xlabel('Consumption growth $C_{t+1}/C_t$')
ax.set_ylabel('SDF $M_{t+1}$')
ax.set_title('SDF as a Function of Consumption Growth\n(CRRA utility)')
ax.legend(fontsize=9)

# Panel 2: simulate the SDF under a calibrated consumption process
# Consumption growth: log(C_{t+1}/C_t) ~ N(mu_c, sigma_c^2)
mu_c    = 0.018   # mean log consumption growth (US postwar)
sigma_c = 0.036   # std of log consumption growth (US postwar)
T_sim = 200

log_growth = rng.normal(mu_c, sigma_c, T_sim)
growth_sim = np.exp(log_growth)

ax2 = axes[1]
for gamma, color in zip([1.0, 5.0, 10.0], colors[1:4]):
    M_sim = beta * growth_sim**(-gamma)
    ax2.plot(M_sim, lw=1.2, alpha=0.8, color=color, label=f'$\\gamma={gamma}$')
ax2.set_xlabel('Time period $t$')
ax2.set_ylabel('Realised $M_{t+1}$')
ax2.set_title('Simulated SDF Under Calibrated US Consumption\n($\\mu_c=0.018$, $\\sigma_c=0.036$)')
ax2.legend()

plt.tight_layout()
plt.show()

print("Key observation: for low gamma, M barely varies across time.")
print("The SDF is nearly flat — it cannot generate large risk premia.")
print("This is the heart of the equity premium puzzle.")

---

## 3. Solving for the Equilibrium Price

### 3.1 The Price-Dividend Ratio Under IID Dividend Growth

Assume that log dividend growth $g_t = \ln(D_t/D_{t-1})$ is i.i.d. with $g_t \sim \mathcal{N}(\mu_g, \sigma_g^2)$. In equilibrium, $C_t = D_t$, so log consumption growth equals log dividend growth.

Use **power (CRRA) utility**: $u(C) = C^{1-\gamma}/(1-\gamma)$, so $u'(C) = C^{-\gamma}$ and:
$$
M_{t+1} = \beta \left(\frac{C_{t+1}}{C_t}\right)^{-\gamma} = \beta e^{-\gamma g_{t+1}}.
$$

**Guess**: The price-dividend ratio $P_t/D_t = \kappa$ is constant (this is consistent with the i.i.d. assumption). Then $P_t = \kappa D_t$ and $P_{t+1} + D_{t+1} = (1+\kappa) D_{t+1}$.

Substituting into the Euler equation:
$$
\kappa D_t = E_t\left[\beta e^{-\gamma g_{t+1}} (1 + \kappa) D_{t+1}\right] = (1+\kappa) \beta D_t \cdot E\left[e^{-\gamma g_{t+1} + g_{t+1}}\right]
= (1+\kappa) \beta D_t \cdot E\left[e^{(1-\gamma)g_{t+1}}\right].
$$
Since $g_{t+1} \sim \mathcal{N}(\mu_g, \sigma_g^2)$, the log-normal moment generating function gives:
$$
E\left[e^{(1-\gamma)g}\right] = \exp\left[(1-\gamma)\mu_g + \frac{1}{2}(1-\gamma)^2\sigma_g^2\right].
$$
Define $A \equiv \beta \exp\left[(1-\gamma)\mu_g + \frac{1}{2}(1-\gamma)^2\sigma_g^2\right]$. Then $\kappa = (1+\kappa)A$, which solves to:
$$
\boxed{\kappa = \frac{A}{1-A} = \frac{\beta \exp\left[(1-\gamma)\mu_g + \frac{1}{2}(1-\gamma)^2 \sigma_g^2\right]}{1 - \beta \exp\left[(1-\gamma)\mu_g + \frac{1}{2}(1-\gamma)^2 \sigma_g^2\right]}.}
$$
For this to be positive and finite, we require $A < 1$: the present value of dividends must converge.

### 3.2 The Implied Equity Return

The gross return on the tree from $t$ to $t+1$ is:
$$
R_{t+1} = \frac{P_{t+1} + D_{t+1}}{P_t} = \frac{(1+\kappa) D_{t+1}}{\kappa D_t} = \frac{1+\kappa}{\kappa} e^{g_{t+1}}.
$$
The log return is:
$$
r_{t+1} \equiv \ln R_{t+1} = \ln\frac{1+\kappa}{\kappa} + g_{t+1}.
$$
Since $g_{t+1}$ is i.i.d. normal, log returns are i.i.d. normal — which is consistent with the guess of a constant price-dividend ratio.

### 3.3 The Riskfree Rate

Apply the Euler equation to the riskfree bond ($x_{t+1} = R_f$, $p_t = 1$):
$$
1 = E_t[M_{t+1}] R_f \implies R_f = \frac{1}{E[M_{t+1}]}.
$$
With $M_{t+1} = \beta e^{-\gamma g_{t+1}}$ and $g \sim \mathcal{N}(\mu_g, \sigma_g^2)$:
$$
E[M_{t+1}] = \beta E[e^{-\gamma g}] = \beta \exp\left(-\gamma \mu_g + \frac{\gamma^2 \sigma_g^2}{2}\right).
$$
Therefore the log riskfree rate is:
$$
\boxed{r_f = -\ln\beta + \gamma \mu_g - \frac{\gamma^2 \sigma_g^2}{2}.}
$$
This formula encodes two economic forces:

1. **Time preference** ($-\ln\beta > 0$): Impatient agents demand a high return for postponing consumption.
2. **Consumption smoothing** ($+\gamma\mu_g$): When consumption is expected to grow, agents want to borrow against the future, raising the riskfree rate.
3. **Precautionary savings** ($-\gamma^2\sigma_g^2/2$): Uncertainty about future consumption induces precautionary saving, pushing the riskfree rate *down*.

In [None]:
# Solve the Lucas tree equilibrium across a range of gamma values
# Calibration to US postwar data (annual)
beta_calib = 0.99
mu_g       = 0.018   # mean log consumption/dividend growth
sigma_g    = 0.036   # std of log consumption/dividend growth

gammas_range = np.linspace(0.5, 20, 200)

def lucas_moments(gamma, beta, mu_g, sigma_g):
    """Compute equilibrium riskfree rate, equity premium, and PD ratio."""
    # Log riskfree rate
    rf_log = -np.log(beta) + gamma * mu_g - 0.5 * gamma**2 * sigma_g**2

    # Price-dividend ratio
    A = beta * np.exp((1 - gamma) * mu_g + 0.5 * (1 - gamma)**2 * sigma_g**2)
    if A >= 1.0:
        return np.nan, np.nan, np.nan
    kappa = A / (1 - A)

    # Mean log equity return
    r_equity_log = np.log((1 + kappa) / kappa) + mu_g

    # Equity premium (log)
    eq_premium = r_equity_log - rf_log

    return rf_log * 100, eq_premium * 100, kappa

results = np.array([lucas_moments(g, beta_calib, mu_g, sigma_g) for g in gammas_range])
rf_vals, ep_vals, pd_vals = results[:, 0], results[:, 1], results[:, 2]

# Empirical targets (US postwar annual averages)
emp_rf = 0.86    # percent: real riskfree rate
emp_ep = 6.33    # percent: equity premium

fig, axes = plt.subplots(1, 3, figsize=(15, 4))

ax = axes[0]
ax.plot(gammas_range, rf_vals, 'steelblue', lw=2)
ax.axhline(emp_rf, color='tomato', ls='--', lw=1.5, label=f'US data: {emp_rf}%')
ax.set_xlabel('Risk aversion $\\gamma$')
ax.set_ylabel('Log riskfree rate (% p.a.)')
ax.set_title('Riskfree Rate\nvs Risk Aversion')
ax.legend()

ax2 = axes[1]
ax2.plot(gammas_range, ep_vals, 'steelblue', lw=2)
ax2.axhline(emp_ep, color='tomato', ls='--', lw=1.5, label=f'US data: {emp_ep}%')
ax2.set_xlabel('Risk aversion $\\gamma$')
ax2.set_ylabel('Equity premium (% p.a.)')
ax2.set_title('Equity Premium\nvs Risk Aversion')
ax2.set_ylim(-2, 15)
ax2.legend()

ax3 = axes[2]
mask = pd_vals > 0
ax3.plot(gammas_range[mask], pd_vals[mask], 'steelblue', lw=2)
ax3.axhline(30, color='tomato', ls='--', lw=1.5, label='US data: ~30')
ax3.set_xlabel('Risk aversion $\\gamma$')
ax3.set_ylabel('Price-dividend ratio $\\kappa$')
ax3.set_title('Price-Dividend Ratio\nvs Risk Aversion')
ax3.set_ylim(0, 80)
ax3.legend()

plt.tight_layout()
plt.show()

print("Key tension (the twin puzzles):")
print("- To match the equity premium (~6%), we need high gamma (~20+).")
print("- But high gamma → very high riskfree rate (~35% at gamma=20), far above 1%.")
print("The CCAPM cannot simultaneously match both moments.")

---

# Part B — The Capital Asset Pricing Model

---

## 4. Deriving the CAPM from the SDF Framework

The CAPM is the most widely used model in finance. In this section, we derive it rigorously as a special case of the SDF framework, making its assumptions transparent.

### 4.1 Two Equivalent Derivations

The CAPM can be derived under two distinct sets of assumptions:

1. **Mean-variance preferences**: All investors care only about the mean and variance of their portfolio return. (This follows from quadratic utility or from elliptical return distributions.)
2. **CAPM as a linear SDF approximation**: The SDF is linear in the market return.

We pursue the second approach, which reveals the CAPM as a restriction on the form of $M$.

### 4.2 Derivation via a Linear SDF

**Suppose** the SDF takes the linear form:
$$
M = a - b R_m,
$$
where $R_m$ is the gross return on the market portfolio and $a, b > 0$ are constants.

Apply $E[MR^i] = 1$ to any asset $i$ and to the riskfree asset:
$$
E[(a - bR_m)R^i] = 1, \quad (a - b E[R_m]) R_f = 1.
$$
From the riskfree equation: $a = R_f^{-1} + b E[R_m]$. From the general equation:
$$
a E[R^i] - b E[R_m R^i] = 1.
$$
Using $E[R_m R^i] = E[R_m]E[R^i] + \text{Cov}(R_m, R^i)$ and substituting the riskfree equation:
$$
E[R^i] - R_f = \frac{b}{a - b E[R_m]} \text{Cov}(R_m, R^i) \cdot R_f.
$$
Defining $\beta_i = \text{Cov}(R_m, R^i) / \text{Var}(R_m)$ and noting the relationship between $a$, $b$, and the Sharpe ratio of the market:
$$
\boxed{E[R^i] - R_f = \beta_i (E[R_m] - R_f).}
$$
This is the **Security Market Line (SML)** of the CAPM.

### 4.3 Economic Interpretation

The CAPM has a simple and powerful message: **the only priced risk is systematic risk** — covariance with the market portfolio. Idiosyncratic risk is not compensated because it can be diversified away.

Let us make this precise. Write any return as:
$$
R^i = \alpha_i + \beta_i R_m + \varepsilon_i,
$$
where $\text{Cov}(R_m, \varepsilon_i) = 0$ (market model decomposition). Then:
$$
\text{Var}(R^i) = \beta_i^2 \text{Var}(R_m) + \text{Var}(\varepsilon_i).
$$
The total variance is split into **systematic** ($\beta_i^2 \text{Var}(R_m)$) and **idiosyncratic** ($\text{Var}(\varepsilon_i)$) components. The CAPM says only the systematic component is priced:
$$
E[R^i] - R_f = \beta_i \lambda_m, \quad \lambda_m \equiv E[R_m] - R_f.
$$
The risk premium is proportional to $\beta_i$, the loading on market risk. $\alpha_i = 0$ for all assets in equilibrium — any non-zero alpha would represent a deviation from the CAPM.

In [None]:
# Illustrate the Security Market Line and beta decomposition
# Simulate 40 assets with different betas and compute expected returns

Rf_capm  = 0.02     # annual riskfree rate
E_Rm     = 0.08     # expected market return
lam_m    = E_Rm - Rf_capm  # market risk premium
var_m    = 0.04     # market variance (sigma_m = 20%)

# Simulate true betas across assets
n_assets = 40
betas_true = rng.uniform(0.0, 2.0, n_assets)

# CAPM-predicted expected returns
E_R_capm = Rf_capm + betas_true * lam_m

# Add a small alpha (deviation from CAPM) for illustration
alpha = rng.normal(0, 0.005, n_assets)
E_R_actual = E_R_capm + alpha

# SML grid
beta_grid = np.linspace(-0.2, 2.2, 200)
sml = Rf_capm + beta_grid * lam_m

fig, axes = plt.subplots(1, 2, figsize=(13, 5))

ax = axes[0]
ax.plot(beta_grid, sml * 100, 'k-', lw=2, label='Security Market Line')
ax.scatter(betas_true, E_R_actual * 100,
           c=alpha * 1000, cmap='RdYlGn', s=60, zorder=5,
           edgecolors='black', linewidths=0.5)
ax.axhline(Rf_capm * 100, color='grey', lw=0.8, ls='--', label=f'$R_f = {Rf_capm*100:.0f}\\%$')
ax.set_xlabel('Beta $\\beta_i$')
ax.set_ylabel('Expected return (% p.a.)')
ax.set_title('The Security Market Line\n(green = positive alpha, red = negative alpha)')
ax.legend()

# Panel 2: variance decomposition — systematic vs idiosyncratic
sigma_eps = rng.uniform(0.05, 0.30, n_assets)  # idiosyncratic vol
var_systematic = betas_true**2 * var_m
var_idiosyncratic = sigma_eps**2
var_total = var_systematic + var_idiosyncratic
r_squared = var_systematic / var_total  # proportion of variance explained by market

sort_idx = np.argsort(betas_true)
x_pos = np.arange(n_assets)

ax2 = axes[1]
ax2.bar(x_pos, var_systematic[sort_idx] * 100, label='Systematic variance', color='steelblue')
ax2.bar(x_pos, var_idiosyncratic[sort_idx] * 100, bottom=var_systematic[sort_idx] * 100,
        label='Idiosyncratic variance', color='#e74c3c', alpha=0.7)
ax2.set_xlabel('Asset (sorted by $\\beta$, low to high)')
ax2.set_ylabel('Variance (×100)')
ax2.set_title('Return Variance Decomposition\nOnly Systematic Variance is Priced by CAPM')
ax2.legend()
ax2.set_xticks([])

plt.tight_layout()
plt.show()

print(f"Market risk premium: {lam_m*100:.1f}% p.a.")
print(f"Average R-squared (systematic / total var): {r_squared.mean():.2f}")
print(f"For an asset with beta=1: E[R] = {(Rf_capm + lam_m)*100:.1f}% — same as the market.")

### 4.4 The Mean-Variance Frontier and the Sharpe Ratio

A key geometric property of the CAPM is that the **market portfolio lies on the mean-variance efficient frontier** — no other portfolio of the same variance has a higher expected return.

For any portfolio $p$ on the efficient frontier with expected return $\mu_p$ and standard deviation $\sigma_p$:
$$
\frac{\mu_p - R_f}{\sigma_p} = \frac{E[R_m] - R_f}{\sigma_m} \equiv SR_m \quad (\text{Sharpe ratio of the market}).
$$
All efficient portfolios achieve the same Sharpe ratio as the market. Any portfolio off the frontier has a lower Sharpe ratio.

**Connection to the SDF**: We showed in Week 1 that $\text{std}(M)/E[M] \geq$ Sharpe ratio. When $M = a - bR_m$ is linear in the market, $\text{Cor}(M, R_m) = -1$ (perfect negative correlation), so the HJ bound is exactly achieved by the market's Sharpe ratio. The CAPM SDF is the minimum-variance SDF consistent with the market data.

In [None]:
# Construct the mean-variance frontier from simulated assets
# Show that the market portfolio lies on the frontier

n_assets_mv = 6

# True expected returns and covariance matrix
mu_assets = np.array([0.06, 0.08, 0.10, 0.07, 0.12, 0.09])
# Correlation matrix (symmetric, positive definite)
corr = np.array([
    [1.00, 0.50, 0.30, 0.40, 0.20, 0.35],
    [0.50, 1.00, 0.45, 0.35, 0.25, 0.40],
    [0.30, 0.45, 1.00, 0.20, 0.55, 0.30],
    [0.40, 0.35, 0.20, 1.00, 0.15, 0.50],
    [0.20, 0.25, 0.55, 0.15, 1.00, 0.25],
    [0.35, 0.40, 0.30, 0.50, 0.25, 1.00],
])
vols = np.array([0.15, 0.18, 0.22, 0.16, 0.28, 0.20])
Sigma = np.outer(vols, vols) * corr

# Compute the efficient frontier by parametric optimisation
# For each target return mu_p, minimise portfolio variance
from scipy.optimize import minimize

def portfolio_variance(w, Sigma):
    return w @ Sigma @ w

def portfolio_return(w, mu):
    return w @ mu

Rf_mv = 0.02
target_returns = np.linspace(0.04, 0.15, 60)
frontier_vols = []

for mu_target in target_returns:
    constraints = [
        {'type': 'eq', 'fun': lambda w: w.sum() - 1},
        {'type': 'eq', 'fun': lambda w, m=mu_target: portfolio_return(w, mu_assets) - m}
    ]
    bounds = [(-0.5, 2.0)] * n_assets_mv
    w0 = np.ones(n_assets_mv) / n_assets_mv
    res = minimize(portfolio_variance, w0, args=(Sigma,),
                   method='SLSQP', bounds=bounds, constraints=constraints,
                   options={'ftol': 1e-12, 'maxiter': 1000})
    frontier_vols.append(np.sqrt(res.fun) if res.success else np.nan)

frontier_vols = np.array(frontier_vols)

# Tangency portfolio (maximum Sharpe ratio) — lies on Capital Market Line
def neg_sharpe(w, mu, Sigma, Rf):
    ret = w @ mu
    vol = np.sqrt(w @ Sigma @ w)
    return -(ret - Rf) / vol

constraints_tang = [{'type': 'eq', 'fun': lambda w: w.sum() - 1}]
bounds_tang = [(-0.5, 2.0)] * n_assets_mv
res_tang = minimize(neg_sharpe, np.ones(n_assets_mv)/n_assets_mv,
                    args=(mu_assets, Sigma, Rf_mv),
                    method='SLSQP', bounds=bounds_tang, constraints=constraints_tang)
w_tang = res_tang.x
mu_tang = w_tang @ mu_assets
sigma_tang = np.sqrt(w_tang @ Sigma @ w_tang)
SR_tang = (mu_tang - Rf_mv) / sigma_tang

# Capital market line
cml_vols = np.linspace(0, 0.35, 200)
cml_rets = Rf_mv + SR_tang * cml_vols

fig, ax = plt.subplots(figsize=(9, 6))
ax.plot(frontier_vols * 100, target_returns * 100, 'steelblue', lw=2.5, label='Mean-variance frontier')
ax.plot(cml_vols * 100, cml_rets * 100, 'k--', lw=1.5, label='Capital Market Line')
ax.scatter(vols * 100, mu_assets * 100, s=80, color='grey', zorder=5, label='Individual assets')
for i, (v, m) in enumerate(zip(vols, mu_assets)):
    ax.annotate(f'Asset {i+1}', (v*100, m*100), textcoords='offset points', xytext=(5, 2), fontsize=8)
ax.scatter(sigma_tang * 100, mu_tang * 100, s=150, color='#e74c3c', zorder=6,
           label=f'Tangency (market) portfolio\nSharpe = {SR_tang:.2f}')
ax.scatter(0, Rf_mv * 100, s=100, color='black', marker='*', zorder=6, label=f'Riskfree: {Rf_mv*100:.0f}%')
ax.set_xlabel('Portfolio standard deviation (%)')
ax.set_ylabel('Expected return (%)')
ax.set_title('Mean-Variance Frontier and the Capital Market Line')
ax.legend(fontsize=9)
plt.tight_layout()
plt.show()

print(f"Tangency portfolio Sharpe ratio: {SR_tang:.3f}")
print(f"Tangency weights: {np.round(w_tang, 3)}")

### 4.5 Empirical Implementation: Fama-MacBeth

The standard empirical procedure for testing the CAPM is the **Fama-MacBeth (1973) two-pass regression**:

**Pass 1 (time-series)**: For each asset $i$, run a time-series OLS regression:
$$
R^i_t - R_{f,t} = \alpha_i + \beta_i (R_{m,t} - R_{f,t}) + \varepsilon_{i,t}.
$$
Obtain the estimated betas $\hat{\beta}_i$.

**Pass 2 (cross-section)**: For each period $t$, run a cross-sectional OLS regression:
$$
R^i_t - R_{f,t} = \lambda_{0,t} + \lambda_{1,t} \hat{\beta}_i + \eta_{i,t}.
$$
Average the estimated coefficients $\hat{\lambda}_{0,t}$ and $\hat{\lambda}_{1,t}$ over time. The CAPM predicts:
- $E[\hat{\lambda}_0] = 0$ (intercept should be zero: no alpha for a zero-beta asset above $R_f$)
- $E[\hat{\lambda}_1] = E[R_m] - R_f > 0$ (the market risk premium)

Empirically, the CAPM fails both tests: $\hat{\lambda}_0 > 0$ (low-beta assets earn too much) and $\hat{\lambda}_1 < E[R_m] - R_f$ (the slope is too flat). This is the **flatness of the SML** — a persistent empirical failure of the CAPM documented since the 1970s.

In [None]:
# Fama-MacBeth procedure: simulate panel data and implement both passes
# We simulate a world where the TRUE SDF includes a second factor (size)
# so the CAPM is misspecified and the SML is flat

T_fm  = 480    # months (40 years)
N_fm  = 25     # assets (5x5 portfolios by beta and size)

# True parameters
lam_mkt_true  = 0.005    # true monthly market risk premium
lam_size_true = 0.003    # true monthly size premium (omitted from CAPM)
Rf_fm = 0.001            # monthly riskfree rate

sigma_m = 0.045          # monthly market vol
sigma_e = 0.06           # idiosyncratic vol

# True betas and size loadings (independent, uniformly spaced)
beta_grid_fm = np.linspace(0.3, 1.7, 5)
size_grid    = np.linspace(-0.5, 0.5, 5)
beta_true_fm = np.tile(beta_grid_fm, 5)      # 25 assets
size_true    = np.repeat(size_grid, 5)        # 25 assets

# Simulate market factor and size factor returns
f_mkt  = rng.normal(lam_mkt_true, sigma_m, T_fm)
f_size = rng.normal(lam_size_true, 0.03, T_fm)

# Simulate excess returns: e_it = alpha_i + beta_i * f_mkt + s_i * f_size + eps_it
eps = rng.normal(0, sigma_e, (T_fm, N_fm))
excess_returns = (beta_true_fm[None, :] * f_mkt[:, None]
                 + size_true[None, :] * f_size[:, None]
                 + eps)  # alpha_i = 0 by construction

# ---- PASS 1: time-series regressions to estimate betas ----
X_ts = np.column_stack([np.ones(T_fm), f_mkt])  # [1, f_mkt]
betas_estimated = np.empty(N_fm)
alphas_estimated = np.empty(N_fm)
for i in range(N_fm):
    coef = np.linalg.lstsq(X_ts, excess_returns[:, i], rcond=None)[0]
    alphas_estimated[i] = coef[0]
    betas_estimated[i]  = coef[1]

# ---- PASS 2: cross-sectional regressions each month ----
lam0_t = np.empty(T_fm)
lam1_t = np.empty(T_fm)
for t in range(T_fm):
    X_cs = np.column_stack([np.ones(N_fm), betas_estimated])
    coef_cs = np.linalg.lstsq(X_cs, excess_returns[t, :], rcond=None)[0]
    lam0_t[t] = coef_cs[0]
    lam1_t[t] = coef_cs[1]

lam0_mean = lam0_t.mean()
lam1_mean = lam1_t.mean()
lam0_se   = lam0_t.std() / np.sqrt(T_fm)
lam1_se   = lam1_t.std() / np.sqrt(T_fm)

# Plot
fig, axes = plt.subplots(1, 2, figsize=(13, 5))

ax = axes[0]
ax.scatter(beta_true_fm, betas_estimated, c=size_true, cmap='RdYlGn', s=70, edgecolors='k', lw=0.4)
ax.plot([0.2, 1.8], [0.2, 1.8], 'k--', lw=1, label='45-degree line')
ax.set_xlabel('True $\\beta$')
ax.set_ylabel('Estimated $\\hat{\\beta}$')
ax.set_title('Pass 1: Estimated vs True Beta\n(color = size loading)')
ax.legend()

ax2 = axes[1]
beta_plot = np.linspace(0.2, 1.9, 100)
ax2.scatter(betas_estimated, excess_returns.mean(axis=0) * 1200,
            c=size_true, cmap='RdYlGn', s=70, edgecolors='k', lw=0.4,
            label='Average excess return (annualised)')
# True SML
ax2.plot(beta_plot, (lam_mkt_true * beta_plot) * 1200, 'k-', lw=2,
         label=f'True SML (slope={lam_mkt_true*1200:.2f}%)')
# Fama-MacBeth estimated SML
ax2.plot(beta_plot, (lam0_mean + lam1_mean * beta_plot) * 1200,
         color='tomato', lw=2, ls='--',
         label=f'FM SML: $\\hat{{\\lambda}}_0$={lam0_mean*1200:.2f}%, $\\hat{{\\lambda}}_1$={lam1_mean*1200:.2f}%')
ax2.set_xlabel('Estimated beta $\\hat{\\beta}$')
ax2.set_ylabel('Average excess return (% p.a.)')
ax2.set_title('Pass 2: Fama-MacBeth Cross-Section\nFlat SML when CAPM is Misspecified')
ax2.legend(fontsize=9)

plt.tight_layout()
plt.show()

print(f"Fama-MacBeth results (annualised):")
print(f"  λ₀ (intercept): {lam0_mean*1200:.3f}% ± {lam0_se*1200:.3f}%   [CAPM predicts 0]")
print(f"  λ₁ (slope):     {lam1_mean*1200:.3f}% ± {lam1_se*1200:.3f}%   [CAPM predicts {lam_mkt_true*1200:.2f}%]")
print(f"\nThe SML is flat because size-sorted assets earn extra return not captured by beta.")

---

# Part C — The Equity Premium Puzzle

---

## 5. The Mehra-Prescott Puzzle

We established in Section 3 that the CCAPM links the equity premium to the covariance between equity returns and consumption growth. In 1985, Rajnish Mehra and Edward Prescott made this quantitative, and found a devastating result.

### 5.1 The Theoretical Equity Premium Under CRRA

With CRRA utility and i.i.d. lognormal consumption growth:
$$
M_{t+1} = \beta \left(\frac{C_{t+1}}{C_t}\right)^{-\gamma}.
$$
The risk premium equation gives:
$$
E[R] - R_f = -R_f \text{Cov}(M, R) \approx \gamma \cdot \text{Cov}\left(\frac{C_{t+1}}{C_t}, R_{t+1}\right).
$$
More precisely, for lognormal returns and consumption growth:
$$
E[r_{t+1}] - r_f = \gamma \cdot \text{Cov}(g_{t+1}, r_{t+1}),
$$
where $r = \ln R$ and $g = \ln(C_{t+1}/C_t)$.

**This is a critical equation**: the equity premium equals the coefficient of relative risk aversion times the covariance between log consumption growth and log equity returns.

### 5.2 The Empirical Magnitudes

Mehra and Prescott (1985) used U.S. data (1889–1978) and found:

| Moment | US Data |
|--------|---------|
| Mean log equity return | 7.0% p.a. |
| Mean log riskfree rate | 0.8% p.a. |
| **Equity premium** | **6.2% p.a.** |
| $\text{std}(g)$ — consumption growth volatility | 3.6% p.a. |
| $\text{Cov}(g, r_{\text{equity}})$ | $\approx 0.002$ |

The model requirement is:
$$
0.062 = \gamma \times 0.002 \implies \gamma \approx 31.
$$
A coefficient of relative risk aversion of 31 is extraordinarily large. To put it in perspective:

- At $\gamma = 31$, an agent would be indifferent between a certain income of \$51,300 and a 50-50 gamble of \$50,000 or \$100,000.
- Arrow (1971) argued that plausible values of $\gamma$ are in the range of 1 to 2.
- Empirical studies of insurance and portfolio choice typically find $\gamma \leq 5$.

Furthermore, as we showed in Section 3, any $\gamma$ large enough to match the equity premium generates a riskfree rate of 30–40% — the **riskfree rate puzzle** (Weil, 1989).

### 5.3 Why This Is a Genuine Puzzle

The equity premium puzzle is not a failure of a particular parameterisation. It is a structural result: the covariance between consumption growth and equity returns is simply too small — by two orders of magnitude — to generate the observed equity premium at any reasonable level of risk aversion.

Why is this covariance so small? Because aggregate consumption growth is **smooth**. US per capita consumption growth has a standard deviation of only 3.6% per year. By contrast, stock returns have a standard deviation of roughly 17%. If consumption and stock returns were perfectly correlated, we would still need $\gamma \approx 17/3.6 \times \text{Sharpe ratio} \approx 4.7 \times 0.4 \approx 19$ — still implausibly high. And in practice their correlation is only about 0.2, making the problem worse.

The puzzle is therefore: **why is aggregate consumption so insulated from stock market risk?** This question has driven three decades of research.

In [None]:
# Quantify the equity premium puzzle precisely
# Simulate the CCAPM moments and compare to data

# US postwar data moments (approximate, annual)
mu_g_us     = 0.018   # mean log consumption growth
sigma_g_us  = 0.036   # std log consumption growth
mu_r_us     = 0.070   # mean log equity return
sigma_r_us  = 0.165   # std log equity return
corr_gc_r   = 0.20    # correlation between consumption growth and equity returns
rf_us_log   = 0.008   # mean log riskfree rate
ep_us       = mu_r_us - rf_us_log  # equity premium in logs

cov_g_r = corr_gc_r * sigma_g_us * sigma_r_us  # Cov(g, r_equity)

print("=" * 60)
print("US Postwar Data Moments (Annual)")
print("=" * 60)
print(f"  Mean consumption growth μ_g    = {mu_g_us*100:.1f}%")
print(f"  Std consumption growth  σ_g    = {sigma_g_us*100:.1f}%")
print(f"  Mean log equity return         = {mu_r_us*100:.1f}%")
print(f"  Std log equity return   σ_r    = {sigma_r_us*100:.1f}%")
print(f"  Corr(g, r_equity)              = {corr_gc_r:.2f}")
print(f"  Cov(g, r_equity)               = {cov_g_r:.5f}")
print(f"  Observed equity premium (log)  = {ep_us*100:.1f}%")
print(f"  Log riskfree rate              = {rf_us_log*100:.1f}%")
print()
print("CCAPM model implications: EP = γ × Cov(g, r)")
print("-" * 60)
print(f"  γ required to match EP: {ep_us / cov_g_r:.0f}  ← The Equity Premium Puzzle")

# Now show the twin-puzzle: riskfree rate implied at each gamma
gammas_puzzle = np.array([1, 2, 5, 10, 20, 31, 50])
beta_puzzle = 0.99

print()
print(f"{'γ':>5} | {'Impl. EP (%)':>12} | {'Impl. rf (%)':>12} | {'HJ Cov(M)':>12}")
print("-" * 50)
for gamma in gammas_puzzle:
    ep_model = gamma * cov_g_r * 100
    rf_model = (-np.log(beta_puzzle) + gamma * mu_g_us - 0.5 * gamma**2 * sigma_g_us**2) * 100
    # Coefficient of variation of M
    log_M_var = gamma**2 * sigma_g_us**2
    E_M = beta_puzzle * np.exp(-gamma * mu_g_us + 0.5 * gamma**2 * sigma_g_us**2)
    std_M = E_M * np.sqrt(np.exp(log_M_var) - 1)
    cv_M = std_M / E_M
    print(f"{gamma:>5} | {ep_model:>12.2f} | {rf_model:>12.2f} | {cv_M:>12.4f}")

print(f"\nData targets: EP ≈ {ep_us*100:.1f}%,  rf ≈ {rf_us_log*100:.1f}%")
print(f"HJ bound (min cv_M):  {ep_us/sigma_r_us:.3f}")
print("\nNo single γ can simultaneously match both targets.")

In [None]:
# Visualise the twin puzzles in a single diagram
gammas_plot = np.linspace(0.5, 40, 300)
beta_plot_twin = 0.99

ep_model_vec = gammas_plot * cov_g_r * 100
rf_model_vec = (-np.log(beta_plot_twin) + gammas_plot * mu_g_us
                - 0.5 * gammas_plot**2 * sigma_g_us**2) * 100

fig, axes = plt.subplots(1, 2, figsize=(13, 5))

ax = axes[0]
ax.plot(gammas_plot, ep_model_vec, 'steelblue', lw=2.5, label='CCAPM prediction')
ax.axhline(ep_us * 100, color='tomato', ls='--', lw=2,
           label=f'US data: {ep_us*100:.1f}%')
gamma_needed = ep_us / cov_g_r
ax.axvline(gamma_needed, color='grey', ls=':', lw=1.5, label=f'γ needed ≈ {gamma_needed:.0f}')
ax.set_xlabel('Risk aversion $\\gamma$')
ax.set_ylabel('Equity premium (% p.a.)')
ax.set_title('The Equity Premium Puzzle\nCCAPM cannot match without extreme γ')
ax.legend()
ax.set_ylim(0, 15)

ax2 = axes[1]
ax2.plot(gammas_plot, rf_model_vec, 'steelblue', lw=2.5, label='CCAPM prediction')
ax2.axhline(rf_us_log * 100, color='tomato', ls='--', lw=2,
            label=f'US data: {rf_us_log*100:.1f}%')
ax2.axvline(gamma_needed, color='grey', ls=':', lw=1.5, label=f'γ needed ≈ {gamma_needed:.0f}')
ax2.set_xlabel('Risk aversion $\\gamma$')
ax2.set_ylabel('Log riskfree rate (% p.a.)')
ax2.set_title('The Riskfree Rate Puzzle (Weil 1989)\nHigh γ implies implausibly high rf')
ax2.legend()
ax2.set_ylim(-5, 30)

plt.tight_layout()
plt.suptitle('The Twin Puzzles: A Single γ Cannot Match Both Moments',
             fontsize=13, y=1.02, fontweight='bold')
plt.show()

---

## 6. Deeper Diagnostics: The Hansen-Jagannathan Distance

The equity premium and riskfree rate puzzles are specific moments. A more systematic approach is to ask: how far is the CCAPM SDF from the set of all valid SDFs consistent with the data?

The **Hansen-Jagannathan (HJ) distance** measures the minimum (in a specific norm) amount by which the SDF implied by a model must be perturbed to satisfy all the pricing conditions:
$$
\delta^2(M) = \min_{M^*: E[M^* R^i]=1} E[(M - M^*)^2].
$$
A model with smaller HJ distance is closer to explaining the cross-section of asset returns.

In the two-moment version (equity premium and riskfree rate), the HJ bound is just the volatility constraint we derived in Week 1. For a general set of test assets, the HJ distance can be computed as:
$$
\delta^2 = \min_b (E[M] \mathbf{1} - E[\mathbf{R}])' \Omega^{-1} (E[M] \mathbf{1} - E[\mathbf{R}]),
$$
where $\mathbf{R}$ is the vector of test asset returns and $\Omega = E[\mathbf{R}\mathbf{R}']$ is the second-moment matrix.

We will use this diagnostic extensively in the empirical sections of the course. For now, the key message is that the CCAPM generates an SDF with far too little volatility to explain the data.

---

# Part D — Responses to the Puzzle

---

## 7. Habit Formation

The equity premium puzzle arises because aggregate consumption is smooth. One response is to change the preferences in a way that makes the marginal utility of consumption *more volatile* even when consumption itself is not.

### 7.1 External Habit (Catching Up With the Joneses)

**Abel (1990)** proposed that utility depends on consumption relative to a habit level $X_t$:
$$
u(C_t, X_t) = \frac{(C_t/X_t)^{1-\gamma}}{1-\gamma}.
$$
If habit is external (based on aggregate, not individual, past consumption), the SDF is still:
$$
M_{t+1} = \beta \left(\frac{C_{t+1}/X_{t+1}}{C_t/X_t}\right)^{-\gamma}.
$$
If $X_t = C_{t-1}$ (habit is last period's consumption), then $C_t/X_t = C_t/C_{t-1}$ — the SDF is driven by the ratio of growth rates, not the level. This can amplify volatility.

### 7.2 The Campbell-Cochrane (1999) Model

The most influential habit formation model is by **Campbell and Cochrane (1999)**. They use internal habit with a slowly varying surplus consumption ratio.

**Preferences**: $u(C_t, X_t) = \frac{(C_t - X_t)^{1-\gamma}}{1-\gamma}$, where $X_t$ is the habit level and the **surplus consumption ratio** is:
$$
S_t \equiv \frac{C_t - X_t}{C_t} \in (0, 1).
$$
When $S_t$ is small, the agent is close to the subsistence level (habit), and marginal utility $u'(C_t - X_t) = (C_t - X_t)^{-\gamma} = (S_t C_t)^{-\gamma}$ is very high.

**The SDF**:
$$
M_{t+1} = \beta \left(\frac{C_{t+1} - X_{t+1}}{C_t - X_t}\right)^{-\gamma} = \beta \left(\frac{S_{t+1} C_{t+1}}{S_t C_t}\right)^{-\gamma}.
$$
Even if consumption growth $C_{t+1}/C_t$ is smooth, if $S_{t+1}/S_t$ is volatile, the SDF is volatile — and risk premia can be large.

**Campbell-Cochrane specification**: The log surplus consumption ratio $s_t = \ln S_t$ follows:
$$
s_{t+1} = (1-\phi)\bar{s} + \phi s_t + \lambda(s_t)(g_{t+1} - \mu_g),
$$
where $\phi$ is the persistence, $\bar{s}$ is the steady-state log surplus ratio, and $\lambda(s_t)$ is a **sensitivity function** chosen so that the riskfree rate is constant:
$$
\lambda(s_t) = \frac{1}{\bar{S}} \sqrt{1 - 2(s_t - \bar{s})} - 1, \quad \bar{S} = \sigma_g \sqrt{\frac{\gamma}{1-\phi}}.
$$

**Why this works**: Near the steady state, $\lambda(\bar{s}) \approx 1/\bar{S} - 1$. Since $\bar{S}$ is small (habit is close to consumption), $\lambda$ is large — a small consumption shock generates a large shock to the surplus ratio and hence to marginal utility. The effective risk aversion is $\gamma/S_t$, which is much larger than $\gamma$ alone when $S_t$ is small.

In [None]:
# Simulate the Campbell-Cochrane (1999) model
# and show how it amplifies effective risk aversion

# Parameters (Campbell-Cochrane calibration)
gamma_cc = 2.0          # curvature parameter (deceptively low)
phi_cc   = 0.87         # persistence of log surplus ratio (monthly)
mu_g_cc  = 0.0015       # monthly mean consumption growth
sigma_g_cc = 0.0109     # monthly std of consumption growth
beta_cc  = 0.89**(1/12) # monthly discount factor

# Steady-state surplus ratio
S_bar = sigma_g_cc * np.sqrt(gamma_cc / (1 - phi_cc))
s_bar = np.log(S_bar)

def sensitivity_fn(s, s_bar, S_bar, phi):
    """Campbell-Cochrane sensitivity function lambda(s)."""
    # Defined only where 1 - 2*(s - s_bar) > 0
    arg = 1 - 2*(s - s_bar)
    if arg <= 0:
        return 0.0
    return (1/S_bar) * np.sqrt(arg) - 1

T_cc = 1200   # 100 years monthly

# Simulate consumption growth shocks
g_shocks = rng.normal(mu_g_cc, sigma_g_cc, T_cc)

# Simulate surplus consumption ratio
s = np.empty(T_cc + 1)
s[0] = s_bar  # start at steady state
lam_arr = np.empty(T_cc)

for t in range(T_cc):
    lam = sensitivity_fn(s[t], s_bar, S_bar, phi_cc)
    lam_arr[t] = lam
    # Enforce upper bound s_{t+1} <= s_max = s_bar + 0.5*(1/S_bar^2 - 1)
    s_next = (1 - phi_cc) * s_bar + phi_cc * s[t] + lam * (g_shocks[t] - mu_g_cc)
    s_max  = s_bar + 0.5 * (1/S_bar**2 - 1)
    s[t+1] = min(s_next, s_max)

S = np.exp(s)  # surplus consumption ratio

# Effective risk aversion = gamma / S_t
eff_RA = gamma_cc / S

# Compute SDF
log_M_cc = np.log(beta_cc) + (-gamma_cc) * (
    np.log(S[1:]) - np.log(S[:-1]) + g_shocks
)
M_cc = np.exp(log_M_cc)

fig, axes = plt.subplots(2, 2, figsize=(14, 9))

years = np.arange(T_cc + 1) / 12

ax = axes[0, 0]
ax.plot(years, S * 100, color='steelblue', lw=1)
ax.axhline(S_bar * 100, color='tomato', ls='--', lw=1.5, label=f'Steady state $\\bar{{S}}={S_bar*100:.1f}\\%$')
ax.set_xlabel('Years')
ax.set_ylabel('Surplus ratio $S_t$ (%)')
ax.set_title('Surplus Consumption Ratio $S_t = (C_t - X_t)/C_t$')
ax.legend()

ax2 = axes[0, 1]
ax2.plot(years, eff_RA, color='#e74c3c', lw=1)
ax2.axhline(gamma_cc, color='black', ls='--', lw=1.5, label=f'Curvature γ={gamma_cc}')
ax2.set_xlabel('Years')
ax2.set_ylabel('Effective risk aversion $\\gamma / S_t$')
ax2.set_title('Time-Varying Effective Risk Aversion')
ax2.set_ylim(0, 100)
ax2.legend()

ax3 = axes[1, 0]
ax3.plot(years[1:], M_cc, color='#27ae60', lw=0.8, alpha=0.7)
ax3.set_xlabel('Years')
ax3.set_ylabel('SDF $M_{t+1}$')
ax3.set_title('Stochastic Discount Factor\n(Campbell-Cochrane model)')

ax4 = axes[1, 1]
# Compare std(M)/E[M] to HJ bound
# Monthly Sharpe ratio for equity
monthly_SR = 0.4 / np.sqrt(12)
cv_M_cc = M_cc.std() / M_cc.mean()
cv_crra = {}
for g in [2, 5, 10]:
    log_M_var_crra = g**2 * sigma_g_cc**2
    E_M_c = np.exp(-g * mu_g_cc + 0.5 * log_M_var_crra)
    std_M_c = E_M_c * np.sqrt(np.exp(log_M_var_crra) - 1)
    cv_crra[g] = std_M_c / E_M_c

models = ['HJ bound\n(data)', 'CRRA γ=2', 'CRRA γ=5', 'CRRA γ=10', 'Campbell-\nCochrane']
values = [monthly_SR, cv_crra[2], cv_crra[5], cv_crra[10], cv_M_cc]
colors_bar = ['tomato', '#95a5a6', '#7f8c8d', '#636e72', 'steelblue']
bars = ax4.bar(models, values, color=colors_bar, edgecolor='black', lw=0.5)
ax4.axhline(monthly_SR, color='tomato', ls='--', lw=1.5, alpha=0.6)
ax4.set_ylabel('std$(M)$ / $E[M]$')
ax4.set_title('HJ Bound Comparison\n(Monthly)')
for bar, val in zip(bars, values):
    ax4.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.001,
             f'{val:.3f}', ha='center', va='bottom', fontsize=9)

plt.tight_layout()
plt.show()

print(f"Campbell-Cochrane: γ = {gamma_cc}, but effective RA averages {eff_RA.mean():.1f}")
print(f"SDF coefficient of variation: {cv_M_cc:.4f}")
print(f"Monthly HJ bound:             {monthly_SR:.4f}")
print(f"CC model {'PASSES' if cv_M_cc >= monthly_SR else 'FAILS'} the HJ bound.")

---

## 8. Epstein-Zin Preferences

The habit formation approach modifies preferences in the time domain. A different strategy is to modify the structure of preferences across risk and time — separating two parameters that CRRA utility conflates.

### 8.1 The Observational Equivalence Problem in CRRA

Under CRRA utility, a single parameter $\gamma$ governs **two distinct concepts**:

1. **Risk aversion**: The willingness to bear cross-sectional risk (accept a fair gamble).
2. **Intertemporal elasticity of substitution (IES)**: The willingness to substitute consumption over time (delay consumption in exchange for growth).

With CRRA, $\text{IES} = 1/\gamma$. So high risk aversion ($\gamma$ large) necessarily implies low IES — the agent is also reluctant to substitute intertemporally. This coupling generates the riskfree rate puzzle: high $\gamma$ means the agent strongly prefers smooth consumption over time, pushing the riskfree rate up (they borrow aggressively against future growth).

**Epstein and Zin (1989)** proposed a more general preference structure that separates these two parameters.

### 8.2 The Epstein-Zin Recursive Utility

The Epstein-Zin value function is defined recursively:
$$
V_t = \left[(1-\beta) C_t^{1-1/\psi} + \beta \left(E_t[V_{t+1}^{1-\gamma}]\right)^{\frac{1-1/\psi}{1-\gamma}}\right]^{\frac{1}{1-1/\psi}},
$$
where:
- $\gamma$ is the **coefficient of relative risk aversion** (governs attitude to risk across states).
- $\psi$ is the **elasticity of intertemporal substitution** (governs attitude to timing of consumption).
- When $\gamma = 1/\psi$, this reduces to standard CRRA utility.

### 8.3 The Epstein-Zin SDF

The SDF under Epstein-Zin preferences takes the form:
$$
\boxed{M_{t+1} = \beta^{\theta} \left(\frac{C_{t+1}}{C_t}\right)^{-\theta/\psi} R_{w,t+1}^{\theta-1},}
$$
where $\theta \equiv (1-\gamma)/(1-1/\psi)$ and $R_{w,t+1}$ is the gross return on the **wealth portfolio** (the claim to all future consumption).

**Key observation**: The SDF depends on *both* consumption growth *and* the return on wealth. This second factor — absent from CRRA — allows the model to generate large risk premia without requiring implausibly high risk aversion.

**Intuition**: With EZ preferences, agents have preferences over the *distribution* of continuation utilities, not just expected utility. News about future consumption growth (long-run news) affects utility today through the $R_w$ term, even if current consumption does not change. This channel generates an additional source of SDF volatility.

### 8.4 The Riskfree Rate Under Epstein-Zin

With EZ preferences and i.i.d. lognormal consumption growth:
$$
r_f = -\ln\beta + \frac{\mu_g}{\psi} - \frac{1}{2}\left(\frac{1}{\psi}\right)^2 \sigma_g^2 \cdot \gamma \psi + \ldots
$$
The key improvement over CRRA: the coefficient on $\mu_g$ is $1/\psi$ rather than $\gamma$. We can choose $\psi$ to match the riskfree rate independently of $\gamma$. Setting $\psi = 1$ (unit IES) and $\gamma$ large allows:
- High $\gamma$ → large equity premium.
- $\psi = 1$ → $r_f \approx -\ln\beta + \mu_g$ → reasonable riskfree rate.

This is the key resolution offered by Epstein-Zin preferences in the long-run risks model of Bansal and Yaron (2004), which we will study in Week 3.

In [None]:
# Compare CRRA and Epstein-Zin: the decoupling of gamma and IES
# Under i.i.d. lognormal consumption growth, compute equity premium and riskfree rate
# for CRRA vs EZ with psi=1 (log utility IES)

mu_g_ez    = 0.018
sigma_g_ez = 0.036
beta_ez    = 0.99

# For CRRA: rf and EP as functions of gamma (IES = 1/gamma)
# For EZ with psi=1: rf fixed, EP grows with gamma
# (Under iid growth with psi=1, approximate formulas)

gammas_ez = np.linspace(0.5, 25, 300)

# CRRA
rf_crra = -np.log(beta_ez) + gammas_ez * mu_g_ez - 0.5 * gammas_ez**2 * sigma_g_ez**2
ep_crra = gammas_ez * cov_g_r  # approximate

# EZ with psi = 1 (log IES)
psi_ez = 1.0
# Riskfree rate (approximate, iid): rf ≈ -log(beta) + mu_g/psi - (0.5/psi^2)*sigma_g^2
# The precautionary saving term is scaled by 1/psi, not gamma^2
rf_ez_psi1 = -np.log(beta_ez) + mu_g_ez / psi_ez - 0.5 * (1/psi_ez**2) * sigma_g_ez**2
rf_ez_psi1_vec = np.full_like(gammas_ez, rf_ez_psi1)  # constant in gamma!

# EP under EZ (approximate): same formula if we interpret via the wealth portfolio
# For comparison, assume same covariance structure — EP still grows with gamma
ep_ez_psi1 = gammas_ez * cov_g_r  # identical in simple iid case

# EZ with psi = 2 (high IES — Bansal-Yaron calibration)
psi_ez2 = 2.0
rf_ez_psi2 = -np.log(beta_ez) + mu_g_ez / psi_ez2 - 0.5 * (1/psi_ez2**2) * sigma_g_ez**2
rf_ez_psi2_vec = np.full_like(gammas_ez, rf_ez_psi2)

fig, axes = plt.subplots(1, 2, figsize=(13, 5))

ax = axes[0]
ax.plot(gammas_ez, rf_crra * 100, 'steelblue', lw=2.5, label='CRRA (IES = 1/γ)')
ax.plot(gammas_ez, rf_ez_psi1_vec * 100, '#e74c3c', lw=2.5, ls='--', label='EZ: ψ = 1')
ax.plot(gammas_ez, rf_ez_psi2_vec * 100, '#27ae60', lw=2.5, ls=':', label='EZ: ψ = 2')
ax.axhline(rf_us_log * 100, color='black', lw=1, ls='-', alpha=0.4, label=f'US data: {rf_us_log*100:.1f}%')
ax.set_xlabel('Risk aversion $\\gamma$')
ax.set_ylabel('Log riskfree rate (% p.a.)')
ax.set_title('Riskfree Rate vs Risk Aversion\nEZ decouples rf from γ')
ax.legend(fontsize=9)
ax.set_ylim(-5, 25)

ax2 = axes[1]
ax2.plot(gammas_ez, ep_crra * 100, 'steelblue', lw=2.5, label='CRRA')
ax2.plot(gammas_ez, ep_ez_psi1 * 100, '#e74c3c', lw=2.5, ls='--', label='EZ: ψ = 1 (same EP, better rf)')
ax2.axhline(ep_us * 100, color='black', lw=1, ls='-', alpha=0.4, label=f'US data: {ep_us*100:.1f}%')
ax2.axvline(gamma_needed, color='grey', ls=':', lw=1, label=f'γ needed ≈ {gamma_needed:.0f}')
ax2.set_xlabel('Risk aversion $\\gamma$')
ax2.set_ylabel('Equity premium (% p.a.)')
ax2.set_title('Equity Premium vs Risk Aversion\n(iid consumption growth)')
ax2.set_ylim(0, 12)
ax2.legend(fontsize=9)

plt.tight_layout()
plt.show()

print("Under iid consumption growth, EZ helps the RISKFREE RATE PUZZLE but not the EQUITY PREMIUM puzzle.")
print("The full resolution requires non-iid consumption growth — the long-run risks channel (Week 3).")
print(f"\nWith ψ=1: rf = {rf_ez_psi1*100:.2f}% (vs data: {rf_us_log*100:.1f}%) — independent of γ.")
print(f"With ψ=2: rf = {rf_ez_psi2*100:.2f}% — even lower, and still independent of γ.")

---

## 9. Connecting Everything: A Summary Diagram

The models studied this week can be organised along two dimensions:

1. **What drives the SDF?** Consumption growth only (CCAPM), or additional state variables (habit, long-run risks).
2. **What governs preferences?** Standard expected utility (CRRA) or recursive utility (Epstein-Zin).

| | **Expected utility (CRRA)** | **Recursive utility (EZ)** |
|---|---|---|
| **$M$ driven by $\Delta c$ only** | CCAPM: fails equity premium and riskfree rate | EZ-CCAPM: fixes riskfree rate, not EP |
| **$M$ driven by $\Delta c$ + state variable** | Habit (CC 1999): fixes both via time-varying RA | Long-run risks (BY 2004): fixes both via IES > 1 |

**The fundamental tension**: To generate large equity premia with smooth consumption growth, we need the SDF to be more volatile than consumption growth alone can deliver. The strategies are:

- **Amplify**: Make marginal utility more sensitive to consumption (habit — sensitivity function $\lambda(s_t)$).
- **Add channel**: Introduce a second source of SDF variation (long-run risks through EZ preferences).
- **Separate parameters**: Decouple risk aversion from IES (EZ preferences).

In [None]:
# Summary: compare model moments across specifications
# Show equity premium, riskfree rate, and SDF vol for each model

# Parameters calibrated to be "best fit" for each model
model_params = {
    'CCAPM\n(γ=5)':       {'ep': 5 * cov_g_r * 100,
                             'rf': (-np.log(0.99) + 5*mu_g_us - 0.5*25*sigma_g_us**2)*100,
                             'cv_M': np.sqrt(np.exp(25*sigma_g_us**2)-1)},
    'CCAPM\n(γ=20)':      {'ep': 20 * cov_g_r * 100,
                             'rf': (-np.log(0.99) + 20*mu_g_us - 0.5*400*sigma_g_us**2)*100,
                             'cv_M': np.sqrt(np.exp(400*sigma_g_us**2)-1)},
    'EZ\n(γ=10,ψ=1.5)':  {'ep': 10 * cov_g_r * 100,  # approx, full solution requires LRR model
                             'rf': (-np.log(0.99) + mu_g_us/1.5 - 0.5*(1/1.5**2)*sigma_g_us**2)*100,
                             'cv_M': np.sqrt(np.exp(100*sigma_g_us**2)-1)},
    'CC Habit\n(γ=2)':    {'ep': cv_M_cc * (ep_us / (mu_r_us/sigma_r_us)) * 100,  # rough
                             'rf': 0.9,
                             'cv_M': cv_M_cc * np.sqrt(12)},  # annualise
    'US Data':             {'ep': ep_us * 100, 'rf': rf_us_log * 100,
                             'cv_M': ep_us / sigma_r_us},
}

labels = list(model_params.keys())
ep_vals_summary = [model_params[k]['ep'] for k in labels]
rf_vals_summary = [model_params[k]['rf'] for k in labels]
cv_M_summary    = [model_params[k]['cv_M'] for k in labels]

x = np.arange(len(labels))
width = 0.28

fig, ax = plt.subplots(figsize=(13, 5))
b1 = ax.bar(x - width, ep_vals_summary, width, label='Equity premium (%)', color='steelblue', alpha=0.85)
b2 = ax.bar(x, np.clip(rf_vals_summary, -5, 20), width, label='Riskfree rate (%)', color='#e74c3c', alpha=0.85)
b3 = ax.bar(x + width, np.clip(cv_M_summary, 0, 1.5), width, label='CV of SDF (clipped)', color='#27ae60', alpha=0.85)

ax.set_xticks(x)
ax.set_xticklabels(labels, fontsize=10)
ax.set_ylabel('Value (% or ratio)')
ax.set_title('Summary of Model Moments vs US Data\n(CC Habit and EZ moments are approximate)')
ax.legend(fontsize=10)
ax.axhline(0, color='black', lw=0.5)

# Annotate data bars
for bar in [b1[-1], b2[-1], b3[-1]]:
    ax.text(bar.get_x() + bar.get_width()/2,
            bar.get_height() + 0.1,
            f'{bar.get_height():.2f}',
            ha='center', va='bottom', fontsize=8, fontweight='bold')

plt.tight_layout()
plt.show()

---

## 10. Summary and Looking Ahead

This week answered the question left open by Week 1: where does the SDF come from?

**The Lucas exchange economy** provides a complete general equilibrium framework. In equilibrium, the SDF is the representative agent's intertemporal marginal rate of substitution:
$$
M_{t+1} = \beta \frac{u'(C_{t+1})}{u'(C_t)}.
$$
The Euler equation $u'(C_t) P_t = \beta E_t[u'(C_{t+1})(P_{t+1} + D_{t+1})]$ is the equilibrium pricing condition. The endowment constraint $C_t = D_t$ closes the model.

**The CAPM** is a special case where the SDF is linear in the market return — a restriction that holds under mean-variance preferences or elliptical return distributions. The Security Market Line says the only priced risk is covariance with the market. The Fama-MacBeth procedure is the standard test, which the CAPM fails empirically.

**The equity premium puzzle** (Mehra-Prescott, 1985) establishes that with CRRA utility and calibrated US consumption data, the CCAPM requires $\gamma \approx 31$ to match the equity premium — an order of magnitude larger than plausible. Furthermore, any $\gamma$ that matches the equity premium implies a riskfree rate far above 1% (the riskfree rate puzzle, Weil 1989). These are not failures of parameterisation — they are structural.

**Responses to the puzzle**:
- **Habit formation (Campbell-Cochrane, 1999)**: Makes the SDF volatile through time-varying effective risk aversion $\gamma/S_t$. Requires only $\gamma = 2$ in the curvature parameter, but generates extreme risk aversion during downturns.
- **Epstein-Zin preferences**: Separates risk aversion $\gamma$ from the IES $\psi$. Fixes the riskfree rate puzzle without constraining $\gamma$. Under i.i.d. consumption growth, does not resolve the equity premium puzzle — requires the long-run risks channel.

**In Week 3**, we will study two of the most influential quantitative models: the **long-run risks model** of Bansal and Yaron (2004), which combines Epstein-Zin preferences with persistent consumption growth, and the **disaster risk model** of Barro (2006) and Rietz (1988), which introduces rare, extreme consumption events.

---

## Problem Set — Week 2

**Problem 1 — Lucas tree with log utility**: Consider a Lucas tree economy with $u(C) = \ln C$, $\beta \in (0,1)$, and i.i.d. dividend growth $D_{t+1}/D_t = G$ deterministic. (a) Solve for the equilibrium price-dividend ratio $\kappa$ analytically. (b) Show that the equity return is constant and equals $G/\beta$. (c) How does the equilibrium change if $G > 1/\beta$? Interpret.

**Problem 2 — The Euler equation under CRRA**: Suppose $u(C) = C^{1-\gamma}/(1-\gamma)$ and log dividend growth $g_t \sim \mathcal{N}(\mu_g, \sigma_g^2)$ i.i.d. (a) Derive the exact formula for $E[M_{t+1}]$, $E[R_m]$, and $\text{Cov}(M_{t+1}, R_{m,t+1})$. (b) Verify that $E[M_{t+1}] \cdot E[R_{m,t+1}] + \text{Cov}(M, R_m) = 1$. (c) Show that the equity premium in logs equals $\gamma \sigma_g^2$ when consumption equals dividends.

**Problem 3 — The CAPM beta**: A portfolio $p$ has expected return 9%, beta 1.2, and idiosyncratic volatility 20%. The riskfree rate is 2% and the market risk premium is 5%. (a) What is the CAPM alpha of this portfolio? (b) If you build an equally weighted portfolio of 100 such assets (each with idiosyncratic vol 20% and the same beta), what is the variance of the resulting portfolio? What happens as $N \to \infty$? (c) Explain in words why idiosyncratic risk is not priced.

**Problem 4 — Equity premium puzzle calibration**: Using the formulas derived in this notebook, (a) find the value of $\gamma$ (with $\beta = 0.99$) that simultaneously minimises the sum of squared deviations of the model's equity premium and riskfree rate from the US data targets. (b) Compare this to the individual solutions. (c) At the solution $\gamma$, compute the implied HJ distance and the implied coefficient of variation of $M$.

**Problem 5 — Habit formation and time-varying risk aversion**: In the Campbell-Cochrane model, the effective risk aversion is $\gamma_{\text{eff}} = \gamma / S_t$. (a) If consumption falls by 5% below the habit level, what is the surplus ratio $S_t$ and the effective risk aversion (use the calibration from the notebook)? (b) How does this compare to a recession scenario where consumption falls 1% below its mean? (c) Explain why time-varying risk aversion generates countercyclical risk premia — an empirically realistic feature.

**Problem 6 — Epstein-Zin SDF**: Suppose $\gamma = 10$, $\psi = 1.5$, $\beta = 0.99$, and consumption growth is i.i.d. lognormal with $\mu_g = 0.018$ and $\sigma_g = 0.036$. Assume the wealth portfolio return equals the equity return. (a) Write out the Epstein-Zin SDF explicitly. (b) Compute $\theta = (1-\gamma)/(1-1/\psi)$. (c) Compute the log riskfree rate using the EZ formula and compare to the CRRA formula with $\gamma = 10$. (d) Under what conditions does EZ utility reduce to CRRA?

---

## References

- Lucas, R. E. (1978). Asset prices in an exchange economy. *Econometrica*, 46(6), 1429–1445.
- Mehra, R. and Prescott, E. C. (1985). The equity premium: A puzzle. *Journal of Monetary Economics*, 15(2), 145–161.
- Weil, P. (1989). The equity premium puzzle and the risk-free rate puzzle. *Journal of Monetary Economics*, 24(3), 401–421.
- Fama, E. F. and MacBeth, J. D. (1973). Risk, return, and equilibrium: Empirical tests. *Journal of Political Economy*, 81(3), 607–636.
- Campbell, J. Y. and Cochrane, J. H. (1999). By force of habit: A consumption-based explanation of aggregate stock market behavior. *Journal of Political Economy*, 107(2), 205–251.
- Epstein, L. G. and Zin, S. E. (1989). Substitution, risk aversion, and the temporal behavior of consumption and asset returns: A theoretical framework. *Econometrica*, 57(4), 937–969.
- Cochrane, J. H. (2005). *Asset Pricing* (revised ed.). Princeton University Press. — Chapters 1, 2, 8, 9, 20, 21.