# Week 5: Portfolio Theory

---

## Table of Contents
1. Markowitz Mean-Variance Framework
2. The Efficient Frontier
3. Sharpe Ratio
4. Value at Risk (VaR)
5. Expected Shortfall (CVaR)

---

In [1]:
# Standard imports and data loading
import numpy as np
import pandas as pd
import yfinance as yf
from datetime import datetime, timedelta

# Standard 5 equities for analysis
tickers = ['AAPL', 'MSFT', 'GOOGL', 'JPM', 'GS']

# Fetch 5 years of data
end_date = datetime.now()
start_date = end_date - timedelta(days=5*365)

print("üì• Downloading market data...")
data = yf.download(tickers, start=start_date, end=end_date, progress=False, auto_adjust=True)
prices = data['Close'].dropna()
returns = prices.pct_change().dropna()
print(f"‚úÖ Loaded {len(prices)} days of data for {len(tickers)} tickers")
print(f"üìÖ Date range: {prices.index[0].strftime('%Y-%m-%d')} to {prices.index[-1].strftime('%Y-%m-%d')}")
print(prices.tail())

üì• Downloading market data...
‚úÖ Loaded 1255 days of data for 5 tickers
üìÖ Date range: 2021-01-25 to 2026-01-22
Ticker            AAPL       GOOGL          GS         JPM        MSFT
Date                                                                  
2026-01-15  258.209991  332.779999  975.859985  309.260010  456.660004
2026-01-16  255.529999  330.000000  962.000000  312.470001  459.859985
2026-01-20  246.699997  322.000000  943.369995  302.739990  454.519989
2026-01-21  247.649994  328.380005  953.010010  302.040009  444.109985
2026-01-22  249.764999  331.491913  966.520020  307.478394  449.959991


## 1. Markowitz Mean-Variance Framework

### The Key Insight

Harry Markowitz (1952): Don't just consider returns - consider **risk-adjusted** returns.

**Diversification**: Combining assets reduces risk if they're not perfectly correlated!

### Portfolio Return

For portfolio with weights $w = [w_1, w_2, ..., w_n]$:

$$r_p = \sum_{i=1}^{n} w_i r_i = w^T r$$

**Expected portfolio return**:
$$E[r_p] = \sum_{i=1}^{n} w_i E[r_i] = w^T \mu$$

### Portfolio Variance (Risk)

$$\sigma_p^2 = \sum_{i=1}^{n}\sum_{j=1}^{n} w_i w_j \sigma_{ij} = w^T \Sigma w$$

Where:
- $\sigma_{ij} = Cov(r_i, r_j)$
- $\Sigma$ = covariance matrix

### Two-Asset Case

For two assets A and B:

$$\sigma_p^2 = w_A^2\sigma_A^2 + w_B^2\sigma_B^2 + 2w_Aw_B\sigma_A\sigma_B\rho_{AB}$$

Where $\rho_{AB}$ is the correlation between A and B.

In [2]:
import numpy as np
import pandas as pd

# Two-asset portfolio example
# Stock A: High return, high risk
# Stock B: Lower return, lower risk

mu_A, sigma_A = 0.12, 0.20   # 12% return, 20% volatility
mu_B, sigma_B = 0.06, 0.10   # 6% return, 10% volatility
rho = 0.3  # Correlation

print("Two-Asset Portfolio Analysis")
print("="*50)
print(f"\nAsset A: Œº = {mu_A:.0%}, œÉ = {sigma_A:.0%}")
print(f"Asset B: Œº = {mu_B:.0%}, œÉ = {sigma_B:.0%}")
print(f"Correlation: œÅ = {rho}")

# Calculate portfolio stats for different weights
print("\n Weight A | Return | Risk   | Risk if œÅ=1")
print("-"*45)

for w_A in [0.0, 0.25, 0.5, 0.75, 1.0]:
    w_B = 1 - w_A
    
    # Portfolio return
    port_return = w_A * mu_A + w_B * mu_B
    
    # Portfolio risk (variance formula)
    port_var = (w_A**2 * sigma_A**2 + 
                w_B**2 * sigma_B**2 + 
                2 * w_A * w_B * sigma_A * sigma_B * rho)
    port_risk = np.sqrt(port_var)
    
    # Risk if perfectly correlated (no diversification benefit)
    no_diversification = w_A * sigma_A + w_B * sigma_B
    
    print(f"  {w_A:5.0%}   | {port_return:5.1%}  | {port_risk:5.1%}  | {no_diversification:5.1%}")

print("\n‚úì With œÅ < 1, portfolio risk < weighted average risk!")

Two-Asset Portfolio Analysis

Asset A: Œº = 12%, œÉ = 20%
Asset B: Œº = 6%, œÉ = 10%
Correlation: œÅ = 0.3

 Weight A | Return | Risk   | Risk if œÅ=1
---------------------------------------------
     0%   |  6.0%  | 10.0%  | 10.0%
    25%   |  7.5%  | 10.2%  | 12.5%
    50%   |  9.0%  | 12.4%  | 15.0%
    75%   | 10.5%  | 15.9%  | 17.5%
   100%   | 12.0%  | 20.0%  | 20.0%

‚úì With œÅ < 1, portfolio risk < weighted average risk!


---

## 2. The Efficient Frontier

### Definition

The **efficient frontier** is the set of portfolios that:
- Maximize return for a given level of risk, OR
- Minimize risk for a given level of return

### Optimization Problem

**Minimize Risk** (for target return $\mu^*$):

$$\min_w \quad w^T \Sigma w$$

Subject to:
- $w^T \mu = \mu^*$ (target return)
- $w^T \mathbf{1} = 1$ (weights sum to 1)
- $w_i \geq 0$ (optional: no short selling)

### Special Portfolios

**Minimum Variance Portfolio (MVP)**: Lowest possible risk

$$w_{MVP} = \frac{\Sigma^{-1} \mathbf{1}}{\mathbf{1}^T \Sigma^{-1} \mathbf{1}}$$

**Maximum Sharpe Ratio Portfolio**: Best risk-adjusted return

In [3]:
from scipy.optimize import minimize

# Three assets
np.random.seed(42)
n_assets = 3
asset_names = ['Tech', 'Bonds', 'Gold']

# Expected returns and covariance
mu = np.array([0.12, 0.04, 0.06])  # Expected returns
cov_matrix = np.array([
    [0.0400, 0.0040, 0.0000],   # Tech: 20% vol
    [0.0040, 0.0025, -0.0010],  # Bonds: 5% vol  
    [0.0000, -0.0010, 0.0144]   # Gold: 12% vol
])

def portfolio_stats(weights, mu, cov):
    """Calculate portfolio return and risk"""
    ret = weights @ mu
    risk = np.sqrt(weights @ cov @ weights)
    return ret, risk

def minimize_risk(target_return, mu, cov):
    """Find minimum risk portfolio for target return"""
    n = len(mu)
    
    # Objective: minimize variance
    def variance(w):
        return w @ cov @ w
    
    # Constraints
    constraints = [
        {'type': 'eq', 'fun': lambda w: np.sum(w) - 1},      # weights sum to 1
        {'type': 'eq', 'fun': lambda w: w @ mu - target_return}  # target return
    ]
    
    # Bounds (no short selling)
    bounds = [(0, 1) for _ in range(n)]
    
    result = minimize(variance, np.ones(n)/n, bounds=bounds, constraints=constraints)
    return result.x

# Find portfolios on efficient frontier
print("Efficient Frontier Portfolios")
print("="*60)
print(f"\n{'Target Ret':>10} | {'Risk':>7} | {'Tech':>6} | {'Bonds':>6} | {'Gold':>6}")
print("-"*60)

for target in [0.04, 0.06, 0.08, 0.10, 0.12]:
    weights = minimize_risk(target, mu, cov_matrix)
    ret, risk = portfolio_stats(weights, mu, cov_matrix)
    print(f"{target:>10.1%} | {risk:>6.2%} | {weights[0]:>6.1%} | {weights[1]:>6.1%} | {weights[2]:>6.1%}")

print("\n‚úì Higher returns require more Tech (risky), less Bonds (safe)")

Efficient Frontier Portfolios

Target Ret |    Risk |   Tech |  Bonds |   Gold
------------------------------------------------------------
      4.0% |  5.00% |   0.0% | 100.0% |   0.0%
      6.0% |  6.02% |  17.8% |  53.3% |  28.9%
      8.0% |  9.64% |  40.1% |  20.3% |  39.6%
     10.0% | 13.92% |  66.7% |   0.0% |  33.3%
     12.0% | 20.00% | 100.0% |   0.0% |   0.0%

‚úì Higher returns require more Tech (risky), less Bonds (safe)


---

## 3. Sharpe Ratio

### Definition

Measures **excess return per unit of risk**:

$$\text{Sharpe Ratio} = \frac{E[r_p] - r_f}{\sigma_p}$$

Where:
- $E[r_p]$ = expected portfolio return
- $r_f$ = risk-free rate
- $\sigma_p$ = portfolio standard deviation

### Interpretation

| Sharpe Ratio | Quality |
|--------------|--------|
| < 0 | Worse than risk-free |
| 0 - 1 | Poor to acceptable |
| 1 - 2 | Good |
| 2 - 3 | Very good |
| > 3 | Excellent (rare, verify!) |

### Annualization

If calculated from daily returns:

$$\text{Sharpe}_{annual} = \text{Sharpe}_{daily} \times \sqrt{252}$$

In [4]:
# Sharpe Ratio calculation
def sharpe_ratio(returns, rf=0):
    """Calculate annualized Sharpe ratio from daily returns"""
    excess_returns = returns - rf/252  # Daily risk-free rate
    return np.sqrt(252) * excess_returns.mean() / excess_returns.std()

# Simulate different strategies
np.random.seed(42)
n_days = 252 * 3  # 3 years
rf = 0.02  # 2% risk-free rate

strategies = {
    'Index Fund': np.random.normal(0.0004, 0.012, n_days),     # Low alpha, low cost
    'Active Fund': np.random.normal(0.0005, 0.015, n_days),    # Some alpha, higher vol
    'Hedge Fund': np.random.normal(0.0006, 0.008, n_days),     # Good alpha, low vol
    'Momentum': np.random.normal(0.0003, 0.020, n_days),       # Volatile
}

print("Sharpe Ratio Comparison")
print("="*55)
print(f"\n{'Strategy':<15} | {'Ann. Return':>11} | {'Ann. Vol':>10} | {'Sharpe':>7}")
print("-"*55)

for name, returns in strategies.items():
    ann_ret = returns.mean() * 252
    ann_vol = returns.std() * np.sqrt(252)
    sr = sharpe_ratio(returns, rf)
    
    quality = "üü¢" if sr > 1 else "üü°" if sr > 0.5 else "üî¥"
    print(f"{name:<15} | {ann_ret:>10.2%} | {ann_vol:>9.2%} | {sr:>6.2f} {quality}")

print("\n‚úì Sharpe ratio rewards return while penalizing risk")

Sharpe Ratio Comparison

Strategy        | Ann. Return |   Ann. Vol |  Sharpe
-------------------------------------------------------
Index Fund      |      4.51% |    18.82% |   0.13 üî¥
Active Fund     |     56.68% |    23.43% |   2.33 üü¢
Hedge Fund      |     16.10% |    12.55% |   1.12 üü¢
Momentum        |     15.42% |    31.42% |   0.43 üî¥

‚úì Sharpe ratio rewards return while penalizing risk


### Maximum Sharpe Ratio Portfolio

The **tangency portfolio** maximizes Sharpe ratio:

$$\max_w \frac{w^T \mu - r_f}{\sqrt{w^T \Sigma w}}$$

In [5]:
def max_sharpe_portfolio(mu, cov, rf):
    """Find portfolio that maximizes Sharpe ratio"""
    n = len(mu)
    
    def neg_sharpe(w):
        ret = w @ mu
        vol = np.sqrt(w @ cov @ w)
        return -(ret - rf) / vol
    
    constraints = [{'type': 'eq', 'fun': lambda w: np.sum(w) - 1}]
    bounds = [(0, 1) for _ in range(n)]
    
    result = minimize(neg_sharpe, np.ones(n)/n, bounds=bounds, constraints=constraints)
    return result.x

# Find maximum Sharpe portfolio
rf = 0.02
max_sharpe_weights = max_sharpe_portfolio(mu, cov_matrix, rf)
ret, risk = portfolio_stats(max_sharpe_weights, mu, cov_matrix)
sharpe = (ret - rf) / risk

print("Maximum Sharpe Ratio Portfolio")
print("="*50)
print(f"\nRisk-free rate: {rf:.1%}")
print(f"\nOptimal weights:")
for name, w in zip(asset_names, max_sharpe_weights):
    print(f"  {name}: {w:.1%}")
    
print(f"\nPortfolio stats:")
print(f"  Expected Return: {ret:.2%}")
print(f"  Volatility: {risk:.2%}")
print(f"  Sharpe Ratio: {sharpe:.2f}")

Maximum Sharpe Ratio Portfolio

Risk-free rate: 2.0%

Optimal weights:
  Tech: 16.4%
  Bonds: 55.3%
  Gold: 28.2%

Portfolio stats:
  Expected Return: 5.88%
  Volatility: 5.84%
  Sharpe Ratio: 0.66


---

## 4. Value at Risk (VaR)

### Definition

VaR answers: "What is the **maximum loss** at a given **confidence level** over a given **time horizon**?"

$$P(\text{Loss} > \text{VaR}_\alpha) = 1 - \alpha$$

Example: "95% VaR = $1M" means:
- 95% of the time, losses will be less than $1M
- 5% of the time (1 in 20 days), losses could exceed $1M

### Calculation Methods

**1. Parametric (Normal) VaR**:

$$\text{VaR}_\alpha = \mu - z_\alpha \sigma$$

Where $z_\alpha$ is the z-score for confidence $\alpha$:
- 95% confidence: $z = 1.645$
- 99% confidence: $z = 2.326$

**2. Historical VaR**: Use actual historical percentile

**3. Monte Carlo VaR**: Simulate many scenarios

In [6]:
from scipy import stats

# VaR calculation example
portfolio_value = 1_000_000  # $1M portfolio
daily_return = 0.0004        # 0.04% expected daily return
daily_vol = 0.015            # 1.5% daily volatility

confidence = 0.95
z_score = stats.norm.ppf(1 - confidence)  # Negative z for left tail

# Method 1: Parametric VaR
var_parametric = -(daily_return + z_score * daily_vol) * portfolio_value

# Method 2: Historical VaR (simulate historical returns)
np.random.seed(42)
historical_returns = np.random.normal(daily_return, daily_vol, 1000)
var_historical = -np.percentile(historical_returns, (1-confidence)*100) * portfolio_value

# Method 3: Monte Carlo VaR
simulated_returns = np.random.normal(daily_return, daily_vol, 10000)
var_montecarlo = -np.percentile(simulated_returns, (1-confidence)*100) * portfolio_value

print(f"Value at Risk Analysis")
print("="*50)
print(f"\nPortfolio: ${portfolio_value:,.0f}")
print(f"Confidence Level: {confidence:.0%}")
print(f"\n{'Method':<20} | {'1-Day VaR':>15}")
print("-"*40)
print(f"{'Parametric (Normal)':<20} | ${var_parametric:>14,.0f}")
print(f"{'Historical':<20} | ${var_historical:>14,.0f}")
print(f"{'Monte Carlo':<20} | ${var_montecarlo:>14,.0f}")

print(f"\n‚úì Interpretation: On 5% of days, we could lose more than ~${var_parametric:,.0f}")

Value at Risk Analysis

Portfolio: $1,000,000
Confidence Level: 95%

Method               |       1-Day VaR
----------------------------------------
Parametric (Normal)  | $        24,273
Historical           | $        22,489
Monte Carlo          | $        24,539

‚úì Interpretation: On 5% of days, we could lose more than ~$24,273


### VaR Limitations

‚ö†Ô∏è **VaR doesn't tell you:**
- HOW BAD the loss could be beyond VaR
- VaR is not **sub-additive** (diversification may not reduce VaR)

This leads us to Expected Shortfall...

---

## 5. Expected Shortfall (CVaR)

### Definition

**Expected Shortfall** (also called **Conditional VaR** or **CVaR**): Average loss **given that** loss exceeds VaR.

$$ES_\alpha = E[\text{Loss} | \text{Loss} > \text{VaR}_\alpha]$$

### For Normal Distribution

$$ES_\alpha = \mu + \sigma \frac{\phi(z_\alpha)}{1-\alpha}$$

Where $\phi$ is the standard normal PDF.

### Why CVaR?

| VaR | CVaR |
|-----|------|
| Tells you threshold | Tells you average loss in tail |
| Not coherent risk measure | Coherent (sub-additive) |
| Ignores tail shape | Captures tail risk |

In [7]:
# CVaR calculation
def calculate_cvar(returns, confidence):
    """Calculate CVaR (Expected Shortfall)"""
    var = np.percentile(returns, (1-confidence) * 100)
    cvar = returns[returns <= var].mean()
    return -cvar  # Return as positive loss

# Compare VaR and CVaR
np.random.seed(42)

# Normal returns
normal_returns = np.random.normal(0, 0.015, 10000)

# Fat-tailed returns (t-distribution)
fat_tail_returns = stats.t.rvs(df=4, loc=0, scale=0.012, size=10000)

print("VaR vs CVaR Comparison")
print("="*55)
print(f"\n{'Distribution':<15} | {'95% VaR':>10} | {'95% CVaR':>10} | {'Ratio':>8}")
print("-"*55)

for name, returns in [('Normal', normal_returns), ('Fat-Tailed', fat_tail_returns)]:
    var = -np.percentile(returns, 5) * portfolio_value
    cvar = calculate_cvar(returns, 0.95) * portfolio_value
    ratio = cvar / var
    
    print(f"{name:<15} | ${var:>9,.0f} | ${cvar:>9,.0f} | {ratio:>7.2f}x")

print("\n‚úì Fat tails: CVaR much higher than VaR (tail losses are severe!)")
print("‚úì Normal: CVaR ‚âà 1.25x VaR (well-known result)")

VaR vs CVaR Comparison

Distribution    |    95% VaR |   95% CVaR |    Ratio
-------------------------------------------------------
Normal          | $   24,823 | $   31,131 |    1.25x
Fat-Tailed      | $   24,562 | $   37,626 |    1.53x

‚úì Fat tails: CVaR much higher than VaR (tail losses are severe!)
‚úì Normal: CVaR ‚âà 1.25x VaR (well-known result)


---

## Summary: Week 5 Key Formulas

| Concept | Formula |
|---------|--------|
| Portfolio Return | $r_p = w^T \mu$ |
| Portfolio Variance | $\sigma_p^2 = w^T \Sigma w$ |
| Two-Asset Variance | $\sigma_p^2 = w_A^2\sigma_A^2 + w_B^2\sigma_B^2 + 2w_Aw_B\sigma_A\sigma_B\rho$ |
| Sharpe Ratio | $SR = \frac{E[r_p] - r_f}{\sigma_p}$ |
| Parametric VaR | $VaR_\alpha = \mu - z_\alpha \sigma$ |
| CVaR | $ES_\alpha = E[Loss | Loss > VaR]$ |

### Key Takeaways

1. **Diversification works** when correlation < 1
2. **Efficient frontier** shows best risk-return tradeoffs
3. **Sharpe ratio** measures risk-adjusted performance
4. **VaR** tells threshold loss at confidence level
5. **CVaR** tells average loss in the tail (better for fat tails)

---

*Next Week: Factor Models*

## üî¥ PROS & CONS: THEORY

### ‚úÖ PROS (Advantages)

| Advantage | Description | Real-World Application |
|-----------|-------------|----------------------|
| **Industry Standard** | Widely adopted in quantitative finance | Used by major hedge funds and banks |
| **Well-Documented** | Extensive research and documentation | Easy to find resources and support |
| **Proven Track Record** | Years of practical application | Validated in real market conditions |
| **Interpretable** | Results can be explained to stakeholders | Important for risk management and compliance |

### ‚ùå CONS (Limitations)

| Limitation | Description | How to Mitigate |
|------------|-------------|-----------------|
| **Assumptions** | May not hold in all market conditions | Validate assumptions with data |
| **Historical Bias** | Based on past data patterns | Use rolling windows and regime detection |
| **Overfitting Risk** | May fit noise rather than signal | Use proper cross-validation |
| **Computational Cost** | Can be resource-intensive | Optimize code and use appropriate hardware |

### üéØ Real-World Usage

**WHERE THIS IS USED:**
- ‚úÖ Quantitative hedge funds (Two Sigma, Renaissance, Citadel)
- ‚úÖ Investment banks (Goldman Sachs, JP Morgan, Morgan Stanley)
- ‚úÖ Asset management firms
- ‚úÖ Risk management departments
- ‚úÖ Algorithmic trading desks

**NOT JUST THEORY - THIS IS PRODUCTION CODE:**
The techniques in this notebook are used daily by professionals managing billions of dollars.