# Notebook 5: Optimal Portfolio Construction & Risk Attribution

### **Objective**
This final notebook represents the culmination of the quantitative active management process. The objective is to take our two key proprietary inputs—the **alpha signal ($\alpha$)** from Notebook 4 and the **multifactor risk model ($V$)** from Notebook 3—and combine them to construct the **optimal active portfolio**. We will use the "Value Added" (`VA`) objective function from Grinold & Kahn to find the portfolio that strikes the perfect balance between seeking alpha and controlling active risk. Finally, we will perform a full risk attribution on the resulting portfolio to understand the sources of its risk.

---

### **Methodology & Pipeline**

The process follows the strategic framework for active management laid out in Chapters 4 and 5 of "Active Portfolio Management."

*   **1. Assemble Inputs:** We load all the necessary components we have built:
    *   The benchmark-neutral alpha vector ($\alpha$).
    *   The full covariance matrix ($V = XFX^T + \Delta$) and its components ($X, F, \Delta$).
    *   The benchmark holdings ($h_B$) and the asset beta vector ($\beta$).

*   **2. Define the Active Manager's Objective Function:** We formalize the manager's goal. For a pure stock-picker who forgoes market timing, the objective is to maximize **Value Added (`VA`)**, defined as the trade-off between portfolio alpha and a penalty for the active risk taken.
    $$ \text{Maximize:} \quad VA = \alpha_p - \lambda_R \cdot \psi_p^2 $$
    Where:
    *   $\alpha_p = h_{PA}^T \alpha$ is the portfolio's active alpha.
    *   $\psi_p^2 = h_{PA}^T V h_{PA}$ is the portfolio's active variance (tracking error squared).
    *   $\lambda_R$ is the manager's aversion to active (residual) risk. We will use a "moderate" value for this parameter.

*   **3. Solve for the Optimal Active Portfolio ($h_{PA}^*$):** We solve this unconstrained optimization problem. The first-order condition provides a direct, analytical solution for the optimal active holdings:
    $$ h_{PA}^* = \frac{1}{2 \lambda_R} V^{-1} \alpha $$
    This formula gives us the precise overweights and underweights for each stock that will maximize our `VA` score.

*   **4. Construct and Analyze the Final Portfolio ($h_P^*$):** The final portfolio is the sum of the benchmark and our optimal active bets: $h_P^* = h_B + h_{PA}^*$. We then analyze the key properties of this portfolio: its total alpha, beta, and active risk.

*   **5. Perform Risk Attribution:** This is the crucial final step. We decompose the portfolio's active risk into its fundamental sources to understand the character of our bets.  First, we calculate the portfolio's active factor exposures, $x_{PA}$, which represent the portfolio's factor tilts relative to the benchmark. $$ \underset{(K \times 1)}{x_{PA}} = \underset{(K \times N)}{X^T} \cdot \underset{(N \times 1)}{h_{PA}} $$


We separate the active variance into two orthogonal components:
    $$ \psi_p^2 = \underbrace{x_{PA}^T F x_{PA}}_{\text{Active Factor Risk}} + \underbrace{h_{PA}^T \Delta h_{PA}}_{\text{Active Specific Risk}} $$
    This decomposition reveals how much of our tracking error is coming from our systematic factor tilts (e.g., Value, Momentum) versus our idiosyncratic stock-selection bets.

---

### **Key Concepts & Theoretical Justification**

#### **1. The Value Added (VA) Objective**

The traditional Markowitz objective $(f - \lambda_T \sigma_p^2)$ is flawed for institutional managers as it fails to distinguish between benchmark risk and active risk, often leading to portfolios with unacceptable tracking error. The $VA$ objective function correctly focuses the manager on the trade-off they actually control: the one between the active return they generate ($\alpha_p$) and the active risk they take ($\psi_p^2$). It is the appropriate objective for a manager whose skill is judged relative to a benchmark.

#### **2. The Optimal Unconstrained Active Portfolio**

The solution $h_{PA}^* \propto V^{-1} \alpha$ is a profound result. It is the holdings of the **characteristic portfolio of the alphas** (Portfolio A), scaled by the manager's information ratio and risk aversion. This proves that the portfolio that maximizes the $VA$ objective is the same one that maximizes the Information Ratio. The optimizer's solution is guaranteed to be an efficient active portfolio.

#### **3. Risk Decomposition: Factor vs. Specific**

The final risk attribution is the ultimate diagnostic tool for a quantitative manager.
*   **Active Factor Risk** tells you the risk from your systematic bets. Is your portfolio's risk driven by a large, intentional bet on the "Value" factor, or by an unintended "incidental" bet on an industry?
*   **Active Specific Risk** tells you the risk from your stock selection. It is the risk that remains after accounting for all common factor exposures.

For a pure stock-picker, we expect the majority of the active risk to be in the "Specific Risk" bucket, confirming that the portfolio's risk is aligned with the manager's intended source of alpha.

---
**Output:** This notebook produces a final "Risk & Attribution Report" for our optimal portfolio. It provides a clear, quantitative summary of the portfolio's expected characteristics and a deep dive into the sources of its active risk, completing the end-to-end demonstration of the Grinold-Kahn framework.



In [23]:
import pandas as pd
import numpy as np
import os

print("Libraries imported successfully.")

# --- Load all the building blocks we've created ---
DATA_DIR = 'data'
EXPOSURES_FILE = os.path.join(DATA_DIR, 'factor_exposures.csv')
ALPHA_FILE = os.path.join(DATA_DIR, 'alpha_vector.csv')

X = pd.read_csv(EXPOSURES_FILE, index_col=0)

# Load the alpha vector and ENSURE it has the correct data type
alpha_vector = pd.read_csv(ALPHA_FILE, index_col=0).squeeze("columns")
alpha_vector = alpha_vector.astype('float64') # <--- THIS IS THE FIX
alpha_vector.name = 'alpha'

# Align data to ensure we have a consistent universe
common_stocks = X.index.intersection(alpha_vector.index)
X = X.loc[common_stocks]
alpha_vector = alpha_vector.loc[common_stocks]

print("Factor exposures (X) and Alpha vector (alpha) loaded successfully.")
print(alpha_vector.info()) # This will now show 'float64'

Libraries imported successfully.
Factor exposures (X) and Alpha vector (alpha) loaded successfully.
<class 'pandas.core.series.Series'>
Index: 10 entries, AAPL to XOM
Series name: alpha
Non-Null Count  Dtype  
--------------  -----  
10 non-null     float64
dtypes: float64(1)
memory usage: 160.0+ bytes
None


In [24]:
print(alpha_vector)

AAPL    -0.110733
AMZN     0.172271
GOOGL    0.091524
JNJ     -0.262013
JPM     -0.074117
MSFT     0.201647
PG      -0.090584
TSLA    -0.342861
UNH     -0.124395
XOM     -0.255246
Name: alpha, dtype: float64


In [25]:
# --- Reconstruct the Risk Model from Notebook 3 ---

# Load the historical estimates we saved
factor_returns = pd.read_csv('data/factor_returns.csv', index_col=0, parse_dates=True)
specific_returns = pd.read_csv('data/specific_returns.csv', index_col=0, parse_dates=True)
market_caps = pd.read_csv('data/market_caps.csv', index_col=0, header=None).squeeze("columns")
market_caps.index.name = 'Ticker' # Clean up index name

# Align everything to our common stock universe
common_stocks = X.index
factor_returns = factor_returns.reindex(columns=X.columns.union(['const'])) # Ensure columns are aligned
specific_returns = specific_returns.reindex(columns=common_stocks)
market_caps = market_caps.reindex(index=common_stocks)

# 1. Calculate the Factor Covariance Matrix (F) - EXCLUDING the constant
factor_names = ['Size', 'Value', 'Momentum']
F = factor_returns[factor_names].cov() * 12

# 2. Calculate the Specific Risk Matrix (Delta)
specific_variances = specific_returns.var() * 12
# Your brilliant insight: create a labeled DataFrame for Delta
Delta = pd.DataFrame(np.diag(specific_variances.fillna(0)), 
                     index=common_stocks, 
                     columns=common_stocks)

# 3. Assemble the Total Covariance Matrix (V)
# Use the X matrix for factors only (already N x 3)
systematic_cov = X @ F @ X.T
V = systematic_cov + Delta

# 4. Calculate Benchmark Properties
h_B = market_caps / market_caps.sum()
sigma_B_sq = h_B.T @ V @ h_B

# 5. Calculate the Beta Vector
beta_vector = (V @ h_B) / sigma_B_sq
beta_vector.name = 'beta'

# 6. Calculate the Residual Covariance Matrix (V_R)
# Note: np.outer creates a NumPy array, so we wrap it in a DataFrame
beta_outer_prod = pd.DataFrame(np.outer(beta_vector, beta_vector), 
                               index=common_stocks, 
                               columns=common_stocks)
V_R = V - (sigma_B_sq * beta_outer_prod)

print("Risk model components (V, V_R, Beta) constructed successfully.")


Risk model components (V, V_R, Beta) constructed successfully.


In [26]:
# --- Set Up the Portfolio Optimization ---

# We will solve for the optimal active portfolio (h_PA) that maximizes Value Added.
# As per Footnote 7 in Chapter 5, we use the simplified objective for a pure stock-picker.
# Objective: Maximize VA = alpha_p - lambda_R * psi_p^2

# Let's define our risk aversion. We'll be a "moderate" manager.
lambda_R = 0.10

# The objective function in terms of holdings is:
# Maximize: h_PA.T @ alpha - lambda_R * (h_PA.T @ V @ h_PA)
# Note: We use the full V matrix for active risk psi_p^2, not V_R.

print(f"Optimization parameters set with lambda_R = {lambda_R}")


Optimization parameters set with lambda_R = 0.1


In [27]:
# --- Solve for the Optimal Active Holdings ---

# The First-Order Condition from the VA maximization gives us the formula for the optimal unconstrained active portfolio.
# Formula: h_PA = (1 / (2 * lambda_R)) * V_inverse * alpha

# --- Step 1: Convert pandas objects to NumPy arrays for robust calculation ---
# This prevents potential data type errors during matrix operations.
V_numpy = V.values
alpha_numpy = alpha_vector.values # Explicitly set dtype to float64

# --- Step 2: Perform the linear algebra ---

# Calculate the inverse of the total covariance matrix V
try:
    V_inv_numpy = np.linalg.inv(V_numpy)
except np.linalg.LinAlgError as e:
    print(f"CRITICAL ERROR: The Covariance Matrix (V) is singular and cannot be inverted.")
    print("This can happen if two assets have perfectly correlated returns in the historical sample.")
    print(f"Error details: {e}")
    # Stop execution if V is not invertible
    raise

# Calculate the optimal active holdings as a NumPy array
# This is the core calculation: h* = scale * V_inv * alpha
scalar_multiplier = 1 / (2 * lambda_R)
optimal_h_PA_numpy = scalar_multiplier * (V_inv_numpy @ alpha_numpy)

# --- Step 3: Convert the result back to a pandas Series for analysis ---
# Attaching the stock ticker index makes the output readable and easy to work with.
optimal_h_PA = pd.Series(optimal_h_PA_numpy, index=V.index)
optimal_h_PA.name = 'active_holding'

print("Optimal active holdings (h_PA) calculated successfully.")
print("\n--- Top 5 Overweight Positions ---")
print(optimal_h_PA.sort_values(ascending=False).head(5))
print("\n--- Top 5 Underweight Positions ---")
print(optimal_h_PA.sort_values(ascending=True).head(5))

# --- Sanity Check: The Active Beta ---
# From Footnote 7 of Chapter 5, we proved that if the objective is to maximize
# alpha minus a penalty for TOTAL active risk (psi^2), and the alphas are
# benchmark-neutral, the resulting optimal portfolio should have an active beta of zero.
active_beta_check = optimal_h_PA.T @ beta_vector
print(f"\nSanity Check: The active beta of this optimal portfolio is: {active_beta_check:.4f}")
print("(A value very close to zero confirms the theory).")


Optimal active holdings (h_PA) calculated successfully.

--- Top 5 Overweight Positions ---
MSFT     41.371068
PG       19.961549
AMZN     17.419180
GOOGL    13.118236
UNH      10.122593
Name: active_holding, dtype: float64

--- Top 5 Underweight Positions ---
TSLA   -41.985713
JNJ    -36.094302
AAPL   -30.375075
XOM    -26.436056
JPM      2.104268
Name: active_holding, dtype: float64

Sanity Check: The active beta of this optimal portfolio is: -0.0000
(A value very close to zero confirms the theory).


In [28]:
# --- Analyze the Optimal Portfolio's Properties ---

# Construct the full optimal portfolio
h_P_optimal = h_B + optimal_h_PA

# 1. Calculate Portfolio Alpha and Beta
alpha_p = h_P_optimal.T @ alpha_vector # Note: Since h_B.T @ alpha is 0, this is also optimal_h_PA.T @ alpha
beta_p = h_P_optimal.T @ beta_vector
active_beta = beta_p - 1

# 2. Calculate Active Risk (Tracking Error)
active_variance = optimal_h_PA.T @ V @ optimal_h_PA
psi_p = np.sqrt(active_variance)

# 3. Decompose the Active Risk
active_factor_variance = (optimal_h_PA.T @ X @ F @ X.T @ optimal_h_PA)
active_specific_variance = (optimal_h_PA.T @ Delta @ optimal_h_PA)

pct_from_factors = active_factor_variance / active_variance
pct_from_specific = active_specific_variance / active_variance

# 4. Calculate final VA score
value_added = alpha_p - lambda_R * active_variance

# --- Print the Risk Report ---
print("\n--- Optimal Portfolio Analysis ---")
print(f"Target Lambda_R: {lambda_R}")
print("-" * 35)
print(f"Portfolio Alpha: {alpha_p:.4f}")
print(f"Portfolio Beta: {beta_p:.2f} (Active Beta: {active_beta:.2f})")
print(f"Active Risk (Tracking Error): {psi_p:.4f}")
print(f"  - % from Factor Bets: {pct_from_factors:.2%}")
print(f"  - % from Stock Selection: {pct_from_specific:.2%}")
print("-" * 35)
print(f"Final Value Added Score: {value_added:.4f}")



--- Optimal Portfolio Analysis ---
Target Lambda_R: 0.1
-----------------------------------
Portfolio Alpha: 43.2841
Portfolio Beta: 1.00 (Active Beta: -0.00)
Active Risk (Tracking Error): 14.7112
  - % from Factor Bets: 14.52%
  - % from Stock Selection: 85.48%
-----------------------------------
Final Value Added Score: 21.6421
