# Comparing Hedge Ratio Estimation: OLS vs PCA



This notebook explains and compares two common methods for constructing market-neutral spreads from futures contracts (e.g., WTI F1 and F2): **OLS Regression** and **Principal Component Analysis (PCA)**.



## OLS Regression Approach



### Method



Run a linear regression:

$$
F1_t = \alpha + \beta \cdot F2_t + \varepsilon_t
$$

The **hedge ratio** is:

$$
\text{HR}_{\text{OLS}} = \beta
$$

Construct the synthetic spread as:

$$
\text{Spread}_{\text{OLS}} = F1_t - \beta \cdot F2_t
$$



### Interpretation
- Models F1 as being explained by F2.
- The residual (spread) represents the **part of F1 not explained by F2**.
- If F1 and F2 are cointegrated, the spread is often **mean-reverting**.


### Pros
- Easy to interpret and implement.
- Works well for **pairs trading** with **cointegrated assets**.

### Cons
- Assumes **directionality** (F1 depends on F2).
- Sensitive to outliers or structural breaks.
- Cannot be easily extended to >2 assets.



## PCA Approach

### Method
Apply **Principal Component Analysis** to prices or returns of F1 and F2.

- PCA identifies **orthogonal components** (linear combinations) that explain the variance.
- The first principal component (PC1) captures the **common trend** (i.e., oil price level).
- The second principal component (PC2) captures **relative deviations**.

Market exposure or beta could be extracted from

$$
PC1 = w_1\times F1 +w_2\times F2
$$

Construct the synthetic spread from PC2:

$$
\text{Spread}_{\text{PCA}} = F1_t - \gamma \cdot F2_t
$$

Where:

$$
\gamma = \frac{w_2}{w_1}
$$



with \( w_1, w_2 \) being the loadings from the eigenvector of PC2.



### Pros
- Symmetric treatment of F1 and F2.
- Cleanly removes **systematic market beta**.
- Scales well to **more than 2 instruments**.

### Cons
- Less interpretable economically.
- Doesnâ€™t guarantee stationarity or cointegration.
- Eigenvectors may shift over time.