# **Beta Update Equation (Kalman Filter) for Pair Trading**

## **State-Space Model**

### **State Equation**
$$
\beta_t = \beta_{t-1} + w_t, \quad w_t \sim N(0, Q)
$$
- \(\beta_t\) evolves with process noise \(w_t\).

### **Observation Equation**
$$
y_t = \beta_t x_t + v_t, \quad v_t \sim N(0, R)
$$
- \(y_t\) (SPY price), \(x_t\) (DJIA price), and \(v_t\) (measurement noise).

## **Kalman Filter Update Steps**

### **Prediction**
$$
\hat{\beta}_{t|t-1} = \hat{\beta}_{t-1}
$$
$$
P_{t|t-1} = P_{t-1} + Q
$$

### **Update**
$$
K_t = P_{t|t-1} x_t^T (x_t P_{t|t-1} x_t^T + R)^{-1}
$$
$$
\hat{\beta}_t = \hat{\beta}_{t|t-1} + K_t (y_t - x_t \hat{\beta}_{t|t-1})
$$
$$
P_t = (1 - K_t x_t) P_{t|t-1}
$$
### **The final beta update equation**

$$
\hat{\beta}_{t \mid t} = \hat{\beta}_{t \mid t-1} + \frac{S_{t, CA}}{S_{t, CA}^2 + \gamma^{-1}} \left( S_{t, AU} - \hat{\beta}_{t \mid t-1} S_{t, CA} \right)
$$  


In [None]:
import numpy as np

beta_prev = 0
P_prev = 1
Q = 0.01
R = 1

def kalman_beta_update(y_t, x_t, beta_prev, P_prev, Q, R):
    beta_pred = beta_prev
    P_pred = P_prev + Q
    K_t = P_pred * x_t / (x_t * P_pred * x_t + R)
    beta_updated = beta_pred + K_t * (y_t - x_t * beta_pred)
    P_updated = (1 - K_t * x_t) * P_pred
    return beta_updated, P_updated

y_t = 145
t_x = 113

beta_new, P_new = kalman_beta_update(y_t, t_x, beta_prev, P_prev, Q, R)
print("Updated Beta:", beta_new)
print("Updated Covariance:", P_new)


**Ques 2: Coding**

In [None]:

import pandas as pd
import numpy as np
import statsmodels.api as sm
import matplotlib.pyplot as plt
from pykalman import KalmanFilter

spy_df = pd.read_csv("spy.csv")
djia_df = pd.read_csv("djia.csv")

spy_df["Date"] = pd.to_datetime(spy_df["Date"])
djia_df["Date"] = pd.to_datetime(djia_df["Date"])

merged_df = pd.merge(spy_df, djia_df, on="Date", suffixes=("_SPY", "_DJIA"))

x = merged_df["PX_LAST_DJIA"].values
y = merged_df["PX_LAST_SPY"].values


In [None]:
betas = []
window = 60
for i in range(window, len(x)):
    X = sm.add_constant(x[i-window:i])
    model = sm.OLS(y[i-window:i], X).fit()
    betas.append(model.params[1])
rolling_beta = np.concatenate([np.full(window, np.nan), betas])

n = len(x)
observation_matrices = np.array([[xi] for xi in x])

In [None]:

kf = KalmanFilter(
    transition_matrices=[1],
    observation_matrices=observation_matrices.reshape(-1, 1, 1),
    initial_state_mean=0,
    initial_state_covariance=1,
    observation_covariance=1,
    transition_covariance=0.01
)
state_means, _ = kf.filter(y)
kalman_beta = state_means.flatten()

merged_df["Rolling Beta"] = rolling_beta
merged_df["Kalman Beta"] = kalman_beta
merged_df["Rolling Spread"] = y - merged_df["Rolling Beta"] * x
merged_df["Kalman Spread"] = y - merged_df["Kalman Beta"] * x

In [None]:

plt.figure(figsize=(12, 5))
plt.plot(merged_df["Date"], merged_df["Rolling Beta"], label="Rolling Regression Beta", color="blue")
plt.plot(merged_df["Date"], merged_df["Kalman Beta"], label="Kalman Filter Beta", color="orange")
plt.xlabel("Date")
plt.ylabel("Hedge Ratio (Beta)")
plt.title("Rolling Regression vs. Kalman Filter Beta")
plt.legend()
plt.show()


**Conclusion**

Key Insights:

The Kalman filter provides a dynamic and real-time method for estimating the hedge ratio, making it more adaptable than the rolling regression approach, which tends to lag.


While both techniques are valuable for pair trading, the Kalman filter is typically favored in high-frequency or real-time trading strategies due to its superior efficiency and responsiveness.