# Cross-Currency Basis Quant Research Project

This notebook investigates the behavior of the USD/JPY cross-currency basis and develops a trading strategy based on mean reversion and PCA-based stress detection. It includes data preparation, regression, PCA analysis, signal generation, backtesting, and performance evaluation.

## Load and Prepare Data

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Load merged dataset
df = pd.read_csv("merged_data.csv", parse_dates=["Date"])
df.head()

Unnamed: 0,Date,USDJPY_Basis,USDEUR_Basis,VIX,LIBOR,TBill,TED,Fed_Balance_Sheet
0,2009-01-01,-0.1275,-0.0315,22.48,0.4705,0.2075,0.2631,2200000.0
1,2009-01-02,-0.1459,-0.0292,19.31,0.3951,0.2182,0.1769,2201260.0
2,2009-01-05,-0.1482,-0.0135,23.24,0.4009,0.2004,0.2006,2202519.0
3,2009-01-06,-0.2692,-0.0227,27.62,0.4354,0.0957,0.3397,2203779.0
4,2009-01-07,-0.1127,0.0067,18.83,0.4117,0.208,0.2036,2205038.0


## Econometric Regression: Explaining the Basis

In [2]:
import statsmodels.api as sm

# Select variables
X = df[["VIX", "TED", "Fed_Balance_Sheet"]]
X = sm.add_constant(X)
y = df["USDJPY_Basis"]

# OLS Regression
model = sm.OLS(y, X, missing='drop')
results = model.fit()
print(results.summary())

ImportError: cannot import name '_lazywhere' from 'scipy._lib._util' (/Users/prescylliajaelle/Desktop/1Cross Currency/basis-env/lib/python3.11/site-packages/scipy/_lib/_util.py)

## PCA on Cross-Currency Basis Series

In [None]:
from sklearn.decomposition import PCA

basis_data = df[["USDJPY_Basis", "USDEUR_Basis"]].dropna()
pca = PCA(n_components=2)
components = pca.fit_transform(basis_data)

df.loc[basis_data.index, "PC1"] = components[:, 0]
print("Explained Variance Ratios:", pca.explained_variance_ratio_)
pd.DataFrame(pca.components_, columns=["USDJPY_Basis", "USDEUR_Basis"], index=["PC1", "PC2"])

## Generate Mean Reversion and PCA-Based Signals

In [None]:
# Compute z-scores
window = 60
df["Basis_Mean"] = df["USDJPY_Basis"].rolling(window).mean()
df["Basis_Std"] = df["USDJPY_Basis"].rolling(window).std()
df["Z_Score_Basis"] = (df["USDJPY_Basis"] - df["Basis_Mean"]) / df["Basis_Std"]

df["PC1_Mean"] = df["PC1"].rolling(window).mean()
df["PC1_Std"] = df["PC1"].rolling(window).std()
df["Z_Score_PC1"] = (df["PC1"] - df["PC1_Mean"]) / df["PC1_Std"]

# Signal function
def generate_signals(z, long_thresh=-1.5, short_thresh=1.5, exit_thresh=0.5):
    signal = np.zeros_like(z)
    position = 0
    for i in range(1, len(z)):
        if position == 0:
            if z[i] < long_thresh:
                position = 1
            elif z[i] > short_thresh:
                position = -1
        elif position == 1 and z[i] > -exit_thresh:
            position = 0
        elif position == -1 and z[i] < exit_thresh:
            position = 0
        signal[i] = position
    return signal

df["Signal_MeanRev"] = generate_signals(df["Z_Score_Basis"].values)
df["Signal_PCA"] = generate_signals(df["Z_Score_PC1"].values)

## Backtest Strategies

In [None]:
df["Basis_Return"] = df["USDJPY_Basis"].diff()
df["PnL_MeanRev"] = df["Signal_MeanRev"].shift(1) * df["Basis_Return"]
df["PnL_PCA"] = df["Signal_PCA"].shift(1) * df["Basis_Return"]

df["Cumulative_PnL_MeanRev"] = df["PnL_MeanRev"].cumsum()
df["Cumulative_PnL_PCA"] = df["PnL_PCA"].cumsum()

## Strategy Performance Metrics

In [None]:
def evaluate_strategy(pnl_series):
    pnl = pnl_series.dropna()
    returns = pnl.values
    total_return = np.sum(returns)
    sharpe = np.mean(returns) / np.std(returns) * np.sqrt(252) if np.std(returns) > 0 else 0
    cumulative = np.cumsum(returns)
    drawdown = cumulative - np.maximum.accumulate(cumulative)
    max_drawdown = np.min(drawdown)
    win_rate = np.mean(returns > 0)
    return total_return, sharpe, max_drawdown, win_rate * 100

print("Mean Reversion:", evaluate_strategy(df["PnL_MeanRev"]))
print("PCA-Based:", evaluate_strategy(df["PnL_PCA"]))

## Visualize Strategy Performance

In [None]:
plt.figure(figsize=(12, 6))
plt.plot(df["Date"], df["Cumulative_PnL_MeanRev"], label="Mean Reversion")
plt.plot(df["Date"], df["Cumulative_PnL_PCA"], label="PCA-Based")
plt.title("Cumulative PnL")
plt.legend()
plt.grid(True)
plt.show()

# Drawdown
cum = df["Cumulative_PnL_MeanRev"]
drawdown = cum - cum.cummax()

plt.figure(figsize=(12, 4))
plt.plot(df["Date"], drawdown, color="red")
plt.title("Drawdown – Mean Reversion Strategy")
plt.grid(True)
plt.show()

## Conclusion

The mean reversion strategy on USD/JPY basis produced strong risk-adjusted returns, while the PCA-based strategy underperformed. Improvements could include better filtering of PC1 signals, transaction cost modeling, and inclusion of macroeconomic regime changes.