# Financial Time Series Analysis and Trading Strategy Evaluation

This notebook explores historical financial data from the S&P 500 and other related assets to:

- Perform time series diagnostics (stationarity, mean reversion, etc.).
- Explore moving average and mean-reversion-based trading strategies.
- Evaluate the profitability of these strategies using cumulative returns and key metrics (Sharpe ratio, volatility, drawdown).
- Test cointegration between the S&P 500 and other ETFs to construct a potential pairs trading model.

In [None]:
import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.stattools import adfuller
from arch.unitroot import VarianceRatio
from hurst import compute_Hc
from statsmodels.tsa.stattools import kpss
import statsmodels.api as sm

## S&P 500 Time Series Overview

We begin by downloading and visualizing S&P 500 historical data, including:
- Price trends
- Moving averages
- Daily returns
- Cumulative returns
- Rolling volatility and mean absolute deviation


In [None]:
# download S&P 500 data pricing hystory for 5 years

sp500 = yf.download('^GSPC', period='5y')
print(f"Downloaded {sp500.shape[0]} rows of S&P 500 data.")
sp500.head()

In [None]:
# sliding mean analysis

sp500['SMA_50'] = sp500['Close'].rolling(window=50).mean()
sp500['SMA_200'] = sp500['Close'].rolling(window=200).mean()

# Plotting the closing price and the moving averages
plt.figure(figsize=(14, 7))
plt.plot(sp500['Close'], label='S&P 500 Close Price', color='blue')
plt.plot(sp500['SMA_50'], label='50-Day SMA', color='orange')
plt.plot(sp500['SMA_200'], label='200-Day SMA', color='red')
plt.title('S&P 500 Closing Price and Moving Averages')
plt.xlabel('Date')
plt.ylabel('Price')
plt.legend()
plt.grid()
plt.show()


In [None]:
# compute the daily returns
sp500['Daily Return'] = sp500['Close'].pct_change()
print(sp500['Daily Return'].describe())
# Plotting the daily returns
plt.figure(figsize=(14, 7))
plt.plot(sp500['Daily Return'], label='Daily Return', color='green')
plt.title('S&P 500 Daily Returns')
plt.xlabel('Date')
plt.ylabel('Daily Return')
plt.legend()
plt.grid()
plt.show()

In [None]:
# compute cumuilative returns
sp500['Cumulative Return'] = (1 + sp500['Daily Return']).cumprod()
print(sp500['Cumulative Return'].describe())
# Plotting the cumulative returns
plt.figure(figsize=(14, 7))
plt.plot(sp500['Cumulative Return'], label='Cumulative Return', color='purple')
plt.title('S&P 500 Cumulative Returns')
plt.xlabel('Date')
plt.ylabel('Cumulative Return')
plt.legend()
plt.grid()
plt.show()

In [None]:
# compute rolling volatility
sp500['Rolling Volatility'] = sp500['Daily Return'].rolling(window=21).std() * (252 ** 0.5)  # Annualized volatility
print(sp500['Rolling Volatility'].describe())

# Plotting the rolling volatility and MAD
plt.figure(figsize=(14, 7))
plt.plot(sp500['Rolling Volatility'], label='Rolling Volatility (Annualized)', color='brown')
plt.title('S&P 500 Rolling Volatility (Annualized)')
plt.xlabel('Date')
plt.ylabel('Volatility')
plt.legend()
plt.grid()
plt.show()  

In [None]:
# compute MAD and quantiles
sp500['MAD'] = sp500['Close'].rolling(window=50).apply(lambda x: (x - x.mean()).abs().mean())
print(sp500['MAD'].describe())
# Plotting the Mean Absolute Deviation (MAD)
plt.figure(figsize=(14, 7))
plt.plot(sp500['MAD'], label='Mean Absolute Deviation (MAD)', color='cyan')
plt.title('S&P 500 Mean Absolute Deviation (MAD)')
plt.xlabel('Date')
plt.ylabel('MAD')
plt.legend()
plt.grid()
plt.show()      

## Strategy 0: Moving Average Crossover with Persistent Position

This simple trading strategy generates buy/sell signals based on moving average crossovers:

- **Buy Signal**: When the 20-day MA crosses above the 50-day MA.
- **Sell Signal**: When the 20-day MA crosses below the 50-day MA.
- **Position Persistence**: Maintain position until an opposite signal appears.

We simulate performance by applying the position to subsequent daily returns and compare to a buy-and-hold benchmark.



In [None]:
# compute 5-day, 21-day and 63-day returns
sp500['5-Day Return'] = sp500['Close'].pct_change(periods=5)
sp500['21-Day Return'] = sp500['Close'].pct_change(periods=21)
sp500['63-Day Return'] = sp500['Close'].pct_change(periods=63)

#compute 20-day and 50-day moving average
sp500['20-Day MA'] = sp500['Close'].rolling(window=20).mean()
sp500['50-Day MA'] = sp500['Close'].rolling(window=50).mean()

sp500['Signal'] = 0
sp500.loc[sp500['20-Day MA'] > sp500['50-Day MA'], 'Signal'] = 1
sp500.loc[sp500['20-Day MA'] < sp500['50-Day MA'], 'Signal'] = -1

# Plotting the 20-day and 50-day moving averages with buy/sell signals
plt.figure(figsize=(14, 7))
plt.plot(sp500['Close'], label='S&P 500 Close Price', color='blue')
plt.plot(sp500['20-Day MA'], label='20-Day MA', color='orange')
plt.plot(sp500['50-Day MA'], label='50-Day MA', color='red')
plt.scatter(sp500.index[sp500['Signal'] == 1], sp500['Close'][sp500['Signal'] == 1], label='Buy Signal', marker='^', color='green', s=100)
plt.scatter(sp500.index[sp500['Signal'] == -1], sp500['Close'][sp500['Signal'] == -1], label='Sell Signal', marker='v', color='red', s=100)
plt.title('S&P 500 Moving Averages with Buy/Sell Signals')
plt.xlabel('Date')
plt.ylabel('Price')
plt.legend()
plt.grid()
plt.show()  

In [None]:
"""
A simple moving average crossover trading strategy and evaluate its performance 
relative to a passive buy-and-hold approach.
"""

# compute position based on the signals

sp500['Position'] = 0
sp500.loc[sp500['Signal'] == 1, 'Position'] = 1
sp500['Position'] = sp500['Position'].ffill()  # carry forward position
sp500.loc[sp500['Signal'] == -1, 'Position'] = 0

# compute cumulative returns for buy and hold strategy
sp500['Cumulative BuyHold'] = (1 + sp500['Daily Return']).cumprod()

# compute strategy returns based on the position
sp500['Strategy0_Return'] = sp500['Position'].shift(1) * sp500['Daily Return']
# compute cumulative strategy returns
sp500['Cumulative_Strategy0'] = (1 + sp500['Strategy0_Return']).cumprod()

# Plotting the strategy returns
plt.figure(figsize=(12, 6))
plt.plot(sp500['Cumulative_Strategy0'], label='Strategy 0 Returns', color='blue')
plt.plot(sp500['Cumulative BuyHold'], label='Buy & Hold Returns', color='orange')
plt.title('Cumulative Returns: Strategy 0 vs. Buy & Hold')
plt.xlabel('Date')
plt.ylabel('Cumulative Return')
plt.legend()
plt.grid()
plt.show()  




## Performance Evaluation: Strategy 0

We compute key KPIs to evaluate the strategy:

- Total Return
- Annualized Return
- Volatility
- Sharpe Ratio
- Maximum Drawdown

These are benchmarked against a passive buy-and-hold approach.



In [None]:
# function to compute metrics for the strategy
def strategy_metrics(cumulative_returns, cumulative_buy_hold):
    strategy_return = cumulative_returns.iloc[-1] - 1
    buy_hold_return = cumulative_buy_hold.iloc[-1] - 1
    annualized_strategy_return = (1 + strategy_return) ** (252 / len(cumulative_returns))
    annualized_buy_hold_return = (1 + buy_hold_return) ** (252 / len(cumulative_buy_hold))
    annualized_volatility = cumulative_returns.pct_change().std() * (252 ** 0.5)
    sharpe_ratio = annualized_strategy_return / annualized_volatility   
    max_drawdown = (cumulative_returns / cumulative_returns.cummax() - 1).min()
    
    return {
        'strategy_return': strategy_return,
        'buy_hold_return': buy_hold_return,
        'annualized_strategy_return': annualized_strategy_return,
        'annualized_buy_hold_return': annualized_buy_hold_return,
        'annualized_volatility': annualized_volatility,
        'sharpe_ratio': sharpe_ratio,
        'max_drawdown': max_drawdown
    }

# compute the performance metrics for strategy 0
metrics = strategy_metrics(sp500['Cumulative_Strategy0'], sp500['Cumulative BuyHold'])

# Print the performance metrics values
print(f"Strategy 0 Return: {metrics['strategy_return']:.2%}")
print(f"Buy & Hold Return: {metrics['buy_hold_return']:.2%}")
print(f"Annualized Strategy 0 Return: {metrics['annualized_strategy_return']:.2%}")
print(f"Annualized Buy & Hold Return: {metrics['annualized_buy_hold_return']:.2%}")
print(f"Annualized Volatility: {metrics['annualized_volatility']:.2%}")
print(f"Sharpe Ratio: {metrics['sharpe_ratio']:.2f}")
print(f"Max Drawdown: {metrics['max_drawdown']:.2%}")

## Strategy 0 Performance Discussion

Although simple and interpretable, Strategy 0 underperforms buy-and-hold. Potential reasons:

- Lags in reacting to market trends.
- Sensitivity to noisy signals.
- Strong S&P 500 uptrend over the evaluation period favors holding.

Despite the underperformance, this strategy offers:
- A testable framework.
- A basis for improving with filters (e.g. RSI, MACD).



In [None]:
# adding transaction costs
transaction_cost = 0.001  # 0.1% transaction cost
sp500['Transaction Cost'] = transaction_cost * sp500['Position'].diff().abs()
sp500['Strategy0_Return_With_Costs'] = sp500['Strategy0_Return'] - sp500['Transaction Cost']
sp500['Cumulative_Strategy0_With_Costs'] = (1 + sp500['Strategy0_Return_With_Costs']).cumprod()   

# Plotting the strategy returns with transaction costs
plt.figure(figsize=(12, 6))
plt.plot(sp500['Cumulative_Strategy0_With_Costs'], label='Strategy 0 Returns with Costs', color='purple')
plt.plot(sp500['Cumulative_Strategy0'], label='Strategy 0 Returns without Costs', color='blue')
plt.plot(sp500['Cumulative BuyHold'], label='Buy & Hold Returns', color='orange')
plt.title('Cumulative Returns: Strategy 0 vs. Buy & Hold (with Costs)')
plt.xlabel('Date')
plt.ylabel('Cumulative Return')
plt.legend()    

In [None]:
# ADF test for stationarity
print("==== Augmented Dickey-Fuller Test on Returns====")
adf_result = adfuller(sp500['Daily Return'].dropna())
adf_stat, adf_p = adf_result[0], adf_result[1]
print(f"ADF Statistic: {adf_stat:.4f}")
print(f"p-value: {adf_p:.4f}")

if adf_p < 0.05:
    print("Result: The time series is likely **stationary** (reject null hypothesis).")
else:
    print("Result: The time series appears **non-stationary** (fail to reject null hypothesis).")

# Variance ratio test
print("\n==== Variance Ratio Test on Returns ====")
vr_test = VarianceRatio(sp500['Daily Return'].dropna(), lags=2)
vr_stat, vr_p = vr_test.stat, vr_test.pvalue
print(f"Variance Ratio: {vr_stat:.4f}")
print(f"p-value: {vr_p:.4g}")

if vr_p < 0.05:
    if vr_stat < 0:
        print("Result: The series shows **mean reversion** (reject random walk null hypothesis).")
    else:
        print("Result: The series shows **trend-following behavior** (reject random walk null hypothesis).")
else:
    print("Result: The series may follow a **random walk** (fail to reject null hypothesis).")

#Hurst exponent 
print("\n==== Hurst Exponent ====")
hurst_stat, _, _ = compute_Hc(sp500['Daily Return'].dropna(), kind='change', simplified=True)
print(f"Hurst Exponent: {hurst_stat:.4f}")  
if hurst_stat < 0.5:
    print("Result: The series shows **mean reversion** behavior.")
elif hurst_stat == 0.5:
    print("Result: The series follows a **random walk**.")
else:
    print("Result: The series shows **trending behavior**.")

# KPSS test for stationarity
print("\n==== KPSS Test on Returns ====")
kpss_result = kpss(sp500['Daily Return'].dropna(), regression='c')
kpss_stat, kpss_p = kpss_result[0], kpss_result[1]
print(f"KPSS Statistic: {kpss_stat:.4f}")
print(f"p-value: {kpss_p:.4f}") 



## Statistical Summary: Return Series Diagnostics

We apply standard statistical tests to the **daily returns** of the S&P 500 to assess:

- Stationarity
- Random walk behavior
- Memory and mean-reversion

### Results:
- **ADF Test**: Stationary
- **Variance Ratio**: Mean-reverting
- **Hurst Exponent**: Weak trend-following
- **KPSS**: Stationary



In [None]:
# check if price data is stationary
# ADF test for stationarity on price data
print("==== Augmented Dickey-Fuller Test on Price ====")
adf_result = adfuller(sp500['Close'].dropna())
adf_stat, adf_p = adf_result[0], adf_result[1]
print(f"ADF Statistic: {adf_stat:.4f}")
print(f"p-value: {adf_p:.4f}")      
if adf_p < 0.05:
    print("Result: The price series is likely **stationary** (reject null hypothesis).")
else:
    print("Result: The price series appears **non-stationary** (fail to reject null hypothesis).")      
    
# Variance ratio test on price data
print("\n==== Variance Ratio Test on Price ====")
vr_test = VarianceRatio(sp500['Close'].dropna(), lags=2)
vr_stat, vr_p = vr_test.stat, vr_test.pvalue
print(f"Variance Ratio: {vr_stat:.4f}")
print(f"p-value: {vr_p:.4g}")   

if vr_p < 0.05:
    if vr_stat < 0:
        print("Result: The price series shows **mean reversion** (reject random walk null hypothesis).")
    else:
        print("Result: The price series shows **trend-following behavior** (reject random walk null hypothesis).")
else:
    print("Result: The price series may follow a **random walk** (fail to reject null hypothesis).")    
  
  
#Hurst exponent for price data
print("\n==== Hurst Exponent on Price ====")
H, c, data_reg = compute_Hc(sp500['Close'].dropna(), kind='price', simplified=True)
print(f"Hurst Exponent: {H:.4f}")
if H < 0.5:
    print("Result: The price series shows **mean reversion** behavior.")
elif H > 0.5:
    print("Result: The price series shows **trend-following** behavior.")
else:
    print("Result: The price series is likely a **random walk**.")  
    
# generalized Hurst exponent for log price data
print("\n==== Generalized Hurst Exponent on log Price ====")        
log_prices = sp500['Close'].dropna().apply(lambda x: np.log(x))
H, c, data_reg = compute_Hc(log_prices, kind='price', simplified=False)
print(f"Generalized Hurst Exponent: {H:.4f}")
if H < 0.5:
    print("Result: The log price series shows **mean reversion** behavior.")
elif H > 0.5:
    print("Result: The log price series shows **trend-following** behavior.")
else:
    print("Result: The log price series is likely a **random walk**.")
        
# KPSS test for stationarity on price data
print("\n==== KPSS Test on Price ====")
kpss_result = kpss(sp500['Close'].dropna(), regression='c')
kpss_stat, kpss_p = kpss_result[0], kpss_result[1]
print(f"KPSS Statistic: {kpss_stat:.4f}")
print(f"p-value: {kpss_p:.4f}") 

## Statistical Summary: Price Series Diagnostics

We repeat similar tests on **raw price data** and **log-transformed prices**.

### Results:
- Price series is **non-stationary**.
- Displays **trend-following** behavior and persistence.
- Consistent with a financial asset following a random walk.



In [None]:
def halflife(series):
    lagged_series = series.shift(1).dropna()
    delta = series.diff().dropna()

    # Align lengths
    lagged_series = lagged_series.loc[delta.index]

    # Regress delta on lagged series
    model = sm.OLS(delta, sm.add_constant(lagged_series)).fit()
    beta = model.params.iloc[1]  # Coefficient of the lagged series

    halflife = -np.log(2) / beta if beta != 0 else np.nan
    return halflife

halflife_value = halflife(sp500['Close'].dropna())
print(f"Halflife of Mean Reversion: {halflife_value:.2f} days")

## Half-Life of Mean Reversion

The half-life estimates how long it takes a series to revert halfway to its mean.

### Result: ~586.62 days

This long half-life suggests weak or near-absent mean-reversion in the raw S&P 500 price series, reinforcing the need to model returns or residuals instead.



## Strategy 1: Linear Mean Reversion on S&P 500

A simple mean-reversion strategy using z-score signals:

- Compute z-score of price deviations from a rolling mean.
- Trade in the opposite direction of z-score (expect reversion).
- Evaluate performance using cumulative returns and metrics.


In [None]:
# strategy 1: linear mean reversion

lookback = round(halflife_value)  # Use the halflife as the lookback period

# Step 1: Compute rolling mean and std over the halflife window
rolling_mean = sp500['Close'].rolling(window=lookback).mean()
rolling_std = sp500['Close'].rolling(window=lookback).std()

# Step 2: Compute z-score (normalized deviation from mean)
z_score = -(sp500['Close'] - rolling_mean) / rolling_std

# Step 3: Market value (position) = -z_score
mkt_val = z_score.shift(1)  # lagging by 1 day (to avoid lookahead bias)

# Step 4: Daily returns of the price
daily_return = sp500['Close'].pct_change()

# Step 5: Strategy P&L = position * next day return
strategy_pnl = mkt_val * daily_return

# cumulative return
cumulative_return = (1 + strategy_pnl).cumprod()
cumulative_return = cumulative_return.squeeze()

# Display final stats

plt.figure(figsize=(10, 4))
cumulative_return.plot(title='Cumulative Return of Mean-Reversion Strategy')
plt.ylabel('Cumulative Return')
plt.grid(True)
plt.show()

metrics = strategy_metrics(cumulative_return, sp500['Cumulative BuyHold'])
print(f"Strategy 1 Return: {metrics['strategy_return']:.2%}")
print(f"Buy & Hold Return: {metrics['buy_hold_return']:.2%}")
print(f"Annualized Strategy 1 Return: {metrics['annualized_strategy_return']:.2%}")
print(f"Annualized Buy & Hold Return: {metrics['annualized_buy_hold_return']:.2%}")
print(f"Annualized Volatility: {metrics['annualized_volatility']:.2%}")
print(f"Sharpe Ratio: {metrics['sharpe_ratio']:.2f}")
print(f"Max Drawdown: {metrics['max_drawdown']:.2%}")   

## Mean-Reversion Strategy Performance Summary

### Cumulative Performance
- **Strategy Return**: -16.90%
- **Buy & Hold Return**: 98.39%

### Annualized Metrics
- Strategy underperformed significantly despite early gains.
- **Sharpe Ratio**: 4.73 — inflated due to volatility clustering.

### Risk
- **Max Drawdown**: -45.90% — very risky profile.

## Interpretation
- The strategy failed to recover from strong directional trends.
- Mean-reversion signal lacked robustness on raw S&P 500.
- High Sharpe ratio is misleading without consistent gains.



## Cointegration analysis: S&P 500 vs Other ETFs


In [None]:
import yfinance as yf

tickers = ['^GSPC', 'DIA', 'QQQ', 'VTI', 'XLF', 'TLT', 'EWA', 'EWC']
start_date = '2022-07-11'
end_date = '2025-07-11'

# Download all data
raw_data = yf.download(tickers, start=start_date, end=end_date)

# Extract only the 'Close' price from multi-level columns
close_prices = raw_data['Close'].dropna()

# Preview
print(close_prices.head())




In [None]:
# cointegrate sp500 with other assets
from statsmodels.tsa.stattools import coint     
# Function to perform cointegration test
def cointegration_test(series1, series2):
    score, p_value, _ = coint(series1, series2)
    return score, p_value   

# Perform cointegration test between S&P 500 and other assets
print("\n==== Cointegration Test Results ====\n")
close_prices = close_prices.dropna()  # Ensure no NaN values
data = close_prices['^GSPC']
for ticker in close_prices.columns:
    if ticker != '^GSPC':
        score, p_value = cointegration_test(data, close_prices[ticker])
        print(f"Cointegration test between S&P 500 and {ticker}: \n Score = {score:.4f}, p-value = {p_value:.4f}")
        if p_value < 0.05:
            print(f"{ticker} is cointegrated with S&P 500 (reject null hypothesis).\n \n")
        else:
            print(f"{ticker} is not cointegrated with S&P 500 (fail to reject null hypothesis). \n \n")   

In [None]:
# Plotting the close prices of S&P 500 and other assets
plt.figure(figsize=(14, 7))
# for ticker in close_prices.columns:
plt.plot(close_prices['VTI'], label='VTI', color='orange')
plt.plot(close_prices['^GSPC'], label='S&P 500', color='blue')
plt.title('Close Prices of S&P 500 and VTI')
plt.xlabel('Date')
plt.ylabel('Price')
plt.legend()
plt.grid()
plt.show()


# Normalize all prices to start at 1
plt.figure(figsize=(14, 7))
normalized_prices = close_prices / close_prices.iloc[0]
plt.plot(normalized_prices['VTI'], label='VTI', color='orange')
plt.plot(normalized_prices['^GSPC'], label='S&P 500', color='blue')
plt.title('Normalized Prices of S&P 500 and VTI')
plt.ylabel('Normalized Price (Start = 1)')
plt.xlabel('Date')
plt.legend()
plt.grid()
plt.show()

#scatter plot of VTI vs S&P 500
plt.figure(figsize=(14,7))
plt.scatter(close_prices['VTI'], close_prices['^GSPC'], alpha=0.5)
plt.title('Scatter Plot of VTI vs S&P 500')
plt.xlabel('VTI Close Price')
plt.ylabel('S&P 500 Close Price')
plt.grid()
plt.show()

In [None]:
# residual hedge ratio
import statsmodels.api as sm    
def residual_hedge_ratio(series1, series2):
    model = sm.OLS(series1, sm.add_constant(series2)).fit()
    hedge_ratio = model.params[1]  # Coefficient of series2
    return hedge_ratio
hedge_ratio = residual_hedge_ratio(close_prices['^GSPC'], close_prices['VTI'])
print(f"Residual Hedge Ratio (S&P 500 to VTI): {hedge_ratio:.4f}")

# plot residuals
residuals = close_prices['^GSPC'] - hedge_ratio * close_prices['VTI']
plt.figure(figsize=(14, 7))
plt.plot(residuals, label='Residuals (S&P 500 - Hedge Ratio * VTI)', color='purple')
plt.title('Residuals of S&P 500 and VTI')
plt.xlabel('Date')
plt.ylabel('Residuals')
plt.legend()
plt.grid()
plt.show()

# Perform ADF test on residuals
adf_result = adfuller(residuals.dropna())
adf_stat, adf_p = adf_result[0], adf_result[1]
print(f"ADF Statistic on Residuals: {adf_stat:.4f}")
print(f"p-value on Residuals: {adf_p:.4f}")
if adf_p < 0.05:
    print("Result: The residuals are likely **stationary** (reject null hypothesis).")
else:
    print("Result: The residuals appear **non-stationary** (fail to reject null hypothesis).")  

## Cointegration Analysis: S&P 500 and VTI

We tested whether other ETFs are cointegrated with the S&P 500.

- Method: Engle-Granger (ADF on residuals)
- Only **VTI** showed statistically significant cointegration.
- This opens the door for **residual-based trading**.

In [None]:
# apply the strategy 1 to residuals

# Step 0: compute halflife of residuals
halflife_value = halflife(residuals.dropna())

# Step 1: Compute rolling mean and std over the halflife window
rolling_mean = residuals.rolling(window=round(halflife_value)).mean()
rolling_std = residuals.rolling(window=round(halflife_value)).std() 

# Step 2: Compute z-score (normalized deviation from mean)
z_score = -(residuals - residuals.rolling(lookback).mean()) / residuals.rolling(lookback).std()
z_score = z_score.clip(-2, 2)  # Cap at reasonable limits

# Step 3: Market value (position) = -z_score
mkt_val = z_score.shift(1)  # lagging by 1 day (to avoid lookahead bias)
# Step 4: Daily returns of the price of S&P 500 and VTI
close_prices = close_prices.dropna()  # Ensure no NaN values
daily_return_sp500 = close_prices['^GSPC'].pct_change()
daily_return_vti = close_prices['VTI'].pct_change()
# Step 5: Strategy P&L = position * next day return of S&P 500 - hedge ratio * next day return of VTI
spread_return = residuals.pct_change()
strategy_pnl = mkt_val.shift(1) * spread_return  
# cumulative return
cumulative_return = (1 + strategy_pnl).cumprod()
cumulative_return = cumulative_return.squeeze() 
# Display final stats
plt.figure(figsize=(10, 4))
cumulative_return.plot(title='Cumulative Return of Mean-Reversion Strategy on Residuals')
plt.ylabel('Cumulative Return')
plt.grid(True)
plt.show()
metrics = strategy_metrics(cumulative_return, close_prices['^GSPC'].pct_change().add(1).cumprod())
print(f"Strategy 1 Return: {metrics['strategy_return']:.2%}")
print(f"Buy & Hold Return: {metrics['buy_hold_return']:.2%}")
print(f"Annualized Strategy 1 Return: {metrics['annualized_strategy_return']:.2%}")     
print(f"Annualized Buy & Hold Return: {metrics['annualized_buy_hold_return']:.2%}")
print(f"Annualized Volatility: {metrics['annualized_volatility']:.2%}")
print(f"Sharpe Ratio: {metrics['sharpe_ratio']:.2f}")
print(f"Max Drawdown: {metrics['max_drawdown']:.2%}")
 

## Mean-Reversion Strategy on Residuals between S&P 500 and VTI: Summary

This strategy applies a mean-reversion signal on the residuals of the cointegration regression between S&P 500 and VTI.

### Steps:
1. Estimate hedge ratio via regression.
2. Calculate residuals and test for stationarity (ADF: **p=0.0064**).
3. Apply a z-score-based mean-reversion signal.

### Results:
| Metric                    | Value     |
|---------------------------|-----------|
| Strategy Return           | 141.26%   |
| Buy & Hold Return         | 62.94%    |
| Annualized Strategy Return| 134.28%   |
| Annualized Volatility     | 53.07%    |
| Sharpe Ratio              | 2.53      |
| Max Drawdown              | -24.50%   |




## Interpretation

- Strong absolute and risk-adjusted performance.
- High Sharpe Ratio (>2.5) suggests strong signal quality.
- Residual-based strategy outperformed both prior strategies.

This confirms the potential of statistical arbitrage using cointegration-based signals.

## Final Takeaways

- The S&P 500 shows both trend-following and weak mean-reverting behavior.
- Strategy 0 (MA Crossover) underperformed Buy & Hold, especially with costs.
- Strategy 1 (Mean Reversion on Price) failed due to lack of stationarity.
- Strategy 2 (Residual-based Mean Reversion) performed best, with strong Sharpe ratio.
- This lays the groundwork for further statistical arbitrage modeling.

Future work:
- Improve filtering and smoothing.
- Add volatility-adjusted position sizing.
- Expand to multi-asset frameworks.