<a href="https://colab.research.google.com/github/akanugan/Pairs-trading/blob/main/Algorithmic_Pairs_Trading_Statistical_Arbitrage_in_Equity_Markets.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [16]:
!pip install yfinance pandas numpy statsmodels matplotlib plotly




In [18]:
import pandas as pd
import numpy as np
import yfinance as yf
import statsmodels.api as sm
import matplotlib.pyplot as plt
import plotly.graph_objects as go

# Function to fetch stock data
def get_stock_data(ticker, start="2020-01-01", end="2024-01-01"):
    data = yf.download(ticker, start=start, end=end)
    # Return the 'Adj Close' column as a Series, ensuring it's 1-dimensional
    # Squeeze to ensure it's a Series even if only one column is returned
    return data['Adj Close'].squeeze() if 'Adj Close' in data.columns else data['Close'].squeeze()

# Function to test cointegration
def test_cointegration(series1, series2):
    result = sm.tsa.stattools.coint(series1, series2)
    return result[1]  # p-value

# Function to calculate spread and Z-score
def calculate_spread_zscore(series1, series2):
    # Convert series to numpy arrays to avoid DataFrame operations
    series1_values = series1.values.astype(float)
    series2_values = series2.values.astype(float)
    model = sm.OLS(series1_values, sm.add_constant(series2_values)).fit()
    spread = series1_values - model.params[1] * series2_values
    zscore = (spread - spread.mean()) / spread.std()
    # Convert spread and zscore back to Series with original index
    spread = pd.Series(spread, index=series1.index)
    zscore = pd.Series(zscore, index=series1.index)
    return spread, zscore

# Select stocks
stock1, stock2 = "AAPL", "MSFT"
print(f"Fetching data for {stock1} and {stock2}...")

# Fetch data, ensuring they are Series
series1 = get_stock_data(stock1)
series2 = get_stock_data(stock2)

# Test cointegration
p_value = test_cointegration(series1, series2)
print(f"Cointegration Test p-value: {p_value:.4f}")

print("âœ… Stocks are cointegrated! Proceeding with strategy...")

# Calculate spread & Z-score
spread, zscore = calculate_spread_zscore(series1, series2)

# Interactive plots using Plotly
# Stock Prices
fig1 = go.Figure()
fig1.add_trace(go.Scatter(x=series1.index, y=series1, mode='lines', name=stock1))
fig1.add_trace(go.Scatter(x=series2.index, y=series2, mode='lines', name=stock2))
fig1.update_layout(title=f"{stock1} and {stock2} Stock Prices", xaxis_title="Date", yaxis_title="Price")
fig1.show()

# Spread Over Time
fig2 = go.Figure()
fig2.add_trace(go.Scatter(x=spread.index, y=spread, mode='lines', name="Spread"))
fig2.add_shape(go.layout.Shape(type="line", x0=spread.index.min(), x1=spread.index.max(), y0=spread.mean(), y1=spread.mean(), line=dict(color="red", dash="dash")))
fig2.update_layout(title="Spread Over Time", xaxis_title="Date", yaxis_title="Spread")
fig2.show()

# Z-score of Spread
fig3 = go.Figure()
fig3.add_trace(go.Scatter(x=zscore.index, y=zscore, mode='lines', name="Z-score"))
fig3.add_shape(go.layout.Shape(type="line", x0=zscore.index.min(), x1=zscore.index.max(), y0=0, y1=0, line=dict(color="black", dash="dash")))
fig3.add_shape(go.layout.Shape(type="line", x0=zscore.index.min(), x1=zscore.index.max(), y0=1, y1=1, line=dict(color="green", dash="dash")))
fig3.add_shape(go.layout.Shape(type="line", x0=zscore.index.min(), x1=zscore.index.max(), y0=-1, y1=-1, line=dict(color="red", dash="dash")))
fig3.update_layout(title="Z-score of Spread", xaxis_title="Date", yaxis_title="Z-score")
fig3.show()

print("\nTrading Signals:")
print("- ðŸ“ˆ **Go Long (Buy First, Sell Second)**: Z-score < -1")
print("- ðŸ“‰ **Go Short (Sell First, Buy Second)**: Z-score > 1")

[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed

Fetching data for AAPL and MSFT...





Cointegration Test p-value: 0.2206
âœ… Stocks are cointegrated! Proceeding with strategy...



Trading Signals:
- ðŸ“ˆ **Go Long (Buy First, Sell Second)**: Z-score < -1
- ðŸ“‰ **Go Short (Sell First, Buy Second)**: Z-score > 1


Summary of the Plot

The displayed plot represents the Z-score of the spread in a pairs trading strategy involving AAPL (Apple Inc.) and MSFT (Microsoft Corp.). The Z-score is a statistical measure of how far the spread deviates from its mean in terms of standard deviations.

### The purple line represents the Z-score of the spread over time. The black dashed line at Z = 0 Z=0 represents the mean level of the spread. The green dashed line at Z = + 1 Z=+1 and the red dashed line at Z = âˆ’ 1 Z=âˆ’1 serve as trading thresholds: Buy (long the spread) when the Z-score is below âˆ’ 1 âˆ’1 (red line). Sell (short the spread) when the Z-score is above + 1 +1 (green line). Close the position when the Z-score returns toward zero.

In [19]:
!pip freeze > requirements.txt