## Backtesting the Strategy
Here we will be looking at how the strategy has performed in the last 6 months and then using this we will decide if we need to move further or not.

One addition to your previous code would be to add another column for the beta between the assets.

This way we can get the hedge ratio.

Here is a task list for those who want to code. Otherwise you can use mine as reference

## Pairs Trading Strategy with Beta as Hedge Ratio: Task List
# Data Preparation and Research
Asset Pairs Selection: Identify and finalize the list of asset pairs for pairs trading. Record their hedge ratios (beta) based on historical data. Data Collection: Research and decide on the data source for real-time or historical price data for selected asset pairs.

# Code Development
Fetch Asset Data: Write code to fetch historical or real-time price data for the selected asset pairs. Calculate Spread: Implement logic to calculate the spread between each asset pair using the formula Spread = Asset2 − beta × Asset1 Spread=Asset2−β×Asset1. Z-Score Calculation: Calculate the z-score of the spread to standardize it. Signal Generation: Generate trading signals based on the calculated z-score. Define upper and lower thresholds for entering and exiting trades. Use Beta as Hedge Ratio: Implement logic to use beta as a hedge ratio while placing trades to make the portfolio market-neutral.

In [2]:
# Importing additional necessary libraries for quant metrics and plotting
from pyfolio.timeseries import perf_stats
import plotly.graph_objs as go
import os
import pandas as pd
import numpy as np
from datetime import datetime
from dateutil.relativedelta import relativedelta
from scipy.stats import zscore
import yfinance as yf
import warnings

# Ignore all warnings
warnings.filterwarnings("ignore")

In [3]:
def backtest(df_trading, asset1 : str, asset2 : str, beta: float):
    metrics = {}

    df_trading['spread'] = df_trading[asset2] - beta * df_trading[asset1]
    df_trading['zscore'] = zscore(df_trading['spread'])

    UL = df_trading['zscore'].mean() + df_trading['zscore'].std()
    LL = df_trading['zscore'].mean() - df_trading['zscore'].std()

    holding_position = False  # Flag to indicate if we are holding a position
    df_trading['asset1_signal'] = 0
    df_trading['asset2_signal'] = 0

    for i in range(1, len(df_trading)):
        if not holding_position:
            if df_trading['zscore'].iloc[i] > UL:
                df_trading['asset1_signal'].iloc[i] = -beta
                df_trading['asset2_signal'].iloc[i] = 1
                holding_position = True  # Now holding a position

            elif df_trading['zscore'].iloc[i] < LL:
                df_trading['asset1_signal'].iloc[i] = beta
                df_trading['asset2_signal'].iloc[i] = -1
                holding_position = True  # Now holding a position

        elif holding_position:
            if LL <= df_trading['zscore'].iloc[i] <= UL:
                # Closing the trade
                df_trading['asset1_signal'].iloc[i] = -df_trading['asset1_signal'].iloc[i-1]  # Reverse the last trade
                df_trading['asset2_signal'].iloc[i] = -df_trading['asset2_signal'].iloc[i-1]
                holding_position = False  # No longer holding a position

        # Daily returns
    df_trading['asset1_signal'] = df_trading['asset1_signal'].shift(1)
    df_trading['asset2_signal'] = df_trading['asset2_signal'].shift(1)

    df_trading['asset1_returns'] = df_trading[asset1].pct_change() * df_trading['asset1_signal']
    df_trading['asset2_returns'] = df_trading[asset2].pct_change() * df_trading['asset2_signal']

    df_trading['portfolio_returns'] = df_trading['asset1_returns'] + df_trading['asset2_returns']

    # Quantitative metrics
    stats = perf_stats(df_trading['portfolio_returns'].dropna())
    metrics['CAGR'] = stats['Annual return']
    metrics['Sharpe ratio'] = stats['Sharpe ratio']
    metrics['Max Drawdown'] = stats['Max drawdown']
    metrics['Number of Trades'] = df_trading['asset1_signal'].ne(0).sum()  # Counting non-zero entries

    return metrics, df_trading

In [4]:
# Function to save the trading signals graph
def save_plotly_graph(df, asset1, asset2):
    # Create directory if it doesn't exist
    if not os.path.exists('img/signals'):
        os.makedirs('img/signals')

    # Create the figure
    fig = go.Figure()

    # Add z-score trace
    fig.add_trace(go.Scatter(x=df.index, y=df['zscore'], mode='lines', name='Z-Score'))

    # Add upper and lower limits as dashed lines
    UL = df['zscore'].mean() + df['zscore'].std()
    LL = df['zscore'].mean() - df['zscore'].std()

    fig.add_trace(go.Scatter(x=df.index, y=[UL]*len(df.index), mode='lines', name='Upper Limit', line=dict(dash='dash')))
    fig.add_trace(go.Scatter(x=df.index, y=[LL]*len(df.index), mode='lines', name='Lower Limit', line=dict(dash='dash')))

    # Add buy and sell signals
    fig.add_trace(go.Scatter(x=df.index, y=df['zscore'].where(df['asset1_signal'] > 0),
                             mode='markers', name='Buy Signal', marker=dict(color='green', symbol='triangle-up')))

    fig.add_trace(go.Scatter(x=df.index, y=df['zscore'].where(df['asset1_signal'] < 0),
                             mode='markers', name='Sell Signal', marker=dict(color='red', symbol='triangle-down')))

    # Layout options
    fig.update_layout(title=f'Trading Signals for {asset1} and {asset2}',
                      xaxis_title='Date',
                      yaxis_title='Z-Score')

    # Save the figure
    fig.write_html(f'img/signals/{asset1}_{asset2}.html')

In [5]:
# Main function
def main(csv_path: str):
    # Read the CSV file
    asset_pairs = pd.read_csv(csv_path)

    # Initialize metrics DataFrame
    metrics_df = pd.DataFrame(columns=['Pair Name', 'CAGR', 'Sharpe Ratio', 'Number of Trades', 'Max Drawdown'])

    # Date range for backtesting (Last 6 months)
    end = datetime.now().date()
    start = (datetime.now() - relativedelta(months=6)).date()

    for index, row in asset_pairs.iterrows():
        print("Running Backtest for Asset Pair")
        asset1 = row['Asset1']
        asset2 = row['Asset2']
        beta = row['Beta']
        print("Downloading Data")
        # Download data
        asset1_data = yf.download(asset1, start=start, end=end, progress=False)['Adj Close']
        asset2_data = yf.download(asset2, start=start, end=end, progress=False)['Adj Close']

        # DataFrame for backtesting
        df_trading = pd.DataFrame({asset1: asset1_data, asset2: asset2_data})
        print("Running Backtest metrics")
        # Backtest and get metrics and signals
        metrics, signals_df = backtest(df_trading, asset1, asset2, beta)

        new_row = pd.DataFrame({
            'Pair Name': [f'{asset1}_{asset2}'],
            'CAGR': [metrics['CAGR']],
            'Sharpe Ratio': [metrics['Sharpe ratio']],
            'Number of Trades': [metrics['Number of Trades']],
            'Max Drawdown': [metrics['Max Drawdown']]
        })

        metrics_df = pd.concat([metrics_df, new_row], ignore_index=True)


        # Save the Plotly graph
        save_plotly_graph(signals_df, asset1, asset2)

    # Save the metrics DataFrame
    metrics_df.to_csv('data/backtest.csv', index = False)

In [6]:
main("/Users/gabe/Desktop/FIM 500/Pairs Trading/Live_Pairs_Trading/research/manufacturing/data/final_pairs.csv")

Running Backtest for Asset Pair
Downloading Data
Running Backtest metrics
Running Backtest for Asset Pair
Downloading Data
Running Backtest metrics
Running Backtest for Asset Pair
Downloading Data
Running Backtest metrics
Running Backtest for Asset Pair
Downloading Data
Running Backtest metrics
Running Backtest for Asset Pair
Downloading Data
Running Backtest metrics
Running Backtest for Asset Pair
Downloading Data
Running Backtest metrics
Running Backtest for Asset Pair
Downloading Data
Running Backtest metrics
Running Backtest for Asset Pair
Downloading Data
Running Backtest metrics
Running Backtest for Asset Pair
Downloading Data
Running Backtest metrics
Running Backtest for Asset Pair
Downloading Data
Running Backtest metrics
Running Backtest for Asset Pair
Downloading Data
Running Backtest metrics
Running Backtest for Asset Pair
Downloading Data
Running Backtest metrics
Running Backtest for Asset Pair
Downloading Data
Running Backtest metrics


In [7]:
backtest_df = pd.read_csv("data/backtest.csv")

In [8]:
backtest_df

Unnamed: 0,Pair Name,CAGR,Sharpe Ratio,Number of Trades,Max Drawdown
0,CSX_CP,-0.002166,0.009283,14,-0.035479
1,EMR_DE,0.383708,2.347223,15,-0.038874
2,FAST_CP,-0.053387,-1.587192,6,-0.034318
3,FDX_CP,0.034424,0.821413,8,-0.02159
4,FDX_FAST,0.048542,1.359948,8,-0.014827
5,MAS_CP,-0.060352,-1.645778,8,-0.034316
6,NSC_LUV,-0.079429,-1.043302,11,-0.07028
7,OTIS_CP,0.002632,0.141068,7,-0.009017
8,RTX_LMT,-0.136589,-1.560015,12,-0.070801
9,TXT_DE,0.063396,0.618392,8,-0.061907
