<h2>
Backtesting a Pairs Trading Strategy
</h2>
<p>
This notebook is a sequel to the notebook <i>Exploratory Statistics of Pairs Trading</i>
(https://github.com/IanLKaplan/pairs_trading/blob/master/pairs_trading.ipynb). The previous notebook explores the algorithms for selecting
pairs and the statistics of pairs trading. This statistical exploration provides the foundation for the strategy that his backtested in
this notebook. For a discussion of pairs trading, the algorithms used to select pairs and the background for the strategy that is
backetested in this notebook, please see the pevious notebook.
</p>
<h2>
Pairs Trading Strategy
</h2>
<h3>
Shorting Stocks in Pairs Trading
</h3>
<p>
This section discusses the mechanics of taking a short position in a stock, which is more complicated than taking a long position.
</p>
<p>
Pairs trading is a market neutral long/short strategy where the long and short positions for a pair have approximately equal
dollar values when the position is opened. A profit is realized when there is mean reversion in the prices of a pair.
</p>
<p>
When a stock is "shorted", the stock is borrowed and then immediately sold, realizing cash.  For example, if 100 shares at a current market
price of 10 is shorted, the brokerage account will be credited with 1000 (100 x 10).  At some point in the future, the borrowed stock must be paid
back buy buying the stock at the current market price.  A short position is profitable when the market price of the stock goes down.
For example, if the market price of the stock shorted at 10 goes down to 6, there is a profit of 4 per share (4 x 100 = 400).
</p>
<p>
Short positions can have unlimited loss when the stock price goes up.  For example, if the market price of the 10 stock rises to
14 per share, there is a 400 loss on the 100 share purchase. If the stock price doubles to 20 there will be a loss of 10 per share
or 1000 for the 100 share short.
</p>
<p>
Shorting stocks is often considered a risky investment strategy because of the potential of unlimited risk in short positions. Pairs
trading uses market neutral positions where a long position is opened that is, initially, approximately equal to the dollar value of the short position.
The pairs that are traded are chosen from the same industry sector and are highly correlated and cointegrated. If the market price of
the shorted stock rises, the value of the long member of the pair should rise as well. The profit in the long position will tend
to offset the loss in the short position. This makes the pairs trading strategy much less risky than a short only strategy.
</p>
<p>
When a stock is shorted the stock is borrowed. This is treated as a margin loan. The brokerage requires that the customer maintain a
balance with liquid assets of 150 percent of the amount borrowed. This includes the proceeds of the short sale, plus 50 percent.
For example, if 100 shares of a 10 dollar stock are shorted, the account will be credited with 1000. The account must also
have an addition balace of 500. The margin requirement can be met with cash or highly tradable "blue chip" stocks (e.g., S&P 500 stocks).
</p>
<p>
When the pairs spread crosses a threshold, a long-short position is opened in the pair. The dollar value of the long and short positions
will be approximately equal (they will usually not be exactly equal because we are trading whole share amounts).  This involves the following
steps:
</p>
<ol>
<li>
<p>
Open the short position. This will result in cash from the short sale.
</p>
<p>
Stock A has a price of 10. Shorting 100 shares results in 1000 in cash.
</p>
</li>
<li>
<p>
The proceeds from the short sale are used to pay for the long position. If the cash value of the short position was less than the long
position, some additional cash will be needed to open the long position.
</p>
<p>
Stock B has a price of 20 per share. A long position is taken in 50 shares. The 1000 realized from the short is used
to pay for the long position.
</p>
</li>
<li>
<p>
The 1000 long position is used for the margin requirement. An additional 500 in cash or highly liquid stocks will is required for
the margin requirement.
</p>
</li>
</ol>
<p>
When positions are opened, there must be an additional 50% in the margin account. SEC regulation T requires that there must be at
least 25% as the prices of the stocks change. Interactive Brokers (IB) calculates the margin requirements in real time and will liquidate
account assets that cross the Reg T margin line.
</p>
<p>
If there is a liquidity deficit, IB will liquidate the deficit amount time 4.
</p>
<p>
The pairs trading strategy will have a portfolio of short and long positions which are opened and closed as the pair spread moves.
At any time, the aggregate value of the short positions and the long positions, plus margin cash, must be within the margin
requirements.
</p>
<h4>
Margin references
</h4>
<ul>
<li>
<a href="https://www.interactivebrokers.com/en/general/education/pdfnotes/WN-UnderstandingMargin.php">Understanding Margin Webinar Notes</a>
</li>
</ul>
<h3>
In-sample and out-of-sample time periods
</h3>
<ul>
<li>
<p>
In-sample period: six months (126 trading days)
</p>
</li>
<li>
<p>
Out-of-sample (trading) period: three months (63 trading days)
</p>
</li>
</ul>
<h4>
Strategy
</h4>
<p>
For each in-sample period:
</p>
<ol>
<li>
Get pairs for each S&P 500 industrial sector
</li>
<li>
Select the pairs with close price series correlation greater than or equal to 0.75
</li>
<li>
Select the high correlation pairs that show Granger cointegration
</li>
<li>
Sort the spread time series for the selected pairs by volatility (high to low volatility). Pairs with spread that has
high volatility (standard deviation) are more likely to be profitable.
</li>
<li>
Select the top M volatile pairs.
</li>
<li>
Remove pairs that have the same stock
</li>
<li>
Select N pairs from the unique pair list
</li>
</ol>
<h4>
Out-of-sample trading period
</h4>
<p>
This is not an academic exercise.  The pairs trading backtest is intended to be as close to actual trading as possible.
This backtest is intended to help understand whether this strategy is worth pursuing for actual trading.
</p>
<p>
At the start date of the backtest, there is an investment of N dollars (e.g., 100,000). Of these funds, approximately 60,000 is used for
long and short positions. The remaining approximately 40,000 is used to satisify the margin requirement.
</p>
<p>
When the spread crosses the median the long and short positions will be closed.
</p>
At the end of each trading period, any open positions will closed. The resulting cash is used in the next trading period.
</p>
<p>
For each pair (in the N pair set) in the out-of-sample trading period:
</p>
<ol>
<li>
Calculate the spread value for the current trading day.
</li>
<li>
If there is an open pair position that crosses the mean, the positions will be closed. The profit and
loss amount will be updated.
</li>
<li>
If the spread value for a pair crosses the treading threshold (e.g., standard deviation times 0.75),
open a long/short position.
</li>
<li>
If the end of the trading period is reached, close all open positions and update the profit and loss.
</li>
</ol>
<p>
Positions are opened for whole share values.
</p>
<p>
The results of the backtest should provide the following statistics
</p>
<h4>
Trading Period Statistics
</h4>
<ol>
<li>
Running margin values, by day.
</li>
<li>
Positions for each pair and P/L for each trades.
</li>
<li>
Return for each pair in the trading period
</li>
<li>
Overalll return for the trading period
</li>
<li>
Standard deviation for the trading period
</li>
<li>
Number of pairs that had a loss and a profit
</li>
<li>
Maximum drawdown for the trading period
</li>
</ol>
<h4>
Yearly Results
</h4>
<li>
Yearly return
</li>
<li>
Yearly standard deviation
</li>
<li>
Yearly maximum drawdown
</li>
<li>
Sharpe Ratio
</li>
<li>
VaR and CVaR
</li>
</ol>
<h4>
Data structures
</h4>
<ul>
<li>
Pairs list for the trading period
</li>
<li>
Current trading capital balance
</li>
<li>
Trade position and P/L for each trade in the trading period.
</li>
<li>
Quarterly and yearly statistics. Once the statistics are calculated the trade position data can be discarded.
</li>
</ul>

In [None]:
import os
from datetime import datetime
from multiprocessing import Pool
from typing import List, Tuple, Dict

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
import statsmodels.api as sm
from numpy import log
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.vector_ar.vecm import coint_johansen
from tabulate import tabulate

from coint_analysis.coint_analysis_result import CointAnalysisResult, CointInfo
from coint_data_io.coint_matrix_io import CointMatrixIO
#
# Local libraries
#
from plot_ts.plot_time_series import plot_ts, plot_two_ts
from read_market_data.MarketData import MarketData

from s_and_p_filter import s_and_p_directory, s_and_p_stock_file
s_and_p_file = s_and_p_directory + os.path.sep + s_and_p_stock_file

start_date_str = '2007-01-03'
start_date: datetime = datetime.fromisoformat(start_date_str)

trading_days = 252
half_year = int(trading_days/2)
quarter = int(trading_days/4)
