![QuantConnect Logo](https://cdn.quantconnect.com/web/i/icon.png)
<hr>

### Pairs Trading Quantitative Trading

Pairs trading is a market neutral trading strategy and it belongs to statistical arbitrage. The basic idea is to select two stocks which move similarly, sell the high priced stock and buy the low priced stock where there is a price divergence between the pairs.

In [None]:
%matplotlib inline
# Imports
from clr import AddReference
from datetime import datetime, timedelta
import matplotlib.pyplot as plt
import pandas as pd
# Create an instance
qb = QuantBook()

In [None]:
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
import pandas as pd
import statsmodels.api as sm
from math import floor
plt.style.use('seaborn-whitegrid')
from sklearn import linear_model

In [None]:
syls = ["KO", "PEP"]
qb.AddEquity(syls[0])
qb.AddEquity(syls[1])
start = datetime(2017,1,1)
end = datetime(2022,1,1)
x = qb.History([syls[0]],start ,end, Resolution.Daily).loc[syls[0]]['close']
y = qb.History([syls[1]],start ,end, Resolution.Daily).loc[syls[1]]['close']

In [None]:
syls = ["XOM", "CVX"]
qb.AddEquity(syls[0])
qb.AddEquity(syls[1])
start = datetime(2017,1,1)
end = datetime(2022,1,1)
x = qb.History([syls[0]],start ,end, Resolution.Daily).loc[syls[0]]['close']
y = qb.History([syls[1]],start ,end, Resolution.Daily).loc[syls[1]]['close']

In [None]:
price.plot(figsize = (15,10))

### Estimates
If we have two stocks, X & Y, that are cointegrated in their price movements, then any divergence in the spread from 0 should be temporary and mean-reverting. Next step we will estimate the spread series.

In [None]:
def reg(x,y):
    regr = linear_model.LinearRegression()
    x_constant = pd.concat([x,pd.Series([1]*len(x),index = x.index)], axis=1)
    regr.fit(x_constant, y)    
    beta = regr.coef_[0]
    alpha = regr.intercept_
    spread = y - x*beta - alpha
    return spread

In [None]:
x = lp['XOM']
y = lp['CVX']
spread = reg(x,y)
# plot the spread series
spread.plot(figsize =(15,10))
plt.ylabel('spread')

### Step 3: Check Stationarity
From the above plot, the first order difference 
seems to be stationary and mean-reverting. Next we will check if it is stationary. We use the ADF test to check the stationarity of the spread series.

In [None]:
# check if the spread is stationary 
adf = sm.tsa.stattools.adfuller(spread, maxlag=1)
print('ADF test statistic: %.02f' % adf[0])
for key, value in adf[4].items():
    print('\t%s: %.3f' % (key, value))
print('p-value: %.03f' % adf[1])