# Pair strategy using daily data - Python

### Overview
The Pair Trading Strategy is a standard mean reversion model; two stocks that normally trade in the same direction become temporally uncorrelated and eventually will revert to the mean.

Assume that the strategy will trade pairs of equities of the same industrial sector that are historically highly correlated.

### Indicator
Pairs trading is a market-neutral strategy, when we identify a deviation in the price relationship of these instruments, we expect a mean reversion.
We buy the underperforming instrument and simultaneously sell the outperforming one.

To illustrate the price relationship between pair instruements and to generate trading signals as explained above, we use the:

$$ Ratio = \frac{Last_1}{Last_2} $$
Where:
- $Last_1$ is the Last price of the instrument 1
- $Last_2$ is the Last price of the instrument 2

### Services used
This sample uses *gRPC requests* in order to retrieve bars from the dedicated hosted service. The queried endpoint in this script are:
* DailyBars: to directly retrieve bars objects from the server

### Modules required
1. Systemathics packages:
    * *systemathics.apis*
2. Open source packages
    * *googleapis-common-protos*
    * *protobuf*
    * *grpcio*
    * *pandas*
    * *matpotlib* as per display package

***

# Run Pair strategy using daily data

### Step 1: Install packages and import them

In [None]:
pip install googleapis-common-protos protobuf grpcio pandas matplotlib

In [None]:
pip install systemathics.apis

In [None]:
import os
import grpc
import pandas as pd
from datetime import datetime
import google.type.date_pb2 as date
import google.type.dayofweek_pb2 as dayofweek
import google.type.timeofday_pb2 as timeofday
import google.protobuf.duration_pb2 as duration
import systemathics.apis.type.shared.v1.identifier_pb2 as identifier
import systemathics.apis.services.daily.v1.daily_bars_pb2 as daily_bars
import systemathics.apis.services.daily.v1.daily_bars_pb2_grpc as daily_bars_service
import systemathics.apis.helpers.token_helpers as token_helpers
import systemathics.apis.helpers.channel_helpers as channel_helpers

### Step 2: Retrieve authentication token
The following code snippet sends authentication request and print token to console output in order to process the upcomming *gRPC queries*.

In [None]:
token = token_helpers.get_token()
display(token)

### Step 3: Retrieve prices

#### 3.1 Instrument selection

In [None]:
# set the instruments identifiers: tickers and exchange
exchange = "XNGS"
ticker_1 = "AAPL"
ticker_2 = "MSFT"

#### 3.2 Request creation
The following code snippets create *gRPC client*, process the *daily bars* request and stream the replies:

In [None]:
# create daily bars requests for the pair instruments
daily_request_1 = daily_bars.DailyBarsRequest(identifier = identifier.Identifier(exchange = exchange, ticker = ticker_1))
daily_request_2 = daily_bars.DailyBarsRequest(identifier = identifier.Identifier(exchange = exchange, ticker = ticker_2))

In [None]:
try:
    # open a gRPC channel, instantiate the daily bars service and get the reply for the 1st instrument
    with channel_helpers.get_grpc_channel() as channel:  
        daily_service = daily_bars_service.DailyBarsServiceStub(channel)
        response_1 = daily_service.DailyBars(request = daily_request_1, metadata = [('authorization', token)])
        
    print("Total bars retrieved: ",len(response_1.data))
except grpc.RpcError as e:
    display(e.code().name)
    display(e.details())

In [None]:
try:
    # open a gRPC channel, instantiate the daily bars service and get the reply for the 2nd instrument
    with channel_helpers.get_grpc_channel() as channel:  
        daily_service = daily_bars_service.DailyBarsServiceStub(channel)
        response_2 = daily_service.DailyBars(request = daily_request_2, metadata = [('authorization', token)])
        
    print("Total bars retrieved: ",len(response_2.data))
except grpc.RpcError as e:
    display(e.code().name)
    display(e.details())

#### 3.3 Store prices and timestamps
The following code snippet reprocess the outputs of the requests and store them in a *pandas* dataframe:

In [None]:
# create pandas dataframe to store close prices for the pair instruments
length = 500 # keep last 500 points
dates = [datetime(ts.date.year,ts.date.month, ts.date.day ) for ts in response_2.data[-length:]]
prices1 = [ts.close for ts in response_1.data[-length:]]
prices2 = [ts.close for ts in response_2.data[-length:]]
data = {'Date': dates, 'Price_1': prices1, 'Price_2': prices2}
df = pd.DataFrame(data=data)

#### 3.4 Visualize retrieved prices

In [None]:
import matplotlib.pyplot as plt

In [None]:
plt.figure(figsize=(25, 10))
plt.plot( 'Date', 'Price_1', data=df, marker='', color='orange', linewidth=1, label="{}".format(ticker_1))
plt.plot( 'Date', 'Price_2', data=df, marker='', color='purple', linewidth=1, label="{}".format(ticker_2))
plt.xlabel("Date")
plt.ylabel("Price")
plt.title("{0} & {1} price over time".format(ticker_1, ticker_2))
plt.legend()

### Step 4: Generate buy/sell signals

#### 4.1 Compute ratio

In [None]:
import math
# define the strategy ratio, equals Price1/Price2
def get_ratio(p1,p2):
    if p2 == 0:
        return 0
    else:
        return p1/p2

In [None]:
# Compute ratio and add to the dataframe
ratios = [get_ratio(p1,p2) for p1,p2 in zip(prices1,prices2)]
df['Ratio'] = ratios
df

In [None]:
# display the ratio and its mean over the time
ratio_mean = [df.mean().Ratio for t in df['Date']]
plt.figure(figsize=(25, 10))
plt.plot('Date', 'Ratio', data=df, marker='', color='blue', linewidth=1, label="Ratio")
plt.plot(df['Date'], ratio_mean, marker='', color='black', linewidth=1, label="Average ratio")
plt.xlabel("Date")
plt.ylabel("Ratio")
plt.title("{0} & {1} ratio".format(ticker_1,ticker_2))
plt.legend()
plt.show()

#### 4.2 Compute ratio Z-score
A *z-score* is the number of standard deviations a datapoint is from the mean. In the following code snippets, we will compute the *z-score* for the strategy ratio. 

In [None]:
# define a method to compute z-score
def get_zscore(value,std,mean):
    return (value - mean) / std

In [None]:
# compute the z-score for the strategy indicator
zscores = [get_zscore(i, df.std().Ratio, df.mean().Ratio) for i in ratios]
df['Zscore'] = zscores
df

In the following code snippet, we plot the *z-score*. We notice that it reverts the mean as as soon as it is higher or lower than the thresholds: +1 and -1.

In [None]:
# display zscore and zscore_mean
zscore_means = [df.mean().Zscore for t in df['Date']]
plt.figure(figsize=(25, 10))
plt.plot('Date', 'Zscore', data=df, marker='', color='blue', linewidth=1, label="Z-score")
plt.plot(df['Date'],zscore_means, marker='', color='black', linewidth=1, label="Average Z-score")
plt.axhline(1.0, color='red')
plt.axhline(-1.0, color='green')
plt.xlabel("Date")
plt.ylabel("Z-score")
plt.title("{0} & {1} Z-score".format(ticker_1,ticker_2))
plt.legend()

#### 4.3 Compute indicator moving averages Z-score

To generate **trading signals**, we will track indicator movements and identify points where it reverts the mean. 

To that end, we will compute a specific *z-score* on top of the indicator metrics related to the mean:
* 60 day Moving Average of Indicator
* 5 day Moving Average of Indicator

In [None]:
# Set moving average windows
long_window = 60
short_window = 5

In [None]:
# compute long moving average
long_ma_rolling = df['Ratio'].rolling(window=long_window, center=False)
long_mas = long_ma_rolling.mean()
long_ma_std = long_ma_rolling.std()

# compute short moving average
short_ma_rolling = df['Ratio'].rolling(window=short_window, center=False)
short_mas = short_ma_rolling.mean()

# add the strategy indicator long and short moving averages
df['Ratio_long_ma'],df['Ratio_short_ma'] = long_mas, short_mas 

In [None]:
# comptue zscore
zscore_mas = (short_mas -long_mas)/long_ma_std
df['Zscore_ma'] = zscore_mas
df

The following code snippets displays the strategy ratio and its long/short moving averages:

In [None]:
plt.figure(figsize=(25, 10))
plt.plot('Date', 'Ratio', data=df, marker='', color='blue', linewidth=1, alpha = 0.6, label="Ratio")
plt.plot('Date', 'Ratio_long_ma', data=df, marker='', color='red', linewidth=1, label="Ratio long-ma")
plt.plot('Date', 'Ratio_short_ma', data=df, marker='', color='green', linewidth=1, label="Ratio short-ma")
plt.legend(['Ratio', 'Ratio long-ma', 'Ratio short-ma'])
plt.ylabel('Ratio')
plt.xlabel('Date')
plt.title(' {0} & {1} Ratio with long/short moving averages: last {2} points'.format(ticker_1,ticker_2,length))
plt.show()

The following code snippets displays the strategy ratio z-score previously computed using the long/short indicator moving averages.

In [None]:
plt.figure(figsize=(25, 10))
plt.plot('Date', 'Zscore_ma', data=df, marker='', color='blue', linewidth=1, label="Ratio z-score")
plt.axhline(0, color='black')
plt.axhline(1.0, color='red', linestyle='--')
plt.axhline(-1.0, color='green', linestyle='--')
plt.legend(['Ratio z-score', 'Mean', '+1', '-1'])
plt.ylabel('Z-score')
plt.title(' {0} & {1} ratio Z-Score from moving averages: last {2} points'.format(ticker_1,ticker_2,length))
plt.xlabel('Date')
plt.show()

#### 4.4 Generate trading signals

We now generate **buy/sell trading signals** based on *z-score* movements:
* if *z-score* < -1 : we *buy* the ratio  
* if *z-score* > 1 : we *sell* the ratio  

In [None]:
buys = [None] * length
sells = [None] * length

# customize sell and buy signals
for i in range(len(buys)):
    if zscore_mas[i] <= -1:
        buys[i] = ratios[i]
    if zscore_mas[i] >= 1:
        sells[i] = ratios[i]

df['Buy'],df['Sell'] = buys, sells
df

### Step 5: Plot buy / sell signals

#### 5.1 Plot buy / sell signals on ratio

In [None]:
plt.figure(figsize=(25, 10))
plt.plot('Date', 'Ratio', data=df, marker='', color='blue', linewidth=1, label="Ratio z-Score")
plt.plot('Date', 'Buy', data=df, color='green', linestyle='None', marker='^')
plt.plot('Date', 'Sell', data=df, color='red', linestyle='None', marker='v')
plt.legend(['Ratio', 'Buy Signal', 'Sell Signal'])
plt.ylabel('Z-Score Rolling Ratio')
plt.title('{0} & {1} Buy/sell signals on the ratio: last {2} points'.format(ticker_1,ticker_2,length))
plt.xlabel('Date')
plt.show()

#### 5.2 Plot buy & sell signals on respective instruments
We previously identified the trading signals based on the ratio. We now have to match signals in order to determine which instrument to buy/sell in each case.
Since the ratio was previously defined as Price1/Price2, the decision will be made following the rules below:
* When buying the ratio, you **buy** Instrument 1 and **sell** Instrument 2
* When selling the ratio, you **sell** Instrument 1 and **buy** Instrument 2

In [None]:
# match b/s signals on the corresponding instruments
buy_signal = [None] * length
sell_signal= [None] * length

for i in range(length):
    if buys[i] != None:  # buying the ratio
        buy_signal[i] = prices1[i]    #buy instrument1
        sell_signal[i] = prices2[i]   #sell instrument2
    if sells[i] != None: # selling the ratio
        sell_signal[i] = prices1[i]   #sell instrument1
        buy_signal[i] = prices2[i]    #buy instrument2

df['Buy_signal'], df['Sell_signal'] = buy_signal, sell_signal

In [None]:
plt.figure(figsize=(25, 10))
plt.plot('Date', 'Price_1', data=df, marker='', color='orange', linewidth=1, label="{}".format(ticker_1))
plt.plot('Date', 'Price_2', data=df, marker='', color='purple', linewidth=1, label="{}".format(ticker_1))
plt.plot('Date', 'Buy_signal', data=df, color='green', linestyle='None', marker='^', label="Buy signal")
plt.plot('Date', 'Sell_signal', data=df, color='red', linestyle='None', marker='v', label="Sell signal")
plt.title('{0} & {1} Buy/sell signals on the pair instruments: last {2} points'.format(ticker_1,ticker_2,length))
plt.xlabel('Date')
plt.legend()
plt.show()

### 6 Estimate profit
We want to apply this strategy to a sample to estimate the profit.

#### 6.1 Prepare the dataset
Repeat the same steps explained in the process above.

In [None]:
# Define our data sample : -1000 to -500 points
length = 500 
dates = [datetime(ts.date.year,ts.date.month, ts.date.day ) for ts in response_2.data[-2*length:-length]]
prices1 = [ts.close for ts in response_1.data[-2*length:-length]]
prices2 = [ts.close for ts in response_2.data[-2*length:-length]]
data = {'Date': dates, 'Price_1': prices1, 'Price_2': prices2}
df_test = pd.DataFrame(data=data)

In [None]:
plt.figure(figsize=(15, 8))
plt.plot( 'Date', 'Price_1', data=df_test, marker='', color='orange', linewidth=1, label="{}".format(ticker_1))
plt.plot( 'Date', 'Price_2', data=df_test, marker='', color='purple', linewidth=1, label="{}".format(ticker_2))
plt.xlabel("Date")
plt.ylabel("Price")
plt.title("{0} & {1} price over time".format(ticker_1, ticker_2))
plt.legend()
plt.show()

In [None]:
# compute ratio and add to the dataframe
ratios = [get_ratio(p1,p2) for p1,p2 in zip(prices1,prices2)]
df_test['Ratio'] = ratios

# compute score and add to the dataframe
long_ma_rolling = df_test['Ratio'].rolling(window=long_window, center=False)
long_mas = long_ma_rolling.mean()

# compute short moving average
short_ma_rolling = df_test['Ratio'].rolling(window=short_window, center=False)
short_mas = short_ma_rolling.mean()
short_ma_std = short_ma_rolling.std()

# add the strategy ratio long and short moving averages
df_test['Ratio_long_ma'],df_test['Ratio_short_ma'] = long_mas, short_mas 

# comptue zscore
zscore_mas = (long_mas - short_mas)/short_ma_std
df_test['Zscore_ma'] = zscore_mas
df_test

#### 6.2 Run the strategy
The following code snippet runs the decision algorithm and exports the result of the strategy in a csv file.

In [None]:
import csv

empty_entries = False # determines whether or not we show the empty entries in the csv export
filename = 'output/daily_export.csv'
os.makedirs('output', exist_ok=True)
with open(filename, mode='w') as export_file:
    writer = csv.writer(export_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    
    # write 1rst row
    writer.writerow(['Date', 'Ratio', 'Action' ,'Instr1_position', 'Instr2_position', 'Profit'])
    
    # initialize parameters:
    total, instr1_count, instr2_count = 0,0,0
    clear_threshold = 0.5

    # iterate the selected sample and apply the pair strategy algorithm
    for i in range(len(ratios)):
        ratio = ratios[i]

        if zscore_mas[i] <-1: 
            # we sell the ratio: 
            #  -buying 1 instr1
            #  -selling indicators[i] instr2
            total += prices1[i] - prices2[i] *ratio
            instr1_count -= 1
            instr2_count += ratio
            writer.writerow(['{0:%Y/%m/%d}'.format(dates[i]), '{0:.3g}'.format(ratio), 'Sold', instr1_count,'{0:.3g}'.format(instr2_count) , '{0:.3g}'.format(total) ])
        elif zscore_mas[i] >1: 
            # we buy the ratio: 
            #  -selling 1 instr1
            #  -buying indicators[i] instr2
            total +=  - prices1[i] + prices2[i] *ratio
            instr1_count += 1
            instr2_count -= ratio
            writer.writerow(['{0:%Y/%m/%d}'.format(dates[i]), '{0:.3g}'.format(ratio), 'Bought', instr1_count, '{0:.3g}'.format(instr2_count), '{0:.3g}'.format(total) ])
        elif abs(zscore_mas[i]) < clear_threshold:
            # clear our current position
            total +=  instr1_count * prices1[i] + instr2_count * prices2[i]
            instr1_count = 0
            instr2_count = 0
            writer.writerow(['{0:%Y/%m/%d}'.format(dates[i]), '{0:.3g}'.format(ratio), 'Cleared', instr1_count, '{0:.3g}'.format(instr2_count), '{0:.3g}'.format(total) ])
        else:
            if empty_entries:
                writer.writerow(['{0:%Y/%m/%d}'.format(dates[i]), '{0:.3g}'.format(ratio), '', instr1_count, '{0:.3g}'.format(instr2_count), '{0:.3g}'.format(total) ])