# Pair strategy using tick data - Python

### Overview
The Pair Trading Strategy is a standard mean reversion model; two stocks that normally trade in the same direction become temporally uncorrelated and eventually will revert to the mean.

Assume that the strategy will trade pairs of equities of the same industrial sector that are historically highly correlated: 
- (A, B) : the pair
- A : the first instrument of the pair
- B : the second instrument the pair

### Indicator
Pairs trading is a market-neutral strategy, when we identify a deviation in the price relationship of these instruments, we expect a mean reversion.
We buy the underperforming instrument and simultaneously sell the outperforming one.

To illustrate the price relationship between pair instruements and to generate trading signals as explained above, we use the following indicator:

$$ Indicator = \frac{LastA}{Lastb} $$
Where:
- *Last A* is the Last trade price of the stock A 
- *Last B* is the Last trade price of the stock B

### Services used
This sample uses *gRPC requests* in order to retrieve intraday bars from the dedicated hosted service. The queried endpoint in this script are:
* *TickBarsService*: to directly retrieve tick bars objects from the server

### Modules required
1. Systemathics packages:
    * *systemathics.apis*
2. Open source packages
    * *googleapis-common-protos*
    * *protobuf*
    * *grpcio*
    * *pandas*
    * *matpotlib* as per display package

***

# Run Pair strategy using tick data

### Step 1: Install packages and import them

In [None]:
pip install googleapis-common-protos protobuf grpcio pandas matplotlib systemathics.apis

In [None]:
import os
import grpc
import pandas as pd
from datetime import datetime
from datetime import timedelta
import google.type.date_pb2 as date
import google.type.timeofday_pb2 as timeofday
import google.type.dayofweek_pb2 as dayofweek
import google.protobuf.duration_pb2 as duration
import systemathics.apis.type.shared.v1.identifier_pb2 as identifier
import systemathics.apis.type.shared.v1.constraints_pb2 as constraints
import systemathics.apis.type.shared.v1.date_interval_pb2 as dateinterval
import systemathics.apis.type.shared.v1.time_interval_pb2 as timeinterval
import systemathics.apis.services.tick_analytics.v1.tick_bars_pb2 as tick_bars
import systemathics.apis.services.tick_analytics.v1.tick_bars_pb2_grpc as tick_bars_service

### Step 2: Retrieve authentication token
The following code snippet sends authentication request and print token to console output in order to process the upcomming *gRPC queries*.

In [None]:
token = f"Bearer {os.environ['AUTH0_TOKEN']}"
display(token)

### Step 3: Retrieve prices

#### 3.1 Instrument selection

In [None]:
# set the instruments: tickers and exchange
exchange = "BATS"
ticker_1 = "AAPL"
ticker_2 = "MSFT"

#### 3.2 Tick bars parameters
The following code snippets enables to set the tick bars request parameters in order to retrieve tick prices:

In [None]:
# set the bar duration
sampling = 1 * 60

# set the bar calculation field
field = tick_bars.BAR_PRICE_TRADE 

In [None]:
#### 3.3 Time period selection

In [None]:
# create time intervals (we are using Google date format)
date_interval = dateinterval.DateInterval(
    start_date = date.Date(year = 2021, month = 3, day = 5), 
    end_date = date.Date(year = 2021, month = 3, day = 5)
)

# build the market data request time interval (we are using Google time format)
# UTC time zone
time_interval = timeinterval.TimeInterval(
    start_time = timeofday.TimeOfDay(hours = 0, minutes = 0, seconds = 0), 
    end_time = timeofday.TimeOfDay(hours = 21, minutes = 0, seconds = 0)
)

#### 3.2 Request creation
The following code snippets create *gRPC client*, process the *daily bars* request and stream the replies:

In [None]:
# generate constraints based on the previous time selection
constraints = constraints.Constraints(
    date_intervals = [date_interval],
    time_intervals = [time_interval],
)

In [None]:
# create tick bars requests for the pair instruments
request_1 = tick_bars.TickBarsRequest(
    identifier = identifier.Identifier(exchange = exchange, ticker = ticker_1),
    constraints = constraints,
    sampling = duration.Duration(seconds = sampling),
    field = field
)

request_2 = tick_bars.TickBarsRequest(
    identifier = identifier.Identifier(exchange = exchange, ticker = ticker_2),
    constraints = constraints,
    sampling = duration.Duration(seconds = sampling),
    field = field
)

In [None]:
# open a gRPC channel, instantiate the tick bars service and get the reply for the 1st instrument
with open(os.environ['SSL_CERT_FILE'], 'rb') as f:
    credentials = grpc.ssl_channel_credentials(f.read())
with grpc.secure_channel(os.environ['GRPC_APIS'], credentials) as channel:
    service = tick_bars_service.TickBarsServiceStub(channel)
    bars_1 = []
    for bar in service.TickBars(request = request_1, metadata = [('authorization', token)]):
            bars_1.append(bar)

In [None]:
# open a gRPC channel, instantiate the tick bars service and get the reply for the 2nd instrument
with open(os.environ['SSL_CERT_FILE'], 'rb') as f:
    credentials = grpc.ssl_channel_credentials(f.read())
with grpc.secure_channel(os.environ['GRPC_APIS'], credentials) as channel:
    service = tick_bars_service.TickBarsServiceStub(channel)
    bars_2 = []
    for bar in service.TickBars(request = request_2, metadata = [('authorization', token)]):
            bars_2.append(bar)

In [None]:
print("Total bars retrieved: ",len(bars_1))

In [None]:
print("Total bars retrieved: ",len(bars_2))

#### 3.3 Store prices and timestamps
The following code snippet reprocess the outputs of the requests and store them in a *pandas* dataframe:

In [None]:
# create pandas dataframe to store close prices for the pair instruments
length = len(bars_1)
dates = [datetime.fromtimestamp(b.time_stamp.seconds) for b in bars_1]
prices1 = [b.close for b in bars_1]
prices2 = [b.close for b in bars_2]
data = {'Date': dates, 'Price_1': prices1, 'Price_2': prices2}
df = pd.DataFrame(data=data)
df

#### 3.4 Visualize tick prices

In [None]:
import matplotlib.pyplot as plt

In [None]:
fig,ax = plt.subplots(1,1,figsize=(25,10))
ax.plot( 'Date', 'Price_1', data=df, marker='', color='orange', linewidth=1, label="{}".format(ticker_1))

# twin x-axis for two different y-axis
ax2=ax.twinx()
ax2.plot( 'Date', 'Price_2', data=df, marker='', color='purple', linewidth=1, label="{}".format(ticker_2))

# set graph title and axis label
ax.set_xlabel("Date",fontsize=14)
ax.set_ylabel("{}".format(ticker_1),color="orange",fontsize=14)
ax2.set_ylabel("{}".format(ticker_2),color="purple",fontsize=14)
plt.title("{0} & {1} tick close prices over time".format(ticker_1, ticker_2))
plt.legend()
plt.show()

### Step 4: Generate buy/sell signals

#### 4.1 Compute stock indicator

In [None]:
import math
# define the strategy indicator, equals Price1/Price2
def get_indicator(p1,p2):
    if p2 == 0:
        return 0
    else:
        return p1/p2

In [None]:
# Compute ratio and add to the dataframe
indicators = [get_indicator(p1,p2) for p1,p2 in zip(prices1,prices2)]
df['Indicator'] = indicators
df

In [None]:
# display the indicator and its mean over the time
indicator_mean = [df.mean().Indicator for t in df['Date']]
plt.figure(figsize=(25, 10))
plt.plot('Date', 'Indicator', data=df, marker='', color='blue', linewidth=1, label="Indicator")
plt.plot(df['Date'], indicator_mean, marker='', color='black', linewidth=1, label="Average indicator")
plt.xlabel("Date")
plt.ylabel("Indicator")
plt.title("{0}/{1} indicator".format(ticker_1,ticker_2))
plt.legend()

#### 4.2 Compute indicator Z-score
A *z-score* is the number of standard deviations a datapoint is from the mean. In the following code snippets, we will compute the *z-score* for the strategy indicator. 

In [None]:
# define a method to compute z-score
def get_zscore(value,std,mean):
    return (value - mean) / std

In [None]:
# compute the z-score for the strategy indicator
zscores = [get_zscore(i, df.std().Indicator, df.mean().Indicator) for i in indicators]
df['Zscore'] = zscores
df

In the following code snippet, we plot the *z-score*. We notice that it reverts the mean as as soon as it is higher or lower than the thresholds: +1 and -1.

In [None]:
# display zscore and zscore_mean
zscore_means = [df.mean().Zscore for t in df['Date']]
plt.figure(figsize=(25, 10))
plt.plot('Date', 'Zscore', data=df, marker='', color='blue', linewidth=1, label="Z-score")
plt.plot(df['Date'],zscore_means, marker='', color='black', linewidth=1, label="Average Z-score")
plt.axhline(1.0, color='red')
plt.axhline(-1.0, color='green')
plt.xlabel("Date")
plt.ylabel("Z-score")
plt.title("{0}/{1} Z-score".format(ticker_1,ticker_2))
plt.legend()

#### 4.3 Compute indicator moving averages Z-score

To generate **trading signals**, we will track indicator movements and identify points where it reverts the mean. 

To that end, we will compute a specific *z-score* on top of the indicator metrics related to the mean:
* 60 day Moving Average of Indicator
* 5 day Moving Average of Indicator

In [None]:
# compute long moving average
long_ma_rolling = df['Indicator'].rolling(window=60, center=False)
long_mas = long_ma_rolling.mean()
long_ma_std = long_ma_rolling.std()

# compute short moving average
short_ma_rolling = df['Indicator'].rolling(window=5, center=False)
short_mas = short_ma_rolling.mean()

In [None]:
# add the strategy indicator long and short moving averages
df['Indicator_long_ma'],df['Indicator_short_ma'] = long_mas, short_mas 

In [None]:
# comptue zscore
zscore_mas = (short_mas - long_mas)/long_ma_std
df['Zscore_ma'] = zscore_mas
df

The following code snippets displays the strategy indicator and its long/short moving averages:

In [None]:
plt.figure(figsize=(25, 10))
plt.plot('Date', 'Indicator', data=df, marker='', color='blue', linewidth=1, alpha = 0.6, label="Indicator")
plt.plot('Date', 'Indicator_long_ma', data=df, marker='', color='red', linewidth=1, label="Indicator long-ma")
plt.plot('Date', 'Indicator_short_ma', data=df, marker='', color='green', linewidth=1, label="Indicator short-ma")
plt.legend(['Indicator', 'Indicator long-ma', 'Indicator short-ma'])
plt.ylabel('Indicator')
plt.xlabel('Date')
plt.title(' {0}/{1} Indicator with long/short moving averages'.format(ticker_1,ticker_2))
plt.show()

The following code snippets displays the strategy indicator z-score previously computed using the long/short indicator moving averages.

In [None]:
plt.figure(figsize=(25, 10))
plt.plot('Date', 'Zscore_ma', data=df, marker='', color='blue', linewidth=1, label="Indicator z-score")
plt.axhline(0, color='black')
plt.axhline(1.0, color='red', linestyle='--')
plt.axhline(-1.0, color='green', linestyle='--')
plt.legend(['Indicator z-score', 'Mean', '+1', '-1'])
plt.ylabel('Z-score')
plt.title('{0}/{1} indicator Z-Score from moving averages'.format(ticker_1,ticker_2))
plt.xlabel('Date')
plt.show()

#### 4.4 Generate trading signals

We now generate **buy/sell trading signals** based on *z-score* movements:
* if *z-score* < -1 : we *buy* the ratio  
* if *z-score* > 1 : we *sell* the ratio  

In [None]:
buys = [None] * length
sells = [None] * length

# Customize sell and buy signals
for i in range(len(buys)):
    if zscore_mas[i] <= -1:
        buys[i] = indicators[i]
    if zscore_mas[i] >= 1:
        sells[i] = indicators[i]

df['Buy'],df['Sell'] = buys, sells
df

### Step 5: Plot buy/sell signals

#### 5.1 Plot buy/sell signals on ratio

In [None]:
plt.figure(figsize=(25, 10))
plt.plot('Date', 'Indicator', data=df, marker='', color='blue', linewidth=1, label="Indicator z-score")
plt.plot('Date', 'Buy', data=df, color='green', linestyle='None', marker='^')
plt.plot('Date', 'Sell', data=df, color='red', linestyle='None', marker='^')
plt.legend(['Indicator', 'Buy Signal', 'Sell Signal'])
plt.ylabel('Z-Score')
plt.title('{0}/{1} Buy/sell signals on the indicator'.format(ticker_1,ticker_2))
plt.xlabel('Date')
plt.show()

#### 5.2 Plot buy & sell signals on respective instruments
We previously identified the trading signals based on the ratio. We now have to match signals in order to determine which instrument to buy/sell in each case.
Since the ratio was previously defined as Price1/Price2, the decision will be made following the rules below:
* When buying the ratio, you **buy** Instrument 1 and **sell** Instrument 2
* When selling the ratio, you **sell** Instrument 1 and **buy** Instrument 2

In [None]:
# Match b/s signals on the corresponding instruments
buy_1, buy_2, sell_1, sell_2 = [None] * length, [None] * length, [None] * length, [None] * length

for i in range(length):
    if buys[i] != None:  # buying the ratio
        buy_1[i] = prices1[i]    #buy instrument1
        sell_2[i] = prices2[i]   #sell instrument2
    if sells[i] != None: # selling the ratio
        sell_1[i] = prices1[i]   #sell instrument1
        buy_2[i] = prices2[i]    #buy instrument2

df['Buy_1'], df['Sell_1'], df['Buy_2'], df['Sell_2'] = buy_1, sell_1, buy_2, sell_2
df

In [None]:
fig,ax = plt.subplots(1,1,figsize=(25,10))
ax.plot( 'Date', 'Price_1', data=df, marker='', color='orange', linewidth=1, label="{}".format(ticker_1))
ax.plot('Date', 'Buy_1', data=df, color='green', linestyle='None', marker='^', label="Buy {0}".format(ticker_1))
ax.plot('Date', 'Sell_1', data=df, color='red', linestyle='None', marker='^', label="Sell {0}".format(ticker_1))

# twin x-axis for two different y-axis
ax2=ax.twinx()
ax2.plot( 'Date', 'Price_2', data=df, marker='', color='purple', linewidth=1, label="{}".format(ticker_2))
ax2.plot('Date', 'Buy_2', data=df, color='green', linestyle='None', marker='^', label="Buy {0}".format(ticker_2))
ax2.plot('Date', 'Sell_2', data=df, color='red', linestyle='None', marker='^', label="Sell {0}".format(ticker_2))

# set graph title and axis label
ax.set_xlabel("Date",fontsize=14)
ax.set_ylabel("{}".format(ticker_1),color="orange",fontsize=14)
ax2.set_ylabel("{}".format(ticker_2),color="purple",fontsize=14)
plt.title("{0} & {1} tick close prices over time".format(ticker_1, ticker_2))
ax.legend(loc='upper left')
plt.legend(loc='lower right')
plt.show()