# A benchmark Pairs Trading strategy 

This notebook explores a pairs trading strategy using bollinger bands. This projected is being developed as part of a master thesis for the degree of Electrical and Computer Engineering.

**Author:** Simão Moraes Sarmento <br /> 
**Contact:** simaosarmento@hotmail.com

## Dependencies

This notebook requires code from:

Python files:
- `class_SeriesAnalyser.py` - contains a set of functions to deal with time series analysis.
- `class_Trader.py` - contains a set of functions concerning trading strategies.
- `class_DataProcessor.py` - contains a set of functions concerning the data pre processing.

Pickle files:
- pickle file containing pairs to be traded (obtained from running `PairsTrading_CommodityETFs-Clustering.ipynb`)

As a good practise, the notebook solely intends to exemplify the application of different trading strategies for different dataset examples, rather than coding the strategies theirselves. Please look into the files menitoned above for more detailed info on how the functions are built.

### Import Libraries

In [1]:
import numpy as np
import pandas as pd
import pickle

import json

import statsmodels
import statsmodels.api as sm
from statsmodels.tsa.stattools import coint, adfuller

import matplotlib.pyplot as plt
import matplotlib.cm as cm

# Import Datetime and the Pandas DataReader
from datetime import datetime
from pandas_datareader import data, wb

# Import alpha vantage
from alpha_vantage.timeseries import TimeSeries

# Import scikit instruments
from sklearn.cluster import DBSCAN
from sklearn.decomposition import PCA
from sklearn.manifold import TSNE
from sklearn import preprocessing
from sklearn.metrics import silhouette_score

# just set the seed for the random number generator
np.random.seed(107)

### Import Configurations

In [2]:
config_file = 'config/config_commodities_2010_2019.json'

In [3]:
with open(config_file, 'r') as f:
    config = json.load(f)

In [4]:
with open(config['dataset']['ticker_segment_dict'], 'rb') as handle:
    ticker_segment_dict = pickle.load(handle)

### Import Classes

In [38]:
%load_ext autoreload
%aimport class_SeriesAnalyser, class_Trader, class_DataProcessor
%autoreload 1

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [39]:
series_analyser = class_SeriesAnalyser.SeriesAnalyser()
trader = class_Trader.Trader()
data_processor = class_DataProcessor.DataProcessor()

# Retrieve prices data set

We start by retrieving the data from a Dataframe saved in a pickle file, as it was previously processed in the `PairsTrading_CommodityETFS_Datapreprocessing.ipynb` notebook.

In [40]:
# intraday
df_prices = pd.read_pickle('data/etfs/pickle/commodity_ETFs_from_2014_complete.pickle')

In [92]:
# split data in training and test
df_prices_train, df_prices_test = data_processor.split_data(df_prices,
                                                            ('01-01-2012',
                                                             '31-12-2014'),
                                                            ('01-01-2015',
                                                             '31-12-2015'),
                                                            remove_nan=True)
train_val_split = '01-01-2014'

Total of 116 tickers
Total of 95 tickers after removing tickers with Nan values


In [42]:
len(df_prices_train)+len(df_prices_test)

77916

# Load Pairs

In [43]:
# intra day
#with open('data/etfs/pickle/pairs_unfiltered_intraday.pickle', 'rb') as handle:
#    pairs = pickle.load(handle)

#with open('data/etfs/pickle/pairs_category_intraday.pickle', 'rb') as handle:
with open('data/etfs/pickle/2012-2016/pairs_category_intraday.pickle', 'rb') as handle:
    pairs = pickle.load(handle)

#with open('data/etfs/pickle/2014-2018/pairs_unsupervised_learning_intraday.pickle', 'rb') as handle:
#    pairs = pickle.load(handle)
    
# interday  
#with open('data/etfs/pickle/pairs_unfiltered_interday.pickle', 'rb') as handle:
#    pairs = pickle.load(handle)

#with open('data/etfs/pickle/pairs_category_interday.pickle', 'rb') as handle:
#    pairs = pickle.load(handle)

#with open('data/etfs/pickle/pairs_unsupervised_learning_interday.pickle', 'rb') as handle:
#    pairs = pickle.load(handle)

In [44]:
###### lookback_multiplier= config['trading']['lookback_multiplier']
entry_multiplier= config['trading']['entry_multiplier']
exit_multiplier= config['trading']['exit_multiplier']
# obtain trading filter info
if config['trading_filter']['active'] == 1:
    trading_filter = config['trading_filter']
else:
    trading_filter = None

In [45]:
len(pairs)

59

In [46]:
# intraday
n_years_test = 1

1

## Applying Fixed Beta

In [81]:
train_results_without_costs, train_results_with_costs, performance_threshold_train = \
                 trader.apply_trading_strategy(pairs, 
                                                'fixed_beta',
                                                2,#entry_multiplier,
                                                0,#exit_multiplier,
                                                test_mode=False,
                                                train_val_split=train_val_split
                                               )

sharpe_results_threshold_train_nocosts, cum_returns_threshold_train_nocosts = train_results_without_costs
sharpe_results_threshold_train_w_costs, cum_returns_threshold_train_w_costs = train_results_with_costs

Pair: 59/59

In [83]:
cum_returns_threshold_train_w_costs = np.asarray(cum_returns_threshold_train_w_costs)
profitable_pairs_indices = np.argwhere(cum_returns_threshold_train_w_costs > 0)
profitable_pairs = [pairs[i] for i in profitable_pairs_indices.flatten()]

results_without_costs, results_with_costs, performance_threshold_test = trader.apply_trading_strategy(
                                                                                             profitable_pairs,
                                                                                             'fixed_beta',
                                                                                             2,
                                                                                             0,
                                                                                             test_mode=True)
sharpe_results_threshold_test_nocosts, cum_returns_threshold_test_nocosts = results_without_costs
sharpe_results_threshold_test_w_costs, cum_returns_threshold_test_w_costs = results_with_costs

Pair: 41/41

In [84]:
_, _, _, _ = trader.calculate_metrics(sharpe_results_threshold_test_w_costs, cum_returns_threshold_test_w_costs,
                                      n_years_test)

Average result:  0.7761166451995087
avg_annual_roi:  11.829184590791808
82.92682926829268 % of the pairs had positive returns


## Applying the bollinger bands strategy

In [102]:
train_results_without_costs, train_results_with_costs, performance_train = trader.apply_trading_strategy(pairs,
                                                                               'bollinger_bands', 
                                                                                2,
                                                                                0,
                                                                                test_mode=False,
                                                                                train_val_split=train_val_split)
sharpe_results_train_nocosts, cum_returns_train_nocosts = train_results_without_costs
sharpe_results_train_w_costs, cum_returns_train_w_costs = train_results_with_costs

Pair: 1/599.469617152412368
Pair: 2/592.7658509616273363
Pair: 3/590.0
Pair: 4/590.0
Pair: 5/590.0
Pair: 6/590.0
Pair: 7/592.3878462253870802
Pair: 8/590.0
Pair: 9/59-0.39276116571623776
Pair: 10/590.0
Pair: 11/5915.434249635605202
Pair: 12/5916.77592200018627
Pair: 13/5926.418818004831923
Pair: 14/590.0
Pair: 15/590.0
Pair: 16/590.0
Pair: 17/590.0
Pair: 18/590.0
Pair: 19/590.0
Pair: 20/590.0
Pair: 21/590.0
Pair: 22/5910.722478248041734
Pair: 23/590.0
Pair: 24/592.905178047719259
Pair: 25/591.673424420699554
Pair: 26/598.920524070388659
Pair: 27/590.0
Pair: 28/593.5323465084066585
Pair: 29/590.0
Pair: 30/590.0
Pair: 31/590.5948415250237593
Pair: 32/590.0
Pair: 33/590.0
Pair: 34/590.0
Pair: 35/590.0
Pair: 36/59-4.303261303710659
Pair: 37/59-2.304652311049016
Pair: 38/590.0
Pair: 39/590.0
Pair: 40/591.8839616274230142
Pair: 41/590.0
Pair: 42/5914.494311950538652
Pair: 43/590.0
Pair: 44/591.9858208124466925
Pair: 45/590.0
Pair: 46/590.0
Pair: 47/590.0
Pair: 48/593.329730135832465
Pair: 49

In [105]:
cum_returns_train_w_costs = np.asarray(cum_returns_train_w_costs)
profitable_pairs_indices = np.argwhere(cum_returns_train_w_costs > 0)
profitable_pairs = [pairs[i] for i in profitable_pairs_indices.flatten()]

results_without_costs, results_with_costs, performance_test = trader.apply_trading_strategy(
                                                                                             pairs,
                                                                                             'bollinger_bands',
                                                                                             2,
                                                                                             0,
                                                                                             test_mode=True)
sharpe_results_test_nocosts, cum_returns_test_nocosts = results_without_costs
sharpe_results_test_w_costs, cum_returns_test_w_costs = results_with_costs

Pair: 1/59  0.0
Pair: 2/59  0.0
Pair: 3/59  -2.125164750794084
Pair: 4/59  3.7662084822359443
Pair: 5/59  0.0
Pair: 6/59  -0.5311165312349431
Pair: 7/59  0.0
Pair: 8/59  -6.7867730036161085
Pair: 9/59  6.046364931995263
Pair: 10/59  0.0
Pair: 11/59  0.0
Pair: 12/59  0.0
Pair: 13/59  0.0
Pair: 14/59  0.0
Pair: 15/59  0.0
Pair: 16/59  0.0
Pair: 17/59  0.0
Pair: 18/59  0.0
Pair: 19/59  0.0
Pair: 20/59  0.0
Pair: 21/59  0.0
Pair: 22/59  0.0
Pair: 23/59  0.0
Pair: 24/59  0.0
Pair: 25/59  0.0
Pair: 26/59  0.0
Pair: 27/59  0.0
Pair: 28/59  -0.6909591163512308
Pair: 29/59  0.0
Pair: 30/59  0.0
Pair: 31/59  0.8677545643140938
Pair: 32/59  0.0
Pair: 33/59  0.0
Pair: 34/59  0.0
Pair: 35/59  0.0
Pair: 36/59  0.0
Pair: 37/59  0.0
Pair: 38/59  19.383006262558556
Pair: 39/59  0.0
Pair: 40/59  0.0
Pair: 41/59  0.0
Pair: 42/59  0.0
Pair: 43/59  0.0
Pair: 44/59  0.0
Pair: 45/59  0.0
Pair: 46/59  0.0
Pair: 47/59  0.0
Pair: 48/59  -2.352708412362836
Pair: 49/59  0.0
Pair: 50/59  0.0
Pair: 51/59  0.0
Pair:

In [106]:
_, _, _, _ = trader.calculate_metrics(sharpe_results_test_w_costs, cum_returns_test_w_costs,
                                      n_years_test)

Average result:  0.0
avg_annual_roi:  0.2977996168300301
8.474576271186441 % of the pairs had positive returns


### Kalman Filter

In [97]:
train_results_without_costs, train_results_with_costs, performance_train = trader.apply_trading_strategy(pairs,
                                                                               'kalman_filter', 
                                                                                2,
                                                                                0,
                                                                                test_mode=False,
                                                                                train_val_split=train_val_split
                                                                                )
sharpe_results_train_nocosts, cum_returns_train_nocosts = train_results_without_costs
sharpe_results_train_w_costs, cum_returns_train_w_costs = train_results_with_costs

Pair: 59/59

In [98]:
results_without_costs, results_with_costs, performance_test = trader.apply_trading_strategy(pairs,
                                                                                    'kalman_filter',
                                                                                     2,
                                                                                     0,
                                                                                     test_mode=True)
sharpe_results_test_nocosts, cum_returns_test_nocosts = results_without_costs
sharpe_results_test_w_costs, cum_returns_test_w_costs = results_with_costs

Pair: 59/59

In [99]:
_, _, _, _ = trader.calculate_metrics(sharpe_results_test_w_costs, cum_returns_test_w_costs,
                                      n_years_test)

Average result:  0.0
avg_annual_roi:  -21.003326213586803
10.169491525423728 % of the pairs had positive returns
