# Notebook Instructions
<i>You can run the notebook document sequentially (one cell a time) by pressing <b> shift + enter</b>. While a cell is running, a [*] will display on the left. When it has been run, a number will display indicating the order in which it was run in the notebook [8].</i>

<i>Enter edit mode by pressing <b>`Enter`</b> or using the mouse to click on a cell's editor area. Edit mode is indicated by a green cell border and a prompt showing in the editor area.</i>

# Assignment

In this assignment, you will learn to interpret the ADF test results and create a mean reversion strategy on triplets.

The steps followed are
1. Import the libraries and the data
2. Find the hedge ratio
3. Create the spread
4. ADF Test
5. Mean reversion strategy
6. Plot the profit and loss (PnL)


## 1. Import the libraries and the data

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

The code reads the csv file using the read_csv function

Instruction:
1. Replace the `...` with `GLD.csv`, `GDX.csv` and `USO.csv` in the below cell

In [None]:
x = pd.read_csv('...', index_col=0)['Adj Close']
y = pd.read_csv('...', index_col=0)['Adj Close']
z = pd.read_csv('...', index_col=0)['Adj Close']

df = pd.concat([x, y, z], axis=1)
df.columns = ['GLD', 'GDX', 'USO']
df.index = pd.to_datetime(df.index)

(df.pct_change()+1).cumprod().plot(figsize=(10, 5))
plt.ylabel("Percentage Change")
plt.show()

## 2. Find the hedge ratio

Instructions
1. y is GLD and x1 is GDX and x2 is USO. Replace `..1..` with `GLD`, `..2..` with `GDX`, and `..3..` with `USO`.
2. To find the hedge ratio, we will only use first 90 days of data. Replace `..4..` and `..5..` with `90`.

In [None]:
from statsmodels.api import OLS
model = OLS(df['..1..'].iloc[:..4..], df[['..2..', '..3..']].iloc[:..5..])
model = model.fit()
print('The hedge ratio for GDX and USO are')
model.params

## 3. Create the spread

The spread is formed as GLD - m1 * GDX - m2 * USO. The `model.params[0]` stores m1 and `model.params[1]` stores m2.

Instruction
1. Compute the spread

In [None]:
df['spread'] = ...........1.............

In [None]:
print('The spread is: GLD - %f * GDX - %f * USO' %
      (model.params[0], model.params[1]))
# Plot the spread
df.spread.plot(figsize=(10, 5))
plt.ylabel("Spread")
plt.show()

### 4. ADF Test

We determine the spread is cointegrated using adfuller method.

Instruction

1. Run ADF Test on `df.spread` and with `maxlag` as `1`.

In [None]:
# To perform ADF Test
from statsmodels.tsa.stattools import adfuller
# Compute ADF test statistics
adf = .................1..................
adf[0]

In [None]:
adf[4]

If t-stat value is less than the critical value then the spread is cointegrated.

Instructions
2. Replace `..2..` with `True` if the spread is cointegrated and `False` if the spread is not cointegrated
(Assume 90% confidence level)

In [None]:
is_triplet_cointegrated = ..2..

## 5. Mean reversion strategy

In [None]:
def stat_arb(df, lookback, std_dev):
    df['moving_average'] = df.spread.rolling(lookback).mean()
    df['moving_std_dev'] = df.spread.rolling(lookback).std()

    df['upper_band'] = df.moving_average + std_dev*df.moving_std_dev
    df['lower_band'] = df.moving_average - std_dev*df.moving_std_dev

    df['long_entry'] = df.spread < df.lower_band
    df['long_exit'] = df.spread >= df.moving_average
    df['positions_long'] = np.nan
    df.loc[df.long_entry, 'positions_long'] = 1
    df.loc[df.long_exit, 'positions_long'] = 0
    df.positions_long = df.positions_long.fillna(method='ffill')

    df['short_entry'] = df.spread > df.upper_band
    df['short_exit'] = df.spread <= df.moving_average
    df['positions_short'] = np.nan
    df.loc[df.short_entry, 'positions_short'] = -1
    df.loc[df.short_exit, 'positions_short'] = 0
    df.positions_short = df.positions_short.fillna(method='ffill')

    df['positions'] = df.positions_long + df.positions_short

    df['spread_difference'] = df.spread - df.spread.shift(1)
    df['pnl'] = df.positions.shift(1) * df.spread_difference
    df['cumpnl'] = df.pnl.cumsum()
    return df

Instruction
1. Call the function stat_arb with df, lookback as 15 and standard deviation as 1.

In [None]:
df = .............1......................

## 6. Plot the profit and loss (PnL)

Instruction
1. Plot the cumulative PnL

In [None]:
.........1..........



