# Comparison of Different Algorithmic Trading Strategies on Tesla Stock Price

### Tawfiq Jawhar
<b>Machine Learning (COMP-652 and ECSE-608)<br>
Fall 2018<br>
McGill University <br><br>
Instructors:<br>
Audrey Durand<br>
Riashat Islam <br></b>

---

## Outline
In order to train or analyze any data related to our environment, we first need to extract that data from Zipline. <br>
The stock price data is not always the same, it depends on the time of the day. In our case, the environemnt is making a trade every day when market opens. The historic data will usually be the prices of the stock at that day after market closes. If a model is predicting the price of the future, training a model on historic data with market closing prices can cause problems. Especially that the testing is happening in an environment simulating real-time trading. <br>
Using Zipline we can extract data every time the trading function is called. The prices data available to our strategy at every time of trading will have prices similar to the figure below.

![](images/historyDrawing.png)


## Extracting Data
We will be experimenting with different models and different strategies. We will extract the following data on the training interval:

-  The price of every day at market open.
-  A vector representing the logarithmic return with 50 days lookback in history at every day of trading.
-  Labels (Up, Down) for t-1 at every day t, depending whether the closing price went up or down w.r.t. the day before.

In [1]:
## to use zipline magic commands (which is not used in this notebook)
%load_ext zipline
## inline plot
%matplotlib inline

##import libraries
import zipline
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import style
import seaborn as sns
plt.style.use('seaborn-talk')
plt.style.use('bmh')

import pickle
import numpy as np
from zipline.api import (
    order_target_percent, 
    record, 
    symbol, 
    schedule_function, 
    sid,
    date_rules,
    time_rules,
    get_open_orders,
    order_percent,
    order,
    set_benchmark,
    get_datetime)
from zipline.finance import commission, slippage
from zipline import run_algorithm

## pyfolio was not installed in the docker image (needs updating)
## if you want to use pyfolio to analyze the portfolio
try:
    import pyfolio as pf
except:    
    !pip install pyfolio
    import pyfolio as pf
    
## Trading Environment
env = {
    'train': {
        'start_time': pd.to_datetime('2014-01-01').tz_localize('US/Eastern'),
        'end_time': pd.to_datetime('2015-12-31').tz_localize('US/Eastern')
    },
    'test': {
        'start_time': pd.to_datetime('2016-01-01').tz_localize('US/Eastern'),
        'end_time': pd.to_datetime('2017-12-31').tz_localize('US/Eastern')
    },
    'commision': None, #commission.PerShare(cost=.0075, min_trade_cost=1.0)
    'slippage': None, #slippage.VolumeShareSlippage()
    'capital': 100000, #USD
    'stock': 'TSLA',
    'date_rules': date_rules.every_day(),
    'time_rules': time_rules.market_open()
}

<b> Update: some days (for example at 2014-07-02) is not calling the handle_date_daily function. And it is considering the price recorded as the day before. 3 days occured like that as well. I am extracting the data myself with lists outside Zipline instead of using `record`. Total number of days is 500.<b>

In [2]:
labels = list()
date = list()
X_price = list()
X_return = list()
TSLA_closing_price = list()

def initialize(context):
    context.asset = symbol(env['stock'])
    context.set_commission(env['commision'])
    context.set_slippage(env['slippage'])
    schedule_function(handle_data_daily, env['date_rules'], env['time_rules'])
    context.day = 0

def handle_data_daily(context, data):
    try:
        hist = data.history(
            context.asset, 
            fields= 'price', 
            bar_count=51, 
            frequency='1d'
        )
        ## get return on close price
        returns = np.log(hist/hist.shift(1)).dropna()
        ## modify the price of last day with open market price
        hist[50]=data.current(context.asset,'open')

        date.append(get_datetime())


        if context.day != 0:
            #return gain
            if returns.values[-1] > 0:
                labels.append(1)
            #return loss
            else: labels.append(0)
       # print(context.trading_day)
    except Exception as e: print(e)
    context.day+=1
    TSLA_closing_price.append(data.current(context.asset, "price"))
    X_price.append(hist.values[-50:])
    X_return.append(returns.values)
    
    record(TSLA=data.current(context.asset, "price"),
           hist=hist.values[-50:], returns=returns.values)


In [3]:
results = run_algorithm(env['train']['start_time'], env['train']['end_time'],
                        initialize=initialize,capital_base=env['capital'])

In [4]:
labels.append(np.nan)

In [5]:
df = pd.DataFrame(
{
    'Date': date,
    'TSLA': TSLA_closing_price,
    'X_price': X_price,
    'X_return': X_return,
    'label': labels
}
)
df = df.set_index('Date')

In [6]:
df.head()

Unnamed: 0_level_0,TSLA,X_price,X_return,label
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2014-01-02 21:00:00+00:00,150.1,"[171.54, 164.5, 173.15, 169.66, 162.86, 164.47...","[-0.006160303087052136, -0.0419059053568488, 0...",0.0
2014-01-03 21:00:00+00:00,149.56,"[164.5, 173.15, 169.66, 162.86, 164.47, 159.22...","[-0.0419059053568488, 0.0512477006430045, -0.0...",0.0
2014-01-06 21:00:00+00:00,147.0,"[173.15, 169.66, 162.86, 164.47, 159.22, 159.9...","[0.0512477006430045, -0.020361836468842355, -0...",1.0
2014-01-07 21:00:00+00:00,149.36,"[169.66, 162.86, 164.47, 159.22, 159.94, 162.1...","[-0.020361836468842355, -0.040905498340603356,...",1.0
2014-01-08 21:00:00+00:00,151.28,"[162.86, 164.47, 159.22, 159.94, 162.17, 175.2...","[-0.040905498340603356, 0.009837246714192383, ...",0.0


In [7]:
df = df.dropna()

In [8]:
df['X_price'].iloc[0], df['label'].iloc[0]

(array([171.54 , 164.5  , 173.15 , 169.66 , 162.86 , 164.47 , 159.22 ,
        159.94 , 162.17 , 175.2  , 176.81 , 151.16 , 139.772, 137.95 ,
        144.698, 137.8  , 138.7  , 137.6  , 135.45 , 121.58 , 126.09 ,
        121.11 , 122.1  , 121.38 , 120.84 , 120.5  , 126.94 , 127.28 ,
        124.17 , 144.7  , 138.95 , 140.48 , 137.36 , 141.6  , 142.19 ,
        139.65 , 147.47 , 147.654, 147.94 , 152.46 , 147.98 , 140.72 ,
        143.24 , 143.55 , 151.41 , 155.5  , 151.12 , 152.44 , 150.429,
        149.8  ]), 0.0)

In [9]:
df.to_pickle('data/TSLAtraining.pickle')

Now we apply the same thing for the testing data.

In [10]:
labels = list()
date = list()
X_price = list()
X_return = list()
TSLA_closing_price = list()

def initialize(context):
    context.asset = symbol(env['stock'])
    context.set_commission(env['commision'])
    context.set_slippage(env['slippage'])
    schedule_function(handle_data_daily, env['date_rules'], env['time_rules'])
    context.day = 0

def handle_data_daily(context, data):
    try:
        hist = data.history(
            context.asset, 
            fields= 'price', 
            bar_count=51, 
            frequency='1d'
        )
        ## get return on close price
        returns = np.log(hist/hist.shift(1)).dropna()
        ## modify the price of last day with open market price
        hist[50]=data.current(context.asset,'open')

        date.append(get_datetime())


        if context.day != 0:
            #return gain
            if returns.values[-1] > 0:
                labels.append(1)
            #return loss
            else: labels.append(0)
       # print(context.trading_day)
    except Exception as e: print(e)
    context.day+=1
    TSLA_closing_price.append(data.current(context.asset, "price"))
    X_price.append(hist.values[-50:])
    X_return.append(returns.values)
    
    record(TSLA=data.current(context.asset, "price"),
           hist=hist.values[-50:], returns=returns.values)


In [11]:
results = run_algorithm(env['test']['start_time'], env['test']['end_time'],
                        initialize=initialize,capital_base=env['capital'])
labels.append(np.nan)
df = pd.DataFrame(
{
    'Date': date,
    'TSLA': TSLA_closing_price,
    'X_price': X_price,
    'X_return': X_return,
    'label': labels
}
)
df = df.set_index('Date')
df = df.dropna()
df.to_pickle('data/TSLAtesting.pickle')