# Algorithmic Trading Model for Exponential Moving Average Crossover Grid Search Batch Mode Using Colab
### David Lowe
### June 25, 2020

NOTE: This script is for learning purposes only and does not constitute a recommendation for buying or selling any stock mentioned in this script.

SUMMARY: The purpose of this project is to construct and test an algorithmic trading model and document the end-to-end steps using a template.

INTRODUCTION: This algorithmic trading model examines a series of exponential moving average (MA) models via a grid search methodology. When the fast moving-average curve crosses above the slow moving-average curve, the strategy goes long (buys) on the stock. When the opposite occurs, we will exit the position.

The grid search methodology will search through all combinations between the two MA curves. The faster MA curve can range from 5 days to 20 days, while the slower MA can range from 10 days to 50 days. Both curves use a 5-day increment.

ANALYSIS: This is the Google Colab version of the iPython notebook posted on June 17, 2020. The script will save all output for each stock into a text file and on a Google Drive path. The Colab script contains an example of processing 100 different stocks in one batch.

CONCLUSION: Please refer to the individual output file for each stock.

Dataset ML Model: Time series analysis with numerical attributes

Dataset Used: Quandl

An algorithmic trading modeling project generally can be broken down into about five major tasks:

1. Prepare Environment
2. Acquire and Pre-Process Data
3. Develop Strategy and Train Model
4. Back-test Model
5. Evaluate Performance

## Task 1. Prepare Environment

In [1]:
!pip install python-dotenv PyMySQL

Collecting python-dotenv
  Downloading https://files.pythonhosted.org/packages/cb/2a/07f87440444fdf2c5870a710b6770d766a1c7df9c827b0c90e807f1fb4c5/python_dotenv-0.13.0-py2.py3-none-any.whl
Collecting PyMySQL
[?25l  Downloading https://files.pythonhosted.org/packages/ed/39/15045ae46f2a123019aa968dfcba0396c161c20f855f11dea6796bcaae95/PyMySQL-0.9.3-py2.py3-none-any.whl (47kB)
[K     |████████████████████████████████| 51kB 1.4MB/s 
[?25hInstalling collected packages: python-dotenv, PyMySQL
Successfully installed PyMySQL-0.9.3 python-dotenv-0.13.0


In [2]:
import os
import sys
import smtplib
import numpy as np
import pandas as pd
import requests
import json
import matplotlib.pyplot as plt
from datetime import datetime, timedelta
from dotenv import load_dotenv

In [3]:
# Set up the parent directory location for loading the dotenv files
useColab = True
if useColab:
    # Mount Google Drive locally for storing files
    from google.colab import drive
    drive.mount('/content/gdrive')
    gdrivePrefix = '/content/gdrive/My Drive/Colab_Downloads/'
    env_path = '/content/gdrive/My Drive/Colab Notebooks/'
    dotenv_path = env_path + "python_script.env"
    load_dotenv(dotenv_path=dotenv_path)

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/gdrive


In [4]:
# Check and see whether the API key is available
quandl_key = os.environ.get('QUANDL_API')
if quandl_key is None: sys.exit("API key for Quandl not available. Script Processing Aborted!!!")

In [5]:
# Begin the timer for the script processing
startTimeScript = datetime.now()

# Set up the verbose flag to print detailed messages for debugging (setting True will activate!)
verbose = False

# Set Pandas options
pd.set_option("display.max_rows", 120)
pd.set_option("display.width", 140)

In [6]:
# Specify the parameters for the trading strategy
initial_capital = 0
fast_ma_min = 5
fast_ma_max = 20
slow_ma_min = 10
slow_ma_max = 50
ma_increment = 5
min_ma_gap = 5

model_start_date = datetime(2018, 1, 1)
stock_start_date = model_start_date - timedelta(days=int(slow_ma_max*1.5)) # Need more pricing data to calculate moving averages

# model_end_date = datetime(2020, 6, 22)
model_end_date = datetime.now()

## Task 2. Acquire and Pre-Process Data

In [7]:
def task2_acquire_process_data(stock_symbol):
    start_date_string = stock_start_date.strftime('%Y-%m-%d')
    end_date_string = model_end_date.strftime('%Y-%m-%d')
    quandl_url = "https://www.quandl.com/api/v3/datatables/SHARADAR/SEP.json?date.gte=%s&date.lte=%s&ticker=%s&api_key=%s" % (start_date_string, end_date_string, stock_symbol, quandl_key)
    print('Fetching equity data from', start_date_string, 'to', end_date_string)
    response = requests.get(quandl_url)
    quandl_dict = json.loads(response.text)
    stock_quandl = pd.DataFrame(quandl_dict['datatable']['data'])
    print(len(stock_quandl), 'data points retrieved from the API call.')

    stock_quandl.columns = ['ticker', 'date', 'open', 'high', 'low', 'close', 'volume', 'dividend', 'closeunadj', 'lastupdated']
    # stock_quandl.set_index('date', inplace=True)
    stock_quandl.index = pd.to_datetime(stock_quandl.date)
    stock_quandl = stock_quandl.sort_index(ascending = True)

    return stock_quandl

## Task 3. Develop Strategy and Train Model

In [8]:
def task3_develop_strategy(stock_quandl):
    # Select the data source and pricing columns to use for modeling
    model_template = stock_quandl.loc[:, ['open','close']]

    # Set up the standard column name for modeling
    model_template.rename(columns={'open': 'open_price', 'close': 'close_price'}, inplace=True)
    if verbose: model_template.info(verbose=True)

    def trading_ma_crossover(model):
        waitfor_first_entry = True
        for x in range(len(model)):
            if model['ma_change'].iloc[x] > 0:
                model['trade_signal'].iloc[x] = 1  # trade_signal = 1 means we should take a long position
            else:
                model['trade_signal'].iloc[x] = 0  # trade_signal = 0 means we should take a flat position
            if x != 0:
                model['signal_change'].iloc[x] = model['trade_signal'].iloc[x] - model['trade_signal'].iloc[x-1]
                if waitfor_first_entry and (model['signal_change'].iloc[x-1] == 1):
                    model['entry_exit'].iloc[x] = model['signal_change'].iloc[x-1]
                    waitfor_first_entry = False
                elif (not waitfor_first_entry) and (model['signal_change'].iloc[x-1] != 0):
                    model['entry_exit'].iloc[x] = model['signal_change'].iloc[x-1]
    #     model['entry_exit'].iloc[-1] = -1  # the model assumes we will always exit at tht end of the modeling period

    model_collection = {}
    serial_number = 1

    for slow_ma in range(slow_ma_min, slow_ma_max+1, ma_increment):
        for fast_ma in range(fast_ma_min, fast_ma_max+1, ma_increment):
            if (slow_ma - fast_ma) < min_ma_gap: break
            model_name = 'SMA_' + str(serial_number).zfill(3) + '_SlowMA_' + str(slow_ma).zfill(2) + '_FastMA_' + str(fast_ma).zfill(2)
            serial_number = serial_number + 1
            trading_model = model_template.copy()
            trading_model['fast_ma'] = trading_model['close_price'].ewm(span=fast_ma).mean()
            trading_model['slow_ma'] = trading_model['close_price'].ewm(span=slow_ma).mean()
            trading_model['ma_change'] = trading_model['fast_ma'] - trading_model['slow_ma']
            trading_model['trade_signal'] = np.zeros(len(trading_model))
            trading_model['signal_change'] = np.zeros(len(trading_model))
            trading_model['entry_exit'] = np.zeros(len(trading_model))
            trading_model = trading_model[model_start_date:model_end_date]
            trading_ma_crossover(trading_model)
            model_collection[model_name] = trading_model.copy()

    print(serial_number, 'models added to the trading model collection.')
    print()

    # List the entry/exit points for each model
    for key in model_collection:
        if verbose:
            print('List the signal change and entry/exit points for', key)
            print(model_collection[key][(model_collection[key].signal_change != 0) | (model_collection[key].entry_exit != 0)])
            print()

    return model_collection

## Task 4. Back-test Model

In [9]:
def task4_back_test_model(model_collection):
    def trading_portfolio_generation(initial_fund, trading_model):
        # Construct a portfolio to track the transactions and returns
        portfolio = pd.DataFrame(index=trading_model.index, columns=['trade_action', 'qty_onhand', 'cost_basis', 'sold_transaction', 'gain_loss', 'cash_onhand', 'position_value', 'total_position', 'accumu_return'])
        portfolio.iloc[0]['trade_action'] = 0
        portfolio.iloc[0]['qty_onhand'] = 0
        portfolio.iloc[0]['cost_basis'] = 0.00
        portfolio.iloc[0]['sold_transaction'] = 0.00
        portfolio.iloc[0]['gain_loss'] = 0.00
        portfolio.iloc[0]['cash_onhand'] = initial_capital
        portfolio.iloc[0]['position_value'] = 0.00
        portfolio.iloc[0]['total_position'] = initial_capital
        portfolio.iloc[0]['accumu_return'] = portfolio.iloc[0]['total_position'] - initial_fund
        recent_cost = 0

        # The conditional parameters below determine how the trading strategy will be carried out
        for i in range(1, len(portfolio)):
            if (trading_model.iloc[i]['entry_exit'] == 1) and (portfolio.iloc[i-1]['qty_onhand'] == 0):
                portfolio.iloc[i]['trade_action'] = 1
                portfolio.iloc[i]['qty_onhand'] = portfolio.iloc[i-1]['qty_onhand'] + portfolio.iloc[i]['trade_action']
                portfolio.iloc[i]['cost_basis'] = trading_model.iloc[i]['open_price'] * portfolio.iloc[i]['trade_action']
                portfolio.iloc[i]['sold_transaction'] = 0.00
                portfolio.iloc[i]['gain_loss'] = 0.00
                portfolio.iloc[i]['cash_onhand'] = portfolio.iloc[i-1]['cash_onhand'] - portfolio.iloc[i]['cost_basis']
                recent_cost = trading_model.iloc[i]['open_price'] * portfolio.iloc[i]['trade_action']
                if verbose: print('BOUGHT QTY:', portfolio.iloc[i]['trade_action'], 'on', portfolio.index[i], 'at the price of', trading_model.iloc[i]['open_price'])
            elif (trading_model.iloc[i]['entry_exit'] == -1) and (portfolio.iloc[i-1]['qty_onhand'] > 0):
                portfolio.iloc[i]['trade_action'] = -1
                portfolio.iloc[i]['qty_onhand'] = portfolio.iloc[i-1]['qty_onhand'] + portfolio.iloc[i]['trade_action']
                portfolio.iloc[i]['cost_basis'] = 0.00
                portfolio.iloc[i]['sold_transaction'] = trading_model.iloc[i]['open_price'] * portfolio.iloc[i]['trade_action'] * -1
                portfolio.iloc[i]['gain_loss'] = (recent_cost + (trading_model.iloc[i]['open_price'] * portfolio.iloc[i]['trade_action'])) * -1
                portfolio.iloc[i]['cash_onhand'] = portfolio.iloc[i-1]['cash_onhand'] + portfolio.iloc[i]['sold_transaction']
                recent_cost = 0.00
                if verbose: print('SOLD QTY:', portfolio.iloc[i]['trade_action'], 'on', portfolio.index[i], 'at the price of', trading_model.iloc[i]['open_price'])
            else:
                portfolio.iloc[i]['trade_action'] = 0
                portfolio.iloc[i]['qty_onhand'] = portfolio.iloc[i-1]['qty_onhand']
                portfolio.iloc[i]['cost_basis'] = portfolio.iloc[i-1]['cost_basis']
                portfolio.iloc[i]['sold_transaction'] = 0.00
                portfolio.iloc[i]['gain_loss'] = 0.00
                portfolio.iloc[i]['cash_onhand'] = portfolio.iloc[i-1]['cash_onhand']
            portfolio.iloc[i]['position_value'] = trading_model.iloc[i]['close_price'] * portfolio.iloc[i]['qty_onhand']
            portfolio.iloc[i]['total_position'] = portfolio.iloc[i]['cash_onhand'] + portfolio.iloc[i]['position_value']
            portfolio.iloc[i]['accumu_return'] = portfolio.iloc[i]['total_position'] - initial_fund

        return portfolio


    portfolio_collection = {}

    # Build dataframe for reporting model performance summary
    performance_summary = pd.DataFrame(columns=['model_name','return_value','return_percent'])

    for key in model_collection:
        portfolio_collection[key] = trading_portfolio_generation(initial_capital, model_collection[key])
        if initial_capital != 0:
            return_percentage = portfolio_collection[key].accumu_return[-1] / initial_capital * 100
        else:
            return_percentage = None
        performance_summary = performance_summary.append({'model_name': key, 'return_value': portfolio_collection[key].accumu_return[-1],
                                                        'return_percent': return_percentage}, ignore_index=True)
    print()

    # Display the model performance summary
    performance_summary.sort_values(by=['return_value'], inplace=True, ascending=False)
    print(performance_summary)
    print()

    # Display the transactions from the top model
    top_model = performance_summary.iloc[0]['model_name']
    print('The transactions from the top model %s:' % (top_model))
    print(portfolio_collection[top_model][portfolio_collection[top_model].trade_action != 0])
    print()

    return portfolio_collection

## Task 5. Evaluate Performance

In [10]:
def task5_evaluate_performance(stock_quandl, model_collection, portfolio_collection):
    # Calculate the stock's performance for a long-only model
    long_only_model = stock_quandl[model_start_date:model_end_date]
    long_only_return = long_only_model.iloc[-1]['close'] - long_only_model.iloc[0]['open']
    print('The performance of the long-only model from day one is: $%.2f' %(long_only_return))
    print()

    best_model = ''
    best_return = 0
    for key in portfolio_collection:
        if portfolio_collection[key]['accumu_return'][-1] > best_return:
            best_model = key
            best_return = portfolio_collection[best_model]['accumu_return'][-1]
    print('The best model found is:', best_model)
    print('The best profit/loss for the investing period is: $%.2f' % (best_return))
    if initial_capital != 0:
        print('The best return percentage for initial capital is: %.2f%%' % (best_return / initial_capital * 100))
    print()

    worst_model = None
    worst_return = long_only_return
    for key in portfolio_collection:
        if portfolio_collection[key]['accumu_return'][-1] < worst_return:
            worst_model = key
            worst_return = portfolio_collection[worst_model]['accumu_return'][-1]
    print('The worst model found is:', worst_model)
    print('The worst profit/loss for the investing period is: $%.2f' % (worst_return))
    if initial_capital != 0:
        print('The worst return percentage for the initial capital is: %.2f%%' % (worst_return / initial_capital * 100))
    print()

    for key in model_collection:
        print('Processing portfolio for model:', key)
        trade_transactions = portfolio_collection[key][portfolio_collection[key].trade_action != 0]
        print(trade_transactions)
        print('Accumulated profit/loss for one share of stock with initial capital of $%.0f at the end of modeling period: $%.2f' % (initial_capital, portfolio_collection[key].accumu_return[-1]))
        if initial_capital != 0:
            print('Accumulated return percentage based on the initial capital investment: %.2f%%' % (return_percentage))
        if trade_transactions.iloc[-1]['trade_action'] == 1:
            print('The current status of the model is:', 'Holding a position since', trade_transactions.index.tolist()[-1], '\n')
        else:
            print('The current status of the model is:', 'Waiting to enter since', trade_transactions.index.tolist()[-1], '\n')

## Task Execution and Output Management

In [11]:
dataset_path = 'https://www.dainesanalytics.com/datasets/cramer-covid19-index/Cramer_COVID-19_Index.csv'
stock_meta = pd.read_csv(dataset_path, sep=',')
stock_list = stock_meta['Symbol'].tolist()
print('Stocks to process:', stock_list)

Stocks to process: ['AAPL', 'ABBV', 'ABT', 'ADBE', 'AKAM', 'AMD', 'AMT', 'AMZN', 'ATVI', 'BAX', 'BNTX', 'BSX', 'BYND', 'CAG', 'CCI', 'CHGG', 'CHWY', 'CL', 'CLX', 'CMG', 'CNC', 'COR', 'COST', 'COUP', 'CPB', 'CRM', 'CRWD', 'CTXS', 'D', 'DDOG', 'DG', 'DHR', 'DOCU', 'DPZ', 'DXCM', 'EA', 'EBAY', 'EBS', 'EQIX', 'ETSY', 'EVBG', 'GILD', 'GIS', 'GOLD', 'GOOG', 'GSK', 'HD', 'HRL', 'JNJ', 'K', 'KR', 'LLY', 'LOGI', 'LVGO', 'MASI', 'MDLZ', 'MKC', 'MKTX', 'MRNA', 'MRVL', 'MSFT', 'NET', 'NFLX', 'NVDA', 'OKTA', 'PANW', 'PEP', 'PFE', 'PG', 'PLD', 'PRGO', 'PTON', 'PYPL', 'REGN', 'RMD', 'RNG', 'SHOP', 'SJM', 'SNY', 'SPGI', 'SPLK', 'SPOT', 'SQ', 'TDOC', 'TGT', 'TMO', 'TTD', 'TTWO', 'TW', 'TWLO', 'UNH', 'VEEV', 'VZ', 'WING', 'WIX', 'WMT', 'WORK', 'ZM', 'ZS', 'ZTS']


In [12]:
for stock_symbol in stock_list:
    # Begin the timer for the script processing
    startTimeModule = datetime.now()

    print('Processing', stock_symbol, 'from', model_start_date.strftime('%Y-%m-%d'), 'to', model_end_date.strftime('%Y-%m-%d'))

    # Set up the redirection of output to a file
    orig_stdout = sys.stdout
    filename = gdrivePrefix + "algotrading_" + stock_symbol + "_ema-crossover_" + model_end_date.strftime('%Y%m%d') + ".txt"
    f = open(filename, 'w')
    sys.stdout = f

    print('Processing the ticker symbol:', stock_symbol)
    print("Starting date for the model:", model_start_date)
    print("Ending date for the model:", model_end_date)
    print()

    stock_prices = task2_acquire_process_data(stock_symbol)
    stock_models = task3_develop_strategy(stock_prices)
    stock_portfolios = task4_back_test_model(stock_models)
    task5_evaluate_performance(stock_prices, stock_models, stock_portfolios)

    sys.stdout = orig_stdout
    f.close()

    print ('Total time for the script:',(datetime.now() - startTimeModule))
    print ('The output was stored in the file:', filename)

Processing AAPL from 2018-01-01 to 2020-06-23
Total time for the script: 0:00:50.990961
The output was stored in the file: /content/gdrive/My Drive/Colab_Downloads/algotrading_AAPL_ema-crossover_20200623.txt
Processing ABBV from 2018-01-01 to 2020-06-23
Total time for the script: 0:00:51.030658
The output was stored in the file: /content/gdrive/My Drive/Colab_Downloads/algotrading_ABBV_ema-crossover_20200623.txt
Processing ABT from 2018-01-01 to 2020-06-23
Total time for the script: 0:00:51.060265
The output was stored in the file: /content/gdrive/My Drive/Colab_Downloads/algotrading_ABT_ema-crossover_20200623.txt
Processing ADBE from 2018-01-01 to 2020-06-23
Total time for the script: 0:00:50.581080
The output was stored in the file: /content/gdrive/My Drive/Colab_Downloads/algotrading_ADBE_ema-crossover_20200623.txt
Processing AKAM from 2018-01-01 to 2020-06-23
Total time for the script: 0:00:50.966154
The output was stored in the file: /content/gdrive/My Drive/Colab_Downloads/algotr

In [13]:
print ('Total time for the script:',(datetime.now() - startTimeScript))

Total time for the script: 1:19:16.032808
