# Trading Strategy Development
This notebook is for developing crypto trading strategies for the bot we will deploy. If you want to do some data mining, build models, etc. If you want to build more advanced models that don't rely strictly on TA, I would recommend importing `scipy` (stats, optimization, signal processing) `scikit-learn` (classical machine learning), and `pytorch` or `TensorFlow` (deep learning).

### Import required modules
-  I like to work in python2 hence the `print_function` from `__future__`.
-  The `binance` module is a wrapper found here: https://github.com/sammchardy/python-binance.
-  `myapi` simply contains your API key and secret.
-  We will almost always want to use `datetime`, `numpy`, and `pandas` for our data processing and computation.
-  `talib` is an entire library of TA indicators. The project can be found here: https://www.ta-lib.org/.
-  I went with `matplotlib` for quick plots but we should push over to `plotly` as it's more robust and interactive.

In [1]:
#!/usr/bin/env python

from __future__ import print_function
from binance.client import Client
import myapi
import datetime
import numpy as np
import pandas as pd
import talib as ta
import matplotlib.pyplot as plt
pd.options.mode.chained_assignment = None  # default='warn'

In [2]:
from plotly import __version__
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import plotly.graph_objs as go
init_notebook_mode(connected=True)

### User-defined functions
Here are some functions I wrote to assist with transforming the data that the Binance API is returning. Binance provides all datetimes in milliseconds and the data comes back as a nested list which we can transform into a `pandas` dataframe.
-  Use `milliseconds_to_datetime()` if you're looking at minute or hourly candles
-  Use `milliseconds_to_date()` if you're looking at daily or weekly candles
-  Use `df_reset()` to reset the dataframe for each new strategy you want to build in this notebook. Much easier than using different objects.
-  Use `backtest()` to run a backtest, plot equity curve. I still need to add in transaction fees.

In [3]:
def milliseconds_to_datetime(milli):
    '''
    Transforms milliseconds into datetime. Should probably be UTC...
    
    Args: milli = milliseconds as numeric data type
    '''
    return datetime.datetime.fromtimestamp(milli/1000.0)


def milliseconds_to_date(milli):
    '''
    Transforms milliseconds into date. No need for UTC here because the
    date will be identical.
    
    Args:
        milli = milliseconds as numeric data type
        day_or_min = 'day' for kline_days, 'min' for kline_mins
    '''
    return datetime.date.fromtimestamp(milli/1000.0)

def df_reset():
    # Initialize DataFrame
    df = pd.DataFrame()

    # Extract times from klines list and transform to datetime
    df['dates'] = [elem[6] for elem in dt]
    df['dates'] = map(milliseconds_to_datetime, df['dates'])
    df.index = df.dates
    #del df['dates']
    # Extract close price from klines list
    df['open'] = [elem[1] for elem in dt]
    df['open'] = map(float, df['open'])
    df['high'] = [elem[2] for elem in dt]
    df['high'] = map(float, df['high'])
    df['low'] = [elem[3] for elem in dt]
    df['low'] = map(float, df['low'])
    df['close'] = [elem[4] for elem in dt]
    df['close'] = map(float, df['close'])
    df['volume'] = [elem[5] for elem in dt]
    df['volume'] = map(float, df['volume'])
    
    # Calculate 15-min returns
    df['simple_ret'] = df.close.pct_change()
    df['log_ret'] = np.log(df.close).diff()

    # Calculate SMA(5), SMA(7), SMA(10), SMA(20), SMA(50)
    df['sma5'] = ta.SMA(df.close.values, 5)
    df['sma7'] = ta.SMA(df.close.values, 7)
    df['sma10'] = ta.SMA(df.close.values, 10)
    df['sma14'] = ta.SMA(df.close.values, 14)
    df['sma20'] = ta.SMA(df.close.values, 20)
    df['sma50'] = ta.SMA(df.close.values, 50)

    # Calculate RSI(14)
    df['rsi14'] = ta.RSI(df.close.values, 14)

    # Clean up
    df = df.dropna()
    
    return df

def backtest(df):
    # Generate signals
    df = rules_engine(df)

    # Calculate returns
    # trans_fee = .001
    df['strat_ret'] = df.signal * df.log_ret #- trans_fee * df.signal
    df['cum_log_ret'] = df.strat_ret.cumsum()
    df['cum_simple_ret'] = np.exp(df.cum_log_ret) - 1

    # Plot equity curve
    log_equity = go.Scatter(x=df.index, y=df.cum_log_ret, name='Cum. Log Ret')
    simple_equity = go.Scatter(x=df.index, y=df.cum_simple_ret, name='Cum. Simple Ret')
    trace_equity = [log_equity, simple_equity]
    iplot(trace_equity)

    # Print returns
    print('Cumulative Log Return = ' + str(df.cum_log_ret[-1:].values * 100).strip('[]') + '%')
    print('Cumulative Simple Return = ' + str(df.cum_simple_ret[-1:].values * 100).strip('[]') + '%')
    
    return

## Initialize the Binance client
Create a `client` object and grab the server time just to ensure our connection works properly.

In [4]:
# Initialize client
client = Client(myapi.key, myapi.secret)
# Server time (in UTC milliseconds)
client.get_server_time()

{u'serverTime': 1517260429779L}

## Set crypto pair, OHLC candle window, and extract data
-  `tick` is your cryptocurrency pair (e.g. 'LINKBTC', 'ZCLBTC') as a string
-  `window` is your desired candle interval, you can use `Client.KLINE_INTERVAL_` + your desired interval

Here is the setup for the `get_historical_klines()` function from `binance/client.py`:
> **Client.get_historical_klines(symbol, interval, start_str, end_str)**

Note: If you do not pass an argument to `end_str`, the data will span up until now.

In [5]:
# Set your ticker
tick = 'XMRBTC'
# Set your interval
window = Client.KLINE_INTERVAL_15MINUTE
# Extract Historical OHLCV
dt = client.get_historical_klines(tick, window, "1 Jan, 2017")
dt[0]

[1510293600000L,
 u'0.01250000',
 u'0.01250000',
 u'0.01250000',
 u'0.01250000',
 u'0.95300000',
 1510294499999L,
 u'0.01191250',
 1,
 u'0.00000000',
 u'0.00000000',
 u'2156.63682458']

# STRATEGY 1
-  Buy XMRBTC if (sma10 - sma20) > Min{sma10 - sma20}
-  Sell XMRBTC if (sma10 - sma20) < Min{sma10 - sma20}

In [6]:
df = df_reset()
df.head()

Unnamed: 0_level_0,dates,open,high,low,close,volume,simple_ret,log_ret,sma5,sma7,sma10,sma14,sma20,sma50,rsi14
dates,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
2017-11-10 18:29:59.999,2017-11-10 18:29:59.999,0.016101,0.016274,0.0161,0.016101,362.853,0.0,0.0,0.016211,0.016166,0.016113,0.01605,0.016024,0.016221,57.204209
2017-11-10 18:44:59.999,2017-11-10 18:44:59.999,0.0161,0.016275,0.0161,0.016165,237.425,0.003975,0.003967,0.016182,0.016186,0.016123,0.016068,0.01603,0.016295,58.894194
2017-11-10 18:59:59.999,2017-11-10 18:59:59.999,0.016101,0.016355,0.0156,0.0156,605.826,-0.034952,-0.035577,0.016047,0.016117,0.01608,0.016048,0.016014,0.016248,42.818594
2017-11-10 19:14:59.999,2017-11-10 19:14:59.999,0.015603,0.016186,0.0156,0.016185,159.605,0.0375,0.036814,0.01603,0.016099,0.016111,0.016071,0.016027,0.016239,56.161313
2017-11-10 19:29:59.999,2017-11-10 19:29:59.999,0.016185,0.016185,0.015636,0.015679,69.913,-0.031264,-0.031763,0.015946,0.016014,0.016077,0.016054,0.016007,0.016228,46.133898


In [7]:
def rules_engine(df):
    """
    Just a simple rules engine to generate your signal vector.
    Args: df = pandas DataFrame of pricing data and TA indicators
    """
    
    '''STRATEGY RULES GO BELOW HERE'''
    # If SMA(10) crosses SMA(20) from below, then buy
    # If SMA(10) crosses SMA(20) from above, then sell
    #df['signal'] = np.where((df.sma10 > df.sma20) & (df.sma10.shift(1) < df.sma20.shift(1)), 1, 
    #                          np.where((df.sma10 < df.sma20) & df.sma10.shift(1) > df.sma20.shift(1), -1, 0))
    X = (df.sma10 - df.sma20).min()
    df['signal'] = np.where((df.sma10 - df.sma20) > X, 1, 0)
    df['signal'] = np.where((df.sma10 - df.sma20) < X, -1, df['signal'])
    '''STRATEGY RULES GO ABOVE HERE'''
    
    
    # Lag the signal 1 row because we're setting our position for the next 15m period
    df['signal'] = df.signal.shift(1)
    df = df.dropna()
    
    return df

### Generate signals and backtest strategy
Now we can run our dataframe into the `rules_engine()` function and calculate the strategy's return for each period (this is simply your `signal` vector multiplied by the `log_ret` vector). From there, we can calculate the strategy's cumulative return over time, get some performance statistics, and plot our equity curve and drawdown.

In [8]:
backtest(df)

Cumulative Log Return =  55.89631404%
Cumulative Simple Return =  74.88582395%


## STRATEGY 2
-  Buy XMRBTC if close > sma5 & rsi14 < 50
-  Sell XMRBTC if close < sma5 & rsi14 > 50

In [65]:
df = df_reset()
df.head()

Unnamed: 0_level_0,dates,open,high,low,close,volume,simple_ret,log_ret,sma5,sma7,sma10,sma14,sma20,sma50,rsi14
dates,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
2017-11-10 18:29:59.999,2017-11-10 18:29:59.999,0.016101,0.016274,0.0161,0.016101,362.853,0.0,0.0,0.016211,0.016166,0.016113,0.01605,0.016024,0.016221,57.204209
2017-11-10 18:44:59.999,2017-11-10 18:44:59.999,0.0161,0.016275,0.0161,0.016165,237.425,0.003975,0.003967,0.016182,0.016186,0.016123,0.016068,0.01603,0.016295,58.894194
2017-11-10 18:59:59.999,2017-11-10 18:59:59.999,0.016101,0.016355,0.0156,0.0156,605.826,-0.034952,-0.035577,0.016047,0.016117,0.01608,0.016048,0.016014,0.016248,42.818594
2017-11-10 19:14:59.999,2017-11-10 19:14:59.999,0.015603,0.016186,0.0156,0.016185,159.605,0.0375,0.036814,0.01603,0.016099,0.016111,0.016071,0.016027,0.016239,56.161313
2017-11-10 19:29:59.999,2017-11-10 19:29:59.999,0.016185,0.016185,0.015636,0.015679,69.913,-0.031264,-0.031763,0.015946,0.016014,0.016077,0.016054,0.016007,0.016228,46.133898


In [10]:
def rules_engine(df):
    """
    Just a simple rules engine to generate your signal vector.
    Args: df = pandas DataFrame of pricing data and TA indicators
    """
    
    '''STRATEGY RULES GO BELOW HERE'''
    
    df['signal'] = np.where((df.close > df.sma5) & (df.rsi14 < 50), 1, 0)
    df['signal'] = np.where((df.close < df.sma5) & (df.rsi14 > 50), -1, df['signal'])
    
    '''STRATEGY RULES GO ABOVE HERE'''
    
    
    # Lag the signal 1 row because we're setting our position for the next 15m period
    df['signal'] = df.signal.shift(1)
    df = df.dropna()
    
    return df

In [11]:
backtest(df)

Cumulative Log Return =  11.73168912%
Cumulative Simple Return =  12.44757097%


# STRATEGY 3
-  Buy XMRBTC if sma10 < sma20
-  Sell XMRBTC if sma10 > sma20

In [66]:
df = df_reset()
df.head()

Unnamed: 0_level_0,dates,open,high,low,close,volume,simple_ret,log_ret,sma5,sma7,sma10,sma14,sma20,sma50,rsi14
dates,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
2017-11-10 18:29:59.999,2017-11-10 18:29:59.999,0.016101,0.016274,0.0161,0.016101,362.853,0.0,0.0,0.016211,0.016166,0.016113,0.01605,0.016024,0.016221,57.204209
2017-11-10 18:44:59.999,2017-11-10 18:44:59.999,0.0161,0.016275,0.0161,0.016165,237.425,0.003975,0.003967,0.016182,0.016186,0.016123,0.016068,0.01603,0.016295,58.894194
2017-11-10 18:59:59.999,2017-11-10 18:59:59.999,0.016101,0.016355,0.0156,0.0156,605.826,-0.034952,-0.035577,0.016047,0.016117,0.01608,0.016048,0.016014,0.016248,42.818594
2017-11-10 19:14:59.999,2017-11-10 19:14:59.999,0.015603,0.016186,0.0156,0.016185,159.605,0.0375,0.036814,0.01603,0.016099,0.016111,0.016071,0.016027,0.016239,56.161313
2017-11-10 19:29:59.999,2017-11-10 19:29:59.999,0.016185,0.016185,0.015636,0.015679,69.913,-0.031264,-0.031763,0.015946,0.016014,0.016077,0.016054,0.016007,0.016228,46.133898


In [21]:
def rules_engine(df):
    """
    Just a simple rules engine to generate your signal vector.
    Args: df = pandas DataFrame of pricing data and TA indicators
    """
    
    '''STRATEGY RULES GO BELOW HERE'''
    
    # Buy logic
    df['signal'] = np.where(df.sma10 < df.sma20, 1, 0)
    # Sell logic
    df['signal'] = np.where(df.sma10 > df.sma20, -1, df['signal'])
    
    '''STRATEGY RULES GO ABOVE HERE'''
    
    
    # Lag the signal 1 row because we're setting our position for the next 15m period
    df['signal'] = df.signal.shift(1)
    df = df.dropna()
    
    return df

# Add the following to backtest():
-  Transaction fees = .1% = .001
-  Ratios (e.g. Sharpe, Calmar)
    -  Daily, Weekly, Monthly
-  CAGR
-  Max drawdown

In [67]:
df['signal'] = np.where(df.sma10 < df.sma20, 1, 0)
df['signal'] = np.where(df.sma10 > df.sma20, -1, df['signal'])
df['changes'] = np.nan
df.head(20)

Unnamed: 0_level_0,dates,open,high,low,close,volume,simple_ret,log_ret,sma5,sma7,sma10,sma14,sma20,sma50,rsi14,signal,changes
dates,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2017-11-10 18:29:59.999,2017-11-10 18:29:59.999,0.016101,0.016274,0.0161,0.016101,362.853,0.0,0.0,0.016211,0.016166,0.016113,0.01605,0.016024,0.016221,57.204209,-1,
2017-11-10 18:44:59.999,2017-11-10 18:44:59.999,0.0161,0.016275,0.0161,0.016165,237.425,0.003975,0.003967,0.016182,0.016186,0.016123,0.016068,0.01603,0.016295,58.894194,-1,
2017-11-10 18:59:59.999,2017-11-10 18:59:59.999,0.016101,0.016355,0.0156,0.0156,605.826,-0.034952,-0.035577,0.016047,0.016117,0.01608,0.016048,0.016014,0.016248,42.818594,-1,
2017-11-10 19:14:59.999,2017-11-10 19:14:59.999,0.015603,0.016186,0.0156,0.016185,159.605,0.0375,0.036814,0.01603,0.016099,0.016111,0.016071,0.016027,0.016239,56.161313,-1,
2017-11-10 19:29:59.999,2017-11-10 19:29:59.999,0.016185,0.016185,0.015636,0.015679,69.913,-0.031264,-0.031763,0.015946,0.016014,0.016077,0.016054,0.016007,0.016228,46.133898,-1,
2017-11-10 19:44:59.999,2017-11-10 19:44:59.999,0.016177,0.016177,0.015722,0.015773,57.063,0.005995,0.005977,0.01588,0.015943,0.016046,0.016033,0.016,0.016219,47.991646,-1,
2017-11-10 19:59:59.999,2017-11-10 19:59:59.999,0.015836,0.015838,0.015801,0.015822,116.116,0.003107,0.003102,0.015812,0.015904,0.015997,0.016019,0.015996,0.016206,48.979447,-1,
2017-11-10 20:14:59.999,2017-11-10 20:14:59.999,0.015822,0.016071,0.01582,0.01582,78.18,-0.000126,-0.000126,0.015856,0.015863,0.015951,0.016015,0.015992,0.016196,48.93859,1,
2017-11-10 20:29:59.999,2017-11-10 20:29:59.999,0.015821,0.016148,0.01582,0.015824,232.862,0.000253,0.000253,0.015784,0.015815,0.015907,0.016,0.015989,0.016187,49.030166,1,
2017-11-10 20:44:59.999,2017-11-10 20:44:59.999,0.015821,0.016143,0.015821,0.015823,251.858,-6.3e-05,-6.3e-05,0.015812,0.015847,0.015879,0.015982,0.015987,0.016178,49.006503,1,


In [68]:
'''
changes = pd.Series()
if (df.signal[0] == 1):
    changes.append(1)
    current += 1
elif (df.signal[0] == -1):
    changes.append(-1)
    current -= 1
else:
    changes.append(0)
    current = 0

for x in df.signal[1:]:
    # No change = 0
    if (np.diff(df.signal) == 0):
        changes.append(0)
    # Nothing to buy = 1
    elif (np.diff(df.signal) == 1):
        changes.append(1)
    # Nothing to sell = -1
    elif (np.diff(df.signal) == -1):
        changes.append(-1)
    # Short to long = 1
    elif (np.diff(df.signal) == 2):
        changes.append(1)
    # Long to short = -1
    elif (np.diff(df.signal) == -2):
        changes.append(-1)
    else:
        changes.append(None)
'''
df['changes'][np.where(np.diff(df.signal>0)!=0)[0] + 1] = 1
df.head(20)

Unnamed: 0_level_0,dates,open,high,low,close,volume,simple_ret,log_ret,sma5,sma7,sma10,sma14,sma20,sma50,rsi14,signal,changes
dates,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2017-11-10 18:29:59.999,2017-11-10 18:29:59.999,0.016101,0.016274,0.0161,0.016101,362.853,0.0,0.0,0.016211,0.016166,0.016113,0.01605,0.016024,0.016221,57.204209,-1,
2017-11-10 18:44:59.999,2017-11-10 18:44:59.999,0.0161,0.016275,0.0161,0.016165,237.425,0.003975,0.003967,0.016182,0.016186,0.016123,0.016068,0.01603,0.016295,58.894194,-1,
2017-11-10 18:59:59.999,2017-11-10 18:59:59.999,0.016101,0.016355,0.0156,0.0156,605.826,-0.034952,-0.035577,0.016047,0.016117,0.01608,0.016048,0.016014,0.016248,42.818594,-1,
2017-11-10 19:14:59.999,2017-11-10 19:14:59.999,0.015603,0.016186,0.0156,0.016185,159.605,0.0375,0.036814,0.01603,0.016099,0.016111,0.016071,0.016027,0.016239,56.161313,-1,
2017-11-10 19:29:59.999,2017-11-10 19:29:59.999,0.016185,0.016185,0.015636,0.015679,69.913,-0.031264,-0.031763,0.015946,0.016014,0.016077,0.016054,0.016007,0.016228,46.133898,-1,
2017-11-10 19:44:59.999,2017-11-10 19:44:59.999,0.016177,0.016177,0.015722,0.015773,57.063,0.005995,0.005977,0.01588,0.015943,0.016046,0.016033,0.016,0.016219,47.991646,-1,
2017-11-10 19:59:59.999,2017-11-10 19:59:59.999,0.015836,0.015838,0.015801,0.015822,116.116,0.003107,0.003102,0.015812,0.015904,0.015997,0.016019,0.015996,0.016206,48.979447,-1,
2017-11-10 20:14:59.999,2017-11-10 20:14:59.999,0.015822,0.016071,0.01582,0.01582,78.18,-0.000126,-0.000126,0.015856,0.015863,0.015951,0.016015,0.015992,0.016196,48.93859,1,1.0
2017-11-10 20:29:59.999,2017-11-10 20:29:59.999,0.015821,0.016148,0.01582,0.015824,232.862,0.000253,0.000253,0.015784,0.015815,0.015907,0.016,0.015989,0.016187,49.030166,1,
2017-11-10 20:44:59.999,2017-11-10 20:44:59.999,0.015821,0.016143,0.015821,0.015823,251.858,-6.3e-05,-6.3e-05,0.015812,0.015847,0.015879,0.015982,0.015987,0.016178,49.006503,1,


In [22]:
backtest(df)

Cumulative Log Return =  145.40122602%
Cumulative Simple Return =  328.02536001%
