# Trading Strategy Development
This notebook is for developing crypto trading strategies for the bot we will deploy. If you want to do some data mining, build models, etc. If you want to build more advanced models that don't rely strictly on TA, I would recommend importing `scipy` (stats, optimization, signal processing) `scikit-learn` (classical machine learning), and `pytorch` or `TensorFlow` (deep learning).

# Scroll down to rules_engine() section to view: indicators, signals, and trading rules

# THIS STRATEGY IS TRADING XMRBTC VIA BINANCE

### Import required modules
-  I like to work in python2 hence the `print_function` from `__future__`.
-  The `binance` module is a wrapper found here: https://github.com/sammchardy/python-binance.
-  `myapi` simply contains your API key and secret.
-  We will almost always want to use `datetime`, `numpy`, and `pandas` for our data processing and computation.
-  `talib` is an entire library of TA indicators. The project can be found here: https://www.ta-lib.org/.
-  I went with `matplotlib` for quick plots but we should push over to `plotly` as it's more robust and interactive.

In [1]:
#!/usr/bin/env python

from __future__ import print_function
from binance.client import Client
import myapi
import datetime
import numpy as np
import pandas as pd
import talib as ta
import matplotlib.pyplot as plt

### User-defined functions
Here are some functions I wrote to assist with transforming the data that the Binance API is returning. Binance provides all datetimes in milliseconds and the data comes back as a nested list which we can transform into a `pandas` dataframe.
-  Use `milliseconds_to_datetime()` if you're looking at minute or hourly candles
-  Use `milliseconds_to_date()` if you're looking at daily or weekly candles
-  I haven't needed to use `kline_to_pd()` yet, however, if you want to make a strategy using OHLCV and not just Close, this is your function to go from the Binance nested list to a `pandas` dataframe

In [2]:
def milliseconds_to_datetime(milli):
    '''
    Transforms milliseconds into datetime. Should probably be UTC...
    
    Args: milli = milliseconds as numeric data type
    '''
    return datetime.datetime.fromtimestamp(milli/1000.0)


def milliseconds_to_date(milli):
    '''
    Transforms milliseconds into date. No need for UTC here because the
    date will be identical.
    
    Args:
        milli = milliseconds as numeric data type
        day_or_min = 'day' for kline_days, 'min' for kline_mins
    '''
    return datetime.date.fromtimestamp(milli/1000.0)


def kline_to_pd(kline_list):
    '''
    Transform OHLC data returned from Client.get_klines() into a pandas
    DataFrame object so we can calculate statistics and build models.
    
    Args: kline_list = nested list object returned by Client.get_klines()
    '''
    # Prepend headers to data
    header = [
                'OpenTime', 
                'Open', 
                'High',
                'Low',
                'Close',
                'Volume',
                'CloseTime',
                'QuoteAssetVolume',
                'NumTrades',
                'TakerBuyBaseVolume',
                'TakerBuyQuoteVolume',
                'IGNORE'
              ]
    data = [header] + kline_list
    # Transform to DataFrame
    df = pd.DataFrame(data[1:], columns=data[0])
    # Remove the 'IGNORE' column
    del df['IGNORE']
    
    return df

### Initialize the Binance client
Create a `client` object and grab the server time just to ensure our connection works properly.

In [3]:
# Initialize client
client = Client(myapi.key, myapi.secret)
# Server time (in UTC milliseconds)
client.get_server_time()

{u'serverTime': 1516916794778L}

### Set crypto pair, OHLC candle window, and extract data
-  `tick` is your cryptocurrency pair (e.g. 'LINKBTC', 'ZCLBTC') as a string
-  `window` is your desired candle interval, you can use `Client.KLINE_INTERVAL_` + your desired interval

Here is the setup for the `get_historical_klines()` function from `binance/client.py`:
> **Client.get_historical_klines(symbol, interval, start_str, end_str)**

Note: If you do not pass an argument to `end_str`, the data will span up until now.

In [4]:
# Set your ticker
tick = 'XMRBTC'
# Set your interval
window = Client.KLINE_INTERVAL_15MINUTE
# Extract Historical OHLCV
dt = client.get_historical_klines(tick, window, "1 Jan, 2017")
dt[0]

[1510293600000L,
 u'0.01250000',
 u'0.01250000',
 u'0.01250000',
 u'0.01250000',
 u'0.95300000',
 1510294499999L,
 u'0.01191250',
 1,
 u'0.00000000',
 u'0.00000000',
 u'2156.63682458']

### Build the dataframe
Now we take the nested list, `dt`, and transform it into a `pandas` dataframe. We could use the UDF `kline_to_pd`, however, let's just work with the dates and cloes prices.

In [35]:
# Initialize DataFrame
df = pd.DataFrame()

# Extract times from klines list and transform to datetime
df['dates'] = [elem[6] for elem in dt]
df['dates'] = map(milliseconds_to_datetime, df['dates'])
df.index = df.dates
#del df['dates']
# Extract close price from klines list
df['open'] = [elem[1] for elem in dt]
df['open'] = map(float, df['open'])
df['high'] = [elem[2] for elem in dt]
df['high'] = map(float, df['high'])
df['low'] = [elem[3] for elem in dt]
df['low'] = map(float, df['low'])
df['close'] = [elem[4] for elem in dt]
df['close'] = map(float, df['close'])
df['volume'] = [elem[5] for elem in dt]
df['volume'] = map(float, df['volume'])

### Calculate returns and TA indicators
We will calculate simple and logarithmic returns, however, we will use log returns as it's better suited for quantitative finance due to log-normality, time-additivty, and more.

We will also calculate some simple TA indicators here.
-  SMA(n) = Simple Moving Average over n periods
-  EMA(n) = Exponential Moving Average over n periods
-  RSI(n) = Relative Strength Index over n periods, 14 is the default

Calculate returns will force the earliest date to be NA as no return can be calculated at that point. This same concept applies when calculating TA indicators so you'll have to drop the first N rows of your dataframe, where $N = max\left( n \;\; \forall \;\; TA \; indicators \right)$.

In [36]:
df.shape

(7349, 6)

In [37]:
# Calculate 15-min returns
df['simple_ret'] = df.close.pct_change()
df['log_ret'] = np.log(df.close).diff()

# Calculate SMA(5), SMA(7), SMA(10), SMA(20), SMA(50)
df['sma5'] = ta.SMA(df.close.values, 5)
df['sma7'] = ta.SMA(df.close.values, 7)
df['sma10'] = ta.SMA(df.close.values, 10)
df['sma14'] = ta.SMA(df.close.values, 14)
df['sma20'] = ta.SMA(df.close.values, 20)
df['sma50'] = ta.SMA(df.close.values, 50)

# Calculate RSI(14)
df['rsi14'] = ta.RSI(df.close.values, 14)

# Clean up data
df = df.dropna()
df.head()

Unnamed: 0_level_0,dates,open,high,low,close,volume,simple_ret,log_ret,sma5,sma7,sma10,sma14,sma20,sma50,rsi14
dates,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
2017-11-10 18:29:59.999,2017-11-10 18:29:59.999,0.016101,0.016274,0.0161,0.016101,362.853,0.0,0.0,0.016211,0.016166,0.016113,0.01605,0.016024,0.016221,57.204209
2017-11-10 18:44:59.999,2017-11-10 18:44:59.999,0.0161,0.016275,0.0161,0.016165,237.425,0.003975,0.003967,0.016182,0.016186,0.016123,0.016068,0.01603,0.016295,58.894194
2017-11-10 18:59:59.999,2017-11-10 18:59:59.999,0.016101,0.016355,0.0156,0.0156,605.826,-0.034952,-0.035577,0.016047,0.016117,0.01608,0.016048,0.016014,0.016248,42.818594
2017-11-10 19:14:59.999,2017-11-10 19:14:59.999,0.015603,0.016186,0.0156,0.016185,159.605,0.0375,0.036814,0.01603,0.016099,0.016111,0.016071,0.016027,0.016239,56.161313
2017-11-10 19:29:59.999,2017-11-10 19:29:59.999,0.016185,0.016185,0.015636,0.015679,69.913,-0.031264,-0.031763,0.015946,0.016014,0.016077,0.016054,0.016007,0.016228,46.133898


### Explore these series w/ Plotly
Create plot with SMA overlays, RSI below, volume, etc.

In [54]:
from plotly import __version__
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import plotly.graph_objs as go
init_notebook_mode(connected=True)

'''
import cufflinks as cf
cf.set_config_file(offline=True, world_readable=True)
subdf = df[['dates', 'close', 'sma5', 'sma10', 'sma20', 'sma50']]
subdf.iplot(kind='scatter')
'''

ohlc = go.Ohlc(x=df.index, open=df.open, high=df.high, low=df.low, close=df.close)
close = go.Scatter(x=df.index, y=df.close, name='close')
sma5 = go.Scatter(x=df.index, y=df.sma5, name='sma5')
sma10 = go.Scatter(x=df.index, y=df.sma10, name='sma10')
sma20 = go.Scatter(x=df.index, y=df.sma20, name='sma20')
sma50 = go.Scatter(x=df.index, y=df.sma50, name='sma50')
#crosses = go.Scatter(x=df.index, y=np.where(df.sma10 == df.sma20, df.sma10, 0), name='crossover', mode='markers', style=dict(w))
trace_sma = [close, sma10, sma20]#, crosses]
iplot(trace_sma)

### Rules engine
Here we will define our buy and sell logic for the trading strategy. This function will return a `signal` vector that consists of 1, 0, and -1 depending on which trade signal fires. 0 implies there was no signal while 1 implies a buy signal and -1 implies a sell signal.

Here I'm looking at SMA(10) and SMA(20) crossovers. I have set a threshold of MIN(SMA(10) - SMA(20)) to help avoid false signals. We're looking for the SMA(10) to cross the SMA(20) and exceed that threshold, then go long. When the SMA(10) moves MIN(SMA(10)-SMA(20)) below the SMA(20), then go short.

In [82]:
def rules_engine(df):
    """
    Just a simple rules engine to generate your signal vector.
    Args: df = pandas DataFrame of pricing data and TA indicators
    """
    
    '''STRATEGY RULES GO BELOW HERE'''
    # If SMA(10) crosses SMA(20) from below, then buy
    # If SMA(10) crosses SMA(20) from above, then sell
    #df['signal'] = np.where((df.sma10 > df.sma20) & (df.sma10.shift(1) < df.sma20.shift(1)), 1, 
    #                          np.where((df.sma10 < df.sma20) & df.sma10.shift(1) > df.sma20.shift(1), -1, 0))
    X = (df.sma10 - df.sma20).min()
    df['signal'] = np.where((df.sma10 - df.sma20) > X, 1, 0)
    df['signal'] = np.where((df.sma10 - df.sma20) < X, -1, df['signal'])
    '''STRATEGY RULES GO ABOVE HERE'''
    
    
    # Lag the signal 1 row because we're setting our position for the next 15m period
    df['signal'] = df.signal.shift(1)
    df = df.dropna()
    
    return df

### Generate signals and backtest strategy
Now we can run our dataframe into the `rules_engine()` function and calculate the strategy's return for each period (this is simply your `signal` vector multiplied by the `log_ret` vector). From there, we can calculate the strategy's cumulative return over time, get some performance statistics, and plot our equity curve and drawdown.

In [83]:
# Generate signals
df = rules_engine(df)

# Calculate returns
df['strat_ret'] = df.signal * df.log_ret
df['cum_log_ret'] = df.strat_ret.cumsum()
df['cum_simple_ret'] = np.exp(df.cum_log_ret) - 1

##### Plot the equity curve and view cumulative return:
Note that the x-axis is the date and the y-axis is the cumulative return as a percentage.

In [89]:
# Plot equity curve
log_equity = go.Scatter(x=df.index, y=df.cum_log_ret, name='Cum. Log Ret')
simple_equity = go.Scatter(x=df.index, y=df.cum_simple_ret, name='Cum. Simple Ret')
trace_equity = [log_equity, simple_equity]
iplot(trace_equity)


print('Cumulative Log Return = ' + str(df.cum_log_ret[-1:].values * 100).strip('[]') + '%')
print('Cumulative Simple Return = ' + str(df.cum_simple_ret[-1:].values * 100).strip('[]') + '%')

Cumulative Log Return =  58.14394148%
Cumulative Simple Return =  78.86111321%


### Grab order book from binance
First, just pull the order book consisting of bids/asks prices and quantities. We can grab this data via `client.get_order_book()`.

In [63]:
depth = client.get_order_book(symbol=tick)

Extract the bids / asks prices and quantities. Luckily, it's in a simple dictionary.

In [64]:
bids_price = [elem[0] for elem in depth['bids']]
bids_qty = [elem[1] for elem in depth['bids']]
asks_price = [elem[0] for elem in depth['asks']]
asks_qty = [elem[1] for elem in depth['asks']]

Transform the extracted information into a `pandas` dataframe. Set `numpy` precision to 8 (the number of digits after the decimal for crypto) and map the values to `np.float64` just to ensure we aren't losing any precision.

In [65]:
np.set_printoptions(precision=8)
orderbook = pd.DataFrame(
    {
    'bids_price' : map(np.float64, bids_price),
    'bids_qty' : map(np.float64, bids_qty),
    'asks_price' : map(np.float64, asks_price),
    'asks_qty' : map(np.float64, asks_qty)
    }
)

In [66]:
orderbook.head()

Unnamed: 0,asks_price,asks_qty,bids_price,bids_qty
0,0.028492,9.111,0.028445,0.025
1,0.028493,3.345,0.028442,17.319
2,0.028499,10.508,0.028441,45.509
3,0.028501,0.38,0.028423,1.98
4,0.028507,0.325,0.028418,0.067


### Plotting order book via plotly
Create a trace for the bids and a trace for the asks. With plotly, we can easily join them together in the same line plot. We can't just use the `plotly` library from a Jupyter Notebook so we need to pull from `plotly.offline` and initialize for notebook mode.

In [67]:
from plotly import __version__
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import plotly.graph_objs as go
init_notebook_mode(connected=True)

bids = go.Scatter(
    x = orderbook.bids_price,
    y = orderbook.bids_qty,
    name = 'Bids'
)

asks = go.Scatter(
    x = orderbook.asks_price,
    y = orderbook.asks_qty,
    name = 'Asks'
)

depth_plot = [bids, asks]

fig = go.Figure(data=depth_plot)
iplot(fig)

Let's check on that massive buy order to ensure our data is correct. By hovering over the plot, we can see that the ask price on that order is $0.02895000$. Grab the row(s) where this is the case and ensure the volume matches.

In [79]:
# For XMR 20170125
orderbook[orderbook.bids_qty == np.amax(orderbook.bids_qty)]

Unnamed: 0,asks_price,asks_qty,bids_price,bids_qty,cum_bids,cum_asks
54,0.02895,1.57,0.02785,1189.735,1625.121,174.378


### Better visualizing market depth
While the plot above works, let's run the cumulative sum of the bids/asks quantities. This will allow us to better visualize buy/sell walls and is also the view you'll find for basically every exchange's order book.

In [73]:
orderbook['cum_bids'] = orderbook.bids_qty.cumsum()
orderbook['cum_asks'] = orderbook.asks_qty.cumsum()

Now that we've got our cumulative quantities for bids/asks, let's re-plot the order book using the same methodology as above. The depth of the order book is much more visible and intuitive now that we can see the small buy walls and monster sell wall.

In [74]:
bids = go.Scatter(
    x = orderbook.bids_price,
    y = orderbook.cum_bids,
    name = 'Bids'
)

asks = go.Scatter(
    x = orderbook.asks_price,
    y = orderbook.cum_asks,
    name = 'Asks'
)

depth_plot = [bids, asks]
fig = go.Figure(data=depth_plot)
iplot(fig)

### Wash trading function
Wash trading = buying/selling security at the same price to artifically inflate volume and get others to hop on board. To wash trade effectively, it'd be best to accumulate a lot of the crypto at a low price and then begin to wash it. This will drive volume up and make it a more attractive crypto for traders. At the same time, we can begin to layer bids in an attempt to make it seem like there is buying pressure & high volume which should result in driving the price up. If we can't get in low enough, I'd either ignore the crypto and takeover a different one OR we can setup a large sell wall (like the one above) and try to push fear sellers out and then buy at a discount.

In [75]:
max(bids_price)

u'0.02844500'

In [76]:
min(asks_price)

u'0.02849200'

In [77]:
wash_price = np.median(map(np.float64, (max(bids_price), min(asks_price))))
wash_price

0.028468500000000001

In [78]:
def washer(ticker, balance, bids, asks):
    """
    Function to place orders for wash trading.
    Args: ticker = your crypto pair as a string (e.g. 'LINKBTC')
          balance = the size of the trades you want to make while washing (e.g. 100 = buy/sell 100 LINKBTC)
          bids = the bid prices from order book
          asks = the ask prices from order book
    """
    # Price to wash at, currently just using:
    #     median(max(bid), min(ask))
    wash_price = np.median(map(np.float64, (max(bids), min(ask))))
    # Buy order
    client.order_limit_buy(
        symbol = ticker,
        quantity = balance,
        price = str(wash_price)
    )
    # Sell order
    client.order_limit_sell(
        symbol = ticker,
        quantity = balance,
        price = str(wash_price)
    )
    
    return

### Conclusion 2018.01.19
I think this function is ready to wash trade... Just need to loop it and have it fire off using some PRNG. Could also use a PRNG to calculate the `balance` argument that you want to trade so that we're not just buying/selling the same quantity with every trade. I have no idea if Binance (or others) are cracking down on washing... With that being said, it's advantageous for them to allow washing to happen (they're collecting transaction fees) so I assume they're not sniffing out washers and shutting down their accounts.