<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#'bt'-backtesting-library-on-1-year-of-PPAs" data-toc-modified-id="'bt'-backtesting-library-on-1-year-of-PPAs-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>'bt' backtesting library on 1 year of PPAs</a></span><ul class="toc-item"><li><span><a href="#data-passing" data-toc-modified-id="data-passing-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>data passing</a></span></li></ul></li><li><span><a href="#'backtesting'-library" data-toc-modified-id="'backtesting'-library-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>'backtesting' library</a></span></li><li><span><a href="#from-scratch" data-toc-modified-id="from-scratch-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>from scratch</a></span></li></ul></div>

In [1]:
import pandas as pd
import numpy as np
from tqdm import tqdm
import bt

# 'bt' backtesting library on 1 year of PPAs
tricky bit will be defining new algos based on PPAs

In [None]:
s = bt.Strategy('s1', [bt.algos.RunMonthly(),
                       bt.algos.SelectAll(),
                       bt.algos.WeighEqually(),
                       bt.algos.Rebalance()])

- bt runs a strategy on data. i have the data, time to build a strategy.
- strategy is just a name and a list of algos.
    - algos:
        - a function that returns boolean
    - list of algos: "algoStack"
        - chained so that they execute consecutively, one False will stop execution part-way

In [2]:
class MyAlgo(bt.Algo):

    def __init__(self, arg1, arg2):
        self.arg1 = arg1
        self.arg2 = arg2

    def __call__(self, target):
        # my logic goes here

        # accessing/storing variables through target.temp['key']

        # remember to return a bool - True in most cases
        return True

## data format
doesn't seem to be that complex. the algo examples saying target['spy'] is probably referring to row['col_name'].
- every row, it runs the algos and... decides what to do, based on true/false. how does it determine the buy/sell? 

## data passing
In order to pass data between different Algos, the Strategy has two properties: temp and perm. They are both dictionaries and are used for storing data generated by Algos. Temporary data is refreshed on each data change whereas permanent data is not altered.

Algos usually set and/or require values in the temp or perm objects. For example, the bt.algos.WeighEqually Algo sets the ‘weights’ key in temp, and it requires the ‘selected’ key in temp.

For example, let’s take a simple select -> weight -> allocate logic chain. We would break this strategy up into 3 Algos:

selection Which securities do I want to allocate capital to out of the entire universe of investable assets?
weighting How much weight should each of the selected securities have in the target portfolio?
allocate Close out positions that are no longer needed and allocate capital to those that were selected and given target weights.
In this case, the selection Algo could set the ‘selected’ key in the strategy’s temp dict, and the weighting Algo could read those values and in turn set the ‘weights’ key in the temp dict. The allocation Algo would then read the ‘weights’ and act accordingly.

In [None]:
class WeighEqually(Algo):
    """
    Sets temp['weights'] by calculating equal weights for all items in
    selected.

    Equal weight Algo. Sets the 'weights' to 1/n for each item in 'selected'.
    
    Sets:
        * weights
    Requires:
        * selected
    """
    def __init__(self):
        super(WeighEqually, self).__init__()

    def __call__(self, target):
        selected = target.temp['selected']
        n = len(selected)

        if n == 0:
            target.temp['weights'] = {}
        else:
            w = 1.0 / n
            target.temp['weights'] = {x: w for x in selected}

        return True


In [3]:
data = bt.get('aapl,msft,c,gs,ge', start='2010-01-01')

In [5]:
data.head(10)

Unnamed: 0_level_0,aapl,msft,c,gs,ge
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2010-01-04,6.604801,24.168472,30.533976,147.920776,10.733057
2010-01-05,6.616219,24.176279,31.701447,150.535919,10.788632
2010-01-06,6.51098,24.027906,32.68932,148.929138,10.733057
2010-01-07,6.498945,23.778025,32.779129,151.843475,11.28881
2010-01-08,6.54215,23.942017,32.240284,148.971909,11.531955
2010-01-11,6.484439,23.637472,32.599525,146.621704,11.643106
2010-01-12,6.410679,23.481291,31.611656,143.425323,11.650054
2010-01-13,6.501104,23.69994,31.432041,144.493607,11.691734
2010-01-14,6.463451,24.176279,31.521839,144.03212,11.60142
2010-01-15,6.355436,24.098186,30.713583,141.194717,11.420806


# 'backtesting' library

In [6]:
import backtesting



In [7]:
# Example OHLC daily data for Google Inc.
from backtesting.test import GOOG

GOOG.tail()

Unnamed: 0,Open,High,Low,Close,Volume
2013-02-25,802.3,808.41,790.49,790.77,2303900
2013-02-26,795.0,795.95,784.4,790.13,2202500
2013-02-27,794.8,804.75,791.11,799.78,2026100
2013-02-28,801.1,806.99,801.03,801.2,2265800
2013-03-01,797.8,807.14,796.15,806.19,2175400


In [8]:
# simple moving-average + crossover strategy
def SMA(values, n):
    """
    Return simple moving average of `values`, at
    each step taking into account `n` previous values.
    """
    return pd.Series(values).rolling(n).mean()

A new strategy needs to extend Strategy class and override its two abstract methods: init() and next().

Method init() is invoked before the strategy is run. Within it, one ideally precomputes in efficient, vectorized manner whatever indicators and signals the strategy depends on.

Method next() is then iteratively called by the Backtest instance, once for each data point (data frame row), simulating the incremental availability of each new full candlestick bar.

Note, backtesting.py cannot make decisions / trades within candlesticks — any new orders are executed on the next candle's open (or the current candle's close if trade_on_close=True). If you find yourself wishing to trade within candlesticks (e.g. daytrading), you instead need to begin with more fine-grained (e.g. hourly) data.

In [None]:
from backtesting import Strategy
from backtesting.lib import crossover

class SmaCross(Strategy):
    # Define the two MA lags as *class variables*
    # for later optimization
    n1 = 10
    n2 = 20
    
    def init(self):
        # Precompute the two moving averages
        self.sma1 = self.I(SMA, self.data.Close, self.n1)
        self.sma2 = self.I(SMA, self.data.Close, self.n2)
    
    def next(self): # these are the methods i want for "if ppa".
        # If sma1 crosses above sma2, close any existing
        # short trades, and buy the asset
        if crossover(self.sma1, self.sma2):
            self.position.close()
            self.buy()

        # Else, if sma1 crosses below sma2, close any existing
        # long trades, and sell the asset
        elif crossover(self.sma2, self.sma1):
            self.position.close()
            self.sell()

In [10]:
df = pd.read_csv('historical_crypto_btc.csv')

In [12]:
for col in list(df.columns):
    if 'alert' in col:
        print(col)

5min_long_alert
5min_short_alert
30min_long_alert
30min_short_alert
60min_long_alert
60min_short_alert


In [13]:
df.columns

Index(['asset_class', 'asset_name', 'timestamp', 'low', 'high', 'open',
       'close', '5min_long_alert', '5min_long_10pct_conf',
       '5min_long_25pct_conf', '5min_long_50pct_conf', '5min_long_75pct_conf',
       '5min_short_alert', '5min_short_10pct_conf', '5min_short_25pct_conf',
       '5min_short_50pct_conf', '5min_short_75pct_conf', '30min_long_alert',
       '30min_long_10pct_conf', '30min_long_25pct_conf',
       '30min_long_50pct_conf', '30min_long_75pct_conf', '30min_short_alert',
       '30min_short_10pct_conf', '30min_short_25pct_conf',
       '30min_short_50pct_conf', '30min_short_75pct_conf', '60min_long_alert',
       '60min_long_10pct_conf', '60min_long_25pct_conf',
       '60min_long_50pct_conf', '60min_long_75pct_conf', '60min_short_alert',
       '60min_short_10pct_conf', '60min_short_25pct_conf',
       '60min_short_50pct_conf', '60min_short_75pct_conf'],
      dtype='object')

In [17]:
st = '60min_short_50pct_conf'
ind = st.index("min") # 2
s = st[:ind]
s

'60'

- setting up logic is tricky here
- if ppa does not exist:
    - do nothing
- else: # ppa exists
    - if direction=='long':
        - buy, sell when

In [None]:
# simple moving-average + crossover strategy
def SMA(values, n):
    """
    Return simple moving average of `values`, at
    each step taking into account `n` previous values.
    """
    return pd.Series(values).rolling(n).mean()

def ppa(row, alert_names):
    # checks row to see if ppa is present or not. 
    # returns True/False, direction, length and confidence_value_dict of PPA
    for alert in alert_names:
        if row[alert] == 1: # if ppa begins
            # get the four confidence interval values for this alert
            ind = alert.index('min') # 1 or 2, for 5,30,60
            length = int(alert[:ind]) # 5, 30 or 60 minutes
            # direction
            if 'long' in alert:
                direction = 'long'
            else:
                direction = 'short'
            # store confidence intervals
        else:
            return None
    return ppa

class ppa_thresh(Strategy): # self.data accesses dataframe
    bars = 15
    bar_length = 1
    
    def init(self):
        # no precomputation, since our df has signals baked-in
    def next(self):
        # if ppa
        if ppa()

# from scratch
maybe if i try writing it myself, i'll learn the critical aspects of coding it in these backtesting libraries.

it'll be far from a finished product, without all the bells and whistles (graphs, metrics) built into the libraries, but it's a good start for logic.

In [19]:
df.head()

Unnamed: 0,asset_class,asset_name,timestamp,low,high,open,close,5min_long_alert,5min_long_10pct_conf,5min_long_25pct_conf,...,60min_long_alert,60min_long_10pct_conf,60min_long_25pct_conf,60min_long_50pct_conf,60min_long_75pct_conf,60min_short_alert,60min_short_10pct_conf,60min_short_25pct_conf,60min_short_50pct_conf,60min_short_75pct_conf
0,crypto,btc,2017-01-01 0:00,973.35,973.4,973.37,973.39,0,,,...,0.0,,,,,0.0,,,,
1,crypto,btc,2017-01-01 0:05,970.95,973.39,973.35,972.62,0,,,...,,,,,,,,,,
2,crypto,btc,2017-01-01 0:10,970.42,971.94,971.94,970.42,0,,,...,,,,,,,,,,
3,crypto,btc,2017-01-01 0:15,969.9,971.18,970.42,969.9,0,,,...,,,,,,,,,,
4,crypto,btc,2017-01-01 0:20,969.94,970.77,970.09,970.77,0,,,...,,,,,,,,,,


In [26]:
class portfolio:
    # store a current_balance of BTC and USD + bitcoin_price (relative USD)
    def __init__(self, usd, btc, price):
        self.usd = usd
        self.btc = btc
        self.price = price
    
    # buy/sell equations. the mirrored n/ n* is sort of elegant, didn't plan that one
    def buy(self, n): # convert $n USD to BTC
        self.usd -= n # spend $n
        self.btc += (n/self.price) # to purchase BTC
        
    def sell(self, n): # convert n BTC to USD (both are stored in their own units)
        self.usd += n # purchase $n
        self.btc -= (n*self.price) # by spending BTC
    
p = portfolio(2000, 0, 1000) # $2000 USD starting, no BTC. 1btc / $1000 USD
p.buy(1000)
p.btc

1.0

- if bitcoin_price is 1000, and you buy \$100 USD worth of BTC
    - you get 1/10th of a bitcoin. how do we math this?
    - purchased_btc = 100 / 1000 = 1/10 = spent_USD / BTC_price
- if btc_price = 1000, and you have one bitcoin & sell half of it, you get $500 usd
    - purchased_usd = spent_BTC * BTC_price
    - 500 = 0.5(BTC) * 1000

In [24]:
p = portfolio(1000, df.iloc[0]['close'])
p.price

973.39