# Hackathon Starter Guide

This notebook serves as a launchpad and guide to start a first order algorithmic trading strategy. The developer is not limited to but encouraged to use this framework as a basis to get accustomed to the various components involved in a trading strategy.

In order to develop an algorithmic trading strategy the following high-level components need to be considered:

 - *Data* -  Acquire data on the financial assest under test
 - *Strategy*  - Develop a profitable strategy 
 - *Backtesting* - Backtest your strategy using a framework such as BackTrader, Backtesting.py 
 - *Performance Metrics* - Validate feasibility of strategy against metrics such as CAGR, Drawdown, etc


As reference we will follow a [simple example](https://kernc.github.io/backtesting.py/doc/examples/Quick%20Start%20User%20Guide.html) using Backtesting.py


Make sure you have installed the `backtesting.py` package (Preferably in a clean Python 3.9 virtual environment) by running the following:

`pip install backtesting`

## Data
Let's get data some data for our strategy. Backtesting.py has some internal data we can use to test.

You can bring your own data on various financial instruments (stocks, forex, crypto, etc) as a pandas.DataFrame with columns `Open`, `High`, `Low`, `Close` and (optionally) `Volume`. The DataFrame should ideally be indexed with a datetime index (convert it with `pd.to_datetime()`).

For a more significant test you can download and use Forex data provided by the Spatialedge team. Reach out to one of the mentors. Also see helper function at bottom of notebook to read in parquet data. 

In [7]:
# Example OHLC daily data for Google Inc.
from backtesting.test import GOOG

GOOG.tail()

Unnamed: 0,Open,High,Low,Close,Volume
2013-02-25,802.3,808.41,790.49,790.77,2303900
2013-02-26,795.0,795.95,784.4,790.13,2202500
2013-02-27,794.8,804.75,791.11,799.78,2026100
2013-02-28,801.1,806.99,801.03,801.2,2265800
2013-03-01,797.8,807.14,796.15,806.19,2175400


## Strategy

We can now devise a basic strategy by using a simple moving average cross-over.

A new strategy needs to extend  Strategy class and override its two abstract methods: `init()` and `next()`.

 - Method `init()` is invoked before the strategy is run. Within it, one ideally precomputes in efficient, vectorized manner whatever indicators and signals the strategy depends on.

- Method `next()` is then iteratively called by the Backtest instance, once for each data point (data frame row), simulating the incremental availability of each new full candlestick bar.

Note, backtesting.py cannot make decisions / trades within candlesticks — any new orders are executed on the next candle's open (or the current candle's close if `trade_on_close=True`). If you find yourself wishing to trade within candlesticks (e.g. daytrading), you instead need to begin with more fine-grained (e.g. hourly) data.

In [8]:
import pandas as pd


def SMA(values, n):
    """
    Return simple moving average of `values`, at
    each step taking into account `n` previous values.
    """
    return pd.Series(values).rolling(n).mean()

In [9]:
from backtesting import Strategy
from backtesting.lib import crossover


class SmaCross(Strategy):
    # Define the two MA lags as *class variables*
    # for later optimization
    n1 = 10
    n2 = 20
    
    def init(self):
        # Precompute the two moving averages
        self.sma1 = self.I(SMA, self.data.Close, self.n1)
        self.sma2 = self.I(SMA, self.data.Close, self.n2)
    
    def next(self):
        # If sma1 crosses above sma2, close any existing
        # short trades, and buy the asset
        if crossover(self.sma1, self.sma2):
            self.position.close()
            self.buy()

        # Else, if sma1 crosses below sma2, close any existing
        # long trades, and sell the asset
        elif crossover(self.sma2, self.sma1):
            self.position.close()
            self.sell()

## Backtesting

Let's see how our strategy performs on historical Google data. The Backtest instance is initialized with OHLC data and a strategy class (see API reference for additional options), and we begin with 10,000 units of cash and set broker's commission to realistic 0.2%.

`Backtest.run()` method returns a pandas Series of simulation results and statistics associated with our strategy. We see that this simple strategy makes almost 600% return in the period of 9 years, with maximum drawdown 33%, and with longest drawdown period spanning almost two years ...

`Backtest.plot()` method provides the same insights in a more visual form.

In [10]:
from backtesting import Backtest

bt = Backtest(GOOG, SmaCross, cash=10_000, commission=.002)
stats = bt.run()
stats

Start                     2004-08-19 00:00:00
End                       2013-03-01 00:00:00
Duration                   3116 days 00:00:00
Exposure Time [%]                   97.067039
Equity Final [$]                  68221.96986
Equity Peak [$]                   68991.21986
Return [%]                         582.219699
Buy & Hold Return [%]              703.458242
Return (Ann.) [%]                   25.266427
Volatility (Ann.) [%]               38.383008
Sharpe Ratio                         0.658271
Sortino Ratio                        1.288779
Calmar Ratio                         0.763748
Max. Drawdown [%]                  -33.082172
Avg. Drawdown [%]                   -5.581506
Max. Drawdown Duration      688 days 00:00:00
Avg. Drawdown Duration       41 days 00:00:00
# Trades                                   94
Win Rate [%]                        54.255319
Best Trade [%]                       57.11931
Worst Trade [%]                    -16.629898
Avg. Trade [%]                    

Next we will plot our backtest results.

Note if you experience the following error: `TypeError: bokeh.models.tools.Toolbar() got multiple values for keyword argument 'logo'`

Downgrade `bokeh` by running `pip install bokeh==3.2.1`

In [11]:
bt.plot()

  formatter=DatetimeTickFormatter(days=['%d %b', '%a %d'],
  formatter=DatetimeTickFormatter(days=['%d %b', '%a %d'],


## ML strategy

For a more sophisticated strategy leveraging Machine Learning, follow [this](https://kernc.github.io/backtesting.py/doc/examples/Trading%20with%20Machine%20Learning.html) tutorial.



## Helper functions

You can find supplimentary forex data from the following link: <TBC>

In [12]:
# Helper function to read parquet data into dataframe
import pyarrow.parquet as pq

def read_and_process_parquet(data_path, from_date, to_date, symbol, timeframe):
    partition = ['symbol','timeframe', 'date', 'date']
    operator = ['=', '=', '>=', '<=']
    params = [symbol, timeframe, from_date, to_date]
            
    dataset = pq.ParquetDataset(data_path, filters=list(zip(partition, operator, params)))
    table = dataset.read()
    df = table.to_pandas()

    df['date'] = df['date'].astype(str)
    df['time'] = df['time'].astype(str)

    df['datetime'] = df['date'] + ' ' + df['time']
    df['datetime'] = pd.to_datetime(df['datetime'], format='%Y%m%d %H:%M:%S')
    df.set_index('datetime', inplace=True)

    df.drop(['time', 'symbol', 'timeframe', 'date'], axis=1, inplace=True)
    df = df.sort_values('datetime')
    df.fillna(method='ffill', inplace=True)

    return df 


DATA_PATH='/path/to/local/data/' 
FROM_DATE='20200101'
TO_DATE='20221231'
SYMBOL = 'EURUSD'
TIMEFRAME = 'H1'

df = read_and_process_parquet(DATA_PATH, FROM_DATE, TO_DATE, SYMBOL, TIMEFRAME)

Opening in existing browser session.


FileNotFoundError: /path/to/local/data