# MLBot Tutorial

This tutorial demonstrates how to create a trading bot using machine learning.
We'll use GMO Coin's BTC/JPY leveraged trading data for this example.

In [None]:
memory = joblib.Memory('/tmp/gmo_fetcher_cache', verbose=0)
fetcher = GmoFetcher(memory=memory)

# Fetch GMO Coin's BTC/JPY leveraged trading data (https://api.coin.z.com/data/trades/BTC_JPY/)
# First download may take some time
df = fetcher.fetch_ohlcv(
    market='BTC_JPY', # Specify market symbol
    interval_sec=15 * 60, # Specify interval in seconds. In this case, 15-minute intervals
)

# Limit the data period for our experiment
df = df[df.index < pd.to_datetime('2021-04-01 00:00:00Z')]

display(df)
df.to_pickle('df_ohlcv.pkl')

## Adding Maker Fee Column

In the following code, we'll add a maker fee column (fee).
Since GMO Coin has changed their fees several times in the past,
we need the fee information for each timestamp to perform accurate backtesting.
Since this tutorial only uses limit (maker) orders, we'll only add maker fees.

We manually collected fee change timings and values from GMO Coin's past news announcements.
The fee changes appear to have been implemented after regular maintenance periods.
Regular maintenance is conducted from 15:00 to 16:00 JST (06:00 to 07:00 UTC).

In [None]:
maker_fee_history = [
    {
        # https://coin.z.com/jp/news/2020/08/6482/
        # Time not specified in announcement, assuming after regular maintenance
        'changed_at': '2020/08/05 06:00:00Z',
        'maker_fee': -0.00035
    },
    {
        # https://coin.z.com/jp/news/2020/08/6541/
        'changed_at': '2020/09/09 06:00:00Z',
        'maker_fee': -0.00025
    },
    {
        # https://coin.z.com/jp/news/2020/10/6686/
        'changed_at': '2020/11/04 06:00:00Z',
        'maker_fee': 0.0
    },
]

df = pd.read_pickle('df_ohlcv.pkl')

# Initial fee
# https://web.archive.org/web/20180930223704/https://coin.z.com/jp/corp/guide/fees/
df['fee'] = 0.0

for config in maker_fee_history:
    df.loc[pd.to_datetime(config['changed_at']) <= df.index, 'fee'] = config['maker_fee']

df['fee'].plot()
plt.title('Maker Fee History')
plt.show()
    
display(df)
df.to_pickle('df_ohlcv_with_fee.pkl')

## Feature Engineering

In the following code, we'll use [TA-Lib](https://mrjbq7.github.io/ta-lib/) to create technical indicators as features.
We haven't put too much thought into the meaning of these features.
We're simply adding all available features from TA-Lib.
However, there are some important points to keep in mind:

### Important Considerations for Features

#### Avoid Using Future Information

Future information cannot be used in live trading.
Also, including future information often dramatically improves prediction accuracy,
which leads to unrealistic backtesting results.

For example, if we use a simple moving average (SMA), we need to be careful about the calculation method.
Some implementations calculate the average of n data points centered on the current time,
which means they use future data points. We must use only past data points for our calculations.

#### Handle Missing Values

Technical indicators often create missing values (NaN).
For example, when calculating a 20-period moving average,
the first 19 values will be NaN because there isn't enough historical data.

We need to handle these missing values appropriately.
In this tutorial, we'll simply remove rows with missing values.

#### Feature Scaling

Different features can have vastly different scales.
For example, BTC price might be in millions (of JPY),
while RSI values range from 0 to 100.

Machine learning algorithms generally perform better when features are on similar scales.
Therefore, we'll normalize our features.