# Notebook Instructions

1. If you are new to Jupyter notebooks, please go through this introductory manual <a href='https://quantra.quantinsti.com/quantra-notebook' target="_blank">here</a>.
1. Any changes made in this notebook would be lost after you close the browser window. **You can download the notebook to save your work on your PC.**
1. Before running this notebook on your local PC:<br>
i.  You need to set up a Python environment and the relevant packages on your local PC. To do so, go through the section on "**Run Codes Locally on Your Machine**" in the course.<br>
ii. You need to **download the zip file available in the last unit** of this course. The zip file contains the data files and/or python modules that might be required to run this notebook.

# Backtesting the ML Model To Predict Strategy

In the previous notebook, we predicted the strategy labels using the LSTM model. In this notebook, we will backtest the labels predicted by the model designed in the previous notebook and check the results.

The notebook is structured as follows:
1. [Import the Data](#import)
2. [Strategy Parameters](#parameters)
3. [Entry Condition](#entry)
4. [Exit Condition](#exit)
5. [Backtesting](#backtesting)

## Import Libraries

In [1]:
# For data manipulation
import numpy as np
import pandas as pd

# Datetime manipulation
from datetime import timedelta

# For plotting
import matplotlib.pyplot as plt

# To suppress warnings
import warnings 
warnings.filterwarnings('ignore')

<a id='import'></a>
## Import the Data

Import the following files
1. `spx_eom_expiry_options_2010_2022.bz2` as `options_data`
2. `underlying_data_model_mlo.csv` as `underlying_data`
3. `predicted_labels_lstm_mlo.csv` as `pred_labels`
4. `strategies_combinations_mlo.csv` as `strategies`

These files are available in the zip file of the unit 'Python Codes and Data' in the 'Course Summary' section.

In [2]:
# Import options chain data and set the index
options_data = pd.read_pickle(
    "../data_modules/spx_eom_expiry_options_2010_2022.bz2")
options_data.columns = options_data.columns.str.replace(
    "[", "").str.replace("]", "").str.strip()
options_data.index = pd.to_datetime(options_data.index)
options_data['date'] = options_data.index

# Import the predicted labels of test data
pred_labels = pd.read_csv(
    '../data_modules/predicted_labels_lstm_mlo.csv', index_col=0)['0'].values

# Import the underlying data with option chain data and target variable
underlying_data = pd.read_csv(
    '../data_modules/underlying_data_model_mlo.csv', index_col='quote_date')

# Backtesting on test data
test_size = 484
underlying_asset_data = underlying_data[-test_size:]
underlying_asset_data = underlying_asset_data.dropna()

# Create a column 'pred_strategy' with predicted labels
underlying_asset_data.loc[:, 'pred_strategy'] = pred_labels
underlying_asset_data.index = pd.to_datetime(underlying_asset_data.index)
underlying_asset_data['index'] = underlying_asset_data.index
underlying_asset_data.pred_strategy = underlying_asset_data.pred_strategy

# Import the list of strategies
strategies = pd.read_csv(
    '../data_modules/strategies_combinations_mlo.csv', index_col='Unnamed: 0')

<a id='parameters'></a>
## Strategy Parameters

We will set the stop-loss and take-profit percentage to 40% and 40% of the net entry premium respectively. You can try changing this to see how it affects the backtest results. However, if the SL is kept too low, then it might get hit too frequently. On the other hand, if we keep it too high, then it might not hit at all. 

The `days_to_exit_after_entry` is set to 3 since we want to hold positions only for a maximum of 3 trading days.

In [3]:
config = {

    'stop_loss_percentage': 40,
    'take_profit_percentage': 40,
    'days_to_exit_after_entry': 3
}

<a id='entry'></a>
## Entry Condition

We will enter the trade as per the strategy labels predicted by the model which were stored in the `pred_strategy` column of `underlying_asset_data`. Let's create a column `signal` in `underlying_asset_data` to store the signal as 1 when a strategy is predicted by the model. 

In [4]:
# Creating signal column for each strategy predicted by the model
for strategy_ in underlying_asset_data.pred_strategy.unique():
    underlying_asset_data[strategy_+'_signal'] = [1 if k ==
                                                  strategy_ else 0 for k in underlying_asset_data.pred_strategy]

<a id='exit'></a>
## Exit Condition

The trade will be exited if one of the following conditions is met.
1. When current profit is greater than take profit
2. When current loss is greater than stop loss
3. The trade is held for `days_to_exit_after_entry`
4. If the signal is 0

<a id='backtesting'></a>
## Backtesting and Performance Analysis

We will loop over each of the dates in the data for each strategy in the list of predicted strategies, set up the strategy when entry conditions are met, exit when exit conditions are met, and update the trade in `round_trips_details`. `mark_to_market` dataframe contains the premiums of the strategy on each date between the entry date and exit date.

We will backtest the strategies using the following steps:

**Step-1**: Create dataframes `round_trips_details`, `trades` and `mark_to_market` for storing round trips, trades and mtm.

**Step-2**: Define a function `add_to_mtm` which stores daily mark_to_market values for the strategy. It takes the existing `mark_to_market` dataframe, `option_strategy` and `trading_date` as inputs.

**Step-3**: Define a function `get_premium` to get the premium for the multiple positions as per strategy. It takes `options_strategy` and `options_data` as inputs.

**Step-4**: Define a function `strategy_generator` to set up the strategies. It takes `options_data`, `strategies` and  `underlying_asset_data`.

**Step-5**: Initialise `current_position`, `trade_num`, which is the number of trades, `cum_pnl` to 0 and set the `entry_flag` to `False`. 

**Step-6**: We also set the `start_date` for backtesting.

In [5]:
# Create dataframes for round trips, storing trades, and mtm
round_trips_details = pd.DataFrame()
trades = pd.DataFrame()
mark_to_market = pd.DataFrame()

# Function for calculating mtm
def add_to_mtm(mark_to_market, option_strategy, trading_date):
    option_strategy['Date'] = trading_date
    mark_to_market = pd.concat([mark_to_market, option_strategy])
    return mark_to_market

# Function for fetching premium
def get_premium(options_strategy, options_data):

    strike = options_strategy['Strike Price']
    option_type = options_strategy['Option Type']
    if option_type == 'call':
        return options_data[options_data['STRIKE'] == strike].C_LAST
    if option_type == 'put':
        return options_data[options_data['STRIKE'] == strike].P_LAST
    if option_type == 'stock':
        return options_data[options_data['STRIKE'] == strike].UNDERLYING_LAST
    return 0

# Function to create strategy
def strategy_generator(options_data_, strategies, underlying_asset_data):
    strategies = strategies.reset_index().drop(['index'], axis=1)
    for str_ in range(0, len(strategies)):

        n_pos = len([x for x in strategies.loc[str_].values[:3] if x != 0])
        strategy = pd.DataFrame(
            columns=['Option Type', 'Strike Price', 'position', 'premium'])
        atm_strike_price = underlying_asset_data.loc[options_data_.index[0]
                                                     ].atm_strike_price
        for leg in range(0, n_pos):
            contract = strategies.loc[str_][:3][strategies.loc[str_]
                                                [:3] != 0].index[leg]
            position = strategies.loc[str_][:3][strategies.loc[str_]
                                                [:3] != 0][leg]
            strategy.loc[leg] = [contract, atm_strike_price, position, np.nan]
            strategy['premium'] = strategy.apply(
                lambda r: get_premium(r, options_data_), axis=1)
        return strategy


# Initialise current position, the number of trades and cumulative pnl to 0
current_position = 0
trade_num = 0
cum_pnl = 0

# Set exit flag to False
exit_flag = False

# Creating dataframe backtest_data for further calculations
backtest_data = underlying_asset_data
backtest_data.index = backtest_data['index']
backtest_data = backtest_data.drop(['index'], axis=1)

# Set a start date for backtesting
start_date = backtest_data.index[0]

Perform the following steps iteratively for the dates in the backtest period for every strategy in the predicted labels.

**Step-7**: For a given date, if there is no open position and entry conditions are met, we will set up the strategy.

**Step-8**: For a given date, if there is an open position, we exit the trade if stop-loss/take-profit gets hit or if the trade was held for more than `days_to_exit_after_entry` and update round trips.

**Step-9**: Finally, we calculate the pnl for the trade and also the cumulative pnl.

In [6]:
# Function to create strategy
def strategy_generator(options_data_, strategies, underlying_asset_data):
    strategies = strategies.reset_index().drop(['index'], axis=1)
    for str_ in range(0, len(strategies)):

        n_pos = len([x for x in strategies.loc[str_].values[:3] if x != 0])
        strategy = pd.DataFrame(
            columns=['Option Type', 'Strike Price', 'position', 'premium'])
        atm_strike_price = underlying_asset_data.loc[options_data_.index[0]
                                                     ].atm_strike_price
        for leg in range(0, n_pos):
            contract = strategies.loc[str_][:3][strategies.loc[str_]
                                                [:3] != 0].index[leg]
            position = strategies.loc[str_][:3][strategies.loc[str_]
                                                [:3] != 0][leg]
            strategy.loc[leg] = [contract, atm_strike_price, position, np.nan]
            strategy['premium'] = strategy.apply(
                lambda r: get_premium(r, options_data_), axis=1)
        return strategy


In [7]:
for strategy_name in backtest_data.pred_strategy.unique():
    current_position = 0
    print('*********', strategy_name)
    backtest_data['signal'] = backtest_data[strategy_name+'_signal']

    for i in backtest_data.loc[start_date:].index:

        if (current_position == 0) & (backtest_data.loc[i, 'signal'] != 0):

            # Step-5: Set up strategy
            try:
                options_data_daily = options_data.loc[i]
            except:
                continue

            # If signal is 1 we will set up the strategy
            if backtest_data.loc[i, 'signal'] == 1:
                strategy_cal = strategy_generator(
                    options_data_daily, strategies[strategies.strategy == strategy_name], backtest_data)
            else:
                continue

            # Populate the trades dataframe
            trades = strategy_cal.copy()
            trades['entry_date'] = i
            trades.rename(columns={'premium': 'entry_price'}, inplace=True)

            # Calculate net premium
            net_premium = round(
                (strategy_cal.position * strategy_cal.premium).sum(), 1)

            # Compute SL and TP for the trade
            premium_sign = np.sign(net_premium)
            sl = net_premium * \
                (1 - config['stop_loss_percentage']*premium_sign/100)
            tp = net_premium * \
                (1 + config['take_profit_percentage']*premium_sign/100)

            # Update current position
            current_position = backtest_data.loc[i, 'signal']
            last_trade_date = i

            # Update mark_to_market dataframe
            mark_to_market = add_to_mtm(mark_to_market, strategy_cal, i)

            # Increase number of trades by 1
            trade_num += 1
            print("-"*30)

            # Print trade details
            print(
                f"Trade No: {trade_num} | Entry | Date: {i} | Premium: {net_premium*-1} | Position: {current_position}")

        # Step-7 :
        elif current_position != 0:

            # Update net premium
            try:
                options_data_daily = options_data.loc[i]
                strategy_cal['premium'] = strategy_cal.apply(
                    lambda r: get_premium(r, options_data_daily), axis=1)
            except:
                continue
            net_premium = (strategy_cal.position * strategy_cal.premium).sum()

            # Update mark_to_market dataframe
            mark_to_market = add_to_mtm(mark_to_market, strategy_cal, i)

            # Exit the trade if any of the exit condition is met
            if backtest_data.loc[i, 'signal'] != current_position:
                exit_type = 'Expiry or Signal Based'
                exit_flag = True

            elif net_premium < sl:
                exit_type = 'SL'
                exit_flag = True

            elif net_premium > tp:
                exit_type = 'TP'
                exit_flag = True

            elif (i-last_trade_date).days > config['days_to_exit_after_entry']:
                exit_type = 'days'
                exit_flag = True

            if exit_flag:

                # Check that the data is present for all strike prices on the exit date
                if strategy_cal.premium.isna().sum() > 0:
                    print(
                        f"Data missing for the required strike prices on {i}, Not adding to trade logs.")
                    current_position = 0
                    continue

                # Append the trades dataframe
                trades['exit_date'] = i
                trades['exit_type'] = exit_type
                trades['exit_price'] = strategy_cal.premium

                # Add the trade logs to round trip details
                round_trips_details = pd.concat([round_trips_details, trades])

                # Calculate net premium at exit
                net_premium = round(
                    (strategy_cal.position * strategy_cal.premium).sum(), 1)

                # Calculate net premium on entry
                entry_net_premium = (
                    trades.position * trades.entry_price).sum()

                # Step-8: Calculate pnl for the trade
                trade_pnl = round(net_premium - entry_net_premium, 1)

                # Calculate cumulative pnl
                cum_pnl += trade_pnl
                cum_pnl = round(cum_pnl, 1)

                # Print trade details
                print(
                    f"Trade No: {trade_num} | Exit Type: {exit_type} | Date: {i} | Premium: {net_premium} | PnL: {trade_pnl} | Cum PnL: {cum_pnl}")

                # Update current position to 0
                current_position = 0

                # Set exit flag to false
                exit_flag = False

********* strategy_6
------------------------------
Trade No: 1 | Entry | Date: 2020-10-08 00:00:00 | Premium: -3.6 | Position: 1
Trade No: 1 | Exit Type: Expiry or Signal Based | Date: 2020-10-09 00:00:00 | Premium: -27.5 | PnL: -31.1 | Cum PnL: -31.1
------------------------------
Trade No: 2 | Entry | Date: 2020-10-19 00:00:00 | Premium: -1.0 | Position: 1
Trade No: 2 | Exit Type: Expiry or Signal Based | Date: 2020-10-20 00:00:00 | Premium: -18.8 | PnL: -19.8 | Cum PnL: -50.9
------------------------------
Trade No: 3 | Entry | Date: 2020-12-28 00:00:00 | Premium: 1.2 | Position: 1
Trade No: 3 | Exit Type: Expiry or Signal Based | Date: 2020-12-29 00:00:00 | Premium: 6.4 | PnL: 7.6 | Cum PnL: -43.3
------------------------------
Trade No: 4 | Entry | Date: 2021-01-05 00:00:00 | Premium: 3.7 | Position: 1
Trade No: 4 | Exit Type: Expiry or Signal Based | Date: 2021-01-06 00:00:00 | Premium: -43.5 | PnL: -39.8 | Cum PnL: -83.1
------------------------------
Trade No: 5 | Entry | Date

Trade No: 78 | Exit Type: Expiry or Signal Based | Date: 2021-05-20 00:00:00 | Premium: -73.0 | PnL: -32.8 | Cum PnL: 306.3
------------------------------
Trade No: 79 | Entry | Date: 2021-06-10 00:00:00 | Premium: 40.9 | Position: 1
Trade No: 79 | Exit Type: Expiry or Signal Based | Date: 2021-06-11 00:00:00 | Premium: -38.5 | PnL: 2.4 | Cum PnL: 308.7
------------------------------
Trade No: 80 | Entry | Date: 2021-06-22 00:00:00 | Premium: 24.7 | Position: 1
Trade No: 80 | Exit Type: Expiry or Signal Based | Date: 2021-06-23 00:00:00 | Premium: -20.2 | PnL: 4.5 | Cum PnL: 313.2
------------------------------
Trade No: 81 | Entry | Date: 2021-07-02 00:00:00 | Premium: 50.2 | Position: 1
Trade No: 81 | Exit Type: days | Date: 2021-07-06 00:00:00 | Premium: -44.9 | PnL: 5.3 | Cum PnL: 318.5
------------------------------
Trade No: 82 | Entry | Date: 2021-07-08 00:00:00 | Premium: 55.7 | Position: 1
Trade No: 82 | Exit Type: SL | Date: 2021-07-12 00:00:00 | Premium: -88.9 | PnL: -33.2 |

------------------------------
Trade No: 115 | Entry | Date: 2022-07-22 00:00:00 | Premium: 46.9 | Position: 1
Trade No: 115 | Exit Type: Expiry or Signal Based | Date: 2022-07-26 00:00:00 | Premium: -25.0 | PnL: 21.9 | Cum PnL: 13.1
------------------------------
Trade No: 116 | Entry | Date: 2022-07-29 00:00:00 | Premium: 1.5 | Position: 1
Trade No: 116 | Exit Type: Expiry or Signal Based | Date: 2022-08-01 00:00:00 | Premium: -95.9 | PnL: -94.4 | Cum PnL: -81.3
------------------------------
Trade No: 117 | Entry | Date: 2022-08-15 00:00:00 | Premium: 64.1 | Position: 1
Trade No: 117 | Exit Type: Expiry or Signal Based | Date: 2022-08-16 00:00:00 | Premium: -61.4 | PnL: 2.7 | Cum PnL: -78.6
------------------------------
Trade No: 118 | Entry | Date: 2022-08-19 00:00:00 | Premium: 51.4 | Position: 1
Trade No: 118 | Exit Type: Expiry or Signal Based | Date: 2022-08-22 00:00:00 | Premium: -17.1 | PnL: 34.4 | Cum PnL: -44.2
------------------------------
Trade No: 119 | Entry | Date: 2

Trade No: 152 | Exit Type: TP | Date: 2021-01-07 00:00:00 | Premium: 58.6 | PnL: 60.0 | Cum PnL: 97.6
------------------------------
Trade No: 153 | Entry | Date: 2021-02-03 00:00:00 | Premium: -0.8 | Position: 1
Trade No: 153 | Exit Type: TP | Date: 2021-02-04 00:00:00 | Premium: 29.2 | PnL: 28.4 | Cum PnL: 126.0
------------------------------
Trade No: 154 | Entry | Date: 2021-02-05 00:00:00 | Premium: 4.8 | Position: 1
Trade No: 154 | Exit Type: TP | Date: 2021-02-08 00:00:00 | Premium: 14.5 | PnL: 19.2 | Cum PnL: 145.2
------------------------------
Trade No: 155 | Entry | Date: 2021-02-09 00:00:00 | Premium: 2.6 | Position: 1
Trade No: 155 | Exit Type: Expiry or Signal Based | Date: 2021-02-10 00:00:00 | Premium: -8.4 | PnL: -5.8 | Cum PnL: 139.4
------------------------------
Trade No: 156 | Entry | Date: 2021-02-11 00:00:00 | Premium: 6.9 | Position: 1
Trade No: 156 | Exit Type: TP | Date: 2021-02-12 00:00:00 | Premium: 13.9 | PnL: 20.8 | Cum PnL: 160.2
-------------------------

Trade No: 194 | Exit Type: TP | Date: 2021-10-19 00:00:00 | Premium: 27.9 | PnL: 28.9 | Cum PnL: 507.2
------------------------------
Trade No: 195 | Entry | Date: 2021-10-25 00:00:00 | Premium: 0.3 | Position: 1
Trade No: 195 | Exit Type: TP | Date: 2021-10-26 00:00:00 | Premium: 10.4 | PnL: 10.7 | Cum PnL: 517.9
------------------------------
Trade No: 196 | Entry | Date: 2021-10-27 00:00:00 | Premium: -3.9 | Position: 1
Trade No: 196 | Exit Type: TP | Date: 2021-10-28 00:00:00 | Premium: 44.9 | PnL: 41.0 | Cum PnL: 558.9
------------------------------
Trade No: 197 | Entry | Date: 2021-10-29 00:00:00 | Premium: -1.6 | Position: 1
Trade No: 197 | Exit Type: SL | Date: 2021-11-01 00:00:00 | Premium: -3.1 | PnL: -4.7 | Cum PnL: 554.2
------------------------------
Trade No: 198 | Entry | Date: 2021-11-15 00:00:00 | Premium: 6.5 | Position: 1
Trade No: 198 | Exit Type: TP | Date: 2021-11-16 00:00:00 | Premium: 22.2 | PnL: 28.7 | Cum PnL: 582.9
------------------------------
Trade No: 19

------------------------------
Trade No: 233 | Entry | Date: 2022-08-18 00:00:00 | Premium: 5.9 | Position: 1
Trade No: 233 | Exit Type: Expiry or Signal Based | Date: 2022-08-19 00:00:00 | Premium: -54.4 | PnL: -48.5 | Cum PnL: 462.8
------------------------------
Trade No: 234 | Entry | Date: 2022-08-22 00:00:00 | Premium: 3.1 | Position: 1
Trade No: 234 | Exit Type: SL | Date: 2022-08-23 00:00:00 | Premium: -5.0 | PnL: -1.9 | Cum PnL: 460.9
------------------------------
Trade No: 235 | Entry | Date: 2022-08-24 00:00:00 | Premium: -5.3 | Position: 1
Trade No: 235 | Exit Type: TP | Date: 2022-08-25 00:00:00 | Premium: 28.6 | PnL: 23.3 | Cum PnL: 484.2
------------------------------
Trade No: 236 | Entry | Date: 2022-08-26 00:00:00 | Premium: -15.9 | Position: 1
Trade No: 236 | Exit Type: Expiry or Signal Based | Date: 2022-08-29 00:00:00 | Premium: -25.5 | PnL: -41.4 | Cum PnL: 442.8
------------------------------
Trade No: 237 | Entry | Date: 2022-09-23 00:00:00 | Premium: 8.9 | Pos

------------------------------
Trade No: 269 | Entry | Date: 2021-09-13 00:00:00 | Premium: 56.0 | Position: 1
Trade No: 269 | Exit Type: Expiry or Signal Based | Date: 2021-09-14 00:00:00 | Premium: -63.8 | PnL: -7.8 | Cum PnL: 474.7
------------------------------
Trade No: 270 | Entry | Date: 2021-09-27 00:00:00 | Premium: 19.4 | Position: 1
Trade No: 270 | Exit Type: Expiry or Signal Based | Date: 2021-09-28 00:00:00 | Premium: -90.0 | PnL: -70.6 | Cum PnL: 404.1
------------------------------
Trade No: 271 | Entry | Date: 2021-09-29 00:00:00 | Premium: 20.0 | Position: 1
Trade No: 271 | Exit Type: Expiry or Signal Based | Date: 2021-09-30 00:00:00 | Premium: -44.4 | PnL: -24.4 | Cum PnL: 379.7
------------------------------
Trade No: 272 | Entry | Date: 2021-10-13 00:00:00 | Premium: 55.6 | Position: 1
Trade No: 272 | Exit Type: TP | Date: 2021-10-14 00:00:00 | Premium: -26.9 | PnL: 28.6 | Cum PnL: 408.3
------------------------------
Trade No: 273 | Entry | Date: 2021-10-15 00:00:

In [8]:
# Round trip details
round_trips_details.head()

Unnamed: 0,Option Type,Strike Price,position,entry_price,entry_date,exit_date,exit_type,exit_price
0,call,3445.0,-1,66.45,2020-10-08,2020-10-09,Expiry or Signal Based,77.44
1,put,3445.0,1,70.09,2020-10-08,2020-10-09,Expiry or Signal Based,49.9
2,underlying,3445.0,-1,0.0,2020-10-08,2020-10-09,Expiry or Signal Based,0.0
0,call,3425.0,-1,52.5,2020-10-19,2020-10-20,Expiry or Signal Based,60.6
1,put,3425.0,1,53.45,2020-10-19,2020-10-20,Expiry or Signal Based,41.8


In [9]:
# MTM details
mark_to_market.head()

Unnamed: 0,Option Type,Strike Price,position,premium,Date
0,call,3445.0,-1,66.45,2020-10-08
1,put,3445.0,1,70.09,2020-10-08
2,underlying,3445.0,-1,0.0,2020-10-08
0,call,3445.0,-1,77.44,2020-10-09
1,put,3445.0,1,49.9,2020-10-09


## Conclusion
In this notebook, we have backtested the strategies predicted by the LSTM model. Using the dataframes `round_trips_details` and `mark_to_market` we will analyse the trades in the next notebook.