## Backtesting of Trading Strategies

The aim of this project is to determine if an incentive exists by using a technical trading method when compared to a typical Buy-and-Hold (B&H) strategy. The technical trading strategy used in this project is the **50/200 Simple Moving Average (SMA) Crossover**.

In this strategy, the following signals will be generated:
1. Buy: 50-period SMA cuts above 200-period SMA at time = T, and 50-period SMA is below 200-period SMA at time < T. 
2. Sell: 50-period SMA cuts below 200-period SMA at time = T, and 50-period SMA is above 200-period SMA at time < T.

The stock price used in this backtesting is the **daily closing price**. We are considering the performance of each strategy on a **per-share basis** and in a **no short-selling** environment.

In [1]:
%matplotlib notebook

import matplotlib.pyplot as plt
import matplotlib.dates as mDates
import pandas as pd
import numpy as np
from matplotlib.widgets import SpanSelector

# First, we import the relevant stock price dataset into a DataFrame (Yahoo Finance, 2020)
df = pd.read_csv('AAPL (Yahoo Finance).csv', header = 0, index_col = 0)
df.index = pd.to_datetime(df.index)
cols_to_keep = ['Close'] # Specific the columns to keep by column header. In this case, we are using the closing prices only.
df = df[cols_to_keep].ffill(axis = 0) # For values that are not available, fill the missing value with previous day data. 

Next, in the DataFrame imported, we will calculate the $n$-period SMA and the $m$-period SMA respectively. In this notebook we will be using $n = 50$ and $m = 200$. However, you can freely change both n and m as long as $n < m$. 

Note: In this notebook, an error will be returned if $n > m$.

In [2]:
n = 50
m = 200

if n > m:
    raise Exception("n = periods of short-term SMA, while m = periods of long-term SMA. Hence, n must be lesser than m.")

In [3]:
# Calculate n-period short-term SMA & m-period long-term SMA
df['nSMA'] = df['Close'].rolling(window = n, win_type = None).mean()
df['mSMA'] = df['Close'].rolling(window = m, win_type = None).mean()

Then, we define a function that extracts the information pertaining to a period and calculate the trading signals

In [4]:
def extract_period(d1, d2):
    # Process the date input first. 
    if d1 < d2:
        start = d1
        end = d2
    else:
        start = d2
        end = d1
        
    period = df[(df.index >= start) & (df.index <= end)].copy()
    
    if period['nSMA'].iloc[0] < period['mSMA'].iloc[0]:
        """
        This if-condition is to mitigate a possible scenario where the user enters when nSMA < mSMA. Usually, if nSMA cuts below mSMA, it will stay that way for a period of time. Therefore, we need a flag to indicate that when we enter during nSMA < mSMA, we will remain in the position until the next sell-signal generated (ie. nSMA cuts above mSMA first, then cuts below later).
        """
        flag = 0 # To counter the initial situation whereby enter when nSMA < mSMA and exit the next day
    else:
        flag = 1 # To indicate that nSMA > mSMA when enter on start date.
        
    # Compute the signals after entering date, accounting for initial flag
    buy_triggered = 1 # First transaction always a buy
    sell_triggered = 0
    for i in range(len(period)):
        if i != 0:
            if period.loc[period.index[i], 'nSMA'] > period.loc[period.index[i], 'mSMA']:
                if buy_triggered == 0:
                    period.loc[period.index[i], 'Signal'] = 'B'
                    buy_triggered = 1
                    sell_triggered = 0
                else:
                    period.loc[period.index[i], 'Signal'] = 'N'
                    if flag == 0:
                        flag = 1
            else:
                if flag == 1:
                    if sell_triggered == 0:
                        period.loc[period.index[i], 'Signal'] = 'S'
                        buy_triggered = 0
                        sell_triggered = 1
                    else:
                        period.loc[period.index[i], 'Signal'] = 'N'
                else:
                    period.loc[period.index[i], 'Signal'] = 'N'
        else:
            period.loc[period.index[i], 'Signal'] = 'B'
            
    # Compute the returns according to the buy-in and sell-out prices
    unit = 0
    buy_in = 0
    sell_out = 0
    for i in range(len(period)):
        if period.loc[period.index[i], 'Signal'] == 'B':
            if unit == 0:
                unit = 1
                buy_in = period.loc[period.index[i], 'Close']
        elif period.loc[period.index[i], 'Signal'] == 'S':
            if unit == 1:
                unit = 0
                sell_out = period.loc[period.index[i], 'Close']
                period.loc[period.index[i], 'Profit'] = sell_out - buy_in
        
        if i == (len(period) - 1):
            if unit == 1:
                unit = 0
                period.loc[period.index[i], 'Signal'] = 'S'
                period.loc[period.index[i], 'Profit'] = period.loc[period.index[i], 'Close'] - buy_in
    return period

For example, we would want to see the profit from the trading strategies from 1 Jan 2010 to 31 Dec 2019. 

In [5]:
start_date = pd.to_datetime('1 Jan 2010')
end_date = pd.to_datetime('31 Dec 2019')

period = extract_period(start_date, end_date)

We then compare the profits of each strategy, assuming that all positions are liquidated on the end_date, regardless of signals. We also computed the amount of transactions made in the SMA strategy. This is because **transaction fees** exist in the real-world. Thus, if the profit generated is not superior after considering transaction costs, one should consider the B&H strategy instead.

In [6]:
SMA_Profit = '${:,.2f}'.format(period['Profit'].sum())
BnH_Profit = '${:,.2f}'.format(period['Close'].iloc[len(period) - 1] - period['Close'].iloc[0])
diff = '${:,.2f}'.format(period['Profit'].sum() - (period['Close'].iloc[len(period) - 1] - period['Close'].iloc[0]))

count = period[period['Signal'] != 'N'].count()['Signal']
print('The profit/loss generated through the {}/{} SMA strategy is {} , while the B&H strategy generated {}. The difference between the two strategies is {}.'.format(n, m, SMA_Profit, BnH_Profit, diff),
     'The amount of transaction(s) made in the SMA strategy is {}.'.format(count - 2))

The profit/loss generated through the 50/200 SMA strategy is $227.66 , while the B&H strategy generated $263.08. The difference between the two strategies is $-35.42. The amount of transaction(s) made in the SMA strategy is 6.


### Visualisation of Trading Strategy

In this section, we will visualise how the 50/200 SMA trading strategy work. We will plot the stock price, 50-period SMA, and 50-period SMA on the same chart.

In [7]:
fig = plt.figure()
ax = plt.gca()
x_pos = period.index

# First, we plot the stock price, 50-period SMA and 200-period SMA
ax.plot(x_pos, period['Close'], zorder = 1, c = 'teal')
ax.plot(x_pos, period['nSMA'], zorder = 1, c = 'lightcoral')
ax.plot(x_pos, period['mSMA'], zorder = 1, c= 'chocolate')

# Next, we overlay a scatter when a signal is triggered
buy_sign = period[(period['Signal'] != 'N') & (period['Signal'] == 'B')]
sell_sign = period[(period['Signal'] != 'N') & (period['Signal'] == 'S')]

ax.scatter(buy_sign.index, buy_sign['Close'], color = 'yellowgreen', zorder = 2)
ax.scatter(sell_sign.index, sell_sign['Close'], color = 'red', zorder = 2)
plt.legend(['AAPL', '50SMA', '200SMA', 'BUY', 'SELL'])
ax.set_title('Applying 50/200 SMA trading strategy on AAPL from {} to {}'.format(start_date.strftime('%Y-%m-%d'), 
                                                                                 end_date.strftime('%Y-%m-%d')))
ax.set_ylabel('USD ($)')
fig.set_size_inches(9.5, 5)

<IPython.core.display.Javascript object>

### Interactive Backtesting

Now that we have seen how the strategy worked for a given timeframe, we can integrate interactivity with the start to generate the profits and signals dynamically. For ease of change, we will redefine n, m and calculate the rolling moving average again.

In [8]:
n = 50
m = 200

if n > m:
    raise Exception("n = periods of short-term SMA, while m = periods of long-term SMA. Hence, n must be lesser than m.")
    
df['nSMA'] = df['Close'].rolling(window = n, win_type = None).mean()
df['mSMA'] = df['Close'].rolling(window = m, win_type = None).mean()

Next, we define a new function that takes an onSelect event as a parameter. Using the onSelect event, we can determine which part of the canvas (graph) were selected, and we can take the values for our calculations.

In [9]:
fig = plt.figure(figsize=(9.5, 7))
ax = fig.add_subplot(211)
ax.plot(df.index, df['Close'], '-')
ax.set_title('Left-click and drag the testing period')
ax.set_ylabel('USD ($)')

ax2 = fig.add_subplot(212)
ax2.set_title('Select a testing period above first!')
fig.tight_layout(pad = 3)

def onselect(xmin, xmax):
    start_date = pd.to_datetime(mDates.num2date(xmin, tz = None).replace(tzinfo = None))
    end_date = pd.to_datetime(mDates.num2date(xmax, tz = None).replace(tzinfo = None))
    period = extract_period(start_date, end_date)
    ax2.cla()
    ax2.set_title('Comparision of strategies from {} to {}'.format(start_date.strftime('%Y-%m-%d'), end_date.strftime('%Y-%m-%d')))
    x_pos = period.index
    ax2.plot(x_pos, period['Close'], zorder = 1, c = 'teal')
    ax2.plot(x_pos, period['nSMA'], zorder = 1, c = 'lightcoral')
    ax2.plot(x_pos, period['mSMA'], zorder = 1, c= 'chocolate')
    buy_sign = period[(period['Signal'] != 'N') & (period['Signal'] == 'B')]
    sell_sign = period[(period['Signal'] != 'N') & (period['Signal'] == 'S')]

    ax2.scatter(buy_sign.index, buy_sign['Close'], color = 'yellowgreen', zorder = 2)
    ax2.scatter(sell_sign.index, sell_sign['Close'], color = 'red', zorder = 2)
    
    SMA_Profit = '${:,.2f}'.format(period['Profit'].sum())
    BnH_Profit = '${:,.2f}'.format(period['Close'].iloc[len(period) - 1] - period['Close'].iloc[0])
    diff = '${:,.2f}'.format(period['Profit'].sum() - (period['Close'].iloc[len(period) - 1] - period['Close'].iloc[0]))

    count = period[period['Signal'] != 'N'].count()['Signal']
    ax2.annotate('Profit per share\nSMA: {}\nB&H: {}'.format(SMA_Profit, BnH_Profit), xy = (105, 255), xycoords = 'figure pixels')
    ax.set_ylabel('USD ($)')
    
span = SpanSelector(ax, onselect, 'horizontal', useblit= True, rectprops=dict(alpha=0.75, facecolor='lightgrey'))

<IPython.core.display.Javascript object>