# Steps to Backtest a Strategy/Algorithm

Backtesting a trading strategy or algorithm involves simulating its performance against historical data to assess its effectiveness and profitability. Below are the steps involved in conducting a backtest:

1. **Set Up Initial Investment:** Determine the initial capital that will be used to execute the trading strategy.

2. **Set Up Volume of Shares:** Decide on the number of shares or volume that the strategy will trade with for each transaction.

3. **Create Position Column:** Create a column in your dataset to track the position of the stock. This column will contain the number of shares that the strategy is holding at any given point in time.

4. **Identify Entry/Exit Points:** Determine specific points in time when the strategy will buy or sell shares. This can be based on indicators, signals, or conditions defined by the strategy.

5. **Calculate Portfolio Holdings:** Calculate the dollar value of the shares held in the portfolio at each timestamp. This is also known as the portfolio holdings.

6. **Calculate Portfolio Cash:** Determine the amount of cash available in the portfolio after each transaction. This accounts for buying and selling shares and any associated transaction costs.

7. **Track Portfolio Value:** Calculate the total value of the portfolio on a daily basis. This includes the combined value of the held stock and the available cash.

8. **Calculate Portfolio Daily Returns:** Compute the daily returns of the portfolio based on changes in its value over time.

9. **Calculate Cumulative Returns:** Aggregate the daily returns to calculate the cumulative returns of the portfolio. This provides an overall measure of the strategy's performance over the backtesting period.
10. **Visualize**


In [1]:
# Import the required libraries and dependencies
import numpy as np
import pandas as pd
import hvplot.pandas
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

# Read the stock file to a dataframe
# Set the date column as the DataTimeIndex
aapl_df = pd.read_csv(
    Path("../Resources/aapl.csv"),
    index_col="date",
    parse_dates=True,
    infer_datetime_format=True)

# Review the DataFrame
aapl_df.head()


Unnamed: 0_level_0,close,volume,open,high,low
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2014-09-22,101.06,52421660,101.8,102.14,100.58
2014-09-23,102.64,63255860,100.6,102.94,100.54
2014-09-24,101.75,59974260,102.16,102.85,101.2
2014-09-25,97.87,99689300,100.51,100.71,97.72
2014-09-26,100.75,62276770,98.53,100.75,98.4


In [2]:
# Slice to just the `close` column
signals_df = aapl_df.loc[:,["close"]]
signals_df

Unnamed: 0_level_0,close
date,Unnamed: 1_level_1
2014-09-22,101.06
2014-09-23,102.64
2014-09-24,101.75
2014-09-25,97.87
2014-09-26,100.75
...,...
2019-09-13,218.75
2019-09-16,219.90
2019-09-17,220.70
2019-09-18,222.77


In [3]:
# Set the short window and long windows

short_window = 100
long_window = 250

In [4]:
# Generate the short and long moving averages (50 and 100 days, respectively)
signals_df["SMA50"] = signals_df["close"].rolling(window=short_window).mean()
signals_df["SMA100"] = signals_df["close"].rolling(window=long_window).mean()

# Review the DataFrame
display(signals_df.head())
display(signals_df.tail())
# Prepopulate the `Signal` for trading
signals_df['Signal'] = 0.0

Unnamed: 0_level_0,close,SMA50,SMA100
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2014-09-22,101.06,,
2014-09-23,102.64,,
2014-09-24,101.75,,
2014-09-25,97.87,,
2014-09-26,100.75,,


Unnamed: 0_level_0,close,SMA50,SMA100
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2019-09-13,218.75,200.50975,192.04806
2019-09-16,219.9,200.63715,192.05614
2019-09-17,220.7,200.79135,192.06598
2019-09-18,222.77,200.97605,192.08358
2019-09-19,220.96,201.13955,192.0873


In [5]:
# Generate the trading signal 0 or 1,
# where 1 is when short-window (SMA50) is greater than the long (SMA 100)
# and 0 otherwise

signals_df["Signal"][short_window:] = np.where(
    signals_df["SMA50"][short_window:] > signals_df["SMA100"][short_window:], 1.0, 0.0
)

# Review the DataFrame
signals_df.tail(10)


Unnamed: 0_level_0,close,SMA50,SMA100,Signal
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2019-09-06,213.26,199.7293,192.11688,1.0
2019-09-09,214.17,199.8785,192.10024,1.0
2019-09-10,216.7,200.0142,192.07164,1.0
2019-09-11,223.59,200.2115,192.08172,1.0
2019-09-12,223.085,200.39705,192.06842,1.0
2019-09-13,218.75,200.50975,192.04806,1.0
2019-09-16,219.9,200.63715,192.05614,1.0
2019-09-17,220.7,200.79135,192.06598,1.0
2019-09-18,222.77,200.97605,192.08358,1.0
2019-09-19,220.96,201.13955,192.0873,1.0


In [6]:
# Calculate the points in time when the Signal value changes
# Identify trade entry (1) and exit (-1) points
signals_df["Entry/Exit"] = signals_df["Signal"].diff()

# Review the DataFrame
signals_df.loc["2015-02-09":"2015-02-17"]

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2015-02-09,119.72,,,0.0,0.0
2015-02-10,122.02,,,0.0,0.0
2015-02-11,124.88,,,0.0,0.0
2015-02-12,126.46,109.71095,,0.0,0.0
2015-02-13,127.08,109.97115,,0.0,0.0
2015-02-17,127.83,110.22305,,0.0,0.0


In [7]:
# Visualize exit position relative to close price
# Visualize exit position relative to close price
exit = signals_df[signals_df["Entry/Exit"] == -1.0]["close"].hvplot.scatter(
    color="yellow",
    marker="v",
    size=200,
    legend=False,
    ylabel="Price in $",
    width=1000,
    height=400)

# Show the plot
exit

# Visualize entry position relative to close price
entry = signals_df[signals_df["Entry/Exit"] == 1.0]["close"].hvplot.scatter(
    color="purple",
    marker="^",
    size=200,
    legend=False,
    ylabel="Price in $",
    width=1000,
    height=400)

# Show the plot
entry

# Visualize close price for the investment
security_close = signals_df[["close"]].hvplot(
    line_color="lightgray",
    ylabel="Price in $",
    width=1000,
    height=400)

# Show the plot
security_close


# Visualize moving averages
moving_avgs = signals_df[["SMA50", "SMA100"]].hvplot(
    ylabel="Price in $",
    width=1000,
    height=400)

# Show the plot
moving_avgs


# Create the overlay plot
entry_exit_plot = security_close * moving_avgs * entry * exit

# Show the plot
entry_exit_plot.opts(
    title="Apple - SMA50, SMA100, Entry and Exit Points"
)





#### 1. **Set Up Initial Investment:** Determine the initial capital that will be used to execute the trading strategy.
#### 2. **Set Up Volume of Shares:** Decide on the number of shares or volume that the strategy will trade with for each transaction.



In [8]:
# Set initial capital
initial_capital = float(100000)

# Set the share size
share_size = 500

#### 3. **Create Position Column:** Create a column in your dataset to track the position of the stock. This column will contain the number of shares that the strategy is holding at any given point in time.


In [9]:
# Buy a 500 share position when the dual moving average crossover Signal equals 1
# Otherwise, `Position` should be zero (sell)
signals_df['Position'] = share_size*signals_df.Signal
signals_df.tail()

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit,Position
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2019-09-13,218.75,200.50975,192.04806,1.0,0.0,500.0
2019-09-16,219.9,200.63715,192.05614,1.0,0.0,500.0
2019-09-17,220.7,200.79135,192.06598,1.0,0.0,500.0
2019-09-18,222.77,200.97605,192.08358,1.0,0.0,500.0
2019-09-19,220.96,201.13955,192.0873,1.0,0.0,500.0


#### 4. **Identify Entry/Exit Points:** Determine specific points in time when the strategy will buy or sell shares. This can be based on indicators, signals, or conditions defined by the strategy.


In [10]:
# Determine the points in time where a 500 share position is bought or sold
signals_df['Entry/Exit Position'] = signals_df.Position.diff()
signals_df.tail(50)

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit,Position,Entry/Exit Position
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2019-07-11,201.75,191.3014,192.32776,0.0,0.0,0.0,0.0
2019-07-12,203.3,191.6251,192.37564,0.0,0.0,0.0,0.0
2019-07-15,205.21,191.9569,192.43284,0.0,0.0,0.0,0.0
2019-07-16,204.5,192.2913,192.48504,0.0,0.0,0.0,0.0
2019-07-17,203.35,192.5951,192.53684,1.0,1.0,500.0,500.0
2019-07-18,205.66,192.9094,192.59196,1.0,0.0,500.0,0.0
2019-07-19,202.59,193.192,192.63656,1.0,0.0,500.0,0.0
2019-07-22,207.22,193.5155,192.699,1.0,0.0,500.0,0.0
2019-07-23,208.84,193.8724,192.76236,1.0,0.0,500.0,0.0
2019-07-24,208.67,194.2094,192.81776,1.0,0.0,500.0,0.0


#### 5. **Calculate Portfolio Holdings:** Calculate the dollar value of the shares held in the portfolio at each timestamp. This is also known as the portfolio holdings.


In [11]:
# Multiply the close price by the number of shares held, or the Position
signals_df['Portfolio Holdings'] = signals_df['close'] * signals_df.Position
signals_df.tail(20)

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit,Position,Entry/Exit Position,Portfolio Holdings
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2019-08-22,212.46,198.7426,192.72116,1.0,0.0,500.0,0.0,106230.0
2019-08-23,202.64,198.8288,192.66708,1.0,0.0,500.0,0.0,101320.0
2019-08-26,206.49,198.9402,192.62128,1.0,0.0,500.0,0.0,103245.0
2019-08-27,204.16,199.0249,192.55912,1.0,0.0,500.0,0.0,102080.0
2019-08-28,205.53,199.1102,192.48932,1.0,0.0,500.0,0.0,102765.0
2019-08-29,209.01,199.1993,192.42524,1.0,0.0,500.0,0.0,104505.0
2019-08-30,208.74,199.2917,192.34968,1.0,0.0,500.0,0.0,104370.0
2019-09-03,205.7,199.3425,192.25904,1.0,0.0,500.0,0.0,102850.0
2019-09-04,209.19,199.4449,192.18832,1.0,0.0,500.0,0.0,104595.0
2019-09-05,213.28,199.589,192.14904,1.0,0.0,500.0,0.0,106640.0


#### 6. **Calculate Portfolio Cash:** Determine the amount of cash available in the portfolio after each transaction. This accounts for buying and selling shares and any associated transaction costs.


In [12]:
# Subtract the amount of either the cost or proceeds of the trade from the initial capital invested
# Assume for this case that transaction cost = $0

signals_df['Portfolio Cash'] = initial_capital -(signals_df.close * signals_df['Entry/Exit Position']).cumsum()

signals_df.tail(50)

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit,Position,Entry/Exit Position,Portfolio Holdings,Portfolio Cash
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2019-07-11,201.75,191.3014,192.32776,0.0,0.0,0.0,0.0,0.0,125470.0
2019-07-12,203.3,191.6251,192.37564,0.0,0.0,0.0,0.0,0.0,125470.0
2019-07-15,205.21,191.9569,192.43284,0.0,0.0,0.0,0.0,0.0,125470.0
2019-07-16,204.5,192.2913,192.48504,0.0,0.0,0.0,0.0,0.0,125470.0
2019-07-17,203.35,192.5951,192.53684,1.0,1.0,500.0,500.0,101675.0,23795.0
2019-07-18,205.66,192.9094,192.59196,1.0,0.0,500.0,0.0,102830.0,23795.0
2019-07-19,202.59,193.192,192.63656,1.0,0.0,500.0,0.0,101295.0,23795.0
2019-07-22,207.22,193.5155,192.699,1.0,0.0,500.0,0.0,103610.0,23795.0
2019-07-23,208.84,193.8724,192.76236,1.0,0.0,500.0,0.0,104420.0,23795.0
2019-07-24,208.67,194.2094,192.81776,1.0,0.0,500.0,0.0,104335.0,23795.0


#### 7. **Track Portfolio Value:** Calculate the total value of the portfolio on a daily basis. This includes the combined value of the held stock and the available cash.


In [13]:
# Calculate the total portfolio value by adding the portfolio cash to the portfolio holdings (or investments)
signals_df['Portfolio Total'] = signals_df['Portfolio Cash'] + signals_df['Portfolio Holdings']
signals_df.tail()


Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit,Position,Entry/Exit Position,Portfolio Holdings,Portfolio Cash,Portfolio Total
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2019-09-13,218.75,200.50975,192.04806,1.0,0.0,500.0,0.0,109375.0,23795.0,133170.0
2019-09-16,219.9,200.63715,192.05614,1.0,0.0,500.0,0.0,109950.0,23795.0,133745.0
2019-09-17,220.7,200.79135,192.06598,1.0,0.0,500.0,0.0,110350.0,23795.0,134145.0
2019-09-18,222.77,200.97605,192.08358,1.0,0.0,500.0,0.0,111385.0,23795.0,135180.0
2019-09-19,220.96,201.13955,192.0873,1.0,0.0,500.0,0.0,110480.0,23795.0,134275.0


#### 8. **Calculate Portfolio Daily Returns:** Compute the daily returns of the portfolio based on changes in its value over time.


In [14]:
# Calculate the portfolio daily returns
signals_df['Porfolio Daily Returns'] = signals_df['Portfolio Total'].pct_change()
signals_df.tail()

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit,Position,Entry/Exit Position,Portfolio Holdings,Portfolio Cash,Portfolio Total,Porfolio Daily Returns
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
2019-09-13,218.75,200.50975,192.04806,1.0,0.0,500.0,0.0,109375.0,23795.0,133170.0,-0.016016
2019-09-16,219.9,200.63715,192.05614,1.0,0.0,500.0,0.0,109950.0,23795.0,133745.0,0.004318
2019-09-17,220.7,200.79135,192.06598,1.0,0.0,500.0,0.0,110350.0,23795.0,134145.0,0.002991
2019-09-18,222.77,200.97605,192.08358,1.0,0.0,500.0,0.0,111385.0,23795.0,135180.0,0.007716
2019-09-19,220.96,201.13955,192.0873,1.0,0.0,500.0,0.0,110480.0,23795.0,134275.0,-0.006695


#### 9. **Calculate Cumulative Returns:** Aggregate the daily returns to calculate the cumulative returns of the portfolio. This provides an overall measure of the strategy's performance over the backtesting period.


In [15]:
# Calculate the portfolio cumulative returns
signals_df['Portfolio Cumulative Returns'] = (1+ signals_df['Porfolio Daily Returns']).cumprod() - 1
signals_df.tail()

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit,Position,Entry/Exit Position,Portfolio Holdings,Portfolio Cash,Portfolio Total,Porfolio Daily Returns,Portfolio Cumulative Returns
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2019-09-13,218.75,200.50975,192.04806,1.0,0.0,500.0,0.0,109375.0,23795.0,133170.0,-0.016016,0.3317
2019-09-16,219.9,200.63715,192.05614,1.0,0.0,500.0,0.0,109950.0,23795.0,133745.0,0.004318,0.33745
2019-09-17,220.7,200.79135,192.06598,1.0,0.0,500.0,0.0,110350.0,23795.0,134145.0,0.002991,0.34145
2019-09-18,222.77,200.97605,192.08358,1.0,0.0,500.0,0.0,111385.0,23795.0,135180.0,0.007716,0.3518
2019-09-19,220.96,201.13955,192.0873,1.0,0.0,500.0,0.0,110480.0,23795.0,134275.0,-0.006695,0.34275


In [16]:
# Print the DataFrame
signals_df.tail()

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit,Position,Entry/Exit Position,Portfolio Holdings,Portfolio Cash,Portfolio Total,Porfolio Daily Returns,Portfolio Cumulative Returns
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2019-09-13,218.75,200.50975,192.04806,1.0,0.0,500.0,0.0,109375.0,23795.0,133170.0,-0.016016,0.3317
2019-09-16,219.9,200.63715,192.05614,1.0,0.0,500.0,0.0,109950.0,23795.0,133745.0,0.004318,0.33745
2019-09-17,220.7,200.79135,192.06598,1.0,0.0,500.0,0.0,110350.0,23795.0,134145.0,0.002991,0.34145
2019-09-18,222.77,200.97605,192.08358,1.0,0.0,500.0,0.0,111385.0,23795.0,135180.0,0.007716,0.3518
2019-09-19,220.96,201.13955,192.0873,1.0,0.0,500.0,0.0,110480.0,23795.0,134275.0,-0.006695,0.34275


In [17]:
# Visualize exit position relative to total portfolio value
exit = signals_df[signals_df['Entry/Exit'] == -1.0]['Portfolio Total'].hvplot.scatter(
    color='yellow',
    marker='v',
    legend=False,
    ylabel='Total Portfolio Value',
    width=1000,
    height=400
)

# Visualize entry position relative to total portfolio value
entry = signals_df[signals_df['Entry/Exit'] == 1.0]['Portfolio Total'].hvplot.scatter(
    color='purple',
    marker='^',
    ylabel='Total Portfolio Value',
    width=1000,
    height=400
)

# Visualize the value of the total portfolio
total_portfolio_value = signals_df[['Portfolio Total']].hvplot(
    line_color='lightgray',
    ylabel='Total Portfolio Value',
    xlabel='Date',
    width=1000,
    height=400
)

# Overlay the plots
portfolio_entry_exit_plot = total_portfolio_value * entry * exit
portfolio_entry_exit_plot.opts(
    title="Apple Algorithm - Total Portfolio Value",
    yformatter='%.0f'
)