# Steps to Backtest a Strategy/Algorithm

Backtesting a trading strategy or algorithm involves simulating its performance against historical data to assess its effectiveness and profitability. Below are the steps involved in conducting a backtest:

1. **Set Up Initial Investment:** Determine the initial capital that will be used to execute the trading strategy.

2. **Set Up Volume of Shares:** Decide on the number of shares or volume that the strategy will trade with for each transaction.

3. **Create Position Column:** Create a column in your dataset to track the position of the stock. This column will contain the number of shares that the strategy is holding at any given point in time.

4. **Identify Entry/Exit Points:** Determine specific points in time when the strategy will buy or sell shares. This can be based on indicators, signals, or conditions defined by the strategy.

5. **Calculate Portfolio Holdings:** Calculate the dollar value of the shares held in the portfolio at each timestamp. This is also known as the portfolio holdings.

6. **Calculate Portfolio Cash:** Determine the amount of cash available in the portfolio after each transaction. This accounts for buying and selling shares and any associated transaction costs.

7. **Track Portfolio Value:** Calculate the total value of the portfolio on a daily basis. This includes the combined value of the held stock and the available cash.

8. **Calculate Portfolio Daily Returns:** Compute the daily returns of the portfolio based on changes in its value over time.

9. **Calculate Cumulative Returns:** Aggregate the daily returns to calculate the cumulative returns of the portfolio. This provides an overall measure of the strategy's performance over the backtesting period.


In [1]:
# Import the required libraries and dependencies
import numpy as np
import pandas as pd
import hvplot.pandas
from pathlib import Path

# Allow for reviewing more of the DataFrames
pd.set_option('display.max_rows', 2000)
pd.set_option('display.max_columns', 2000)
pd.set_option('display.width', 1000)

# Read the stock file to a dataframe
# Set the date column as the DataTimeIndex
aapl_df = pd.read_csv(
    Path("Resources/aapl.csv"),
    index_col="date",
    parse_dates=True,
    infer_datetime_format=True)

# Review the DataFrame
aapl_df.head()


  aapl_df = pd.read_csv(
  aapl_df = pd.read_csv(


Unnamed: 0_level_0,close,volume,open,high,low
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2014-09-22,101.06,52421660,101.8,102.14,100.58
2014-09-23,102.64,63255860,100.6,102.94,100.54
2014-09-24,101.75,59974260,102.16,102.85,101.2
2014-09-25,97.87,99689300,100.51,100.71,97.72
2014-09-26,100.75,62276770,98.53,100.75,98.4


In [2]:
# Slice to just the `close` column
signals_df = aapl_df.loc[:,["close"]]

In [3]:
# Set the short window and long windows
short_window = 50
long_window = 100

In [4]:
# Generate the short and long moving averages (50 and 100 days, respectively)
signals_df['SMA50'] = signals_df['close'].rolling(window=short_window).mean()
signals_df['SMA100'] = signals_df['close'].rolling(window=long_window).mean()

# Prepopulate the `Signal` for trading
signals_df['Signal'] = 0.0

In [6]:
# Generate the trading signal 0 or 1,
# where 1 is when short-window (SMA50) is greater than the long (SMA 100)
# and 0 otherwise
signals_df['Signal'][short_window:] = np.where(
    signals_df['SMA50'][short_window:] > signals_df['SMA100'][short_window:], 1.0, 0.0
)

In [7]:
signals_df.head(100)

Unnamed: 0_level_0,close,SMA50,SMA100,Signal
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2014-09-22,101.06,,,0.0
2014-09-23,102.64,,,0.0
2014-09-24,101.75,,,0.0
2014-09-25,97.87,,,0.0
2014-09-26,100.75,,,0.0
2014-09-29,100.11,,,0.0
2014-09-30,100.75,,,0.0
2014-10-01,99.18,,,0.0
2014-10-02,99.9,,,0.0
2014-10-03,99.62,,,0.0


In [8]:
# Calculate the points in time when the Signal value changes
# Identify trade entry (1) and exit (-1) points
signals_df['Entry/Exit'] = signals_df['Signal'].diff()

# Review the DataFrame
signals_df.tail(10)

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2019-09-06,213.26,205.2584,199.7293,1.0,0.0
2019-09-09,214.17,205.547,199.8785,1.0,0.0
2019-09-10,216.7,205.9226,200.0142,1.0,0.0
2019-09-11,223.59,206.3634,200.2115,1.0,0.0
2019-09-12,223.085,206.7705,200.39705,1.0,0.0
2019-09-13,218.75,207.0573,200.50975,1.0,0.0
2019-09-16,219.9,207.3707,200.63715,1.0,0.0
2019-09-17,220.7,207.7843,200.79135,1.0,0.0
2019-09-18,222.77,208.2149,200.97605,1.0,0.0
2019-09-19,220.96,208.5695,201.13955,1.0,0.0


In [9]:
# Visualize exit position relative to close price
exit = signals_df[signals_df['Entry/Exit'] == -1.0]['close'].hvplot.scatter(
    color='yellow',
    marker='v',
    size=200,
    legend=False,
    ylabel='Price in $',
    width=1000,
    height=400
)

# Visualize entry position relative to close price
entry = signals_df[signals_df['Entry/Exit'] == 1.0]['close'].hvplot.scatter(
    color='purple',
    marker='^',
    size=200,
    legend=False,
    ylabel='Price in $',
    width=1000,
    height=400
)

# Visualize close price for the investment
security_close = signals_df[['close']].hvplot(
    line_color='lightgray',
    ylabel='Price in $',
    width=1000,
    height=400
)

# Visualize moving averages
moving_avgs = signals_df[['SMA50', 'SMA100']].hvplot(
    ylabel='Price in $',
    width=1000,
    height=400
)

# Create the overlay plot
entry_exit_plot = security_close * moving_avgs * entry * exit

# Show the plot with a title
entry_exit_plot.opts(
    title="Apple - SMA50, SMA100, Entry and Exit Points"
)

#### 1. **Set Up Initial Investment:** Determine the initial capital that will be used to execute the trading strategy.
#### 2. **Set Up Volume of Shares:** Decide on the number of shares or volume that the strategy will trade with for each transaction.

In [10]:
# Set initial capital
initial_capital = float(100000)

# Set the share size
share_size = 500

#### 3. **Create Position Column:** Create a column in your dataset to track the position of the stock. This column will contain the number of shares that the strategy is holding at any given point in time.


In [11]:
# Buy a 500 share position when the dual moving average crossover Signal equals 1
# Otherwise, `Position` should be zero (sell)
signals_df['Position'] = share_size * signals_df['Signal']
signals_df.tail()

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit,Position
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2019-09-13,218.75,207.0573,200.50975,1.0,0.0,500.0
2019-09-16,219.9,207.3707,200.63715,1.0,0.0,500.0
2019-09-17,220.7,207.7843,200.79135,1.0,0.0,500.0
2019-09-18,222.77,208.2149,200.97605,1.0,0.0,500.0
2019-09-19,220.96,208.5695,201.13955,1.0,0.0,500.0


### 4. **Identify Entry/Exit Points:** Determine specific points in time when the strategy will buy or sell shares. This can be based on indicators, signals, or conditions defined by the strategy.


In [12]:
# Determine the points in time where a 500 share position is bought or sold
signals_df['Entry/Exit Position'] = signals_df['Position'].diff()
signals_df.tail()

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit,Position,Entry/Exit Position
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2019-09-13,218.75,207.0573,200.50975,1.0,0.0,500.0,0.0
2019-09-16,219.9,207.3707,200.63715,1.0,0.0,500.0,0.0
2019-09-17,220.7,207.7843,200.79135,1.0,0.0,500.0,0.0
2019-09-18,222.77,208.2149,200.97605,1.0,0.0,500.0,0.0
2019-09-19,220.96,208.5695,201.13955,1.0,0.0,500.0,0.0


#### 5. **Calculate Portfolio Holdings:** Calculate the dollar value of the shares held in the portfolio at each timestamp. This is also known as the portfolio holdings.


In [13]:
# Multiply the close price by the number of shares held, or the Position
signals_df['Portfolio Holdings'] = signals_df['close'] * signals_df['Position']
signals_df.tail()

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit,Position,Entry/Exit Position,Portfolio Holdings
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2019-09-13,218.75,207.0573,200.50975,1.0,0.0,500.0,0.0,109375.0
2019-09-16,219.9,207.3707,200.63715,1.0,0.0,500.0,0.0,109950.0
2019-09-17,220.7,207.7843,200.79135,1.0,0.0,500.0,0.0,110350.0
2019-09-18,222.77,208.2149,200.97605,1.0,0.0,500.0,0.0,111385.0
2019-09-19,220.96,208.5695,201.13955,1.0,0.0,500.0,0.0,110480.0


#### 6. **Calculate Portfolio Cash:** Determine the amount of cash available in the portfolio after each transaction. This accounts for buying and selling shares and any associated transaction costs.


In [14]:
# Subtract the amount of either the cost or proceeds of the trade from the initial capital invested
signals_df['Portfolio Cash'] = initial_capital - (signals_df['close'] * signals_df['Entry/Exit Position']).cumsum() 
signals_df.tail()

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit,Position,Entry/Exit Position,Portfolio Holdings,Portfolio Cash
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2019-09-13,218.75,207.0573,200.50975,1.0,0.0,500.0,0.0,109375.0,22495.0
2019-09-16,219.9,207.3707,200.63715,1.0,0.0,500.0,0.0,109950.0,22495.0
2019-09-17,220.7,207.7843,200.79135,1.0,0.0,500.0,0.0,110350.0,22495.0
2019-09-18,222.77,208.2149,200.97605,1.0,0.0,500.0,0.0,111385.0,22495.0
2019-09-19,220.96,208.5695,201.13955,1.0,0.0,500.0,0.0,110480.0,22495.0


#### 7. **Track Portfolio Value:** Calculate the total value of the portfolio on a daily basis. This includes the combined value of the held stock and the available cash.


In [15]:
# Calculate the total portfolio value by adding the portfolio cash to the portfolio holdings (or investments)
signals_df['Portfolio Total'] = signals_df['Portfolio Cash'] + signals_df['Portfolio Holdings']
signals_df.tail()

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit,Position,Entry/Exit Position,Portfolio Holdings,Portfolio Cash,Portfolio Total
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2019-09-13,218.75,207.0573,200.50975,1.0,0.0,500.0,0.0,109375.0,22495.0,131870.0
2019-09-16,219.9,207.3707,200.63715,1.0,0.0,500.0,0.0,109950.0,22495.0,132445.0
2019-09-17,220.7,207.7843,200.79135,1.0,0.0,500.0,0.0,110350.0,22495.0,132845.0
2019-09-18,222.77,208.2149,200.97605,1.0,0.0,500.0,0.0,111385.0,22495.0,133880.0
2019-09-19,220.96,208.5695,201.13955,1.0,0.0,500.0,0.0,110480.0,22495.0,132975.0


#### 8. **Calculate Portfolio Daily Returns:** Compute the daily returns of the portfolio based on changes in its value over time.


In [16]:
# Calculate the portfolio daily returns
signals_df['Portfolio Daily Returns'] = signals_df['Portfolio Total'].pct_change()
signals_df.tail()

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit,Position,Entry/Exit Position,Portfolio Holdings,Portfolio Cash,Portfolio Total,Portfolio Daily Returns
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
2019-09-13,218.75,207.0573,200.50975,1.0,0.0,500.0,0.0,109375.0,22495.0,131870.0,-0.016171
2019-09-16,219.9,207.3707,200.63715,1.0,0.0,500.0,0.0,109950.0,22495.0,132445.0,0.00436
2019-09-17,220.7,207.7843,200.79135,1.0,0.0,500.0,0.0,110350.0,22495.0,132845.0,0.00302
2019-09-18,222.77,208.2149,200.97605,1.0,0.0,500.0,0.0,111385.0,22495.0,133880.0,0.007791
2019-09-19,220.96,208.5695,201.13955,1.0,0.0,500.0,0.0,110480.0,22495.0,132975.0,-0.00676


#### 9. **Calculate Cumulative Returns:** Aggregate the daily returns to calculate the cumulative returns of the portfolio. This provides an overall measure of the strategy's performance over the backtesting period.


In [17]:
# Calculate the portfolio cumulative returns
signals_df['Portfolio Cumulative Returns'] = (1 + signals_df['Portfolio Daily Returns']).cumprod() - 1
signals_df.tail()

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit,Position,Entry/Exit Position,Portfolio Holdings,Portfolio Cash,Portfolio Total,Portfolio Daily Returns,Portfolio Cumulative Returns
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2019-09-13,218.75,207.0573,200.50975,1.0,0.0,500.0,0.0,109375.0,22495.0,131870.0,-0.016171,0.3187
2019-09-16,219.9,207.3707,200.63715,1.0,0.0,500.0,0.0,109950.0,22495.0,132445.0,0.00436,0.32445
2019-09-17,220.7,207.7843,200.79135,1.0,0.0,500.0,0.0,110350.0,22495.0,132845.0,0.00302,0.32845
2019-09-18,222.77,208.2149,200.97605,1.0,0.0,500.0,0.0,111385.0,22495.0,133880.0,0.007791,0.3388
2019-09-19,220.96,208.5695,201.13955,1.0,0.0,500.0,0.0,110480.0,22495.0,132975.0,-0.00676,0.32975


In [18]:
# Print the DataFrame
signals_df.head(150)

Unnamed: 0_level_0,close,SMA50,SMA100,Signal,Entry/Exit,Position,Entry/Exit Position,Portfolio Holdings,Portfolio Cash,Portfolio Total,Portfolio Daily Returns,Portfolio Cumulative Returns
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2014-09-22,101.06,,,0.0,,0.0,,0.0,,,,
2014-09-23,102.64,,,0.0,0.0,0.0,0.0,0.0,100000.0,100000.0,,
2014-09-24,101.75,,,0.0,0.0,0.0,0.0,0.0,100000.0,100000.0,0.0,0.0
2014-09-25,97.87,,,0.0,0.0,0.0,0.0,0.0,100000.0,100000.0,0.0,0.0
2014-09-26,100.75,,,0.0,0.0,0.0,0.0,0.0,100000.0,100000.0,0.0,0.0
2014-09-29,100.11,,,0.0,0.0,0.0,0.0,0.0,100000.0,100000.0,0.0,0.0
2014-09-30,100.75,,,0.0,0.0,0.0,0.0,0.0,100000.0,100000.0,0.0,0.0
2014-10-01,99.18,,,0.0,0.0,0.0,0.0,0.0,100000.0,100000.0,0.0,0.0
2014-10-02,99.9,,,0.0,0.0,0.0,0.0,0.0,100000.0,100000.0,0.0,0.0
2014-10-03,99.62,,,0.0,0.0,0.0,0.0,0.0,100000.0,100000.0,0.0,0.0


In [19]:
# Visualize exit position relative to total portfolio value
exit = signals_df[signals_df['Entry/Exit'] == -1.0]['Portfolio Total'].hvplot.scatter(
    color='yellow',
    marker='v',
    legend=False,
    ylabel='Total Portfolio Value',
    width=1000,
    height=400
)

# Visualize entry position relative to total portfolio value
entry = signals_df[signals_df['Entry/Exit'] == 1.0]['Portfolio Total'].hvplot.scatter(
    color='purple',
    marker='^',
    ylabel='Total Portfolio Value',
    width=1000,
    height=400
)

# Visualize the value of the total portfolio
total_portfolio_value = signals_df[['Portfolio Total']].hvplot(
    line_color='lightgray',
    ylabel='Total Portfolio Value',
    xlabel='Date',
    width=1000,
    height=400
)

# Overlay the plots
portfolio_entry_exit_plot = total_portfolio_value * entry * exit
portfolio_entry_exit_plot.opts(
    title="Apple Algorithm - Total Portfolio Value",
    yformatter='%.0f'
)