# Backtesting Analysis
This project is to demonstrate how to utilize backtesting to assess stock performance, further increasing the likelihood and filter down to the best stock candidates. 

## What is Backtesting?
Backtesting with Simple Moving Averages (SMAs) in Python is a good step in evaluating the effectiveness of an investment strategy before risking real money in the financial markets. This approach use simulating trades using historical price data to assess how a strategy would have performed in the past. 

By applying these rules to historical price data, key performance metrics can be measured, such as returns, drawdowns, and risk-adjusted ratios. These results provide valuable insights into the strategy's historical profitability and risk profile, helping traders make informed decisions. To summarize, backtesting gives a sort of conclusion to investors where `"if a stock is performing well in the past, it is more likely to perform well in the future.`

This Backtesting Analysis can be done with the use of a library called `Backtesting.py` which simplifies the process by passing the stock listing of choice and its history, the number of cash to invest, and the method of strategy to be used as an argument.

In [1]:
# import libraries
from datetime import datetime
from pandas_datareader import data as pdr
import yfinance as yf
import pandas as pd
import time
from selenium import webdriver
from bs4 import BeautifulSoup as bs
import random
import warnings
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.common.exceptions import NoSuchElementException

pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows', None)

In [2]:
df = pd.read_csv('exports/filtered_class_a_b.csv')

## Part I: Singular Ticker Testing
Retrieving the stock symbol candidate historical prices (yfinance)
The Backtesting.py library requires a "bring your own data" approach which can be easily retrieved through Yahoo Finance with yfinance API. The historical prices are retrieved which will be used to be analyzed. 

To demonstrate, one of the potential candidates that successfully went through the filtration by its dividend and foundational health (class A or B) will be used for this backtesting. A 1 year period is simulated for this demonstration.

In [3]:
# Example OHLC daily data for Google Inc.
import backtesting
from backtesting.test import GOOG
backtesting.set_bokeh_output(notebook=False)

ticker = yf.Ticker("5248.KL")



In [30]:
# Get historical data for the ticker
hist = ticker.history(period="2y")
# start_date = datetime(2021, 6, 1)
# end_date = datetime(2023, 10, 26)
# hist = yf.Ticker("5248.KL").history(period="2y", start=start_date, end=end_date)
print(hist)

                               Open      High       Low     Close    Volume  \
Date                                                                          
2021-11-22 00:00:00+08:00  1.413673  1.431019  1.361636  1.431019   1681400   
2021-11-23 00:00:00+08:00  1.431019  1.431019  1.370309  1.431019   1471400   
2021-11-24 00:00:00+08:00  1.431019  1.431019  1.396328  1.396328    945200   
2021-11-25 00:00:00+08:00  1.387655  1.387655  1.370309  1.370309   1270800   
2021-11-26 00:00:00+08:00  1.370309  1.378982  1.344291  1.370309   2027500   
2021-11-29 00:00:00+08:00  1.352963  1.352963  1.318272  1.326945   1726800   
2021-11-30 00:00:00+08:00  1.326945  1.344291  1.274908  1.292254   2753500   
2021-12-01 00:00:00+08:00  1.300926  1.309599  1.283581  1.283581   1621000   
2021-12-02 00:00:00+08:00  1.283581  1.318272  1.283581  1.283581   2563200   
2021-12-06 00:00:00+08:00  1.283581  1.309599  1.283581  1.292254    746700   
2021-12-07 00:00:00+08:00  1.318272  1.326945  1.300

In [33]:
start_date = datetime(2021, 11, 21)
end_date = datetime(2023, 10, 26)
yf.download("5248.KL", start=start_date, end=end_date, auto_adjust=True)

[*********************100%%**********************]  1 of 1 completed


Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2021-11-22,1.413673,1.431019,1.361636,1.431019,1681400
2021-11-23,1.431019,1.431019,1.370309,1.431019,1471400
2021-11-24,1.431019,1.431019,1.396328,1.396328,945200
2021-11-25,1.387655,1.387655,1.370309,1.370309,1270800
2021-11-26,1.370309,1.378982,1.344291,1.370309,2027500
2021-11-29,1.352963,1.352963,1.318272,1.326945,1726800
2021-11-30,1.326945,1.344291,1.274908,1.292254,2753500
2021-12-01,1.300926,1.309599,1.283581,1.283581,1621000
2021-12-02,1.283581,1.318272,1.283581,1.283581,2563200
2021-12-06,1.283581,1.309599,1.283581,1.292254,746700


In [15]:
stock_data = yf.download("5248.KL", start='2021-06-01', end=end_date)
print(stock_data)

[*********************100%%**********************]  1 of 1 completed
            Open  High   Low  Close  Adj Close    Volume
Date                                                    
2021-06-01  1.34  1.41  1.34   1.40   1.196738   3854600
2021-06-02  1.39  1.43  1.38   1.42   1.213834    903000
2021-06-03  1.42  1.43  1.40   1.40   1.196738    459200
2021-06-04  1.41  1.43  1.40   1.42   1.213834   2763800
2021-06-08  1.42  1.45  1.41   1.45   1.239479   1450100
2021-06-09  1.45  1.45  1.43   1.43   1.222382   1751600
2021-06-10  1.44  1.49  1.44   1.48   1.265123   3378900
2021-06-11  1.48  1.50  1.46   1.50   1.282219   1664800
2021-06-14  1.49  1.49  1.47   1.49   1.273671   1552500
2021-06-15  1.49  1.49  1.46   1.49   1.273671   1944100
2021-06-16  1.49  1.49  1.47   1.48   1.265123   3052500
2021-06-17  1.47  1.49  1.45   1.46   1.248027   1885400
2021-06-18  1.46  1.48  1.44   1.48   1.265123   1948900
2021-06-21  1.52  1.55  1.50   1.54   1.316411   5481900
2021-06-22  1.55  1

## Defining Backtesting Strategy with Simple Moving Average

This Python backtesting code implements a simple moving average (SMA) crossover trading strategy using the Backtesting library. In this strategy, two SMAs are defined: a shorter-term SMA (n1) and a longer-term SMA (n2). The key idea is to use these SMAs to identify potential buy and sell signals based on their crossovers.


- When the shorter-term SMA (sma1) crosses above the longer-term SMA (sma2), it triggers a buy signal. Any existing positions (long or short) are closed, and a long position (buy) is initiated.
- Conversely, when sma1 crosses below sma2, it triggers a sell signal. Again, existing positions are closed, and a short position (sell) is initiated.

This strategy aims to capture potential trends in the asset's price movements by leveraging SMA crossovers. Backtesting is then used to evaluate the strategy's historical performance based on these rules, providing insights into its profitability and effectiveness.

In [16]:
import pandas as pd

def SMA(values, n):
    """
    Return simple moving average of `values`, at
    each step taking into account `n` previous values.
    """
    return pd.Series(values).rolling(n).mean()

In [17]:
from backtesting import Strategy
from backtesting.lib import crossover

class SmaCross(Strategy):
    # Define the two MA lags as *class variables*
    # for later optimization
    n1 = 20
    n2 = 50   
    def init(self):
        # Precompute the two moving averages
        self.sma1 = self.I(SMA, self.data.Close, self.n1)
        self.sma2 = self.I(SMA, self.data.Close, self.n2)
    
    def next(self):
        # If sma1 crosses above sma2, close any existing
        # short trades, and buy the asset
        if crossover(self.sma1, self.sma2):
            self.position.close()
            self.buy()

        # Else, if sma1 crosses below sma2, close any existing
        # long trades, and sell the asset
        elif crossover(self.sma2, self.sma1):
            self.position.close()
            self.sell()

In [22]:
class SingleSma(Strategy):
    # Define the moving average lag as a *class variable*
    n = 100
    
    def init(self):
        # Precompute the moving average
        self.sma = self.I(SMA, self.data.Close, self.n)
    
    def next(self):
        # If the close price crosses above the 200-day moving average,
        # close any existing short trades, and buy the asset
        if crossover(self.data.Close, self.sma):
            self.position.close()
            self.buy()

        # Else, if the close price crosses below the 200-day moving average,
        # close any existing long trades, and sell the asset
        elif crossover(self.sma, self.data.Close):
            self.position.close()
            self.sell()

## Running the Backtest

In [23]:
from backtesting import Backtest

bt = Backtest(hist, SingleSma, cash=10000, commission=0)
stats = bt.run()
stats

Start                     2021-11-22 00:00...
End                       2023-11-21 00:00...
Duration                    729 days 00:00:00
Exposure Time [%]                   71.106557
Equity Final [$]                  12806.38569
Equity Peak [$]                  13067.735441
Return [%]                          28.063857
Buy & Hold Return [%]               71.206681
Return (Ann.) [%]                   13.625127
Volatility (Ann.) [%]               24.158814
Sharpe Ratio                         0.563982
Sortino Ratio                        1.033985
Calmar Ratio                         0.943933
Max. Drawdown [%]                  -14.434419
Avg. Drawdown [%]                   -4.298791
Max. Drawdown Duration      177 days 00:00:00
Avg. Drawdown Duration       32 days 00:00:00
# Trades                                   14
Win Rate [%]                        35.714286
Best Trade [%]                      27.077684
Worst Trade [%]                     -2.739723
Avg. Trade [%]                    

## Analyzing the Results

- Return [%]: This is the total return on your investment as a percentage. It shows how much your initial capital has grown or shrunk.


- Buy & Hold Return [%]: This is the return you would have achieved if you simply bought and held the asset without any trading. It provides a benchmark for comparison.


- Volatility (Ann.) [%]: This represents the annualized volatility of your strategy. It measures the variation in returns over time. Lower volatility is generally preferred.


- Sharpe Ratio: The Sharpe Ratio measures the risk-adjusted return of your strategy. A higher Sharpe Ratio indicates better risk-adjusted performance. It's a common metric for assessing investment strategies.


- Sortino Ratio: Similar to the Sharpe Ratio, the Sortino Ratio measures risk-adjusted return but focuses on downside risk (volatility of negative returns). A higher Sortino Ratio indicates better risk-adjusted performance, especially in strategies that aim to minimize downside risk.


- Calmar Ratio: The Calmar Ratio measures the risk-adjusted return relative to the maximum drawdown. A higher Calmar Ratio is generally better, as it indicates better returns relative to the risk of significant losses.


- Win Rate [%]: This is the percentage of profitable trades out of the total number of trades. A higher win rate is generally preferred.


- Avg. Trade [%]: This is the average percentage return of all trades in your strategy. It provides a measure of the typical trade performance.


- Avg. Trade Duration: This is the average duration (in days) of all trades.


- Profit Factor: The Profit Factor measures the ratio of gross profit to gross loss. A higher profit factor is generally desirable.


- Expectancy [%]: Expectancy represents the average amount you can expect to win (or lose) per trade.

# Interpreting the Results:

Ideal ranges for backtesting metrics can vary depending on your specific trading strategy, risk tolerance, and investment goals. However, I can provide some general guidelines for what traders often consider ideal or acceptable ranges for key backtesting metrics in a 12-month backtest. Keep in mind that these ranges are not set in stone, and what is considered ideal can differ from one trader to another.

1. **Return [%]:**
   - Ideal Range: Positive returns are typically desirable. Aiming for a return that outperforms a benchmark (e.g., Buy & Hold) is a common goal.

2. **Sharpe Ratio:**
   - Ideal Range: A Sharpe Ratio greater than 1 is often considered good, indicating a positive risk-adjusted return. Higher values are better.

3. **Sortino Ratio:**
   - Ideal Range: A Sortino Ratio greater than 1 is generally seen as positive. It focuses on minimizing downside risk, so a higher Sortino Ratio is preferred.

4. **Calmar Ratio:**
   - Ideal Range: A Calmar Ratio greater than 1 is typically desired. It assesses risk-adjusted return relative to maximum drawdown. Higher values are better.

5. **Max. Drawdown [%]:**
   - Ideal Range: A lower maximum drawdown is better. Ideally, it should be less than 20% for most traders, but this can vary based on risk tolerance.

6. **Volatility (Ann.) [%]:**
   - Ideal Range: Lower volatility is generally preferred. Volatility should be consistent with your risk tolerance and investment goals.

7. **Win Rate [%]:**
   - Ideal Range: A win rate above 50% is typically desirable. A higher win rate indicates more winning trades.

8. **Profit Factor:**
   - Ideal Range: A profit factor greater than 1 is usually considered good. It indicates that the strategy's gross profit outweighs gross losses.

10. **Expectancy [%]:**
    - Ideal Range: A positive expectancy indicates that, on average, each trade contributes positively to the strategy's performance.

Note that these ranges are general guidelines and may vary based on the specific trading strategy and risk tolerance. Some traders may be willing to accept higher drawdowns in exchange for potentially higher returns, while others prioritize lower risk and stability.

Additionally, backtesting is just one part of the evaluation process. Real-world trading involves factors like slippage, transaction costs, and market conditions that can impact performance. It's essential to consider these factors and conduct thorough analysis before implementing a trading strategy in live markets.

In [25]:
from bokeh.plotting import show

# Assuming you have created the 'bt' plot object
bt.plot()



# Part II: Bulk Ticker Testing & Export to CSV

In [2]:
df = pd.read_csv('exports/filtered_class_a_b.csv')

In [10]:

results = []

# Loop through each ticker symbol in the CSV file
for ticker_symbol in df['Ticker']:
    # Construct the ticker object
    ticker = yf.Ticker(ticker_symbol)

    # Get historical data for the ticker
    hist = ticker.history(period="1y")

    # Perform backtesting for the ticker
    bt = Backtest(hist, SmaCross, cash=10000, commission=.002)
    stats = bt.run()

    # Extract the selected columns from the stats DataFrame
    selected_stats = stats[['Return [%]', 'Buy & Hold Return [%]', 'Return (Ann.) [%]', 
                            'Volatility (Ann.) [%]', 'Avg. Drawdown [%]', 'Win Rate [%]', 
                            'Avg. Trade [%]', 'Profit Factor', 'Expectancy [%]']]

    # Convert selected_stats to a DataFrame with a single row
    selected_stats_df = selected_stats.to_frame().T

    # Reset the index to make it a row
    selected_stats_df = selected_stats_df.reset_index(drop=True)

    # Rename the columns
    selected_stats_df.columns = ['Return [%]', 'Buy & Hold Return [%]', 'Return (Ann.) [%]',
                                'Volatility (Ann.) [%]', 'Avg. Drawdown [%]', 'Win Rate [%]',
                                'Avg. Trade [%]', 'Profit Factor', 'Expectancy [%]']

    # Append the results to the list along with the Ticker symbol
    results.append({
        'Ticker': ticker_symbol,
        **selected_stats_df.iloc[0].to_dict(),  # Use ** to merge dictionaries
    })

# Convert the results list to a DataFrame
results_df = pd.DataFrame(results)

# Now, you have the backtesting results for each ticker in the CSV file


In [11]:
results_df

Unnamed: 0,Ticker,Return [%],Buy & Hold Return [%],Return (Ann.) [%],Volatility (Ann.) [%],Avg. Drawdown [%],Win Rate [%],Avg. Trade [%],Profit Factor,Expectancy [%]
0,7090.KL,6.416438,12.325211,6.633644,21.922503,-3.843551,54.545455,0.567139,1.533441,0.658037
1,5248.KL,-13.155241,41.809927,-13.607735,19.111162,-14.233281,15.384615,-1.06399,0.698382,-0.842703
2,2062.KL,-25.095244,9.485007,-25.892597,19.281929,-8.672751,35.714286,-2.042905,0.226853,-1.939413
3,6139.KL,10.929217,12.539249,11.356178,16.295912,-2.677627,75.0,1.305223,9.370113,1.317938
