# Backtesting Candlestick and Chart Patterns with Machine Learning
In this notebook, we'll explore the effectiveness of various candlestick and chart patterns in predicting stock price movements. Our approach combines traditional technical analysis with machine learning, specifically using XGBoost, to backtest these patterns on selected assets over a given timeframe.

In [4]:
pip install numpy pandas xgboost scikit-learn yfinance backtrader TA-Lib

Collecting TA-Lib
  Using cached TA-Lib-0.4.28.tar.gz (357 kB)
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
  Installing backend dependencies: started
  Installing backend dependencies: finished with status 'done'
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'done'
Building wheels for collected packages: TA-Lib
  Building wheel for TA-Lib (pyproject.toml): started
  Building wheel for TA-Lib (pyproject.toml): finished with status 'error'
Failed to build TA-Lib
Note: you may need to restart the kernel to use updated packages.


  error: subprocess-exited-with-error
  
  Building wheel for TA-Lib (pyproject.toml) did not run successfully.
  exit code: 1
  
  [20 lines of output]
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build\lib.win-amd64-cpython-311
  creating build\lib.win-amd64-cpython-311\talib
  copying talib\abstract.py -> build\lib.win-amd64-cpython-311\talib
  copying talib\deprecated.py -> build\lib.win-amd64-cpython-311\talib
  copying talib\stream.py -> build\lib.win-amd64-cpython-311\talib
  copying talib\__init__.py -> build\lib.win-amd64-cpython-311\talib
  running build_ext
  building 'talib._ta_lib' extension
  creating build\temp.win-amd64-cpython-311
  creating build\temp.win-amd64-cpython-311\Release
  creating build\temp.win-amd64-cpython-311\Release\talib
  "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.36.32532\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -Ic:\ta-lib\c\include -IC:\Users\Administr

In [3]:
# Import necessary libraries
import numpy as np
import pandas as pd
import xgboost as xgb
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# You might need to install TA-Lib or another library for pattern recognition
import talib
import yfinance as yf
# For backtesting, you can use Backtrader or another backtesting library
import backtrader as bt

ModuleNotFoundError: No module named 'talib'

# Data Acquisition
We will fetch historical data for the assets using the yfinance library.

In [None]:
def fetch_data(asset, start, end):
    data = yf.download(asset, start=start, end=end)
    return data


# Pattern Recognition
Using TA-Lib, we will identify various candlestick patterns in the data.

In [None]:
def add_candlestick_patterns(data):
    # Example: Adding a few candlestick patterns
    data['Hammer'] = talib.CDLHAMMER(data['Open'], data['High'], data['Low'], data['Close'])
    data['Engulfing'] = talib.CDLENGULFING(data['Open'], data['High'], data['Low'], data['Close'])
    # Add more patterns as needed
    return data

# Data Preparation
This step involves preparing our dataset for the machine learning model, including feature creation and labeling.

In [None]:
def prepare_data(data):
    # Add features and labels for ML model
    # Example: Using the next day's return as a label
    data['Next_Close'] = data['Close'].shift(-1)
    data['Return'] = (data['Next_Close'] - data['Close']) / data['Close']
    data['Target'] = np.where(data['Return'] > 0, 1, 0)
    # Drop rows with NaN values
    data = data.dropna()
    X = data.drop(['Target', 'Return', 'Next_Close'], axis=1)
    y = data['Target']
    return X, y

# Model Training
Here, we'll train an XGBoost model to predict future price movements based on identified patterns.

In [None]:
def train_model(X, y):
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    model = xgb.XGBClassifier()
    model.fit(X_train, y_train)
    predictions = model.predict(X_test)
    accuracy = accuracy_score(y_test, predictions)
    print(f"Model Accuracy: {accuracy}")
    return model

# Backtesting Strategy
Using backtrader, we will backtest the predictions made by our model to evaluate its effectiveness.

In [None]:
def backtest_strategy(data, model):
    # Implement backtesting logic here
    # This is a placeholder function
    pass


# Main Execution
The main function orchestrates the process from data fetching to backtesting for each asset.

In [None]:
# Main function
def main():
    assets = ['SPY', 'TQQQ', 'SPSX']
    start = '2020-01-01'
    end = '2021-01-01'

    for asset in assets:
        print(f"Processing {asset}")
        data = fetch_data(asset, start, end)
        data_with_patterns = add_candlestick_patterns(data)
        X, y = prepare_data(data_with_patterns)
        model = train_model(X, y)
        backtest_strategy(data_with_patterns, model)

if __name__ == "__main__":
    main()