### **BITCOIN TRADING STRATEGY**

**Project goal**: To train classification algorithms on a bitcoin trading strategy problem and improve the performance of one machine learning model by tuning its parameters using grid search and Bayesian optimization technique. 

- The idea is to predict when to buy or sell bitcoin. We define buy or sell signal and represent them as 1 or 0. We arrive at the signal by comparing the price trend of short-term and long-term behavior, that is, short-term moving average greater than a long-term moving average. Then we buy bitcoin; otherwise, we sell bitcoin. This is, therefore, a classification problem where we are interested in getting the direction of Bitcoin movement right. 

**Loading Helper Packages and Data**

In [3]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

sns.set()

In [8]:
from time import time 

import joblib 

# display all columns 
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", None)

from bayes_opt import BayesianOptimization


# Libraries for Deep Learning Models
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.dummy import DummyClassifier
from sklearn.ensemble import (
    AdaBoostClassifier, 
    GradientBoostingClassifier,
    RandomForestClassifier
)
from sklearn.linear_model import LogisticRegression

# needed for `HistGradientBoostingClassifier`
from sklearn.metrics import (
    accuracy_score,
    classification_report,
    confusion_matrix,
    f1_score,
    precision_score,
    recall_score,
)
from sklearn.model_selection import (
    GridSearchCV,
    KFold,
    StratifiedGroupKFold,
    cross_val_score,
    cross_validate,
    train_test_split,
)
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier

In [12]:
import yfinance as yf

BTC_Ticker = yf.Ticker("BTC-USD")
BTC_Data = BTC_Ticker.history(period="5y")

In [13]:
BTC_Data.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Dividends,Stock Splits
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2020-01-14 00:00:00+00:00,8140.933105,8879.511719,8140.933105,8827.764648,44841784107,0.0,0.0
2020-01-15 00:00:00+00:00,8825.34375,8890.117188,8657.1875,8807.010742,40102834650,0.0,0.0
2020-01-16 00:00:00+00:00,8812.481445,8846.460938,8612.095703,8723.786133,31313981931,0.0,0.0
2020-01-17 00:00:00+00:00,8725.209961,8958.12207,8677.316406,8929.038086,36372139320,0.0,0.0
2020-01-18 00:00:00+00:00,8927.211914,9012.198242,8827.332031,8942.808594,32337772627,0.0,0.0


**Exploratory Data Analysis**

In [14]:
BTC_Data.shape

(1828, 7)

In [15]:
BTC_Data.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1828 entries, 2020-01-14 00:00:00+00:00 to 2025-01-14 00:00:00+00:00
Data columns (total 7 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   Open          1828 non-null   float64
 1   High          1828 non-null   float64
 2   Low           1828 non-null   float64
 3   Close         1828 non-null   float64
 4   Volume        1828 non-null   int64  
 5   Dividends     1828 non-null   float64
 6   Stock Splits  1828 non-null   float64
dtypes: float64(6), int64(1)
memory usage: 114.2 KB


In [16]:
BTC_Data.describe()

Unnamed: 0,Open,High,Low,Close,Volume,Dividends,Stock Splits
count,1828.0,1828.0,1828.0,1828.0,1828.0,1828.0,1828.0
mean,36933.638399,37741.569888,36093.541273,36978.606822,33335280000.0,0.0,0.0
std,21580.702666,22047.754495,21097.788762,21610.566836,19597760000.0,0.0,0.0
min,5002.578125,5331.833984,4106.980957,4970.788086,5331173000.0,0.0,0.0
25%,19982.574219,20355.200195,19612.807129,19986.950195,20617940000.0,0.0,0.0
50%,33120.023438,34329.384766,31758.96582,33310.972656,30120050000.0,0.0,0.0
75%,51732.856445,52534.219727,50526.316406,51755.625977,40435030000.0,0.0,0.0
max,106147.296875,108268.445312,105291.734375,106140.601562,350967900000.0,0.0,0.0


In [18]:
# Check for any null values and remove them
print("Null Values =", BTC_Data.isnull().values.any())

Null Values = False


**Data Preparation**

- We create a target variable, the buy or sell signal. The target variable constitutes our trading strategy. When the shorter term moving average goes above the longer term moving average, then it is an indicator to buy and the vice versa is also true. 

In [19]:
# create short simple moving average over the short window
BTC_Data["short_moving_avg"] = (
    BTC_Data["Close"].rolling(window=10, min_periods=1, center=False).mean()
)

# Create long simple moving average over the long window
BTC_Data["long_moving_avg"] = (
    BTC_Data["Close"].rolling(window=60, min_periods=1, center=False).mean()
)

# create signals
BTC_Data["signal"] = np.where(
    BTC_Data["short_moving_avg"] > BTC_Data["long_moving_avg"], 1.0, 0.0
)

In [20]:
BTC_Data.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Dividends,Stock Splits,short_moving_avg,long_moving_avg,signal
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2020-01-14 00:00:00+00:00,8140.933105,8879.511719,8140.933105,8827.764648,44841784107,0.0,0.0,8827.764648,8827.764648,0.0
2020-01-15 00:00:00+00:00,8825.34375,8890.117188,8657.1875,8807.010742,40102834650,0.0,0.0,8817.387695,8817.387695,0.0
2020-01-16 00:00:00+00:00,8812.481445,8846.460938,8612.095703,8723.786133,31313981931,0.0,0.0,8786.187174,8786.187174,0.0
2020-01-17 00:00:00+00:00,8725.209961,8958.12207,8677.316406,8929.038086,36372139320,0.0,0.0,8821.899902,8821.899902,0.0
2020-01-18 00:00:00+00:00,8927.211914,9012.198242,8827.332031,8942.808594,32337772627,0.0,0.0,8846.081641,8846.081641,0.0


**Feature Engineering**

We create additional features in our dataset that will help us improve our model performance:

- Exponentail Moving Average: Gives us the price trend of the data.
- Relative Strength Indicator (RSI): Measures the change in price in recent time frame.
- Rate of change: Measures the percentage change between the stock's current price and past prices. 
- Stochastic Oscillator: Compares the current closing price of the stock with its previous closing price.