Skip to content

Rayyanafie/Bitcoin-Prediction-Model

Repository files navigation

🪙 Bitcoin Price Prediction — Hybrid HMM-LSTM Model

A hybrid forecasting model for next-day Bitcoin closing price prediction, combining Hidden Markov Models (HMM) for market regime detection with LSTM networks for sequential learning. Benchmarked against a standalone LSTM baseline.


📌 Overview

Bitcoin's price is highly non-linear and behaves differently across market conditions — bull runs, bear markets, and sideways consolidation each follow distinct patterns. A standard LSTM treats all time steps equally and misses these structural shifts.

This project tackles that with a two-stage hybrid approach:

  1. HMM detects hidden market regimes from historical price features and assigns a latent state label to each time step.
  2. HMM-LSTM feeds those regime labels alongside raw price data into an LSTM, giving the model structural market context when learning price sequences.

The hybrid model is compared against a standalone LSTM baseline to quantify the benefit of regime-aware inputs.


📊 Results

Model MAE RMSE MAPE
LSTM 1471.92 1830.40 3.04%
HMM-LSTM 1074.29 1585.33 2.03%

The hybrid HMM-LSTM model outperforms the standalone LSTM across all three metrics, achieving a ~27% reduction in MAE and a ~1% improvement in MAPE, demonstrating that incorporating hidden market regime information meaningfully improves prediction accuracy.


🗂️ Project Structure

Bitcoin-Prediction-Model/
│
├── 1_DATA.ipynb              # Data collection, cleaning, and preprocessing
├── 2_HYPERTUNEHMM.ipynb      # HMM hyperparameter tuning (hidden states, covariance)
├── 3_HMMLSTM.ipynb           # Hybrid HMM + LSTM model training and evaluation
├── 4_LSTM.ipynb              # Standalone LSTM baseline for benchmarking
│
├── Dataset_Raw.csv           # Original raw BTC price data
├── Cleaned_Data.csv          # Cleaned and preprocessed data
├── Dataset_Ready.csv         # Final dataset with HMM states, ready for modelling
│
└── Image/                    # Output plots and visualisations

🧠 Methodology

1. Data (1_DATA.ipynb)

  • Source: Investing.com — daily BTC/USD historical data
  • Date range: 2019 – 2024
  • Input feature: Daily closing price
  • Preprocessing includes cleaning, normalisation, and sequence windowing
    (lookback window: 30 days)

2. HMM — Market Regime Detection (2_HYPERTUNEHMM.ipynb)

A Gaussian HMM is trained on the BTC price series to identify distinct latent market regimes (e.g. bullish, bearish, ranging, volatile).

Hyperparameter tuning was performed over the number of hidden states (2–4), with model selection based on the Bayesian Information Criterion (BIC):

Hidden States Covariance Iterations BIC
2 tied 1000 39208.95
3 tied 1000 37963.15
4 tied 1000 37149.49

4 hidden states yielded the lowest BIC and was selected as the final configuration.
Each time step in the dataset is labelled with its corresponding HMM state (0–3) and passed as an additional input feature to the LSTM.


3. HMM-LSTM — Hybrid Model (3_HMMLSTM.ipynb)

The LSTM receives a sequence of 30 time steps, where each step includes the closing price and its HMM regime label. This enriched input allows the model to distinguish between structurally different market phases.

Architecture:

Input(shape=(30, features))
→ LSTM(200 units, return_sequences=True)
→ LSTM(200 units, return_sequences=False)
→ Dense(50, activation='relu')
→ Dense(1, activation='linear')

Training configuration:

Parameter Value
Epochs 50
Batch size 32
Train/Val split 70% / 30%
Optimiser Adam
Loss Mean Squared Error

LSTM hyperparameters (layers, units, batch size, train split, dropout) were tuned experimentally. The configuration above represents the best-performing setup and is the version uploaded to this repository.


4. Baseline LSTM (4_LSTM.ipynb)

An identical LSTM architecture trained on closing price only — without HMM state inputs. Used as a direct benchmark to isolate the contribution of regime detection.


🛠️ Tech Stack

  • Python — core language
  • Jupyter Notebook — development environment
  • hmmlearn — Hidden Markov Model
  • TensorFlow / Keras — LSTM network
  • Pandas / NumPy — data manipulation
  • Scikit-learn — preprocessing and evaluation
  • Matplotlib / Seaborn — visualisation

🚀 Getting Started

Install dependencies

pip install numpy pandas matplotlib seaborn scikit-learn hmmlearn tensorflow

Run notebooks in order

1_DATA.ipynb           → Prepare and clean the dataset
2_HYPERTUNEHMM.ipynb   → Tune and fit the HMM, generate regime labels
3_HMMLSTM.ipynb        → Train and evaluate the hybrid HMM-LSTM model
4_LSTM.ipynb           → Train and evaluate the standalone LSTM baseline

⚠️ Disclaimer

This project was developed as a final thesis/academic project and is intended purely for research and educational purposes.

This is NOT financial advice (NFA). Nothing in this repository should be interpreted as a recommendation to buy, sell, or trade Bitcoin or any other asset. Cryptocurrency markets are highly volatile and unpredictable.

Always Do Your Own Research (DYOR) before making any financial decisions. The authors take no responsibility for any financial losses incurred from the use of this model.

📄 License

This project is open-source and available under the MIT License.

About

Hybrid Bitcoin price forecasting model combining Hidden Markov Models (HMM) for market regime detection and LSTM networks for next-day price prediction. Uses daily BTC data (2019–2024) and integrates hidden states with sequence learning to capture structural market shifts in financial time-series.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors