This project is an advanced financial forecasting system that uses Ensemble Deep Learning to predict price movements of high-volatility assets (Bitcoin) and market indices (S&P 500).
- Model Type: "Committee of 5" Ensemble.
- Algorithm: Long Short-Term Memory (LSTM) Recurrent Neural Networks.
- Strategy: Five independent neural networks with different random initializations vote on the final price. This "Wisdom of the Crowd" approach reduces variance and overfitting compared to single-model systems.
- Backend: Python, TensorFlow 2.x (Keras), FastAPI (Async Server).
- Frontend: Vanilla JavaScript (ES6+), Lightweight Charts (Canvas Rendering).
- Data Pipeline: Real-time fetching via
yfinance, normalized usingMinMaxScaler, and processed into 30-day sliding window sequences.
- Multi-Signal Engineering: The model does not just look at price. It devours 9 distinct market signals:
RSI(Relative Strength Index) - identifying overbought/oversold conditions.MACD(Moving Average Convergence Divergence) - trend momentum.Bollinger Bands(Upper/Lower) - volatility breakouts.ATR(Average True Range) - market noise measurement.
- Live Dashboard: A glassmorphism-styled web interface providing real-time "BUY/SELL/HOLD" signals based on the ensemble's consensus.
- Recursive Forecasting: Capability to project prices up to 365 days into the future (long-term horizon).
- Accuracy (MAPE): ~2.05% on S&P 500, ~2.75% on Bitcoin.
- Inference Latency: <50ms (Pre-loaded models).
We rely on Recurrent Neural Networks (RNNs), specifically the LSTM variant.
- Why not standard AI? Standard Neural Networks (like the ones for images) assume inputs are independent. Stock prices are sequential—today's price depends on yesterday's.
- The "Memory" Mechanism: LSTMs have a unique internal structure called a Cell State that runs closely parallel to the data flow. They use three "Gates" to regulate this memory:
- Forget Gate: Decides what information (e.g., an old trend) to throw away.
- Input Gate: Decides what new information (e.g., a sudden crash) to store.
- Output Gate: Decides what to output based on the current cell state.
- Benefit: This allows the model to "remember" a trend from 30 days ago while ignoring random noise from yesterday.
Instead of training one "Master Model", we train 5 identical models with different random starting weights (seed).
- The Problem: Deep Learning models are stochastic. One model might randomly learn to over-weight "Fridays" due to a statistical fluke.
- The Solution: By averaging the predictions of 5 independent models, the individual errors/biases cancel each other out.
-
Math:
$P_{final} = \frac{1}{N} \sum_{i=1}^{N} P_{model_i}$ - Result: This technique (known as Bagging) significantly reduces variance and prevents the system from "hallucinating" patterns that don't exist.
We don't just feed the "Price". We feed a vector of 9 dimensions for every time step:
-
RSI (Relative Strength Index):
$\text{RSI} = 100 - \frac{100}{1 + RS}$ . Measures the speed and change of price movements (Momentum). - MACD (Moving Average Convergence Divergence): Tracks the relationship between two moving averages (12-day and 26-day) to spot trend reversals.
- Bollinger Bands: Measures volatility. If price touches the upper band, it's statistically "expensive".
- ATR (Average True Range): Measures market energy/volatility, regardless of direction.
We built a custom Sequential API model using TensorFlow/Keras.
# The actual architecture used in the code
model = Sequential()
# Layer 1: Bidirectional LSTM (100 Units)
# "Bidirectional" means it reads the price history Forwards AND Backwards to catch callbacks.
model.add(Bidirectional(LSTM(100, return_sequences=True)))
model.add(Dropout(0.2)) # Prevents overfitting (randomly switches off neurons)
# Layer 2: Standard LSTM (100 Units)
# Deepens the understanding of patterns found by Layer 1.
model.add(LSTM(100, return_sequences=False))
model.add(Dropout(0.2))
# Layer 3: Dense (25 Units) -> Layer 4: Output (1 Unit)
# Condenses the logic into a single final price number.
model.add(Dense(25))
model.add(Dense(1))- Optimizer:
Adam(Adaptive Moment Estimation) with a Learning Rate of0.001. - Loss Function:
MSE(Mean Squared Error). The model is punished for being far off the mark. - Callbacks:
EarlyStopping: Kills training if the model stops improving (saves time).ReduceLROnPlateau: Lowers the learning rate if the model gets stuck, allowing it to find a deeper minimum.
- Framework: FastAPI (Asynchronous).
- Global Loading: We load all 5 models into RAM once when the server starts. This is why the dashboard feels instant.
- Endpoint:
/api/forecasttriggers a real-time inference run across all 5 models and returns the consensus.