# PANCAKE PREDICTOR: BNB 5-MINUTE MODEL

**A Comprehensive Notebook — Two Models, Comparison, Ensemble & Knowledge Distillation**

This notebook is **fully self-contained**: it installs all dependencies and runs on its own.

---

## CRISP-DM Workflow

### 1. Business Understanding
- **The Game**: Bet Bull (Up) or Bear (Down) for a 5-minute timeframe on PancakeSwap.
- **The Constraint**: Must place the bet before the round locks.
- **The Edge**: BiLSTMs detect momentum shifts invisible to the naked eye.
- **The House Edge**: PancakeSwap takes ~3%. Model needs >53% accuracy to break even, or play Positive Expected Value.

### 2. Data Understanding & Preparation
- **Market Data**: 1-minute OHLCV candles for BNB/USDT (simulated).
- **Contract Data**: Crowd sentiment (Bull Payout vs. Bear Payout).
- **Feature Engineering**: RSI, volume spikes, sentiment ratios.

### 3. The Architectures (DLA Based)

| Feature | Base Model | Robust Model |
|---|---|---|
| Noise Injection | ✗ | ✓ GaussianNoise(0.05) |
| Feature Extraction | — | Conv1D(32, kernel=3) |
| BiLSTM Layers | 1 | 2 (Stacked) |
| Attention | MultiHeadAttention | MultiHeadAttention + Residual |
| Dropout | 0.2 | 0.3 |
| Learning Rate | Adam default | Adam 0.0005 |

### 4. Evaluation
- Side-by-side training comparison
- Ensemble prediction (weighted average)
- Knowledge Distillation (teacher → student)
- Expected Value (EV) based trading logic

## 0. Install Dependencies

Run this cell to install all required packages. This ensures the notebook runs on its own.

In [None]:
# Install all required dependencies
import subprocess
import sys

def install(package):
    subprocess.check_call([sys.executable, '-m', 'pip', 'install', '-q', package])

install('tensorflow>=2.10.0')
install('numpy>=1.21.0')
install('pandas>=1.3.0')
install('scikit-learn>=1.0.1')
install('matplotlib>=3.4.0')

print('All dependencies installed successfully!')

## 1. Setup & Configuration

In [None]:
# ==========================================
# PANCAKE PREDICTOR: BNB 5-MINUTE MODEL
# ==========================================

import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras import layers, models, Input
from tensorflow.keras.layers import Conv1D, GaussianNoise, Add, LayerNormalization, GlobalAveragePooling1D
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt

# Set seeds for reproducibility
np.random.seed(42)
tf.random.set_seed(42)

# --- CONFIGURATION ---
SEQ_LENGTH = 30       # Look back at last 30 minutes
PREDICT_AHEAD = 5     # Predict 5 minutes into the future
FEATURES = 5          # [Close, Volume, RSI, Bull_Ratio, Bear_Ratio]

print(f'TensorFlow version: {tf.__version__}')
print(f'NumPy version: {np.__version__}')
print(f'Pandas version: {pd.__version__}')
print(f'Configuration: SEQ_LENGTH={SEQ_LENGTH}, PREDICT_AHEAD={PREDICT_AHEAD}, FEATURES={FEATURES}')

## 2. Data Generator (Simulating BNB Price & Contract Data)We simulate 1-minute OHLCV candles with PancakeSwap pool sentiment.In production, uses real-time PancakeSwap DEX data via Web3.py.

In [None]:
# Import real-time data fetching function from the module
from src.pancake_predictor import fetch_live_market_data

# Fetch real-time data from PancakeSwap DEX (no Binance, no simulation)
print('Fetching real-time data from PancakeSwap DEX...')
market_data, contract_info = fetch_live_market_data(
    timeframe='1m',
    limit=500,  # Fetch 500 minutes of data
    use_contract=True  # Include prediction contract data
)

print(f'Data fetched: {len(market_data)} candles')
print(f'Columns: {list(market_data.columns)}')
market_data.head()

## 3. Preprocessing (Sequence Creation)

We create sliding windows of 30 minutes to predict if price goes UP in the next 5 minutes.

In [None]:
def create_sequences(df):
    X = []
    y = []

    # Normalize Data
    scaler = MinMaxScaler()
    scaled_data = scaler.fit_transform(df)

    # Create Windows
    data_val = df.values

    for i in range(SEQ_LENGTH, len(df) - PREDICT_AHEAD):
        # Input: Past 30 mins
        X.append(scaled_data[i - SEQ_LENGTH:i])

        # Target: Did price go UP in the next 5 mins?
        current_price = data_val[i][0]  # Close is index 0
        future_price = data_val[i + PREDICT_AHEAD][0]

        label = 1 if future_price > current_price else 0
        y.append(label)

    return np.array(X), np.array(y), scaler

print('>>> PREPARING SEQUENCES...')
X, y, scaler = create_sequences(market_data)
split = int(0.8 * len(X))
X_train, X_test = X[:split], X[split:]
y_train, y_test = y[:split], y[split:]

print(f'Training: X={X_train.shape}, y={y_train.shape}')
print(f'Testing:  X={X_test.shape}, y={y_test.shape}')
print(f'Label balance: {y_train.mean():.2%} Bull / {1 - y_train.mean():.2%} Bear')

## 4. Model 1: Base Model (BiLSTM + Attention)

The base architecture uses:
- **BiLSTM (Momentum Detector)**: Reads the last 30 minutes bidirectionally. Strong "Green Candle" patterns with increasing volume bias towards Bull.
- **MultiHeadAttention (Whale Detector)**: If a massive sell-off occurred 15 minutes ago, the Attention layer highlights that event.
- **GlobalAveragePooling + Dense**: Condenses the sequence into a single Bull probability.

In [None]:
def build_pancake_model():
    """Base Model: BiLSTM + Attention"""
    inputs = Input(shape=(SEQ_LENGTH, FEATURES))

    # Layer 1: BiLSTM (Momentum Detector)
    # Reads minute-by-minute price action
    x = layers.Bidirectional(layers.LSTM(64, return_sequences=True))(inputs)
    x = layers.Dropout(0.2)(x)

    # Layer 2: Self-Attention (Whale Detector)
    # Focuses on specific minutes with abnormal volume
    attn = layers.MultiHeadAttention(num_heads=4, key_dim=32)(x, x)
    x = layers.Add()([x, attn])
    x = layers.LayerNormalization()(x)

    # Layer 3: Decision
    x = layers.GlobalAveragePooling1D()(x)
    x = layers.Dense(32, activation='relu')(x)
    output = layers.Dense(1, activation='sigmoid', name='Bull_Probability')(x)

    model = models.Model(inputs=inputs, outputs=output)
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    return model

print('>>> BUILDING BASE MODEL...')
base_model = build_pancake_model()
base_model.summary()

### Train Base Model

In [None]:
print('>>> TRAINING BASE MODEL...')
base_history = base_model.fit(
    X_train, y_train,
    epochs=10,
    batch_size=32,
    validation_data=(X_test, y_test),
    verbose=1
)

base_loss, base_acc = base_model.evaluate(X_test, y_test, verbose=0)
print(f'\nBase Model — Test Loss: {base_loss:.4f}, Test Accuracy: {base_acc:.4f}')

## 5. Model 2: Robust Model (Conv1D + Stacked BiLSTM + Attention + Residual)

The robust architecture adds several DLA layers on top of the base:

| Layer | Purpose |
|---|---|
| **GaussianNoise(0.05)** | Prevents overfitting — the model learns the "shape" through the "fog" |
| **Conv1D(32, kernel=3)** | Automatic candlestick pattern extraction (looks at 3 minutes at a time) |
| **Stacked BiLSTM ×2** | Layer 1 captures fast patterns, Layer 2 captures deep trends |
| **MultiHeadAttention + Residual** | Transformer block with ResNet-style skip connection |
| **Dense(64) + Dropout(0.3)** | Heavy regularization for noisy financial data |

In [None]:
def build_robust_pancake_model():
    """Robust Model: Conv1D + Stacked BiLSTM + Attention + Residual"""
    inputs = Input(shape=(SEQ_LENGTH, FEATURES))

    # --- LAYER 1: ROBUSTNESS (Gaussian Noise) ---
    # We inject random noise (stddev=0.05) to the input data.
    # This prevents the model from memorizing exact prices (Overfitting).
    # It learns to see the "Shape" through the "Fog".
    x = GaussianNoise(0.05)(inputs)

    # --- LAYER 2: FEATURE EXTRACTION (Conv1D) ---
    # Filters=32, Kernel=3 means "Look at 3 minutes at a time".
    # This automatically learns candlestick patterns (e.g., Engulfing candles).
    x = Conv1D(filters=32, kernel_size=3, padding='same', activation='relu')(x)
    x = LayerNormalization()(x)  # Keeps values stable

    # --- LAYER 3: TEMPORAL MEMORY (Stacked BiLSTMs) ---
    # We stack two LSTMs.
    # LSTM 1: Fast patterns (return_sequences=True keeps the timeline)
    x = layers.Bidirectional(layers.LSTM(64, return_sequences=True))(x)
    x = layers.Dropout(0.3)(x)

    # LSTM 2: Deep patterns (The "Trend")
    # We save this output 'lstm_out' for the Residual connection later
    lstm_out = layers.Bidirectional(layers.LSTM(64, return_sequences=True))(x)

    # --- LAYER 4: ATTENTION + RESIDUAL (The Transformer Block) ---
    # Multi-Head Attention looks for correlations across the 30-minute window
    attn_out = layers.MultiHeadAttention(num_heads=4, key_dim=32)(lstm_out, lstm_out)

    # RESIDUAL CONNECTION (Add & Norm)
    # We add the LSTM memory (lstm_out) to the Attention insight (attn_out).
    # This is the "Safety Net" that creates robust Deep Learning models (like ResNet).
    x = Add()([lstm_out, attn_out])
    x = LayerNormalization()(x)

    # --- LAYER 5: DECISION ---
    x = GlobalAveragePooling1D()(x)  # Summarize the whole sequence

    # Dense layers to reason about the features
    x = layers.Dense(64, activation='relu')(x)
    x = layers.Dropout(0.3)(x)  # Heavy dropout for financial data

    output = layers.Dense(1, activation='sigmoid', name='Bull_Probability')(x)

    model = models.Model(inputs=inputs, outputs=output)

    # Use a lower learning rate for robust fine-tuning
    opt = tf.keras.optimizers.Adam(learning_rate=0.0005)

    model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
    return model

print('>>> BUILDING ROBUST MODEL...')
robust_model = build_robust_pancake_model()
robust_model.summary()

### Train Robust Model

In [None]:
print('>>> TRAINING ROBUST MODEL...')
robust_history = robust_model.fit(
    X_train, y_train,
    epochs=10,
    batch_size=32,
    validation_data=(X_test, y_test),
    verbose=1
)

robust_loss, robust_acc = robust_model.evaluate(X_test, y_test, verbose=0)
print(f'\nRobust Model — Test Loss: {robust_loss:.4f}, Test Accuracy: {robust_acc:.4f}')

## 6. Model Comparison

Compare training curves and final metrics between the two models.

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(16, 10))

# --- Row 1: Accuracy ---
axes[0, 0].plot(base_history.history['accuracy'], label='Base Train')
axes[0, 0].plot(base_history.history['val_accuracy'], label='Base Val')
axes[0, 0].axhline(y=0.53, color='r', linestyle='--', label='Break-Even (53%)')
axes[0, 0].set_title('Base Model — Accuracy')
axes[0, 0].set_xlabel('Epoch')
axes[0, 0].set_ylabel('Accuracy')
axes[0, 0].legend()

axes[0, 1].plot(robust_history.history['accuracy'], label='Robust Train')
axes[0, 1].plot(robust_history.history['val_accuracy'], label='Robust Val')
axes[0, 1].axhline(y=0.53, color='r', linestyle='--', label='Break-Even (53%)')
axes[0, 1].set_title('Robust Model — Accuracy')
axes[0, 1].set_xlabel('Epoch')
axes[0, 1].set_ylabel('Accuracy')
axes[0, 1].legend()

# --- Row 2: Loss ---
axes[1, 0].plot(base_history.history['loss'], label='Base Train')
axes[1, 0].plot(base_history.history['val_loss'], label='Base Val')
axes[1, 0].set_title('Base Model — Loss')
axes[1, 0].set_xlabel('Epoch')
axes[1, 0].set_ylabel('Loss')
axes[1, 0].legend()

axes[1, 1].plot(robust_history.history['loss'], label='Robust Train')
axes[1, 1].plot(robust_history.history['val_loss'], label='Robust Val')
axes[1, 1].set_title('Robust Model — Loss')
axes[1, 1].set_xlabel('Epoch')
axes[1, 1].set_ylabel('Loss')
axes[1, 1].legend()

plt.suptitle('Model Comparison: Base vs Robust', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

# Print summary table
print('\n' + '=' * 60)
print(f'{"Metric":<25} {"Base Model":>15} {"Robust Model":>15}')
print('=' * 60)
print(f'{"Test Loss":<25} {base_loss:>15.4f} {robust_loss:>15.4f}')
print(f'{"Test Accuracy":<25} {base_acc:>15.4f} {robust_acc:>15.4f}')
print(f'{"Final Train Loss":<25} {base_history.history["loss"][-1]:>15.4f} {robust_history.history["loss"][-1]:>15.4f}')
print(f'{"Final Train Accuracy":<25} {base_history.history["accuracy"][-1]:>15.4f} {robust_history.history["accuracy"][-1]:>15.4f}')
print('=' * 60)

winner = 'Base Model' if base_acc > robust_acc else 'Robust Model'
print(f'\n>>> Best single model: {winner}')

## 7. Ensemble Prediction (Dual Model)

Combine predictions from both models to reduce variance and improve reliability.
The ensemble uses a weighted average of both model outputs.

In [None]:
def ensemble_predict(base_model, robust_model, sequence, base_weight=0.5, robust_weight=0.5):
    """
    Generate predictions from both models and combine them.
    """
    seq_reshaped = sequence.reshape(1, SEQ_LENGTH, FEATURES)

    base_prob = float(base_model.predict(seq_reshaped, verbose=0)[0][0])
    robust_prob = float(robust_model.predict(seq_reshaped, verbose=0)[0][0])
    ensemble_prob = (base_weight * base_prob) + (robust_weight * robust_prob)

    def _decision(prob):
        if prob > 0.60:
            return 'BET BULL'
        elif prob < 0.40:
            return 'BET BEAR'
        return 'SKIP'

    return {
        'base_prob': base_prob,
        'robust_prob': robust_prob,
        'ensemble_prob': ensemble_prob,
        'base_decision': _decision(base_prob),
        'robust_decision': _decision(robust_prob),
        'ensemble_decision': _decision(ensemble_prob),
    }

# --- Test on multiple samples ---
print('>>> ENSEMBLE PREDICTIONS ON TEST DATA ---\n')
print(f'{"Sample":<8} {"Base Prob":>10} {"Robust Prob":>12} {"Ensemble":>10} {"Base":>12} {"Robust":>12} {"Ensemble":>12} {"Actual":>8}')
print('-' * 100)

num_samples = 10
indices = np.linspace(0, len(X_test) - 1, num_samples, dtype=int)

correct_base = 0
correct_robust = 0
correct_ensemble = 0

for idx in indices:
    result = ensemble_predict(base_model, robust_model, X_test[idx])
    actual = 'BULL' if y_test[idx] == 1 else 'BEAR'

    # Check correctness
    actual_bull = y_test[idx] == 1
    if (result['base_prob'] > 0.5) == actual_bull:
        correct_base += 1
    if (result['robust_prob'] > 0.5) == actual_bull:
        correct_robust += 1
    if (result['ensemble_prob'] > 0.5) == actual_bull:
        correct_ensemble += 1

    print(f'{idx:<8} {result["base_prob"]:>10.4f} {result["robust_prob"]:>12.4f} {result["ensemble_prob"]:>10.4f} '
          f'{result["base_decision"]:>12} {result["robust_decision"]:>12} {result["ensemble_decision"]:>12} {actual:>8}')

print(f'\nSample accuracy — Base: {correct_base}/{num_samples}, '
      f'Robust: {correct_robust}/{num_samples}, '
      f'Ensemble: {correct_ensemble}/{num_samples}')

## 8. Trading Strategy (EV-Based Logic)

### The "Kelly Criterion" (EV Logic)

The `trade_logic` function is crucial. Sometimes the AI is unsure (51% Bull), but if the crowd is heavily betting Bear, the Bull Payout might be 2.5x.

**EV = (0.51 × 2.5) - (0.49 × 1) = 0.78** — a massively profitable bet despite low confidence.

In [None]:
def trade_logic(model, current_sequence, bull_payout, bear_payout):
    """
    Decides whether to bet based on Model Confidence AND Pool Odds (EV).
    """
    seq_reshaped = current_sequence.reshape(1, SEQ_LENGTH, FEATURES)
    prob_bull = float(model.predict(seq_reshaped, verbose=0)[0][0])
    prob_bear = 1.0 - prob_bull

    decision = 'SKIP'

    # Calculate Expected Value (EV)
    ev_bull = (prob_bull * bull_payout) - (prob_bear * 1)
    ev_bear = (prob_bear * bear_payout) - (prob_bull * 1)

    # Thresholding (Only bet if we have an edge)
    CONFIDENCE_THRESHOLD = 0.60

    if ev_bull > 0.2 and prob_bull > CONFIDENCE_THRESHOLD:
        decision = 'BET BULL'
    elif ev_bear > 0.2 and prob_bear > CONFIDENCE_THRESHOLD:
        decision = 'BET BEAR'

    return decision, prob_bull, ev_bull, ev_bear

# --- SIMULATE A LIVE ROUND ---
print('\n>>> LIVE ROUND PREDICTION (Base Model) ---')
latest_data = X_test[-1]
current_bull_payout = 1.95
current_bear_payout = 1.95

action, conf, ev_up, ev_down = trade_logic(base_model, latest_data, current_bull_payout, current_bear_payout)
print(f'AI Bull Probability: {conf:.2%}')
print(f'EV Bull: {ev_up:.2f} | EV Bear: {ev_down:.2f}')
print(f'STRATEGY CALL: {action}')

print('\n>>> LIVE ROUND PREDICTION (Robust Model) ---')
action2, conf2, ev_up2, ev_down2 = trade_logic(robust_model, latest_data, current_bull_payout, current_bear_payout)
print(f'AI Bull Probability: {conf2:.2%}')
print(f'EV Bull: {ev_up2:.2f} | EV Bear: {ev_down2:.2f}')
print(f'STRATEGY CALL: {action2}')

## 9. Knowledge Distillation (Teacher → Student)

Both trained models act as **teachers**. We create a smaller **student** model that learns
from the combined soft predictions of both teachers. This transfers the learned knowledge
of both architectures into a single lightweight model.

Benefits:
- Faster inference (smaller model)
- Combines strengths of both architectures
- Soft labels carry more information than hard labels (dark knowledge)

In [None]:
def build_distilled_model(teacher_base, teacher_robust, X_train,
                          epochs=5, batch_size=32):
    """
    Build a student model that learns from both teacher models.
    """
    # Generate soft labels from both teachers
    print('>>> Generating soft labels from teachers...')
    base_preds = teacher_base.predict(X_train, verbose=0)
    robust_preds = teacher_robust.predict(X_train, verbose=0)
    soft_labels = (base_preds + robust_preds) / 2.0

    # Build a lightweight student model
    inputs = Input(shape=(SEQ_LENGTH, FEATURES))
    x = layers.Bidirectional(layers.LSTM(32, return_sequences=True))(inputs)
    x = layers.Dropout(0.2)(x)
    attn = layers.MultiHeadAttention(num_heads=2, key_dim=16)(x, x)
    x = Add()([x, attn])
    x = LayerNormalization()(x)
    x = GlobalAveragePooling1D()(x)
    x = layers.Dense(32, activation='relu')(x)
    output = layers.Dense(1, activation='sigmoid', name='Bull_Probability')(x)

    student = models.Model(inputs=inputs, outputs=output)
    student.compile(
        optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
        loss='binary_crossentropy',
        metrics=['accuracy']
    )

    # Train on soft labels from teachers
    print('>>> Training student on teacher knowledge...')
    student.fit(X_train, soft_labels, epochs=epochs, batch_size=batch_size, verbose=1)

    return student

print('>>> KNOWLEDGE DISTILLATION...')
student_model = build_distilled_model(base_model, robust_model, X_train)

# Evaluate student on real labels
student_loss, student_acc = student_model.evaluate(X_test, y_test, verbose=0)
print(f'\nStudent Model — Test Loss: {student_loss:.4f}, Test Accuracy: {student_acc:.4f}')

## 10. Final Comparison: All Three Models

Compare the base model, robust model, and knowledge-distilled student model.

In [None]:
# Full test-set evaluation
base_preds_all = (base_model.predict(X_test, verbose=0) > 0.5).astype(int).flatten()
robust_preds_all = (robust_model.predict(X_test, verbose=0) > 0.5).astype(int).flatten()
student_preds_all = (student_model.predict(X_test, verbose=0) > 0.5).astype(int).flatten()

# Ensemble predictions
base_probs_all = base_model.predict(X_test, verbose=0).flatten()
robust_probs_all = robust_model.predict(X_test, verbose=0).flatten()
ensemble_probs_all = (base_probs_all + robust_probs_all) / 2.0
ensemble_preds_all = (ensemble_probs_all > 0.5).astype(int)

# Accuracy
base_full_acc = np.mean(base_preds_all == y_test)
robust_full_acc = np.mean(robust_preds_all == y_test)
student_full_acc = np.mean(student_preds_all == y_test)
ensemble_full_acc = np.mean(ensemble_preds_all == y_test)

print('=' * 65)
print('           FINAL MODEL COMPARISON (Full Test Set)')
print('=' * 65)
print(f'{"Model":<25} {"Accuracy":>12} {"vs Break-Even":>15}')
print('-' * 65)
print(f'{"Base (BiLSTM+Attn)":<25} {base_full_acc:>12.4f} {base_full_acc - 0.53:>+15.4f}')
print(f'{"Robust (Conv+BiLSTM)":<25} {robust_full_acc:>12.4f} {robust_full_acc - 0.53:>+15.4f}')
print(f'{"Ensemble (Avg)":<25} {ensemble_full_acc:>12.4f} {ensemble_full_acc - 0.53:>+15.4f}')
print(f'{"Student (Distilled)":<25} {student_full_acc:>12.4f} {student_full_acc - 0.53:>+15.4f}')
print('=' * 65)

best_name = max(
    [('Base', base_full_acc), ('Robust', robust_full_acc),
     ('Ensemble', ensemble_full_acc), ('Student', student_full_acc)],
    key=lambda x: x[1]
)
print(f'\n>>> BEST MODEL: {best_name[0]} ({best_name[1]:.4f})')

# Visualization
models_names = ['Base', 'Robust', 'Ensemble', 'Student']
accuracies = [base_full_acc, robust_full_acc, ensemble_full_acc, student_full_acc]
colors = ['#3498db', '#e74c3c', '#2ecc71', '#9b59b6']

fig, ax = plt.subplots(figsize=(10, 5))
bars = ax.bar(models_names, accuracies, color=colors, edgecolor='black')
ax.axhline(y=0.53, color='red', linestyle='--', linewidth=2, label='Break-Even (53%)')
ax.axhline(y=0.5, color='gray', linestyle=':', linewidth=1, label='Random (50%)')
ax.set_ylabel('Accuracy')
ax.set_title('Model Comparison: Accuracy on Test Set', fontsize=14, fontweight='bold')
ax.legend()

for bar, acc in zip(bars, accuracies):
    ax.text(bar.get_x() + bar.get_width() / 2., bar.get_height() + 0.005,
            f'{acc:.3f}', ha='center', va='bottom', fontweight='bold')

plt.tight_layout()
plt.show()

## 11. Live Prediction Function

A unified function that gets predictions from all models and recommends the best action.

In [None]:
def predict_round(sequence, bull_payout=1.95, bear_payout=1.95):
    """
    Get predictions from all models for a single round.

    Args:
        sequence: Input array of shape (SEQ_LENGTH, FEATURES).
        bull_payout: Current Bull payout multiplier.
        bear_payout: Current Bear payout multiplier.

    Returns:
        Dict with predictions from base, robust, ensemble, and student models.
    """
    seq = sequence.reshape(1, SEQ_LENGTH, FEATURES)

    base_p = float(base_model.predict(seq, verbose=0)[0][0])
    robust_p = float(robust_model.predict(seq, verbose=0)[0][0])
    ensemble_p = (base_p + robust_p) / 2.0
    student_p = float(student_model.predict(seq, verbose=0)[0][0])

    def _ev_decision(prob):
        prob_bear = 1.0 - prob
        ev_bull = (prob * bull_payout) - (prob_bear * 1)
        ev_bear = (prob_bear * bear_payout) - (prob * 1)
        if ev_bull > 0.2 and prob > 0.60:
            return 'BET BULL', ev_bull, ev_bear
        elif ev_bear > 0.2 and prob_bear > 0.60:
            return 'BET BEAR', ev_bull, ev_bear
        return 'SKIP', ev_bull, ev_bear

    results = {}
    for name, prob in [('Base', base_p), ('Robust', robust_p),
                       ('Ensemble', ensemble_p), ('Student', student_p)]:
        dec, ev_b, ev_br = _ev_decision(prob)
        results[name] = {
            'probability': prob,
            'decision': dec,
            'ev_bull': ev_b,
            'ev_bear': ev_br
        }

    return results

# --- Demo ---
print('>>> UNIFIED PREDICTION FOR LATEST ROUND ---\n')
results = predict_round(X_test[-1])

print(f'{"Model":<12} {"Bull Prob":>10} {"EV Bull":>10} {"EV Bear":>10} {"Decision":>12}')
print('-' * 60)
for name, r in results.items():
    print(f'{name:<12} {r["probability"]:>10.4f} {r["ev_bull"]:>10.2f} {r["ev_bear"]:>10.2f} {r["decision"]:>12}')

# Consensus check
decisions = [r['decision'] for r in results.values()]
if all(d == decisions[0] for d in decisions):
    print(f'\n>>> ALL MODELS AGREE: {decisions[0]}')
else:
    print(f'\n>>> MODELS DISAGREE — Consider SKIP for safety')

## 12. Backtesting Simulation (Real Data Feedback)

Simulate trading on the test set to evaluate real P&L performance.
This uses the actual test labels as "real data feedback".

In [None]:
def backtest(model, X_test, y_test, bull_payout=1.95, bear_payout=1.95, name='Model'):
    """
    Backtest a model on the test set. Returns cumulative P&L.
    """
    balance = 0.0
    trades = 0
    wins = 0
    history = [0.0]

    preds = model.predict(X_test, verbose=0).flatten()

    for i in range(len(y_test)):
        prob_bull = float(preds[i])
        prob_bear = 1.0 - prob_bull
        actual_bull = y_test[i] == 1

        ev_bull = (prob_bull * bull_payout) - (prob_bear * 1)
        ev_bear = (prob_bear * bear_payout) - (prob_bull * 1)

        # Only trade when EV is positive and confidence is high
        if ev_bull > 0.2 and prob_bull > 0.60:
            trades += 1
            if actual_bull:
                balance += (bull_payout - 1)  # Net profit
                wins += 1
            else:
                balance -= 1  # Lost stake
        elif ev_bear > 0.2 and prob_bear > 0.60:
            trades += 1
            if not actual_bull:
                balance += (bear_payout - 1)  # Net profit
                wins += 1
            else:
                balance -= 1  # Lost stake

        history.append(balance)

    win_rate = wins / trades if trades > 0 else 0
    print(f'{name:<20} Trades: {trades:>5} | Wins: {wins:>5} | '
          f'Win Rate: {win_rate:.2%} | P&L: {balance:>+8.2f}')

    return history

print('>>> BACKTESTING ALL MODELS ---\n')
base_pnl = backtest(base_model, X_test, y_test, name='Base')
robust_pnl = backtest(robust_model, X_test, y_test, name='Robust')
student_pnl = backtest(student_model, X_test, y_test, name='Student')

# Plot P&L curves
fig, ax = plt.subplots(figsize=(14, 5))
ax.plot(base_pnl, label='Base Model', alpha=0.8)
ax.plot(robust_pnl, label='Robust Model', alpha=0.8)
ax.plot(student_pnl, label='Student Model', alpha=0.8)
ax.axhline(y=0, color='black', linestyle='-', linewidth=0.5)
ax.set_title('Backtesting: Cumulative P&L', fontsize=14, fontweight='bold')
ax.set_xlabel('Trade Index')
ax.set_ylabel('Cumulative P&L (units)')
ax.legend()
plt.tight_layout()
plt.show()

## 13. Real-World Integration Steps1. **Web3.py**: Fetch `currentEpoch`, `lockPrice`, and `bullAmount`/`bearAmount` from the PancakeSwap Prediction Smart Contract.2. **PancakeSwap DEX**: Use Web3.py to query on-chain data to fetch real-time OHLCV data to feed the model.3. **Latency**: Run close to BSC nodes. The "Lock" happens instantly; your transaction needs to be confirmed 5-10 seconds before the lock.### Summary of Architectures| Component | Base Model | Robust Model | Student Model ||---|---|---|---|| Input Noise | ✗ | GaussianNoise(0.05) | ✗ || Conv1D | ✗ | 32 filters, kernel=3 | ✗ || BiLSTM Layers | 1×64 | 2×64 (stacked) | 1×32 || Attention | 4 heads, key=32 | 4 heads, key=32 + Residual | 2 heads, key=16 || Dropout | 0.2 | 0.3 | 0.2 || Learning Rate | Adam default | Adam 0.0005 | Adam 0.001 || Training | Hard labels | Hard labels | Soft labels (distilled) |