# Bitcoin Trading Bot LSTM Model Trainer

This notebook trains an LSTM model for the Bitcoin ML Trading Bot using Google Colab's GPU acceleration.

## Workflow:
1. Upload your `trades.db` file from your local bot
2. Configure training parameters
3. Train the LSTM model using GPU acceleration
4. Download the trained model and scaler files
5. Place the downloaded files in your local `model_artifacts/` directory

## Requirements:
- TensorFlow 2.x
- Pandas
- NumPy
- Scikit-learn
- SQLite3

Let's get started!

## 1. Check GPU Availability

First, let's verify that we have GPU acceleration available.

In [None]:
import tensorflow as tf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import sqlite3
import os
import pickle
from datetime import datetime
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint

# Check for GPU
print("TensorFlow version:", tf.__version__)
print("GPU Available:", tf.config.list_physical_devices('GPU'))

# If GPU is available, set memory growth to avoid OOM errors
gpus = tf.config.list_physical_devices('GPU')
if gpus:
    try:
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        print("Memory growth set to True for all GPUs")
    except RuntimeError as e:
        print("Error setting memory growth:", e)

## 2. Upload Trades Database

Upload your `trades.db` file from your local bot's `logs/` directory.

In [None]:
from google.colab import files

# Upload trades.db file
uploaded = files.upload()

# Get the filename of the uploaded file
db_filename = list(uploaded.keys())[0]
print(f"Uploaded: {db_filename}")

# If the uploaded file is not named trades.db, rename it
if db_filename != "trades.db":
    os.rename(db_filename, "trades.db")
    db_filename = "trades.db"
    print(f"Renamed to: {db_filename}")

## 3. Load and Prepare Data

Now let's load the trade data from the database and prepare it for LSTM training.

In [None]:
# Connect to the database
conn = sqlite3.connect(db_filename)
cursor = conn.cursor()

# Check if the trades table exists
cursor.execute("SELECT name FROM sqlite_master WHERE type='table' AND name='trades'")
if not cursor.fetchone():
    print("Error: 'trades' table not found in the database.")
else:
    # Get column names
    cursor.execute("PRAGMA table_info(trades)")
    columns = [col[1] for col in cursor.fetchall()]
    print(f"Columns in trades table: {columns}")
    
    # Query all trades
    cursor.execute("SELECT * FROM trades ORDER BY timestamp")
    rows = cursor.fetchall()
    
    # Create DataFrame
    df = pd.DataFrame(rows, columns=columns)
    print(f"Loaded {len(df)} trades from database")
    
    # Display the first few rows
    df.head()

In [None]:
# Check if we have enough real trades for training
if len(df) < 100:
    print("❌ Not enough real trades for LSTM training.")
    print("✅ Please upload a valid trades.db file with at least 100 entries in the 'trades' table.")
    raise ValueError("Aborting training: insufficient real trade data.")
else:
    print(f"✅ Loaded {len(df)} real trades. Proceeding with training.")

In [None]:
# Prepare data for LSTM
def prepare_sequences(df, sequence_length=10):
    """Prepare sequences for LSTM training."""
    # Ensure price column is numeric
    if 'price' in df.columns:
        df['price'] = pd.to_numeric(df['price'], errors='coerce')
    else:
        # If price column doesn't exist, try to use entry_price or exit_price
        if 'entry_price' in df.columns:
            df['price'] = pd.to_numeric(df['entry_price'], errors='coerce')
        elif 'exit_price' in df.columns:
            df['price'] = pd.to_numeric(df['exit_price'], errors='coerce')
        else:
            raise ValueError("No price column found in the data")
    
    # Drop rows with NaN prices
    df = df.dropna(subset=['price'])
    
    # Sort by timestamp if available
    if 'timestamp' in df.columns:
        df = df.sort_values('timestamp')
    
    # Extract price series
    prices = df['price'].values.reshape(-1, 1)
    
    # Scale the prices
    scaler = MinMaxScaler(feature_range=(0, 1))
    prices_scaled = scaler.fit_transform(prices)
    
    # Create sequences
    X, y = [], []
    for i in range(len(prices_scaled) - sequence_length):
        X.append(prices_scaled[i:i+sequence_length])
        
        # For the target, we'll predict the direction (1 for up, 0 for down or same)
        next_price = prices_scaled[i+sequence_length][0]
        current_price = prices_scaled[i+sequence_length-1][0]
        y.append(1 if next_price > current_price else 0)
    
    return np.array(X), np.array(y), scaler

# Prepare sequences
sequence_length = 10  # Number of previous prices to use for prediction
X, y, scaler = prepare_sequences(df, sequence_length)

print(f"X shape: {X.shape}, y shape: {y.shape}")
print(f"Class distribution: {np.bincount(y)}")

In [None]:
# Split data into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

print(f"Training set: {X_train.shape}, {y_train.shape}")
print(f"Validation set: {X_val.shape}, {y_val.shape}")

## 4. Build and Train LSTM Model

Now let's build and train the LSTM model.

In [None]:
# Define LSTM model
def build_lstm_model(sequence_length, features=1):
    model = Sequential([
        LSTM(50, return_sequences=True, input_shape=(sequence_length, features)),
        Dropout(0.2),
        LSTM(50),
        Dropout(0.2),
        Dense(1, activation='sigmoid')
    ])
    
    model.compile(
        optimizer='adam',
        loss='binary_crossentropy',
        metrics=['accuracy']
    )
    
    return model

# Build model
model = build_lstm_model(sequence_length)
model.summary()

In [None]:
# Define callbacks
early_stopping = EarlyStopping(
    monitor='val_accuracy',
    patience=10,
    restore_best_weights=True
)

# Create directory for model checkpoints
os.makedirs('model_checkpoints', exist_ok=True)

model_checkpoint = ModelCheckpoint(
    'model_checkpoints/lstm_model_checkpoint.h5',
    monitor='val_accuracy',
    save_best_only=True,
    verbose=1
)

# Train model
epochs = 30
batch_size = 32

history = model.fit(
    X_train, y_train,
    epochs=epochs,
    batch_size=batch_size,
    validation_data=(X_val, y_val),
    callbacks=[early_stopping, model_checkpoint],
    verbose=1
)

## 5. Evaluate Model Performance

In [None]:
# Plot training history
plt.figure(figsize=(12, 5))

# Plot accuracy
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

# Plot loss
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

plt.tight_layout()
plt.show()

# Evaluate on validation set
loss, accuracy = model.evaluate(X_val, y_val, verbose=0)
print(f"Validation Loss: {loss:.4f}")
print(f"Validation Accuracy: {accuracy:.4f}")

## 6. Save Model and Scaler

Now let's save the trained model and scaler for use in the trading bot.

In [None]:
# Save model
model.save('lstm_model.h5')
print("Model saved to lstm_model.h5")

# Save scaler
with open('lstm_scaler.pkl', 'wb') as f:
    pickle.dump(scaler, f)
print("Scaler saved to lstm_scaler.pkl")

## 7. Test Model Prediction

Let's test the model with a sample sequence to ensure it works correctly.

In [None]:
# Get a sample sequence from the validation set
sample_sequence = X_val[0]
true_label = y_val[0]

# Make prediction
prediction = model.predict(sample_sequence.reshape(1, sequence_length, 1))[0][0]
predicted_label = 1 if prediction > 0.5 else 0

print(f"Prediction probability: {prediction:.4f}")
print(f"Predicted label: {predicted_label} ({'UP' if predicted_label == 1 else 'DOWN'})")
print(f"True label: {true_label} ({'UP' if true_label == 1 else 'DOWN'})")

# Decode the sequence to show actual prices
original_prices = scaler.inverse_transform(sample_sequence)
print("\nSequence prices:")
for i, price in enumerate(original_prices):
    print(f"  t-{sequence_length-i}: ${price[0]:.2f}")

## 8. Download Model Files

Finally, let's download the trained model and scaler files for use in the trading bot.

In [None]:
from google.colab import files

# Download model and scaler files
files.download('lstm_model.h5')
files.download('lstm_scaler.pkl')

## 9. Instructions for Using the Model

1. Download both `lstm_model.h5` and `lstm_scaler.pkl` files
2. Place them in your local bot's `model_artifacts/` directory
3. Run your trading bot - it should now pass the model verification check

The model is trained to predict price direction (UP or DOWN) based on the previous 10 price points. The prediction is a probability between 0 and 1, where values above 0.5 indicate an upward prediction and values below 0.5 indicate a downward prediction.

You can retrain this model periodically as more trade data becomes available.