## Regression Returns

Description:     
In this approach, we treat the next-bar (or multi-bar) return as a continuous variable and use a regression model (e.g., RandomForestRegressor) to predict it. Positive predicted returns imply a potential buy signal, negative imply a sell, and near-zero might mean no trade. This method captures magnitude of price movement rather than just direction.

#### 📌 Important Note:
This notebook contains *interactive charts generated using Vectorbt.  
GitHub does not display interactive Plotly charts, so the graphs will not be visible here.  

✅ To view the charts, please download this notebook and run it on your local machine.  
Make sure you have Vectorbt and its dependencies installed to regenerate the visualizations.


## Part 1: Data & Feature Engineering

**Objective:**  
Load raw price data (MetaTrader 5 or CSV) and transform it into a feature-rich dataset.

**Tasks:**
- Fetch historical bars  
- Apply `ta.add_all_ta_features` or custom features  
- (Optionally) create specific labels (multi-bar, double-barrier, regime, etc.)  
- Clean/prepare the final feature matrix **X** and target **y**  

In [None]:
import sys
import os
import warnings
from pathlib import Path

# ---------------------------------------------------------------------------
# 1) SET PROJECT ROOT AND UPDATE PATH/WORKING DIRECTORY
# ---------------------------------------------------------------------------
project_root = Path.cwd().parent.parent  # Adjust if your notebook is in notebooks/time_series
sys.path.append(str(project_root))
os.chdir(str(project_root))
warnings.filterwarnings("ignore")



import pandas as pd
import numpy as np
import MetaTrader5 as mt5
import vectorbt as vbt

# Our modules for data and backtesting
from data.data_loader import get_data_mt5
from features.feature_engineering import add_all_ta_features
from features.labeling_schemes import calculate_future_returns
from backtests.simple_backtest import simulate_trading, calculate_sharpe_ratio
from models.model_training import walk_forward_splits
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error

# Deep Learning libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv1D, GlobalMaxPooling1D, LSTM
from tensorflow.keras.optimizers import Adam

# Set the project root (assuming the notebook is in a subfolder)
project_root = Path.cwd().parent
os.chdir(project_root)

###########################################################
# 1) DATA LOADING & FEATURE ENGINEERING
###########################################################
if not mt5.initialize():
    print("Failed to initialize MT5")
else:
    data = get_data_mt5(symbol="BTCUSD", timeframe=mt5.TIMEFRAME_H4, n_bars=1000)
    mt5.shutdown()

df = add_all_ta_features(data)
df = calculate_future_returns(df).dropna(subset=["future_returns"])

# Prepare features and labels
X = df.drop(columns=["future_returns"])
y = df["future_returns"]

###########################################################
# 2) WALK-FORWARD SPLITS
###########################################################

folds = walk_forward_splits(X, y, n_splits=3)
print(f"Number of folds created: {len(folds)}")

###########################################################
# 3) HELPER FUNCTION TO CREATE SEQUENCES
###########################################################
def create_sequences(X, y, lookback):
    X_seq, y_seq = [], []
    for i in range(lookback, len(X)):
        X_seq.append(X[i-lookback:i])
        y_seq.append(y[i])
    return np.array(X_seq), np.array(y_seq)

# Define lookback window for sequence-based models
lookback = 10

###########################################################
# 4) DEFINE DL MODEL CONSTRUCTORS
###########################################################
# Model 1: MLP (Feed-Forward) that flattens the sequence
def create_mlp_model_seq(input_shape):
    model = Sequential([
         Flatten(input_shape=input_shape),
         Dense(64, activation='relu'),
         Dropout(0.2),
         Dense(32, activation='relu'),
         Dense(1)  # Regression output for future returns
    ])
    model.compile(optimizer=Adam(learning_rate=0.001), loss='mean_squared_error')
    return model

# Model 2: CNN for sequence data
def create_cnn_model_seq(input_shape):
    model = Sequential([
         Conv1D(filters=32, kernel_size=3, activation='relu', input_shape=input_shape),
         GlobalMaxPooling1D(),
         Dense(64, activation='relu'),
         Dropout(0.2),
         Dense(1)
    ])
    model.compile(optimizer=Adam(learning_rate=0.001), loss='mean_squared_error')
    return model

# Model 3: LSTM for sequence data
def create_lstm_model_seq(input_shape):
    model = Sequential([
         LSTM(50, input_shape=input_shape),
         Dense(32, activation='relu'),
         Dropout(0.2),
         Dense(1)
    ])
    model.compile(optimizer=Adam(learning_rate=0.001), loss='mean_squared_error')
    return model

# Dictionary of DL model constructors
dl_models = {
    "MLP": create_mlp_model_seq,
    "CNN": create_cnn_model_seq,
    "LSTM": create_lstm_model_seq
}

###########################################################
# 5) DL MODEL TRAINING & BACKTESTING ACROSS MODELS
###########################################################
threshold = 0.0005  # Trade signal threshold
cost = 0.0002       # Transaction cost (0.02%)

fold_results = {}

for fold_i, (X_train_fold, y_train_fold, X_test_fold, y_test_fold) in enumerate(folds, start=1):
    print(f"\n===== Fold {fold_i} =====")
    
    # Scale features
    scaler = StandardScaler()
    X_train_scaled = scaler.fit_transform(X_train_fold)
    X_test_scaled = scaler.transform(X_test_fold)
    
    # Create sequences from scaled data
    # Note: This reduces the number of samples by 'lookback'
    X_train_seq, y_train_seq = create_sequences(X_train_scaled, y_train_fold.values, lookback)
    X_test_seq, y_test_seq = create_sequences(X_test_scaled, y_test_fold.values, lookback)
    
    # Adjust indices for backtesting to align with the test sequences
    test_indices = X_test_fold.index[lookback:]
    
    fold_results[fold_i] = {}
    
    for model_name, create_model in dl_models.items():
        print(f"Training model: {model_name}")
        input_shape = X_train_seq.shape[1:]  # (lookback, n_features)
        model = create_model(input_shape)
        model.fit(X_train_seq, y_train_seq, epochs=50, batch_size=32, verbose=0)
        
        preds = model.predict(X_test_seq).flatten()
        mse = mean_squared_error(y_test_seq, preds)
        
        # Convert predictions into trading signals
        signals = np.where(preds > threshold, 1, np.where(preds < -threshold, -1, 0))
        
        # Get corresponding rows in the original dataframe for backtesting
        df_test_fold = df.loc[test_indices].copy()
        
        # Run backtest
        daily_returns, total_return = simulate_trading(signals, df_test_fold, cost=cost)
        sr = calculate_sharpe_ratio(np.array(daily_returns))
        
        fold_results[fold_i][model_name] = {
             "MSE": mse,
             "TotalReturn": total_return,
             "Sharpe": sr
        }

###########################################################
# 6) PRINT RESULTS
###########################################################
for fold_i, models_dict in fold_results.items():
    print(f"\n=== Fold {fold_i} Results ===")
    for model_name, stats in models_dict.items():
        mse = stats["MSE"]
        ret = stats["TotalReturn"]
        sr = stats["Sharpe"]
        print(f"{model_name}: MSE={mse:.2e}, Return={ret:.2f}%, Sharpe={sr:.2f}")


Number of folds created: 3

===== Fold 1 =====
Training model: MLP
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 15ms/step
Training model: CNN
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 25ms/step
Training model: LSTM
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 71ms/step

===== Fold 2 =====
Training model: MLP
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 11ms/step
Training model: CNN
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 13ms/step
Training model: LSTM
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 56ms/step

===== Fold 3 =====
Training model: MLP
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step 
Training model: CNN
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 19ms/step
Training model: LSTM
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 43ms/step

=== Fold 1 Results ===
MLP: MSE=1.85e+00, Return=6.89%, Sharpe=0.03
CNN: MSE=4.32

## Part 2: Model Training & Hyperparameter Tuning

**Objective:**  
Train an ML model (e.g., RandomForest, XGBoost) on the engineered features to predict the chosen labels.

**Tasks:**
- Perform time-based or walk-forward splits  
- Select top features if desired (e.g., using RandomForest feature importance)  
- Use `RandomizedSearchCV` or `GridSearchCV` to find optimal hyperparameters  
- Save the best model pipeline (e.g., `best_rf_pipeline.pkl`) 

In [None]:
import sys
import os
import warnings
from pathlib import Path

# ---------------------------------------------------------------------------
# 1) SET PROJECT ROOT AND UPDATE PATH/WORKING DIRECTORY
# ---------------------------------------------------------------------------
project_root = Path.cwd().parent.parent  # Adjust if your notebook is in notebooks/time_series
sys.path.append(str(project_root))
os.chdir(str(project_root))
warnings.filterwarnings("ignore")

import numpy as np
import pandas as pd
import MetaTrader5 as mt5
import joblib
from itertools import product

from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import StandardScaler

# Deep Learning libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from tensorflow.keras.optimizers import Adam

# Your modules for data and feature engineering
from data.data_loader import get_data_mt5
from features.feature_engineering import add_all_ta_features
from features.labeling_schemes import calculate_future_returns

###########################################################
# 1) DATA LOADING & FEATURE ENGINEERING
###########################################################
if not mt5.initialize():
    print("Failed to initialize MT5")
else:
    data = get_data_mt5(symbol="BTCUSD", timeframe=mt5.TIMEFRAME_H4, n_bars=2000)
    mt5.shutdown()

df = add_all_ta_features(data)
df = calculate_future_returns(df).dropna(subset=["future_returns"])

# Create feature matrix X and target vector y
X_full = df.drop(columns=["future_returns"])
y_full = df["future_returns"]

###########################################################
# 2) PREPARE THE TUNING PORTION AND CREATE SEQUENCE DATA
###########################################################
# Define lookback window (number of timesteps in each sequence)
lookback = 10

# Split the first 80% of the data for hyperparameter tuning
split_idx = int(len(X_full) * 0.8)
X_tune = X_full.iloc[:split_idx].values  # Convert to numpy array
y_tune = y_full.iloc[:split_idx].values

# Scale the features before creating sequences
scaler = StandardScaler()
X_tune_scaled = scaler.fit_transform(X_tune)

# Helper function to create sequence data for LSTM
def create_sequences(X, y, lookback):
    X_seq, y_seq = [], []
    for i in range(lookback, len(X)):
        X_seq.append(X[i-lookback:i])
        y_seq.append(y[i])
    return np.array(X_seq), np.array(y_seq)

X_tune_seq, y_tune_seq = create_sequences(X_tune_scaled, y_tune, lookback)
print(f"Tuning portion size (after sequence creation): {len(X_tune_seq)} samples")

###########################################################
# 3) TIME-SERIES CROSS-VALIDATION SETUP
###########################################################
tscv = TimeSeriesSplit(n_splits=3)

###########################################################
# 4) DEFINE THE LSTM MODEL FUNCTION
###########################################################
def build_lstm_model(units, dropout_rate, learning_rate):
    model = Sequential([
        LSTM(units, input_shape=(lookback, X_tune_seq.shape[2])),
        Dropout(dropout_rate),
        Dense(1)
    ])
    model.compile(optimizer=Adam(learning_rate=learning_rate), loss='mean_squared_error')
    return model

###########################################################
# 5) DEFINE HYPERPARAMETER SPACE
###########################################################
param_grid = {
    "units": [30, 50, 70],
    "dropout_rate": [0.1, 0.2, 0.3],
    "learning_rate": [1e-3, 1e-4],
    "epochs": [50, 100],
    "batch_size": [16, 32]
}

# Get all possible hyperparameter combinations
param_combinations = list(product(*param_grid.values()))
print(f"Total hyperparameter combinations: {len(param_combinations)}")

###########################################################
# 6) MANUAL HYPERPARAMETER TUNING
###########################################################
best_mse = float("inf")
best_model = None
best_params = None

for params in param_combinations:
    units, dropout_rate, learning_rate, epochs, batch_size = params
    print(f"\nTraining model with: Units={units}, Dropout={dropout_rate}, LR={learning_rate}, Epochs={epochs}, Batch={batch_size}")

    # Create model
    model = build_lstm_model(units, dropout_rate, learning_rate)

    # Perform cross-validation
    mse_scores = []
    for train_idx, test_idx in tscv.split(X_tune_seq):
        X_train_fold, X_test_fold = X_tune_seq[train_idx], X_tune_seq[test_idx]
        y_train_fold, y_test_fold = y_tune_seq[train_idx], y_tune_seq[test_idx]

        # Train model
        model.fit(X_train_fold, y_train_fold, epochs=epochs, batch_size=batch_size, verbose=0)

        # Evaluate on test set
        y_pred = model.predict(X_test_fold)
        mse = mean_squared_error(y_test_fold, y_pred)
        mse_scores.append(mse)

    avg_mse = np.mean(mse_scores)
    print(f"Avg MSE: {avg_mse:.6f}")

    # Track the best model
    if avg_mse < best_mse:
        best_mse = avg_mse
        best_model = model
        best_params = params

print("\nBest Model Found:")
print(f"Units={best_params[0]}, Dropout={best_params[1]}, LR={best_params[2]}, Epochs={best_params[3]}, Batch={best_params[4]}")
print(f"Best Avg MSE: {best_mse:.6f}")

###########################################################
# 7) SAVE THE BEST MODEL
###########################################################
best_model.save("models/saved_models/best_lstm_model.h5")
print("Saved best LSTM model to 'models/saved_models/best_lstm_model.h5'")


Tuning portion size (after sequence creation): 1589 samples
Total hyperparameter combinations: 72

Training model with: Units=30, Dropout=0.1, LR=0.001, Epochs=50, Batch=16
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 34ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step 
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step
Avg MSE: 0.242755

Training model with: Units=30, Dropout=0.1, LR=0.001, Epochs=50, Batch=32
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 34ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step 
Avg MSE: 0.012655

Training model with: Units=30, Dropout=0.1, LR=0.001, Epochs=100, Batch=16
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 36ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step 
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m



Avg MSE: 0.034152

Best Model Found:
Units=70, Dropout=0.2, LR=0.001, Epochs=100, Batch=16
Best Avg MSE: 0.003916
Saved best LSTM model to 'models/saved_models/best_lstm_model.h5'


Replacing Grid Search with Optuna in Your LSTM Fine-Tuning Code

In [1]:
import os
import warnings
import numpy as np
import pandas as pd
import MetaTrader5 as mt5
import joblib
import optuna
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import StandardScaler
import datetime
# TensorFlow/Keras
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from tensorflow.keras.optimizers import Adam

import sys
import os
import warnings
from pathlib import Path

# ---------------------------------------------------------------------------
# 1) SET PROJECT ROOT AND UPDATE PATH/WORKING DIRECTORY
# ---------------------------------------------------------------------------
project_root = Path.cwd().parent.parent  # Adjust if your notebook is in notebooks/time_series
sys.path.append(str(project_root))
os.chdir(str(project_root))
warnings.filterwarnings("ignore")


# Your modules
from data.data_loader import get_data_mt5
from features.feature_engineering import add_all_ta_features
from features.labeling_schemes import calculate_future_returns

warnings.filterwarnings("ignore")

###########################################################
# 1) DATA LOADING & FEATURE ENGINEERING
###########################################################
if not mt5.initialize():
    print("Failed to initialize MT5")
else:
    # Fetch 2000 bars from an earlier period for training
    data = get_data_mt5(symbol="BTCUSD", timeframe=mt5.TIMEFRAME_H4, n_bars=2000, start_pos=2000)
    mt5.shutdown()

df = add_all_ta_features(data)
df = calculate_future_returns(df).dropna(subset=["future_returns"])

# Create feature matrix X and target vector y
X_full = df.drop(columns=["future_returns"])
y_full = df["future_returns"]

###########################################################
# 2) PREPARE TRAINING DATA
###########################################################
lookback = 10  # Number of timesteps in each sequence

# Use first 80% of data for hyperparameter tuning
split_idx = int(len(X_full) * 0.8)
X_tune, y_tune = X_full.iloc[:split_idx].values, y_full.iloc[:split_idx].values

# Scale the features before creating sequences
scaler = StandardScaler()
X_tune_scaled = scaler.fit_transform(X_tune)

# Helper function to create sequences
def create_sequences(X, y, lookback):
    X_seq, y_seq = [], []
    for i in range(lookback, len(X)):
        X_seq.append(X[i - lookback : i])
        y_seq.append(y[i])
    return np.array(X_seq), np.array(y_seq)

X_tune_seq, y_tune_seq = create_sequences(X_tune_scaled, y_tune, lookback)
print(f"Tuning dataset size (after sequence creation): {len(X_tune_seq)} samples")

###########################################################
# 3) TIME-SERIES CROSS-VALIDATION
###########################################################
tscv = TimeSeriesSplit(n_splits=3)

###########################################################
# 4) DEFINE LSTM MODEL FUNCTION
###########################################################


def build_lstm_model(trial):
    """Build an LSTM model with hyperparameters chosen by Optuna"""
    units = trial.suggest_int("units", 30, 100, step=10)
    dropout_rate = trial.suggest_float("dropout_rate", 0.1, 0.5, step=0.1)
    learning_rate = trial.suggest_loguniform("learning_rate", 1e-4, 1e-2)
    
    model = Sequential([
        LSTM(units, input_shape=(lookback, X_tune_seq.shape[2])),
        Dropout(dropout_rate),
        Dense(1)
    ])
    
    model.compile(optimizer=Adam(learning_rate=learning_rate), loss="mean_squared_error")

    # Add TensorBoard logging
    log_dir = "logs/optuna/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
    tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
    
    return model, tensorboard_callback  # Return both model and TensorBoard callback


###########################################################
# 5) OPTUNA OBJECTIVE FUNCTION
###########################################################
def objective(trial):
    """Objective function for Optuna hyperparameter tuning."""
    mse_scores = []
    
    # Create time-series splits
    tscv = TimeSeriesSplit(n_splits=3)

    for train_idx, test_idx in tscv.split(X_tune_seq):
        # Build a fresh model each fold (includes hyperparams from trial)
        model, tensorboard_callback = build_lstm_model(trial)
        
        # Prepare the fold's train/test data
        X_train_fold, X_test_fold = X_tune_seq[train_idx], X_tune_seq[test_idx]
        y_train_fold, y_test_fold = y_tune_seq[train_idx], y_tune_seq[test_idx]

        # Early stopping to prevent overfitting & reduce training time
        early_stopping = tf.keras.callbacks.EarlyStopping(
            monitor='loss',
            patience=5,
            restore_best_weights=True
        )
        
        # Fit the model on this fold's data
        model.fit(
            X_train_fold, 
            y_train_fold,
            epochs=50,             # You can adjust or tune this as needed
            batch_size=32,         # You can also tune this if desired
            verbose=0,             # 0 = silent, 1 = progress bar, 2 = one line per epoch
            callbacks=[tensorboard_callback, early_stopping]
        )
        
        # Predict and calculate MSE for this fold
        y_pred = model.predict(X_test_fold)
        mse = mean_squared_error(y_test_fold, y_pred)
        mse_scores.append(mse)
    
    # Return average MSE across all folds
    return np.mean(mse_scores)


###########################################################
# 6) RUN OPTUNA OPTIMIZATION
###########################################################
study = optuna.create_study(direction="minimize")
study.optimize(objective, n_trials=30, timeout=1800)  # 30 trials or 30 min max

print("\nBest Trial Found:")
print("MSE:", study.best_trial.value)
print("Best Params:", study.best_trial.params)

###########################################################
# 7) RETRAIN FINAL MODEL WITH BEST PARAMS
###########################################################
best_params = study.best_trial.params

final_model = Sequential([
    LSTM(best_params["units"], input_shape=(lookback, X_tune_seq.shape[2])),
    Dropout(best_params["dropout_rate"]),
    Dense(1)
])

final_model.compile(optimizer=Adam(learning_rate=best_params["learning_rate"]), loss="mean_squared_error")

final_model.fit(X_tune_seq, y_tune_seq, epochs=100, batch_size=32, verbose=1)

###########################################################
# 8) SAVE BEST MODEL
###########################################################
final_model.save("models/saved_models/best_lstm_optuna.h5")
joblib.dump(scaler, "models/saved_models/lstm_scaler.pkl")

print("Saved best LSTM model to 'models/saved_models/best_lstm_optuna.h5'")
print("Saved scaler to 'models/saved_models/lstm_scaler.pkl'")

###########################################################
# 9) VISUALIZE OPTUNA RESULTS
###########################################################
import optuna.visualization as ov
ov.plot_optimization_history(study).show()
ov.plot_param_importances(study).show()


[I 2025-03-02 13:59:32,198] A new study created in memory with name: no-name-6774c4b0-c3e1-416f-be0d-10ba22412034


Tuning dataset size (after sequence creation): 1589 samples
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 14ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 14ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step


[I 2025-03-02 14:00:18,394] Trial 0 finished with value: 0.03359022160300049 and parameters: {'units': 50, 'dropout_rate': 0.1, 'learning_rate': 0.0002141382969019211}. Best is trial 0 with value: 0.03359022160300049.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step


[I 2025-03-02 14:01:14,518] Trial 1 finished with value: 0.011693350924231625 and parameters: {'units': 60, 'dropout_rate': 0.1, 'learning_rate': 0.0007579605289107738}. Best is trial 1 with value: 0.011693350924231625.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 19ms/step


[I 2025-03-02 14:02:09,110] Trial 2 finished with value: 0.008973074438954714 and parameters: {'units': 50, 'dropout_rate': 0.5, 'learning_rate': 0.005816952491529118}. Best is trial 2 with value: 0.008973074438954714.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 31ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 20ms/step


[I 2025-03-02 14:03:08,957] Trial 3 finished with value: 0.008785956740474077 and parameters: {'units': 60, 'dropout_rate': 0.30000000000000004, 'learning_rate': 0.0026640284047387123}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 19ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step


[I 2025-03-02 14:04:07,416] Trial 4 finished with value: 0.0386435432526649 and parameters: {'units': 70, 'dropout_rate': 0.1, 'learning_rate': 0.0001311493233121549}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step


[I 2025-03-02 14:05:04,980] Trial 5 finished with value: 0.010665920439219283 and parameters: {'units': 100, 'dropout_rate': 0.30000000000000004, 'learning_rate': 0.0012430848055967766}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 21ms/step


[I 2025-03-02 14:05:57,359] Trial 6 finished with value: 0.013865590674700777 and parameters: {'units': 40, 'dropout_rate': 0.5, 'learning_rate': 0.003617477761201328}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 21ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 19ms/step


[I 2025-03-02 14:07:03,661] Trial 7 finished with value: 0.02117303577896046 and parameters: {'units': 90, 'dropout_rate': 0.1, 'learning_rate': 0.005442533622576849}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 45ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 28ms/step


[I 2025-03-02 14:08:04,587] Trial 8 finished with value: 0.027115211776129477 and parameters: {'units': 70, 'dropout_rate': 0.30000000000000004, 'learning_rate': 0.0003609700673364088}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 19ms/step


[I 2025-03-02 14:08:55,371] Trial 9 finished with value: 0.023058867846149095 and parameters: {'units': 80, 'dropout_rate': 0.30000000000000004, 'learning_rate': 0.00015717564530105246}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 22ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step


[I 2025-03-02 14:09:52,810] Trial 10 finished with value: 0.015970026150234087 and parameters: {'units': 30, 'dropout_rate': 0.4, 'learning_rate': 0.0020410750027874346}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 34ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step


[I 2025-03-02 14:10:50,188] Trial 11 finished with value: 0.01341771463982521 and parameters: {'units': 50, 'dropout_rate': 0.5, 'learning_rate': 0.008407711399257695}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step


[I 2025-03-02 14:11:46,253] Trial 12 finished with value: 0.013165788392186364 and parameters: {'units': 50, 'dropout_rate': 0.4, 'learning_rate': 0.002874778523184124}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 32ms/step


[I 2025-03-02 14:12:42,202] Trial 13 finished with value: 0.010943142339874581 and parameters: {'units': 30, 'dropout_rate': 0.2, 'learning_rate': 0.009777398393617055}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 23ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 22ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 32ms/step


[I 2025-03-02 14:13:46,029] Trial 14 finished with value: 0.012851531614969882 and parameters: {'units': 60, 'dropout_rate': 0.4, 'learning_rate': 0.001451212528695668}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 21ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 19ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step


[I 2025-03-02 14:14:49,500] Trial 15 finished with value: 0.016150412924882705 and parameters: {'units': 40, 'dropout_rate': 0.2, 'learning_rate': 0.0006777592282439003}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 21ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 29ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 22ms/step


[I 2025-03-02 14:15:51,787] Trial 16 finished with value: 0.008871015248399533 and parameters: {'units': 80, 'dropout_rate': 0.5, 'learning_rate': 0.004491636623379885}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 20ms/step


[I 2025-03-02 14:16:49,215] Trial 17 finished with value: 0.01148415676610758 and parameters: {'units': 90, 'dropout_rate': 0.2, 'learning_rate': 0.003101373938813873}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 20ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 42ms/step


[I 2025-03-02 14:18:03,374] Trial 18 finished with value: 0.01892301953618242 and parameters: {'units': 80, 'dropout_rate': 0.4, 'learning_rate': 0.002062415058087598}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step


[I 2025-03-02 14:19:06,538] Trial 19 finished with value: 0.016835121690561863 and parameters: {'units': 80, 'dropout_rate': 0.5, 'learning_rate': 0.005061758780488287}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 37ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 30ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 26ms/step


[I 2025-03-02 14:20:27,750] Trial 20 finished with value: 0.010563366076005323 and parameters: {'units': 100, 'dropout_rate': 0.4, 'learning_rate': 0.00048548757562026407}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 21ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step


[I 2025-03-02 14:21:26,931] Trial 21 finished with value: 0.012442541700884517 and parameters: {'units': 60, 'dropout_rate': 0.5, 'learning_rate': 0.005444456719993046}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 19ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step


[I 2025-03-02 14:22:17,152] Trial 22 finished with value: 0.036022487773819924 and parameters: {'units': 70, 'dropout_rate': 0.5, 'learning_rate': 0.006705175152781568}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 23ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 27ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 25ms/step


[I 2025-03-02 14:23:16,906] Trial 23 finished with value: 0.014294800788548882 and parameters: {'units': 40, 'dropout_rate': 0.5, 'learning_rate': 0.00411399047224363}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step


[I 2025-03-02 14:24:10,504] Trial 24 finished with value: 0.017998271678696613 and parameters: {'units': 50, 'dropout_rate': 0.4, 'learning_rate': 0.002031285352175662}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step


[I 2025-03-02 14:25:10,287] Trial 25 finished with value: 0.02455422659648186 and parameters: {'units': 60, 'dropout_rate': 0.30000000000000004, 'learning_rate': 0.002580961705941228}. Best is trial 3 with value: 0.008785956740474077.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 19ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step


[I 2025-03-02 14:26:02,922] Trial 26 finished with value: 0.00439279236214546 and parameters: {'units': 90, 'dropout_rate': 0.2, 'learning_rate': 0.007333768949602064}. Best is trial 26 with value: 0.00439279236214546.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 21ms/step


[I 2025-03-02 14:26:55,875] Trial 27 finished with value: 0.009840395301836473 and parameters: {'units': 90, 'dropout_rate': 0.2, 'learning_rate': 0.007775137525053662}. Best is trial 26 with value: 0.00439279236214546.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 20ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 40ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 23ms/step


[I 2025-03-02 14:28:06,191] Trial 28 finished with value: 0.012671266282127192 and parameters: {'units': 80, 'dropout_rate': 0.2, 'learning_rate': 0.0014736610761965474}. Best is trial 26 with value: 0.00439279236214546.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 38ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 20ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 20ms/step


[I 2025-03-02 14:29:17,595] Trial 29 finished with value: 0.0077593635963503 and parameters: {'units': 90, 'dropout_rate': 0.2, 'learning_rate': 0.0038795309928665676}. Best is trial 26 with value: 0.00439279236214546.



Best Trial Found:
MSE: 0.00439279236214546
Best Params: {'units': 90, 'dropout_rate': 0.2, 'learning_rate': 0.007333768949602064}
Epoch 1/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 8ms/step - loss: 0.3021
Epoch 2/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step - loss: 0.0122
Epoch 3/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 0.0052
Epoch 4/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 0.0021
Epoch 5/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 0.0013
Epoch 6/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 12ms/step - loss: 8.1625e-04
Epoch 7/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 12ms/step - loss: 5.9545e-04
Epoch 8/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 4.2944e-04
Epoch 9/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━



Saved best LSTM model to 'models/saved_models/best_lstm_optuna.h5'
Saved scaler to 'models/saved_models/lstm_scaler.pkl'


 Start TensorBoard After Running the Script

In [None]:
# Once training completes, start TensorBoard from the terminal:
tensorboard --logdir logs/optuna

In [None]:
# or from Jupyter Notebook:
%load_ext tensorboard


In [None]:
%tensorboard --logdir logs/optuna

More improvements techqniques     
✅ L1/L2 Regularization (Encourages small weights & sparsity)      
✅ Dropout (Prevents over-reliance on specific neurons)      
✅ Batch Normalization (Stabilizes training & helps generalization)      
✅ Early Stopping (Stops training when validation loss worsens)     

✅ Lower Regularization (1e-6 to 1e-5) – Less restriction, avoids underfitting.
✅ Reduce Dropout (0.1) – Allows the model to capture patterns better.
✅ Higher Learning Rate (1e-3 to 5e-3) – Helps avoid slow convergence.
✅ Allow More Training Iterations (100 epochs, patience=10).
✅ Trade Signal Threshold Adjusted (0.0005) – More signals, more trades.

In [1]:
import os
import warnings
import numpy as np
import pandas as pd
import MetaTrader5 as mt5
import joblib
import optuna
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import StandardScaler
import datetime
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.regularizers import l2


import sys
import os
import warnings
from pathlib import Path

# ---------------------------------------------------------------------------
# 1) SET PROJECT ROOT AND UPDATE PATH/WORKING DIRECTORY
# ---------------------------------------------------------------------------
project_root = Path.cwd().parent.parent
sys.path.append(str(project_root))
os.chdir(str(project_root))
warnings.filterwarnings("ignore")


# ------------------------------
# 1️⃣ Load Data
# ------------------------------
from data.data_loader import get_data_mt5
from features.feature_engineering import add_all_ta_features
from features.labeling_schemes import calculate_future_returns


if not mt5.initialize():
    print("Failed to initialize MT5")
else:
    data = get_data_mt5(symbol="BTCUSD", timeframe=mt5.TIMEFRAME_H4, n_bars=2000, start_pos=2000)
    mt5.shutdown()

df = add_all_ta_features(data)
df = calculate_future_returns(df).dropna(subset=["future_returns"])

X_full = df.drop(columns=["future_returns"])
y_full = df["future_returns"]

lookback = 10  
split_idx = int(len(X_full) * 0.8)
X_tune, y_tune = X_full.iloc[:split_idx].values, y_full.iloc[:split_idx].values

scaler = StandardScaler()
X_tune_scaled = scaler.fit_transform(X_tune)

def create_sequences(X, y, lookback):
    X_seq, y_seq = [], []
    for i in range(lookback, len(X)):
        X_seq.append(X[i - lookback : i])
        y_seq.append(y[i])
    return np.array(X_seq), np.array(y_seq)

X_tune_seq, y_tune_seq = create_sequences(X_tune_scaled, y_tune, lookback)

# ------------------------------
# 2️⃣ Optuna Model & Objective Function
# ------------------------------
def build_lstm_model(trial):
    """Build an optimized LSTM model"""

    # 🔥 Slightly Higher Learning Rate
    learning_rate = trial.suggest_loguniform("learning_rate", 1e-3, 5e-3)
    
    # ✅ FIX LSTM Units
    units = 50  

    # 📉 Reduce L2 Regularization
    l2_reg = trial.suggest_loguniform("l2_reg", 1e-6, 1e-5)  

    # 🔁 Reduce Dropout to 0.1
    dropout_rate = 0.1  

    model = Sequential([
        LSTM(units, input_shape=(lookback, X_tune_seq.shape[2]), kernel_regularizer=l2(l2_reg)),
        Dropout(dropout_rate),
        Dense(1)
    ])
    
    model.compile(optimizer=Adam(learning_rate=learning_rate), loss="mean_squared_error")
    
    # TensorBoard Logging
    log_dir = "logs/optuna/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
    tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)

    return model, tensorboard_callback

# ------------------------------
# 3️⃣ Optuna Search
# ------------------------------
def objective(trial):
    """Optuna tuning objective function"""
    tscv = TimeSeriesSplit(n_splits=3)
    mse_scores = []

    for train_idx, test_idx in tscv.split(X_tune_seq):
        # Build a *fresh* model for each fold
        model, tensorboard_callback = build_lstm_model(trial)

        X_train_fold, X_test_fold = X_tune_seq[train_idx], X_tune_seq[test_idx]
        y_train_fold, y_test_fold = y_tune_seq[train_idx], y_tune_seq[test_idx]

        early_stopping = tf.keras.callbacks.EarlyStopping(
            monitor='loss',
            patience=10,
            restore_best_weights=True
        )

        # Train on the fold's training data
        model.fit(
            X_train_fold,
            y_train_fold,
            epochs=100,
            batch_size=32,
            verbose=0,
            callbacks=[tensorboard_callback, early_stopping]
        )

        # Evaluate on the fold's test data
        y_pred = model.predict(X_test_fold)
        fold_mse = mean_squared_error(y_test_fold, y_pred)
        mse_scores.append(fold_mse)

    # Return the *average* MSE across folds
    return np.mean(mse_scores)


study = optuna.create_study(direction="minimize")
study.optimize(objective, n_trials=20, timeout=1200)

print("\nBest Trial Found:")
print("MSE:", study.best_trial.value)
print("Best Params:", study.best_trial.params)

# ------------------------------
# 4️⃣ Train Final Model with Best Hyperparameters
# ------------------------------
best_params = study.best_trial.params

final_model = Sequential([
    LSTM(50, input_shape=(lookback, X_tune_seq.shape[2]), kernel_regularizer=l2(best_params["l2_reg"])),
    Dropout(0.1),
    Dense(1)
])

final_model.compile(optimizer=Adam(learning_rate=best_params["learning_rate"]), loss="mean_squared_error")

final_model.fit(X_tune_seq, y_tune_seq, epochs=100, batch_size=32, verbose=1,
                callbacks=[tf.keras.callbacks.EarlyStopping(monitor='loss', patience=10, restore_best_weights=True)])

# ------------------------------
# 5️⃣ Save the Final Model
# ------------------------------
final_model.save("models/saved_models/best_lstm_fixed.h5")
joblib.dump(scaler, "models/saved_models/lstm_scaler.pkl")

print("Saved optimized LSTM model!")

# ------------------------------
# 6️⃣ Adjust Trade Signal Threshold
# ------------------------------
threshold = 0.0005  # **Reduce threshold to allow more trades**
fees = 0.0002  

# ------------------------------
# 7️⃣ Optuna Visualization
# ------------------------------
import optuna.visualization as ov
ov.plot_optimization_history(study).show()
ov.plot_param_importances(study).show()


[I 2025-03-02 15:15:52,233] A new study created in memory with name: no-name-6bcf8ea7-d197-4a94-a1f9-887d015aa054


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 32ms/step


[I 2025-03-02 15:17:08,836] Trial 0 finished with value: 0.015822053043072888 and parameters: {'learning_rate': 0.0015331463632807462, 'l2_reg': 7.894338979370795e-06}. Best is trial 0 with value: 0.015822053043072888.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 20ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step


[I 2025-03-02 15:18:42,991] Trial 1 finished with value: 0.006239539412549601 and parameters: {'learning_rate': 0.002566940622775816, 'l2_reg': 1.0106041571961005e-06}. Best is trial 1 with value: 0.006239539412549601.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 26ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 41ms/step


[I 2025-03-02 15:20:21,053] Trial 2 finished with value: 0.010473627900717118 and parameters: {'learning_rate': 0.0010769054159742544, 'l2_reg': 2.4312373523579494e-06}. Best is trial 1 with value: 0.006239539412549601.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 21ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 67ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 34ms/step


[I 2025-03-02 15:22:18,436] Trial 3 finished with value: 0.013585853196053635 and parameters: {'learning_rate': 0.0014823634692929183, 'l2_reg': 9.692536394351257e-06}. Best is trial 1 with value: 0.006239539412549601.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 25ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 33ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step


[I 2025-03-02 15:24:12,058] Trial 4 finished with value: 0.006980291323512781 and parameters: {'learning_rate': 0.004296706078070114, 'l2_reg': 2.9610925284429926e-06}. Best is trial 1 with value: 0.006239539412549601.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 20ms/step


[I 2025-03-02 15:25:40,275] Trial 5 finished with value: 0.008932971236276026 and parameters: {'learning_rate': 0.001989280407069289, 'l2_reg': 2.771768857272482e-06}. Best is trial 1 with value: 0.006239539412549601.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 15ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step


[I 2025-03-02 15:27:10,608] Trial 6 finished with value: 0.009661742086965656 and parameters: {'learning_rate': 0.0018347193879718023, 'l2_reg': 4.349933130497905e-06}. Best is trial 1 with value: 0.006239539412549601.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 19ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step


[I 2025-03-02 15:28:38,673] Trial 7 finished with value: 0.025089144150007996 and parameters: {'learning_rate': 0.0017652683684191309, 'l2_reg': 2.934713559942602e-06}. Best is trial 1 with value: 0.006239539412549601.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 19ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 60ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step


[I 2025-03-02 15:30:09,733] Trial 8 finished with value: 0.008470815311694998 and parameters: {'learning_rate': 0.0013689489094159087, 'l2_reg': 5.569707849690135e-06}. Best is trial 1 with value: 0.006239539412549601.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 19ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 15ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 14ms/step


[I 2025-03-02 15:31:33,842] Trial 9 finished with value: 0.01593259668165293 and parameters: {'learning_rate': 0.004273603770061773, 'l2_reg': 3.6963233294449463e-06}. Best is trial 1 with value: 0.006239539412549601.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 19ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 15ms/step


[I 2025-03-02 15:33:11,179] Trial 10 finished with value: 0.009788353123443986 and parameters: {'learning_rate': 0.0029227836939223523, 'l2_reg': 1.1328355591887656e-06}. Best is trial 1 with value: 0.006239539412549601.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 20ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 27ms/step


[I 2025-03-02 15:34:45,138] Trial 11 finished with value: 0.008168580782104142 and parameters: {'learning_rate': 0.003023122902441207, 'l2_reg': 1.0818302694717924e-06}. Best is trial 1 with value: 0.006239539412549601.


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 20ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 18ms/step
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step


[I 2025-03-02 15:36:25,365] Trial 12 finished with value: 0.011280362253539428 and parameters: {'learning_rate': 0.004843575801085893, 'l2_reg': 1.7686436995263091e-06}. Best is trial 1 with value: 0.006239539412549601.



Best Trial Found:
MSE: 0.006239539412549601
Best Params: {'learning_rate': 0.002566940622775816, 'l2_reg': 1.0106041571961005e-06}
Epoch 1/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 8ms/step - loss: 0.0991
Epoch 2/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 0.0198
Epoch 3/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 0.0098
Epoch 4/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - loss: 0.0068
Epoch 5/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - loss: 0.0045
Epoch 6/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 0.0031
Epoch 7/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 0.0022
Epoch 8/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 9ms/step - loss: 0.0021
Epoch 9/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m



Saved optimized LSTM model!


## Part 3: Backtesting & Performance Evaluation

**Objective:**  
Evaluate how well the trained model performs on unseen data, simulating real trades.

**Tasks:**
- Use walk-forward or expanding splits to mimic “live” conditions  
- Convert model predictions to signals ([-1, 0, +1] or buy/sell/hold)  
- Run a simple backtest script or VectorBT for performance metrics  
- Calculate returns, Sharpe ratio, drawdowns, confusion matrix, etc.  
- Visualize results (equity curve, trades, etc.) to judge strategy viability  

In [None]:
import sys
import os
import warnings
from pathlib import Path

# ---------------------------------------------------------------------------
# 1) SET PROJECT ROOT AND UPDATE PATH/WORKING DIRECTORY
# ---------------------------------------------------------------------------
project_root = Path.cwd().parent.parent
sys.path.append(str(project_root))
os.chdir(str(project_root))
warnings.filterwarnings("ignore")

import numpy as np
import pandas as pd
import MetaTrader5 as mt5
import vectorbt as vbt
import tensorflow as tf
from tensorflow.keras.models import load_model
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error
from tensorflow.keras.optimizers import Adam  # Import Adam optimizer

# Our modules
from data.data_loader import get_data_mt5
from features.feature_engineering import add_all_ta_features
from features.labeling_schemes import calculate_future_returns

###########################################################
# 1) DATA LOADING & FEATURE ENGINEERING
###########################################################
if not mt5.initialize():
    print("Failed to initialize MT5")
else:
    # Fetch 2000 most recent bars for backtesting
    data = get_data_mt5(symbol="BTCUSD", timeframe=mt5.TIMEFRAME_H4, n_bars=2000, start_pos=0)
    mt5.shutdown()

df = add_all_ta_features(data)
df = calculate_future_returns(df).dropna(subset=["future_returns"])
df = df.sort_index()  # Ensure chronological order

# Feature Matrix (X) and Target (y)
X = df.drop(columns=["future_returns"])
y = df["future_returns"]

print(f"Full Dataset Size: {len(X)} bars")

###########################################################
# 2) LOAD BEST LSTM MODEL & PREPARE BACKTESTING
###########################################################
# Load the best trained model (from your training dataset)
best_model = load_model("models/saved_models/best_lstm_fixed.h5")
# Compile if needed (Keras 2.x+ may require re-compilation for further .predict usage)
best_model.compile(optimizer=Adam(learning_rate=0.001), loss="mean_squared_error")

# Hyperparameters
threshold = 0.0005  # Min predicted return for a trade
fees = 0.0002       # 0.02% transaction cost per trade
lookback = 10       # Same lookback window used in training

# Standardize the dataset
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

###########################################################
# 3) CREATE SEQUENCES FOR THE FULL DATASET
###########################################################
def create_sequences(X, y, lookback):
    X_seq, y_seq = [], []
    for i in range(lookback, len(X)):
        X_seq.append(X[i-lookback:i])
        y_seq.append(y[i])
    return np.array(X_seq), np.array(y_seq)

X_seq, y_seq = create_sequences(X_scaled, y.values, lookback)

print(f"Total Sequences Created: {len(X_seq)}")

###########################################################
# 4) MAKE PREDICTIONS (NO RETRAINING)
###########################################################
print("\nPredicting on out-of-sample data (No folds, no retraining)...")
preds = best_model.predict(X_seq, verbose=0).reshape(-1)  # flatten to (n,)
mse = mean_squared_error(y_seq, preds)

###########################################################
# 5) BACKTEST via target exposure (-1, 0, +1)  << UPDATED
###########################################################
# Map regression predictions -> exposure {-1, 0, +1}
exposure = np.where(preds > threshold, 1.0,
            np.where(preds < -threshold, -1.0, 0.0)).astype(float)

# Align prices exactly to the prediction rows (no padding)
df_test = df.iloc[lookback:].copy()
close = df_test["close"]
exposure = pd.Series(exposure, index=close.index)

# Optional: trade on the next bar to avoid look-ahead (set to 0 for same-bar)
execution_lag = 1
if execution_lag > 0:
    exposure = exposure.shift(execution_lag).fillna(0.0)

print("\nRunning Full Backtest on the Last 2000 Bars...")

pf = vbt.Portfolio.from_orders(
    close=close,
    size=exposure,              # -1 short, 0 flat, +1 long
    size_type='targetpercent',
    init_cash=10000,
    freq='4H',
    fees=fees
)

total_return = pf.total_return()
sharpe_ratio = pf.sharpe_ratio()

# Print final results
print(f"\nFull Backtest Results:")
print(f"MSE={mse:.2e}, Return={total_return:.2f}%, Sharpe={sharpe_ratio:.2f}")
print(pf.stats())

# Plot the backtest results
fig = pf.plot()
fig.show()




Full Dataset Size: 1999 bars
Total Sequences Created: 1989

Predicting on out-of-sample data (No folds, no retraining)...
[1m63/63[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step

Running Full Backtest on the Last 2000 Bars...

Full Backtest Results:
MSE=1.71e-04, Return=0.18%, Sharpe=0.63
Start                         2024-04-05 04:00:00
End                           2025-03-02 12:00:00
Period                          331 days 12:00:00
Start Value                               10000.0
End Value                            11832.509617
Total Return [%]                        18.325096
Benchmark Return [%]                    26.605348
Max Gross Exposure [%]                      100.0
Total Fees Paid                        357.100329
Max Drawdown [%]                        37.177513
Max Drawdown Duration           114 days 12:00:00
Total Trades                                   85
Total Closed Trades                            84
Total Open Trades                          