## DRL Trading System Modules

This document provides a simple overview of the modular 
components in the DRL trading system.

üèóÔ∏è Modular Architecture

The system is organized into the following modules:

1. ConfigManager: Centralizes all configuration settings for 
    the system.
2. DataProcessor: Handles data loading, preprocessing, and 
    feature engineering.
3. StateActionReward: Defines the state space, action space, 
    and the logic for reward calculation.
4. TradingEnvironment: Provides the modular trading environment
    where the agent operates.
5. ModelTrainer: Manages the training of A2C and TD3 models.
6. PerformanceAnalyzer: Conducts performance analysis and 
    creates visualizations of the results.
7. HyperparameterOptimizer: Implements strategies for 
    hyperparameter optimization.
8. RollingWindowTrainer: Manages the rolling window training 
    and validation process.

## 15-Day State Space Methodology with Rolling Window Training

This notebook implements the complete DRL trading methodology from the original research

### üìã **Complete Workflow**
1. **Data Loading & Feature Engineering** - Load and process ETH data with 15D features
2. **Rolling Window Diagnostics** - Train on rolling windows for initial evaluation and eval set testing to finalise the hyperparameter
3. **Full Training with Validation** - Complete training with proper validation scoring
4. **Final Testing & Analysis** - Comprehensive evaluation on test set with visualizations

### üéØ **Key Features**
- ‚úÖ Complete 15D state space as per methodology
- ‚úÖ Multi-component reward function optimization
- ‚úÖ Rolling window training protocol
- ‚úÖ Comprehensive hyperparameter testing
- ‚úÖ Advanced risk management
- ‚úÖ Temporal data splitting
- ‚úÖ Modular architecture for maintainability


In [1]:
# ===================== COMPLETE IMPORTS & CONFIGURATION =====================

# Core libraries
import os
import sys
import time
import json
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
from typing import Dict, List, Tuple, Optional, Any, Union
import warnings
from pathlib import Path
from copy import deepcopy
warnings.filterwarnings('ignore')

# Machine Learning & Deep Reinforcement Learning
import torch
import gymnasium as gym
from stable_baselines3 import A2C, TD3
from stable_baselines3.common.utils import set_random_seed

# Visualization & Analysis
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import seaborn as sns

# Statistical Analysis
from scipy import stats

# Progress tracking
from tqdm.auto import tqdm

# Modular components
sys.path.append('./modules')
from modules import (
    ConfigManager,
    DataProcessor,
    StateActionReward,
    TradingEnvironment,
    ModelTrainer,
    PerformanceAnalyzer,
    HyperparameterOptimizer,
    RollingWindowTrainer
)

# Set random seeds for reproducibility
RANDOM_SEED = 42
set_random_seed(RANDOM_SEED)
np.random.seed(RANDOM_SEED)

# Configure plotting
plt.style.use('default')
plt.rcParams['figure.figsize'] = (15, 8)
plt.rcParams['figure.dpi'] = 100
sns.set_palette("husl")

print("‚úÖ All imports loaded successfully")
print(f"üé≤ Random seed set to: {RANDOM_SEED}")
print(f"üêç Python version: {sys.version}")
print(f"üìä NumPy version: {np.__version__}")
print(f"üêº Pandas version: {pd.__version__}")
print(f"üèãÔ∏è Gymnasium version: {gym.__version__}")
print(f"üß† Stable-Baselines3 available")

# Initialize global configuration
config = ConfigManager()
config.print_summary()


Gym has been unmaintained since 2022 and does not support NumPy 2.0 amongst other critical functionality.
Please upgrade to Gymnasium, the maintained drop-in replacement of Gym, or contact the authors of your software and request that they upgrade.
Users of this version of Gym should be able to simply replace 'import gym' with 'import gymnasium as gym' in the vast majority of cases.
See the migration guide at https://gymnasium.farama.org/introduction/migration_guide/ for additional information.


‚úÖ All imports loaded successfully
üé≤ Random seed set to: 42
üêç Python version: 3.12.5 (v3.12.5:ff3bc82f7c9, Aug  7 2024, 05:32:06) [Clang 13.0.0 (clang-1300.0.29.30)]
üìä NumPy version: 2.3.2
üêº Pandas version: 2.3.2
üèãÔ∏è Gymnasium version: 1.2.0
üß† Stable-Baselines3 available

üîß Configuration Summary
üìä State Space Dimensions: 15D
   - Core Features: 6D (Position, Z-score, Zone, Price Momentum, Z-score Momentum, Position Change)
   - Technical Indicators: 8D (MACD√ó3, RSI√ó1, BB√ó3, OBV√ó1)
   - Sentiment Data: 1D (Reddit Sentiment)
üí∞ Trading Configuration:
   - Initial Capital: $10,000
   - Episode Length: 10,080 minutes
   - Max Position Shift: 0.1 per minute
üéØ Reward Function: Multi-component hybrid (6 components)
üìÖ Training Protocol: Rolling window (6 months)
üíª Parallel Processing: 8 cores
üîÑ Sentiment Integration: Enabled
üìÅ Output Directory: ./processed_data_15d
üîç Environment: default


## üöÄ **Step 1: Data Loading & Feature Engineering (15D State Space)**

Load ETH data and create the complete 15-dimensional feature set as per methodology.


In [2]:
# ===================== DATA LOADING & FEATURE ENGINEERING =====================

print("üöÄ Starting data loading and feature engineering...")

# Initialize data processor
data_processor = DataProcessor(config)

# Run complete feature engineering pipeline
df_processed, feature_columns, data_splits = data_processor.run_full_pipeline(config.data.data_path)

print(f"\nüìä Feature Engineering Summary:")
print(f"   üìà Processed data: {len(df_processed):,} rows") 
print(f"   üìã Features: {len(feature_columns)} dimensions")
print(f"   üéØ State space: 15D (as per methodology)")
print(f"   üíæ Memory usage: {df_processed.memory_usage(deep=True).sum() / 1024**2:.1f} MB")

# Display sample of processed features
print(f"\nüìã Sample of 15D Feature Data:")
sample_features = df_processed[feature_columns].tail(3)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
print(sample_features)
pd.reset_option('display.max_columns')
pd.reset_option('display.width')

print(f"\n‚úÖ Data processing complete!")
print(f"üìÖ Ready for temporal data splitting...")


üöÄ Starting data loading and feature engineering...
üöÄ Starting complete data processing pipeline...
üìÅ Loading data from: ../ETHUSDT_1m_with_indicators.parquet
   üîß Found timestamp in index, converting to column
   ‚úÖ Loaded 1,883,407 rows of data
   üìä Columns: ['ts', 'open', 'high', 'low', 'close', 'volume', 'number_of_trades', 'symbol', 'RSI', 'BB_mid', 'BB_high', 'BB_low', 'EMA_12', 'EMA_26', 'MACD', 'MACD_signal', 'MACD_diff', 'ATR']
   üìÖ Date range: 2022-01-01 00:33:00+00:00 to 2025-07-31 23:59:00+00:00

üîß Feature Engineering Pipeline (15D State Space):
   üî¨ Calculating core features (4D + 2D by environment)...
      ‚úÖ Z-score range: [-10.32, 10.36]
      ‚úÖ Zone distribution: {0.0: 569097, -0.5: 568471, 0.5: 566294, -1.0: 92863, 1.0: 86682}
      ‚úÖ Price momentum range: [-0.0959, 0.0635]
   üìà Calculating technical indicators (8D)...
      üìä Full OHLCV data available - calculating comprehensive indicators
      ‚úÖ Technical indicators calculated 

In [None]:
# ===================== TEMPORAL DATA SPLITTING =====================

print("\nüìä Extracting Data Splits from Pipeline Results...")

# Extract data splits (already created by run_full_pipeline)
train_data = data_splits['train']
val_data = data_splits['validation'] 
test_data = data_splits['test']

print(f"\n‚úÖ Temporal data splitting complete!")
print(f"   üìà Training:   {len(train_data):,} rows ({len(train_data)/len(df_processed):.1%})")
print(f"   üìä Validation: {len(val_data):,} rows ({len(val_data)/len(df_processed):.1%})")
print(f"   üìâ Test:       {len(test_data):,} rows ({len(test_data)/len(df_processed):.1%})")

# Initialize model trainer with data splits
# CPU seems to be faster on MAC, but don't know about nvidia gpus. I think you can test with max parallel workers > 1 for gpui if you have more vram. 
trainer = ModelTrainer(train_data, val_data, test_data, feature_columns, config, device="cpu")

print(f"\nüß† ModelTrainer initialized and ready!")
print(f"üéØ Ready for rolling window training implementation")



üìä Extracting Data Splits from Pipeline Results...

‚úÖ Temporal data splitting complete!
   üìà Training:   1,318,384 rows (70.0%)
   üìä Validation: 282,511 rows (15.0%)
   üìâ Test:       282,512 rows (15.0%)
üñ•Ô∏è Using MPS device
‚úÖ ModelTrainer initialized
   üìä Training data: 1,318,384 rows
   üìä Validation data: 282,511 rows
   üìä Test data: 282,512 rows
   üéØ Features: 13 dimensions
   üèõÔ∏è Environment: Modular TradingEnvironment

üß† ModelTrainer initialized and ready!
üéØ Ready for rolling window training implementation


## üß† **Step 3: Full Training with Real Validation Scoring**

Based on the rolling window diagnostics results, we now proceed with full training on the complete training dataset, using the validation set for hyperparameter selection and model evaluation. This follows the exact methodology from the original implementation with improved modular architecture.


In [4]:
# ===================== FULL TRAINING WITH HYPERPARAMETER OPTIMIZATION =====================

def merged_reward_config(cfg: Dict[str, Any]) -> Dict[str, float]:
    """Merge configuration reward components with defaults."""
    merged = config.to_dict()['reward'].copy()
    merged.update(cfg.get("reward_components", {}))
    return merged

print("üß† Starting Full Training with Hyperparameter Optimization...")

# Configuration option: Set to True to retrain existing models
RETRAIN_EXISTING_MODELS = True  # Set to True to delete and retrain existing models
MAX_CONFIGS = 11     #use for testing

# Initialize hyperparameter optimizer
hyperparameter_optimizer = HyperparameterOptimizer(trainer, config)

# Load and select configurations to test
all_configs = hyperparameter_optimizer.load_configurations()[:MAX_CONFIGS]

if all_configs:
    # This single function call now handles the entire process:
    # 1. Culls 90% of configs using a fast, parallelized rolling-window stability test.
    # 2. Runs full, in-depth optimization on the top 10% of survivors.
    # 3. The results are ranked by performance on the validation set.
    final_ranked_results = hyperparameter_optimizer.filter_and_optimize(
        configurations=all_configs,
        survival_rate=0.1,          # Keep the top 10% of configurations
        fast_training_timesteps=10000,  # Timesteps for the fast stability check
        max_windows=3,              # Number of rolling windows for the stability check
        use_parallel=True
    )

    # The best configuration and its results can be extracted for the final testing phase
    if final_ranked_results:
        best_result = final_ranked_results[0]
        best_config = best_result['config']

        print("\n\nü•á Best Configuration Found After Full Optimization ü•á")
        print(f"   - Config ID: {best_result['config_id']}")
        print(f"   - Final Validation Reward: {best_result['train_metrics']['mean_reward']:.4f}")
        print(f"   - Final Test Reward (for reference): {best_result['mean_test_reward']:.4f}")
        print(f"   - Sharpe Ratio (on validation): {best_result['sharpe_ratio']:.3f}")

else:
    print("‚ùå No configurations were loaded. Skipping optimization.")

üß† Starting Full Training with Hyperparameter Optimization...
   ‚ö†Ô∏è MPS device detected. Setting max_parallel_jobs to 1 to ensure efficient GPU utilization.
   ‚ö†Ô∏è GPU mps) detected. Setting max_parallel_jobs to 1 to ensure efficient GPU utilization.
‚úÖ HyperparameterOptimizer initialized
   üíª Max parallel jobs: 15
   üìä Performance metric: sharpe_ratio
   üìÅ Output directory: processed_data_15d/optimization_results
   ‚úÖ Loaded 202 configurations from ./drl_training_configs.json
   üìä Algorithms: ['A2C']
üî™ Starting Phase 1: Culling 11 configurations with rolling window diagnostics...


Evaluating Stability:   0%|          | 0/11 [00:00<?, ?it/s]Gym has been unmaintained since 2022 and does not support NumPy 2.0 amongst other critical functionality.
Please upgrade to Gymnasium, the maintained drop-in replacement of Gym, or contact the authors of your software and request that they upgrade.
Users of this version of Gym should be able to simply replace 'import gym' with 'import gymnasium as gym' in the vast majority of cases.
See the migration guide at https://gymnasium.farama.org/introduction/migration_guide/ for additional information.
Gym has been unmaintained since 2022 and does not support NumPy 2.0 amongst other critical functionality.
Please upgrade to Gymnasium, the maintained drop-in replacement of Gym, or contact the authors of your software and request that they upgrade.
Users of this version of Gym should be able to simply replace 'import gym' with 'import gymnasium as gym' in the vast majority of cases.
See the migration guide at https://gymnasium.farama.or

‚úÖ RollingWindowTrainer initialized
üîÑ Starting rolling window diagnostics
   üìÖ Rolling window: 6 months
   üìä Evaluation period: 1 months
   üî¢ Max windows: 3
   üìè Window size: 259,200 rows
   üìè Eval size: 43,200 rows
   üìä Total training data: 1,318,384 rows

   üìä Window 1: Training [0:259,200], Eval [259,200:302,400]
      üìà Train rows: 259,200, Eval rows: 43,200
üöÄ Training A2C model: a2c_0001_window_1
   üìÖ Valid episode range: [120, 249119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (259200, 31)
   üìã Features: 13D
   üéØ State space: 15D
   ‚è±Ô∏è Episode length: 10,080 minutes
   üîÑ Random start: True
   üí∞ Initial capital: $10,000
   üìÖ Valid episode range: [120, 33119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (43200, 31)
   üìã Fe

Window 1/3:   0%|          | 0/3 [00:00<?, ?it/s]     Gym has been unmaintained since 2022 and does not support NumPy 2.0 amongst other critical functionality.
Please upgrade to Gymnasium, the maintained drop-in replacement of Gym, or contact the authors of your software and request that they upgrade.
Users of this version of Gym should be able to simply replace 'import gym' with 'import gymnasium as gym' in the vast majority of cases.
See the migration guide at https://gymnasium.farama.org/introduction/migration_guide/ for additional information.


‚úÖ RollingWindowTrainer initialized
üîÑ Starting rolling window diagnostics
   üìÖ Rolling window: 6 months
   üìä Evaluation period: 1 months
   üî¢ Max windows: 3
   üìè Window size: 259,200 rows
   üìè Eval size: 43,200 rows
   üìä Total training data: 1,318,384 rows

   üìä Window 1: Training [0:259,200], Eval [259,200:302,400]
      üìà Train rows: 259,200, Eval rows: 43,200
üöÄ Training A2C model: a2c_0002_window_1
   üìÖ Valid episode range: [120, 249119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (259200, 31)
   üìã Features: 13D
   üéØ State space: 15D
   ‚è±Ô∏è Episode length: 10,080 minutes
   üîÑ Random start: True
   üí∞ Initial capital: $10,000
   üìÖ Valid episode range: [120, 33119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (43200, 31)
   üìã Fe

Gym has been unmaintained since 2022 and does not support NumPy 2.0 amongst other critical functionality.
Please upgrade to Gymnasium, the maintained drop-in replacement of Gym, or contact the authors of your software and request that they upgrade.
Users of this version of Gym should be able to simply replace 'import gym' with 'import gymnasium as gym' in the vast majority of cases.
See the migration guide at https://gymnasium.farama.org/introduction/migration_guide/ for additional information.


‚úÖ RollingWindowTrainer initialized
üîÑ Starting rolling window diagnostics
   üìÖ Rolling window: 6 months
   üìä Evaluation period: 1 months
   üî¢ Max windows: 3
   üìè Window size: 259,200 rows
   üìè Eval size: 43,200 rows
   üìä Total training data: 1,318,384 rows

   üìä Window 1: Training [0:259,200], Eval [259,200:302,400]
      üìà Train rows: 259,200, Eval rows: 43,200
üöÄ Training A2C model: a2c_0003_window_1
   üìÖ Valid episode range: [120, 249119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (259200, 31)
   üìã Features: 13D
   üéØ State space: 15D
   ‚è±Ô∏è Episode length: 10,080 minutes
   üîÑ Random start: True
   üí∞ Initial capital: $10,000
   üìÖ Valid episode range: [120, 33119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (43200, 31)
   üìã Fe

Gym has been unmaintained since 2022 and does not support NumPy 2.0 amongst other critical functionality.
Please upgrade to Gymnasium, the maintained drop-in replacement of Gym, or contact the authors of your software and request that they upgrade.
Users of this version of Gym should be able to simply replace 'import gym' with 'import gymnasium as gym' in the vast majority of cases.
See the migration guide at https://gymnasium.farama.org/introduction/migration_guide/ for additional information.
Rolling Windows:   0%|          | 0/3 [00:00<?, ?it/s]

‚úÖ RollingWindowTrainer initialized
üîÑ Starting rolling window diagnostics
   üìÖ Rolling window: 6 months
   üìä Evaluation period: 1 months
   üî¢ Max windows: 3
   üìè Window size: 259,200 rows
   üìè Eval size: 43,200 rows
   üìä Total training data: 1,318,384 rows

   üìä Window 1: Training [0:259,200], Eval [259,200:302,400]
      üìà Train rows: 259,200, Eval rows: 43,200
üöÄ Training A2C model: a2c_0004_window_1
   üìÖ Valid episode range: [120, 249119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (259200, 31)
   üìã Features: 13D
   üéØ State space: 15D
   ‚è±Ô∏è Episode length: 10,080 minutes
   üîÑ Random start: True
   üí∞ Initial capital: $10,000
   üìÖ Valid episode range: [120, 33119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (43200, 31)
   üìã Fe

Gym has been unmaintained since 2022 and does not support NumPy 2.0 amongst other critical functionality.
Please upgrade to Gymnasium, the maintained drop-in replacement of Gym, or contact the authors of your software and request that they upgrade.
Users of this version of Gym should be able to simply replace 'import gym' with 'import gymnasium as gym' in the vast majority of cases.
See the migration guide at https://gymnasium.farama.org/introduction/migration_guide/ for additional information.
Window 1/3:   0%|          | 0/3 [00:00<?, ?it/s]     

‚úÖ RollingWindowTrainer initialized
üîÑ Starting rolling window diagnostics
   üìÖ Rolling window: 6 months
   üìä Evaluation period: 1 months
   üî¢ Max windows: 3
   üìè Window size: 259,200 rows
   üìè Eval size: 43,200 rows
   üìä Total training data: 1,318,384 rows

   üìä Window 1: Training [0:259,200], Eval [259,200:302,400]
      üìà Train rows: 259,200, Eval rows: 43,200
üöÄ Training A2C model: a2c_0005_window_1
   üìÖ Valid episode range: [120, 249119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (259200, 31)
   üìã Features: 13D
   üéØ State space: 15D
   ‚è±Ô∏è Episode length: 10,080 minutes
   üîÑ Random start: True
   üí∞ Initial capital: $10,000
   üìÖ Valid episode range: [120, 33119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (43200, 31)
   üìã Fe

Gym has been unmaintained since 2022 and does not support NumPy 2.0 amongst other critical functionality.
Please upgrade to Gymnasium, the maintained drop-in replacement of Gym, or contact the authors of your software and request that they upgrade.
Users of this version of Gym should be able to simply replace 'import gym' with 'import gymnasium as gym' in the vast majority of cases.
See the migration guide at https://gymnasium.farama.org/introduction/migration_guide/ for additional information.


‚úÖ RollingWindowTrainer initialized
üîÑ Starting rolling window diagnostics
   üìÖ Rolling window: 6 months
   üìä Evaluation period: 1 months
   üî¢ Max windows: 3
   üìè Window size: 259,200 rows
   üìè Eval size: 43,200 rows
   üìä Total training data: 1,318,384 rows

   üìä Window 1: Training [0:259,200], Eval [259,200:302,400]
      üìà Train rows: 259,200, Eval rows: 43,200
üöÄ Training A2C model: a2c_0006_window_1
   üìÖ Valid episode range: [120, 249119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (259200, 31)
   üìã Features: 13D
   üéØ State space: 15D
   ‚è±Ô∏è Episode length: 10,080 minutes
   üîÑ Random start: True
   üí∞ Initial capital: $10,000
   üìÖ Valid episode range: [120, 33119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (43200, 31)
   üìã Fe

Window 1/3:   0%|          | 0/3 [00:00<?, ?it/s]     

‚úÖ RollingWindowTrainer initialized
üîÑ Starting rolling window diagnostics
   üìÖ Rolling window: 6 months
   üìä Evaluation period: 1 months
   üî¢ Max windows: 3
   üìè Window size: 259,200 rows
   üìè Eval size: 43,200 rows
   üìä Total training data: 1,318,384 rows

   üìä Window 1: Training [0:259,200], Eval [259,200:302,400]
      üìà Train rows: 259,200, Eval rows: 43,200
üöÄ Training A2C model: a2c_0007_window_1
   üìÖ Valid episode range: [120, 249119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (259200, 31)
   üìã Features: 13D
   üéØ State space: 15D
   ‚è±Ô∏è Episode length: 10,080 minutes
   üîÑ Random start: True
   üí∞ Initial capital: $10,000
   üìÖ Valid episode range: [120, 33119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (43200, 31)
   üìã Fe

Gym has been unmaintained since 2022 and does not support NumPy 2.0 amongst other critical functionality.
Please upgrade to Gymnasium, the maintained drop-in replacement of Gym, or contact the authors of your software and request that they upgrade.
Users of this version of Gym should be able to simply replace 'import gym' with 'import gymnasium as gym' in the vast majority of cases.
See the migration guide at https://gymnasium.farama.org/introduction/migration_guide/ for additional information.


‚úÖ RollingWindowTrainer initialized
üîÑ Starting rolling window diagnostics
   üìÖ Rolling window: 6 months
   üìä Evaluation period: 1 months
   üî¢ Max windows: 3
   üìè Window size: 259,200 rows
   üìè Eval size: 43,200 rows
   üìä Total training data: 1,318,384 rows

   üìä Window 1: Training [0:259,200], Eval [259,200:302,400]
      üìà Train rows: 259,200, Eval rows: 43,200
üöÄ Training A2C model: a2c_0008_window_1
   üìÖ Valid episode range: [120, 249119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (259200, 31)
   üìã Features: 13D
   üéØ State space: 15D
   ‚è±Ô∏è Episode length: 10,080 minutes
   üîÑ Random start: True
   üí∞ Initial capital: $10,000
   üìÖ Valid episode range: [120, 33119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (43200, 31)
   üìã Fe

Window 1/3:   0%|          | 0/3 [00:00<?, ?it/s]     Gym has been unmaintained since 2022 and does not support NumPy 2.0 amongst other critical functionality.
Please upgrade to Gymnasium, the maintained drop-in replacement of Gym, or contact the authors of your software and request that they upgrade.
Users of this version of Gym should be able to simply replace 'import gym' with 'import gymnasium as gym' in the vast majority of cases.
See the migration guide at https://gymnasium.farama.org/introduction/migration_guide/ for additional information.


‚úÖ RollingWindowTrainer initialized
üîÑ Starting rolling window diagnostics
   üìÖ Rolling window: 6 months
   üìä Evaluation period: 1 months
   üî¢ Max windows: 3
   üìè Window size: 259,200 rows
   üìè Eval size: 43,200 rows
   üìä Total training data: 1,318,384 rows

   üìä Window 1: Training [0:259,200], Eval [259,200:302,400]
      üìà Train rows: 259,200, Eval rows: 43,200
üöÄ Training A2C model: a2c_0009_window_1
   üìÖ Valid episode range: [120, 249119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (259200, 31)
   üìã Features: 13D
   üéØ State space: 15D
   ‚è±Ô∏è Episode length: 10,080 minutes
   üîÑ Random start: True
   üí∞ Initial capital: $10,000
   üìÖ Valid episode range: [120, 33119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (43200, 31)
   üìã Fe

Window 1/3:   0%|          | 0/3 [00:00<?, ?it/s]     Gym has been unmaintained since 2022 and does not support NumPy 2.0 amongst other critical functionality.
Please upgrade to Gymnasium, the maintained drop-in replacement of Gym, or contact the authors of your software and request that they upgrade.
Users of this version of Gym should be able to simply replace 'import gym' with 'import gymnasium as gym' in the vast majority of cases.
See the migration guide at https://gymnasium.farama.org/introduction/migration_guide/ for additional information.


‚úÖ RollingWindowTrainer initialized
üîÑ Starting rolling window diagnostics
   üìÖ Rolling window: 6 months
   üìä Evaluation period: 1 months
   üî¢ Max windows: 3
   üìè Window size: 259,200 rows
   üìè Eval size: 43,200 rows
   üìä Total training data: 1,318,384 rows

   üìä Window 1: Training [0:259,200], Eval [259,200:302,400]
      üìà Train rows: 259,200, Eval rows: 43,200
üöÄ Training A2C model: a2c_0010_window_1
   üìÖ Valid episode range: [120, 249119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (259200, 31)
   üìã Features: 13D
   üéØ State space: 15D
   ‚è±Ô∏è Episode length: 10,080 minutes
   üîÑ Random start: True
   üí∞ Initial capital: $10,000
   üìÖ Valid episode range: [120, 33119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (43200, 31)
   üìã Fe

Rolling Windows:   0%|          | 0/3 [00:00<?, ?it/s]Gym has been unmaintained since 2022 and does not support NumPy 2.0 amongst other critical functionality.
Please upgrade to Gymnasium, the maintained drop-in replacement of Gym, or contact the authors of your software and request that they upgrade.
Users of this version of Gym should be able to simply replace 'import gym' with 'import gymnasium as gym' in the vast majority of cases.
See the migration guide at https://gymnasium.farama.org/introduction/migration_guide/ for additional information.


‚úÖ RollingWindowTrainer initialized
üîÑ Starting rolling window diagnostics
   üìÖ Rolling window: 6 months
   üìä Evaluation period: 1 months
   üî¢ Max windows: 3
   üìè Window size: 259,200 rows
   üìè Eval size: 43,200 rows
   üìä Total training data: 1,318,384 rows

   üìä Window 1: Training [0:259,200], Eval [259,200:302,400]
      üìà Train rows: 259,200, Eval rows: 43,200
üöÄ Training A2C model: a2c_0011_window_1
   üìÖ Valid episode range: [120, 249119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (259200, 31)
   üìã Features: 13D
   üéØ State space: 15D
   ‚è±Ô∏è Episode length: 10,080 minutes
   üîÑ Random start: True
   üí∞ Initial capital: $10,000
   üìÖ Valid episode range: [120, 33119]
   üéÆ Action space: Box(-1.0, 1.0, (1,), float32)
   üëÅÔ∏è Observation space: (15,)
üèõÔ∏è TradingEnvironment initialized:
   üìä Data shape: (43200, 31)
   üìã Fe

Window 1/3:   0%|          | 0/3 [01:04<?, ?it/s]3<?, ?it/s]

Window 1/3:   0%|          | 0/3 [01:09<?, ?it/s]
Window 1/3:   0%|          | 0/3 [01:11<?, ?it/s]
Window 1/3:   0%|          | 0/3 [01:07<?, ?it/s]

Window 1/3:   0%|          | 0/3 [01:08<?, ?it/s]
Window 1/3:   0%|          | 0/3 [01:03<?, ?it/s]
Window 1/3:   0%|          | 0/3 [01:11<?, ?it/s]
Window 1/3:   0%|          | 0/3 [01:05<?, ?it/s]
Window 1/3:   0%|          | 0/3 [01:06<?, ?it/s]
Window 1/3:   0%|          | 0/3 [01:09<?, ?it/s]


KeyboardInterrupt: 

## üìä **Step 4: Final Testing & Comprehensive Analysis**

This section performs the final evaluation on the held-out test set using the best configuration, and generates comprehensive visualizations and performance analysis.


In [None]:
best_config_dict = best_config
best_config_id = best_result["config_id"]
algo = best_config_dict.get("algorithm", "A2C").upper()

print(f"\nüöÄ Retraining the best model for final analysis: {best_config_id}")

# 2) Retrain the single best model using the winning hyperparameters
if algo == "A2C":
    final_model, _ = trainer.train_a2c_model(best_config_dict, save_model=True)
elif algo == "TD3":
    final_model, _ = trainer.train_td3_model(best_config_dict, save_model=True)
else:
    raise ValueError(f"Unsupported algorithm for final training: {algo}")

print("\n‚úÖ Final model retrained and saved.")

# 3) Create the final, held-out test environment
test_env = trainer.create_environment(
    trainer.test_data,
    reward_config=best_config_dict.get("reward_components"),
    random_start=False,
)

# 4) Run the comprehensive analysis on the test set
print("\nüîÑ Running comprehensive analysis on the test set...")
analyzer = PerformanceAnalyzer(config)
final_analysis = analyzer.analyze_model_performance(
    model=final_model,
    env=test_env,
    n_episodes=5,  # Using 5 episodes for a more robust final evaluation
    config_info={"config_id": best_config_id, "algorithm": algo},
)

# 5) Generate final visualizations and a summary report
print("\nüé® Generating final visualizations...")
analyzer.create_performance_plots(final_analysis, save_plots=True, show_plots=True)

final_metrics = final_analysis["aggregate_metrics"]
print("\nüéâ FINAL RESULTS SUMMARY")
print("=" * 60)
print(f"ü•á Best Model: {best_config_id}")
print(f"üîß Algorithm: {algo}")
print(f"üìà Average Total Return: {final_metrics.get('mean_total_return', 0):.2%}")
print(f"üìä Average Sharpe Ratio: {final_metrics.get('mean_sharpe_ratio', 0):.3f}")
print(f"üìâ Average Max Drawdown: {final_metrics.get('mean_max_drawdown', 0):.2%}")
print(f"üí∞ Average Final Portfolio: ${final_metrics.get('mean_final_portfolio_value', 0):,.0f}")
print("=" * 60)
print("\n‚úÖ Complete DRL Trading Analysis Finished!")