# 🧠 NeuroNautilus Training on Google Colab

This notebook trains the NeuroNautilus PPO model using GPU acceleration.

**Prerequisites:**
- Upload your Parquet data to Google Drive at: `MyDrive/NeuroTrader_Workspace/data/nautilus_store/`
- Recommended GPU: T4 (free tier) or higher

**Runtime:** Runtime → Change runtime type → GPU (T4)

## 1️⃣ Setup Environment

In [None]:
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Clone Repository
!git clone https://github.com/MaDoHee33/NeuroTrader.git /content/NeuroTrader
%cd /content/NeuroTrader
!git checkout neuronautilus-v1

## 2️⃣ Install Dependencies

In [None]:
# Install NeuroNautilus dependencies
!pip install -q nautilus_trader pyarrow stable-baselines3[extra] gymnasium==0.29.1 ta pandas numpy

## 3️⃣ Verify GPU

In [None]:
import torch
print(f"🎮 GPU Available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"🔥 Device: {torch.cuda.get_device_name(0)}")

## 4️⃣ Upload Data (Optional)

If you haven't uploaded data to Google Drive yet, run this cell to upload from your local machine:

In [None]:
# Uncomment to upload data files
# from google.colab import files
# import shutil
# import os

# data_dir = '/content/drive/MyDrive/NeuroTrader_Workspace/data/nautilus_store/data/bar/XAUUSD.SIM-5-MINUTE-LAST-EXTERNAL'
# os.makedirs(data_dir, exist_ok=True)

# print("📤 Upload your Parquet files (.parquet)")
# uploaded = files.upload()

# for filename in uploaded.keys():
#     shutil.move(filename, os.path.join(data_dir, filename))
#     print(f"✅ Moved {filename} to {data_dir}")

## 5️⃣ Train Model

### Quick Training (100k steps - ~5 minutes)

In [None]:
# Quick test training
!python -m src.brain.train --timesteps 100000 --model-name ppo_test

### Full Training (1M steps - ~30-45 minutes)

In [None]:
# Production training
!python -m src.brain.train --timesteps 1000000 --model-name ppo_neurotrader_1m

### Advanced Training (10M steps - ~6-8 hours)

In [None]:
# Deep training (run overnight)
!python -m src.brain.train --timesteps 10000000 --model-name ppo_neurotrader_10m

## 6️⃣ Download Trained Model

In [None]:
# Download to your computer
from google.colab import files

model_path = '/content/drive/MyDrive/NeuroTrader_Workspace/models/checkpoints/ppo_neurotrader_1m.zip'
files.download(model_path)

## 7️⃣ Launch TensorBoard (Optional)

In [None]:
# Monitor training progress
%load_ext tensorboard
%tensorboard --logdir /content/drive/MyDrive/NeuroTrader_Workspace/logs

---

## 📚 Training Parameters Reference

You can customize training with these flags:

```bash
--timesteps 1000000          # Total training steps
--bar-type XAUUSD.SIM-5-MINUTE-LAST-EXTERNAL  # Data to use
--model-name my_model        # Output filename
--learning-rate 0.0003       # PPO learning rate
--data-dir /custom/path      # Override data location
```

## 🎯 Recommended Training Strategy

1. **Test Run** (100k): Verify everything works (~5 min)
2. **Baseline** (1M): Get a functional model (~45 min)
3. **Production** (10M): High-quality model (overnight)

**Pro Tip:** Use Colab Pro for longer sessions and faster GPUs (V100/A100)!

In [None]:
# 🎯 Smart Auto-Discovery (Model + Data)from pathlib import Path# Auto-detect workspace (Colab or local)try:    from google.colab import drive    # In Colab    WORKSPACE = Path('/content/drive/MyDrive/NeuroTrader_Workspace')    DATA_DIR = WORKSPACE / 'data'except:    # Local    WORKSPACE = Path.cwd()    DATA_DIR = WORKSPACE / 'data'print(f"📁 Workspace: {WORKSPACE}")# Import discovery functionsfrom src.brain.model_discovery import find_best_modelprint("🔍 Searching for best model...")# Find best modeltry:    model_path = find_best_model(workspace=WORKSPACE)    print(f"✅ Will use: {model_path}")except Exception as e:    print(f"⚠️  No model found: {e}")    print("   Please train a model first.")    model_path = None

---

# 📊 Step 4: Backtest Trained Model

Now that training is complete, let's backtest the model using the validation/test data.


In [None]:
# Backtest configurationimport syssys.path.insert(0, '/content/NeuroTrader')from src.neuro_nautilus.runner import run_backtest, analyze_resultsfrom pathlib import Path# Define model path (ใช้ model ที่เพิ่ง train เสร็จ)model_path = WORKSPACE / 'models' / 'ppo_neurotrader.zip'print("🔍 Starting Backtest...")print(f"Model: {model_path}")

In [None]:
# Run backtest on validation periodbacktest_results = run_backtest(    data_path=str(DATA_DIR / 'nautilus_catalog'),    model_path=str(model_path),    bar_type=BAR_TYPE,    start_date='2023-06-01',  # Validation period start    end_date='2024-09-30',    # Validation period end    initial_balance=10000)print("\n" + "="*60)print("📊 BACKTEST RESULTS (Validation Period)")print("="*60)

In [None]:
# Display performance metrics
from src.neuro_nautilus.runner import analyze_results

metrics = analyze_results(backtest_results)

print(f"\n📈 Performance Metrics:")
print(f"   Total Return:     {metrics.get('total_return', 0):.2%}")
print(f"   Sharpe Ratio:     {metrics.get('sharpe_ratio', 0):.3f}")
print(f"   Max Drawdown:     {metrics.get('max_drawdown', 0):.2%}")
print(f"   Win Rate:         {metrics.get('win_rate', 0):.2%}")
print(f"   Total Trades:     {metrics.get('total_trades', 0)}")
print(f"   Profit Factor:    {metrics.get('profit_factor', 0):.2f}")


In [None]:
# Plot equity curve
import matplotlib.pyplot as plt
import pandas as pd

if 'equity_curve' in backtest_results:
    equity = backtest_results['equity_curve']
    
    plt.figure(figsize=(14, 6))
    plt.subplot(1, 2, 1)
    plt.plot(equity['timestamp'], equity['balance'])
    plt.title('Equity Curve (Validation Period)')
    plt.xlabel('Date')
    plt.ylabel('Account Balance ($)')
    plt.grid(True, alpha=0.3)
    
    plt.subplot(1, 2, 2)
    returns = equity['balance'].pct_change().dropna()
    plt.hist(returns, bins=50, edgecolor='black', alpha=0.7)
    plt.title('Return Distribution')
    plt.xlabel('Returns')
    plt.ylabel('Frequency')
    plt.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.savefig(WORKSPACE / 'backtest_validation.png', dpi=150, bbox_inches='tight')
    plt.show()
    
    print(f"\n💾 Saved plot: {WORKSPACE / 'backtest_validation.png'}")
else:
    print("⚠️  No equity curve data available")


## Test Period Backtest

Now test on completely unseen data (test period):


In [None]:
# Run backtest on TEST period (unseen data)test_results = run_backtest(    data_path=str(DATA_DIR / 'nautilus_catalog'),    model_path=str(model_path),    bar_type=BAR_TYPE,    start_date='2024-10-01',  # Test period start    end_date='2026-01-16',    # Test period end    initial_balance=10000)print("\n" + "="*60)print("📊 BACKTEST RESULTS (Test Period - Unseen Data)")print("="*60)test_metrics = analyze_results(test_results)print(f"\n📈 Test Period Performance:")print(f"   Total Return:     {test_metrics.get('total_return', 0):.2%}")print(f"   Sharpe Ratio:     {test_metrics.get('sharpe_ratio', 0):.3f}")print(f"   Max Drawdown:     {test_metrics.get('max_drawdown', 0):.2%}")print(f"   Win Rate:         {test_metrics.get('win_rate', 0):.2%}")

## Performance Comparison


In [None]:
# Compare validation vs test performance
import pandas as pd

comparison = pd.DataFrame({
    'Metric': ['Total Return', 'Sharpe Ratio', 'Max Drawdown', 'Win Rate'],
    'Validation': [
        f"{metrics.get('total_return', 0):.2%}",
        f"{metrics.get('sharpe_ratio', 0):.3f}",
        f"{metrics.get('max_drawdown', 0):.2%}",
        f"{metrics.get('win_rate', 0):.2%}"
    ],
    'Test': [
        f"{test_metrics.get('total_return', 0):.2%}",
        f"{test_metrics.get('sharpe_ratio', 0):.3f}",
        f"{test_metrics.get('max_drawdown', 0):.2%}",
        f"{test_metrics.get('win_rate', 0):.2%}"
    ]
})

print("\n📊 Performance Comparison:")
print(comparison.to_string(index=False))

# Calculate degradation
val_sharpe = metrics.get('sharpe_ratio', 0)
test_sharpe = test_metrics.get('sharpe_ratio', 0)
degradation = ((test_sharpe - val_sharpe) / val_sharpe * 100) if val_sharpe != 0 else 0

print(f"\n📉 Sharpe Degradation: {degradation:+.1f}%")

if degradation > -20:
    print("✅ Good! Performance holds up on unseen data")
elif degradation > -40:
    print("⚠️  Moderate degradation, acceptable")
else:
    print("❌ Significant degradation, may need retraining")


---

## 🎯 Decision Gate

**Based on backtest results:**
- ✅ Sharpe > 0.5 on test? → Ready for Week 2
- ⚠️  Sharpe 0.3-0.5? → Consider extended training
- ❌ Sharpe < 0.3? → Review strategy/features
