# üåå x.titan: Autonomous Trading Brain - Kaggle Training

**Version**: 2026.1.1 (Post-Audit Hardened)
**Objective**: Train the `DerivOmniModel` (Temporal Fusion Transformer) on historical Deriv market data.

This notebook handles:
1.  **TA-Lib Setup**: Compilation of technical indicator libraries.
2.  **Environment Setup**: Reproducible configuration via `.env`.
3.  **Smart Data Loading**: Auto-detection of Kaggle Input datasets to skip lengthy downloads.
4.  **Hardware-Accelerated Training**: GPU-optimized execution of the hardened `scripts/train.py` pipeline.

## 1. System Requirements & TA-Lib
The x.titan system requires the TA-Lib C-library for technical indicator extraction.

In [None]:
import os
import sys
from pathlib import Path

# 1.1 TA-Lib Installation
if not os.path.exists('/usr/include/ta-lib/ta_defs.h'):
    print("üì¶ Installing TA-Lib Build Dependencies...")
    !apt-get update > /dev/null
    !apt-get install -y build-essential wget > /dev/null
    
    print("üì• Downloading TA-Lib source...")
    !wget -q http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz
    !tar -xzf ta-lib-0.4.0-src.tar.gz
    
    print("üõ†Ô∏è Compiling TA-Lib (Approx 2 mins)...")
    os.chdir('ta-lib')
    !./configure --prefix=/usr > /dev/null
    !make > /dev/null
    !make install > /dev/null
    os.chdir('..')
    !rm -rf ta-lib ta-lib-0.4.0-src.tar.gz
    print("‚úÖ TA-Lib installed successfully!")
else:
    print("‚úÖ TA-Lib already present in system headers.")

## 2. Project & Code Setup
Initialize the workspace and install Python dependencies. We use the custom `python-deriv-api` fork.

In [None]:
import shutil

# We assume the x.titan codebase is uploaded as a Kaggle Dataset called 'xtitan-source'
SOURCE_DIR = Path('/kaggle/input/xtitan-source') 
WORKING_DIR = Path('/kaggle/working/xtitan')

if not WORKING_DIR.exists():
    print(f"üöÄ Deploying project to {WORKING_DIR}...")
    if SOURCE_DIR.exists():
        shutil.copytree(SOURCE_DIR, WORKING_DIR, 
                        ignore=shutil.ignore_patterns('venv', '__pycache__', '.git', 'data_cache', '.agent'))
    else:
        # If running from a uploaded zip in current directory
        print("‚ö†Ô∏è Source not found in /kaggle/input, assuming current directory contains code.")
        # No-op, just ensure we are in the right place
        WORKING_DIR = Path('/kaggle/working')

os.chdir(WORKING_DIR)
print(f"üìÇ Current Directory: {os.getcwd()}")

print("üì• Installing Python Packages...")
!pip install -q TA-Lib pandas numpy torch matplotlib tqdm pydantic pydantic-settings python-dotenv
!pip install -q ./python-deriv-api

## 3. Configuration (TFT Optimized)
Configure the Temporal Fusion Transformer (TFT) hyperparameters and data shapes.

In [None]:
%%writefile .env
ENVIRONMENT=test
TRADING__SYMBOL=R_100
TRADING__STAKE_AMOUNT=10.0

# Probability Thresholds (Post-Audit Standards)
THRESHOLDS__CONFIDENCE_THRESHOLD_HIGH=0.80
THRESHOLDS__LEARNING_THRESHOLD_MAX=0.70
THRESHOLDS__LEARNING_THRESHOLD_MIN=0.50

# TFT Hyperparameters
HYPERPARAMS__USE_TFT=True
HYPERPARAMS__LEARNING_RATE=0.0007
HYPERPARAMS__BATCH_SIZE=256
HYPERPARAMS__LSTM_HIDDEN_SIZE=256
HYPERPARAMS__CNN_FILTERS=128
HYPERPARAMS__LATENT_DIM=64
HYPERPARAMS__DROPOUT_RATE=0.2
HYPERPARAMS__EWC_SAMPLE_SIZE=5000

# Data Shapes (Optimized for context)
DATA_SHAPES__SEQUENCE_LENGTH_TICKS=1000
DATA_SHAPES__SEQUENCE_LENGTH_CANDLES=200
DATA_SHAPES__WARMUP_STEPS=50

## 4. Hardware Optimization & Detection

In [None]:
import torch

print("--- Device Validation ---")
if torch.cuda.is_available():
    print(f"‚úÖ GPU Detected: {torch.cuda.get_device_name(0)}")
    print(f"VRAM Available: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
    !nvidia-smi
else:
    print("‚ö†Ô∏è GPU NOT DETECTED. Training will be extremely slow.")
    # Fallback batch size reduction for CPU OOM avoidance
    os.environ['HYPERPARAMS__BATCH_SIZE'] = '16'

## 5. Data Sourcing (Smart Detection)
We check for pre-bundled data in `/kaggle/input`. If missing, we download 12 months from the Deriv API.

In [None]:
DATA_INPUT_PATH = Path('/kaggle/input/xtitan-data/data_cache')
DATA_LOCAL_PATH = Path('./data_cache')

if DATA_INPUT_PATH.exists():
    print("üîó Found bundled data in Kaggle Input. Creating symlink...")
    if DATA_LOCAL_PATH.exists():
        if DATA_LOCAL_PATH.is_symlink(): os.unlink(DATA_LOCAL_PATH)
        else: shutil.rmtree(DATA_LOCAL_PATH)
    
    # Symlink data to working directory for fast access
    os.symlink(DATA_INPUT_PATH, DATA_LOCAL_PATH)
    print("‚úÖ Data linked successfully.")
else:
    print("üì° Data not found in input. Downloading 12 months (Scraping mode)...")
    # Note: Requires DERIV_API_TOKEN in secrets if downloading
    !python scripts/download_data.py --months 12 --symbol R_100 --output data_cache

## 6. Pre-Flight Validation
Running the audited integrity scripts.

In [None]:
!python scripts/final_integrity_check.py
!python pre_training_validation.py

## 7. Model Training (Run Pipeline)
Execute the hardened training loop. Results are saved to `./checkpoints`.

In [None]:
# Clear previous stale logs/checkpoints
!rm -rf checkpoints/*.pt
!rm -rf logs/tensorboard/*

# Execute audited training script
!python scripts/train.py --data-path data_cache --epochs 50 --checkpoint-dir checkpoints

## 8. Export Artifacts
Bundle model and Fisher Information for deployment.

In [None]:
from datetime import datetime

ts = datetime.now().strftime("%Y%m%d_%H%M")
bundle_name = f"xtitan_model_{ts}"

print("üì¶ Bundling artifacts...")
shutil.make_archive(bundle_name, 'zip', root_dir='.', base_dir='checkpoints')
shutil.copy(".env", "last_config.env")

print(f"‚úÖ Done! Download {bundle_name}.zip from the output side-bar.")