# üöÄ ASX Portfolio OS - Model A Training (Google Colab)

This notebook trains Model A (Technical/Momentum) using your production database.

**Steps:**
1. Install dependencies
2. Build training dataset from database
3. Train Model A with hyperparameter tuning
4. Download trained model artifacts

**Estimated time:** 30-40 minutes

In [None]:
# Step 1: Install dependencies
!pip install -q lightgbm==4.1.0 pandas==2.1.4 numpy==1.26.3 scikit-learn==1.3.2 \
    psycopg2-binary==2.9.9 python-dotenv==1.0.0 optuna==3.5.0 \
    matplotlib seaborn joblib shap prefect==2.14.11

In [None]:
# Step 2: Set environment variables
import os

# Database connection
os.environ['DATABASE_URL'] = 'postgresql://postgres.gxjqezqndltaelmyctnl:HugoRalph2026_DB_Pass_01@aws-1-ap-southeast-2.pooler.supabase.com:6543/postgres'

# Training configuration
os.environ['LOOKBACK_MONTHS'] = '36'
os.environ['CV_FOLDS'] = '12'
os.environ['MODEL_VERSION'] = 'v1_2'
os.environ['BATCH_SIZE'] = '100'

print('‚úÖ Environment configured')

In [None]:
# Step 3: Download the repository code
!git clone https://github.com/Jp8617465-sys/asx-portfolio-os.git
%cd asx-portfolio-os
!mkdir -p outputs data/training

## üìä Build Training Dataset

This fetches 36 months of price data from your database and computes features.

In [None]:
# Build the training dataset
%run jobs/build_training_dataset.py

In [None]:
# Verify dataset was created
import pandas as pd

df = pd.read_csv('outputs/model_a_training_dataset.csv')
print(f"‚úÖ Dataset loaded: {len(df):,} rows, {df['symbol'].nunique()} symbols")
print(f"‚úÖ Date range: {df['dt'].min()} to {df['dt'].max()}")
print(f"\nFeatures: {[c for c in df.columns if c not in ['dt', 'symbol', 'close', 'volume', 'return_1m_fwd']]}")
df.head()

## üéØ Train Model A (Standard)

Train with default hyperparameters (faster, ~10-15 minutes).

In [None]:
# Train Model A with default parameters
%run models/train_model_a_ml.py

## üî¨ Train Model A with Hyperparameter Tuning (Optional)

Use Optuna to optimize hyperparameters (slower, ~30 minutes, better performance).

**Skip this if you already trained above.**

In [None]:
# Optional: Hyperparameter tuning with Optuna
import sys
sys.argv = ['', '--tune-hyperparams', '--n-trials', '30']
%run scripts/train_production_models.py

## üì• Download Trained Model

Download the trained model artifacts to your local machine.

In [None]:
# List all generated files
!ls -lh outputs/

# Zip all outputs for download
!zip -r model_a_artifacts.zip outputs/

from google.colab import files
files.download('model_a_artifacts.zip')

print("\n‚úÖ Download the model_a_artifacts.zip file from the Files panel (left sidebar)")
print("üìÅ Extract it and upload the .pkl files to your Render deployment")

## üìä Model Performance Summary

In [None]:
# Display training results
import json

with open('outputs/model_a_v1_2_metrics.json', 'r') as f:
    metrics = json.load(f)

print("üéØ Model A Performance:")
print(f"   ROC-AUC: {metrics.get('roc_auc_mean', 'N/A'):.4f}")
print(f"   RMSE: {metrics.get('rmse_mean', 'N/A'):.4f}")
print(f"   Sharpe Ratio: {metrics.get('sharpe_ratio', 'N/A'):.2f}")

print("\n‚úÖ Training complete!")
print("\nNext steps:")
print("1. Download model_a_artifacts.zip")
print("2. Upload .pkl files to Render at /app/outputs/")
print("3. Restart Render service to load new model")
print("4. Validate: curl https://asx-portfolio-os.onrender.com/model/status/summary")