# üèÄ NBA Predictor - Complete Cloud Training
## Neural Network + Full Features + GPU Acceleration

### What This Notebook Does:
‚úÖ Trains with ALL features (Team priors, Player priors, Optimization features, Phase 7)
‚úÖ Neural Network (TabNet + LightGBM) EMBEDDED (not optional)
‚úÖ GPU-accelerated for faster training (~20-30 min instead of hours)
‚úÖ Downloads trained models to your computer
‚úÖ Shows accuracy metrics for moneyline AND spread

### Steps:
1. Upload your `priors_data.zip` (drag & drop in next cell)
2. Run all cells (Runtime ‚Üí Run all)
3. Download your trained models
4. Done!

In [None]:
# ============================================================
# STEP 1: Upload Your Priors Data
# ============================================================
# Drag your priors_data.zip file into the file browser (left sidebar)
# OR run this cell to upload:

from google.colab import files
import os

print("üì§ Upload your priors_data.zip file:")
uploaded = files.upload()

# Extract priors
!unzip -q priors_data.zip -d /content/priors_data

# Verify
csv_files = !ls /content/priors_data/*.csv 2>/dev/null | wc -l
if int(csv_files) >= 6:
    print(f"‚úÖ Priors data uploaded! Found {csv_files} CSV files")
    !ls /content/priors_data/*.csv
else:
    print(f"‚ö†Ô∏è Only found {csv_files} files. Expected 6+ CSV files.")
    print("Make sure you uploaded the correct priors_data.zip")

In [None]:
# ============================================================
# STEP 2: Install Dependencies & Download Code
# ============================================================

print("üì¶ Installing packages...")
!pip install -q nba-api kagglehub pytorch-tabnet lightgbm scikit-learn pandas numpy tqdm

print("\nüì• Downloading latest code from GitHub...")
!wget -q https://github.com/tyriqmiles0529-pixel/meep/archive/refs/heads/main.zip
!unzip -q main.zip
!rm main.zip

%cd /content/meep-main
print("‚úÖ Code downloaded!")
print(f"üìÅ Working directory: {os.getcwd()}")

# Check GPU
import torch
print(f"\nüéÆ GPU Available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"   GPU: {torch.cuda.get_device_name(0)}")

In [None]:
# ============================================================
# STEP 3: Train Models with Neural Network + Full Features
# ============================================================

print("üöÄ Starting training...")
print("‚è±Ô∏è  This will take 20-30 minutes with GPU")
print("‚òï Get coffee!\n")

# Run training with ALL features
!python3 train_auto.py \
    --priors /content/priors_data \
    --verbose \
    --fresh \
    --use-gpu \
    --neural-epochs 50

print("\n‚úÖ TRAINING COMPLETE!")

In [None]:
# ============================================================
# STEP 4: Display Training Metrics
# ============================================================

print("üìä Training Metrics:\n")
!python3 show_metrics.py

# Show file structure
print("\nüìÅ Trained Models:")
!ls -lh models/*.pkl models/*.json 2>/dev/null || echo "No models found"

print("\nüìä Model Cache (windowed models):")
!ls -lh model_cache/*.pkl 2>/dev/null || echo "No cached models found"

In [None]:
# ============================================================
# STEP 5: Download Trained Models to Your Computer
# ============================================================

from google.colab import files
import os

print("üì¶ Preparing models for download...")

# Zip everything
!zip -r nba_models_trained.zip models/ model_cache/ -x '*.git*'

print("\nüíæ Downloading models to your computer...")
files.download('nba_models_trained.zip')

print("\n" + "="*80)
print("‚úÖ DONE!")
print("="*80)
print("\nNext steps:")
print("1. Extract nba_models_trained.zip to your local nba_predictor folder")
print("2. Run predictions locally with the new models")
print("3. Models include:")
print("   ‚Ä¢ Moneyline & Spread models (with accuracy metrics)")
print("   ‚Ä¢ Player prop models (Points, Rebounds, Assists, 3PM, Minutes)")
print("   ‚Ä¢ Neural hybrid models (TabNet + LightGBM)")
print("   ‚Ä¢ Ensemble models (Ridge + Elo + Four Factors)")
print("\nüéØ Your models are now trained on 20+ years of data with:")
print("   ‚úì Team statistical priors (O/D ratings, pace, four factors)")
print("   ‚úì Player statistical priors (~68 features from Basketball Reference)")
print("   ‚úì Optimization features (momentum, consistency, fatigue)")
print("   ‚úì Phase 7 features (situational context, adaptive weighting)")
print("   ‚úì Neural network embeddings (deep feature learning)")

---

## üîß Advanced: Run Custom Predictions in Colab

Want to test predictions right here instead of downloading? Run the cells below:

In [None]:
# Test predictions for today's games
!python3 -c "
from player_ensemble_enhanced import predict_all_props
import json

predictions = predict_all_props()
print(json.dumps(predictions, indent=2))
"

---

## üìä Accuracy Metrics Explained

### Moneyline Model:
- **Log Loss**: Lower is better (0.65 = good, 0.55 = excellent)
- **Brier Score**: Similar to log loss (0.22 = good, 0.18 = excellent)
- **Accuracy**: % of games predicted correctly (60%+ is profitable)

### Spread Model:
- **RMSE**: Root Mean Squared Error (10-12 points = good)
- **MAE**: Mean Absolute Error (8-10 points = good)
- **Coverage**: % of predictions within ¬±5 points (70%+ = excellent)

### Player Props:
- **RMSE**: Points/Rebounds/Assists error (6-8 = good for points)
- **MAE**: Average error (4-6 = good for points)
- **Hit Rate**: % of over/under picks that win (55%+ = profitable)

---

## ‚ùì Troubleshooting

### "No models found"
- Training failed - check the error output above
- Most common: priors_data.zip not uploaded correctly

### "GPU not available"
- Go to Runtime ‚Üí Change runtime type ‚Üí Hardware accelerator ‚Üí GPU
- Training will still work on CPU (just slower)

### "Out of memory"
- Restart runtime: Runtime ‚Üí Restart runtime
- Then re-run from Step 1

### Need help?
- Check QUICK_REFERENCE.txt in downloaded files
- Or create a GitHub issue

---

## üéØ Why This Works Better Than Local Training:

1. **GPU Acceleration**: 5-10x faster than CPU
2. **More RAM**: 12GB+ vs your laptop's limits
3. **No System Slowdown**: Your computer stays responsive
4. **Free**: Google Colab is free for up to 12 hours/session
5. **Consistent Environment**: No dependency conflicts

---

## üìà Model Architecture (What You're Training):

### Game Models:
1. **Ridge Regression** (baseline)
2. **Dynamic Elo** (momentum-based ratings)
3. **Four Factors** (advanced stats)
4. **LightGBM** (gradient boosting)
5. **Meta-Learner** (combines all 4)

### Player Models:
1. **TabNet** (deep learning for feature extraction)
2. **LightGBM** (using raw + deep features)
3. **Sigma Model** (uncertainty quantification)

### Feature Pipeline:
- **Phase 1-5**: Basic stats + rolling averages + team context
- **Phase 6**: Optimization (momentum, consistency, fatigue)
- **Phase 7**: Situational (schedule density, opponent history)
- **Basketball Reference Priors**: Historical statistical context

**Total Features**: ~120-150 per model

---

## üîÑ Re-training Schedule:

- **Daily**: Not needed (models are stable)
- **Weekly**: Run for current season updates
- **Monthly**: Full retrain recommended
- **Mid-Season**: After All-Star break (team dynamics change)
- **Playoffs**: Retrain with playoff-specific weights

You can upload your previous model_cache/ to speed up retraining (only trains new data)

---

**Version**: 2.0 (Neural Network Default, Full Features)

**Last Updated**: November 2025