# TFT Training on Google Colab

This notebook sets up and runs TFT training on Google Colab (FREE GPU!).

**Steps:**
1. Enable GPU: Runtime → Change runtime type → GPU → Save
2. Run all cells below
3. Download results when training completes

## 1. Setup: Clone Repository and Install Dependencies

In [None]:
# Check if GPU is available
import torch
print(f"GPU Available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
else:
    print("⚠️ GPU not enabled! Go to Runtime → Change runtime type → GPU → Save")

In [None]:
# Clone repository
!git clone https://github.com/voltavista-lab/symmetrical-parakeet-n0.git
%cd symmetrical-parakeet-n0
!git checkout claude/temporal-fusion-transformers-2u7Wh

In [None]:
# Install dependencies
!pip install -q pytorch-forecasting pytorch-lightning pandas openpyxl xlrd
print("✓ Dependencies installed!")

## 2. Upload Data File (if not already in repository)

If `FINAL_INPUTS_v2.xls` is not in the repository, upload it:

In [None]:
# Check if data file exists
import os

if os.path.exists('FINAL_INPUTS_v2.xls'):
    print("✓ Data file found!")
else:
    print("⚠️ Data file not found. Uploading...")
    from google.colab import files
    uploaded = files.upload()
    print("✓ Upload complete!")

## 3. Run Training

Choose your configuration:

In [None]:
# Configuration
SHEET = 'southeast'  # Options: 'southeast', 'northeast', 'north', 'south'
EPOCHS = 100         # Number of training epochs
BATCH_SIZE = 64      # Batch size
HIDDEN_SIZE = 64     # Hidden layer size

print(f"Training configuration:")
print(f"  Sheet: {SHEET}")
print(f"  Epochs: {EPOCHS}")
print(f"  Batch Size: {BATCH_SIZE}")
print(f"  Hidden Size: {HIDDEN_SIZE}")

In [None]:
# Run training
!python tft_train.py \
    --sheet {SHEET} \
    --epochs {EPOCHS} \
    --batch-size {BATCH_SIZE} \
    --hidden-size {HIDDEN_SIZE}

## 4. View Results

In [None]:
# Display prediction plot
from IPython.display import Image, display
import os

plot_file = f'tft_predictions_{SHEET}.png'
if os.path.exists(plot_file):
    display(Image(filename=plot_file))
else:
    print(f"Plot not found: {plot_file}")

In [None]:
# Display residuals plot
residuals_file = f'tft_residuals_{SHEET}.png'
if os.path.exists(residuals_file):
    display(Image(filename=residuals_file))
else:
    print(f"Plot not found: {residuals_file}")

In [None]:
# Display scatter plot
scatter_file = f'tft_scatter_{SHEET}.png'
if os.path.exists(scatter_file):
    display(Image(filename=scatter_file))
else:
    print(f"Plot not found: {scatter_file}")

In [None]:
# View results CSV
import pandas as pd

results_file = f'tft_results_{SHEET}.csv'
if os.path.exists(results_file):
    df_results = pd.read_csv(results_file)
    print(f"Results shape: {df_results.shape}")
    print(f"\nFirst 10 rows:")
    display(df_results.head(10))
    print(f"\nLast 10 rows:")
    display(df_results.tail(10))
else:
    print(f"Results file not found: {results_file}")

## 5. Download Results

Download all result files to your local machine:

In [None]:
from google.colab import files
import glob

# Download all result files
result_patterns = [
    f'tft_results_{SHEET}.csv',
    f'tft_predictions_{SHEET}.png',
    f'tft_residuals_{SHEET}.png',
    f'tft_scatter_{SHEET}.png',
    f'tft_model_{SHEET}.pt'
]

for pattern in result_patterns:
    for file in glob.glob(pattern):
        print(f"Downloading: {file}")
        files.download(file)

print("\n✓ All files downloaded!")

## 6. Optional: View TensorBoard Logs

In [None]:
# Load TensorBoard
%load_ext tensorboard
%tensorboard --logdir lightning_logs

## 7. Train on Multiple Submarkets

To train on all submarkets sequentially:

In [None]:
# Train on all submarkets
submarkets = ['southeast', 'northeast', 'north', 'south']

for sheet in submarkets:
    print(f"\n{'='*60}")
    print(f"Training on {sheet.upper()}")
    print(f"{'='*60}\n")
    
    !python tft_train.py --sheet {sheet} --epochs {EPOCHS}
    
print("\n✓ All submarkets trained!")

## Tips for Colab

1. **Runtime limits**: Colab free tier has session limits (~12 hours). Save results frequently.
2. **GPU availability**: GPU access may be limited during peak times.
3. **Keep session alive**: Run a cell periodically or use Colab Pro for longer sessions.
4. **Save to Drive**: Mount Google Drive to save results automatically:
   ```python
   from google.colab import drive
   drive.mount('/content/drive')
   # Copy results to Drive
   !cp tft_*.* /content/drive/MyDrive/tft_results/
   ```