# Neural Sequence Decoder - Colab Training

This notebook allows you to train the neural sequence decoder model on Google Colab.

## Setup Instructions:
1. Upload your project folder to Colab (compress as .zip first)
2. Run the setup cell to install dependencies and extract files
3. Configure training parameters if needed
4. Run training
5. Download the trained model files

**Note**: Make sure your data file `ptDecoder_ctc.pkl` is in the `data/pickledData/` folder of your uploaded project.

## 1. Setup Environment and Install Dependencies

In [1]:
import os
import sys
import zipfile
from pathlib import Path

# Check if we're in Colab
try:
    import google.colab
    IN_COLAB = True
    print("Running in Google Colab")
except ImportError:
    IN_COLAB = False
    print("Not running in Google Colab")

# Set up project directory
if IN_COLAB:
    # In Colab, look for uploaded zip file
    project_zips = [f for f in os.listdir('/content') if f.endswith('.zip')]
    if project_zips:
        project_zip = project_zips[0]
        print(f"Found project zip: {project_zip}")
        
        # Extract the zip file
        with zipfile.ZipFile(f'/content/{project_zip}', 'r') as zip_ref:
            zip_ref.extractall('/content')
        
        # Find the extracted project directory
        extracted_dirs = [d for d in os.listdir('/content') if os.path.isdir(f'/content/{d}') and d != 'sample_data']
        if extracted_dirs:
            project_dir = f'/content/{extracted_dirs[0]}'
        else:
            project_dir = '/content'
    else:
        print("No zip file found. Please upload your project as a zip file.")
        print("Expected structure: your_project.zip containing the neural_seq_decoder_pytorch folder")
        project_dir = '/content'
else:
    # Local development - assume we're in the notebooks directory
    project_dir = str(Path.cwd().parent)

print(f"Project directory: {project_dir}")
os.chdir(project_dir)
print(f"Current working directory: {os.getcwd()}")

# Add src to Python path
src_path = os.path.join(project_dir, 'src')
if src_path not in sys.path:
    sys.path.insert(0, src_path)
    print(f"Added {src_path} to Python path")

Not running in Google Colab
Project directory: /code
Current working directory: /code


In [2]:
# Install the package and its dependencies
if IN_COLAB:
    !pip install -e .
else:
    # For local development, assume dependencies are already installed
    print("Local development mode - assuming dependencies are installed")

# Verify installation
try:
    import torch
    import numpy as np
    from neural_decoder.neural_decoder_trainer import trainModel
    from neural_decoder.model import GRUDecoder
    print("✓ All dependencies imported successfully")
    print(f"PyTorch version: {torch.__version__}")
    print(f"CUDA available: {torch.cuda.is_available()}")
    if torch.cuda.is_available():
        print(f"CUDA device: {torch.cuda.get_device_name(0)}")
except ImportError as e:
    print(f"❌ Import error: {e}")
    print("Please check that the package was installed correctly")

Local development mode - assuming dependencies are installed


  from .autonotebook import tqdm as notebook_tqdm


✓ All dependencies imported successfully
PyTorch version: 1.13.1+cu117
CUDA available: True
CUDA device: NVIDIA GeForce RTX 2070


## 2. Verify Data Availability

In [3]:
# Check if data file exists
data_path = os.path.join(project_dir, 'data', 'pickledData', 'ptDecoder_ctc.pkl')
print(f"Looking for data at: {data_path}")

if os.path.exists(data_path):
    file_size = os.path.getsize(data_path) / (1024**3)  # Size in GB
    print(f"✓ Data file found! Size: {file_size:.2f} GB")
else:
    print("❌ Data file not found!")
    print("Please ensure your uploaded project contains:")
    print("  data/pickledData/ptDecoder_ctc.pkl")
    print("")
    print("You can create this file using the formatCompetitionData.ipynb notebook")
    
    # List what's actually in the data directory
    data_dir = os.path.join(project_dir, 'data')
    if os.path.exists(data_dir):
        print(f"Contents of {data_dir}:")
        for root, dirs, files in os.walk(data_dir):
            level = root.replace(data_dir, '').count(os.sep)
            indent = ' ' * 2 * level
            print(f"{indent}{os.path.basename(root)}/")
            subindent = ' ' * 2 * (level + 1)
            for file in files:
                print(f"{subindent}{file}")

Looking for data at: /code/data/pickledData/ptDecoder_ctc.pkl
✓ Data file found! Size: 3.26 GB


## 3. Training Configuration

Configure your training parameters here. The default settings are for full training (may take several hours).

In [4]:
# Training configuration
# You can modify these parameters as needed

# Choose training mode: 'full' or 'quick'
TRAINING_MODE = 'quick'  # Change to 'quick' for faster training with smaller model

# Set up output directories
output_dir = os.path.join(project_dir, 'outputs', 'models', 'colab_trained_model')
log_dir = os.path.join(project_dir, 'logs', 'colab_training')

# Create directories
os.makedirs(output_dir, exist_ok=True)
os.makedirs(log_dir, exist_ok=True)

print(f"Training mode: {TRAINING_MODE}")
print(f"Model output directory: {output_dir}")
print(f"Log directory: {log_dir}")

# Training arguments
if TRAINING_MODE == 'full':
    # Full training configuration (from train_rnn_full.sh)
    args = {
        'outputDir': output_dir,
        'datasetPath': data_path,
        'seqLen': 150,
        'maxTimeSeriesLen': 1200,
        'batchSize': 64,
        'lrStart': 0.02,
        'lrEnd': 0.02,
        'nUnits': 1024,
        'nBatch': 10000,  # This will take several hours
        'nLayers': 5,
        'seed': 0,
        'nClasses': 40,
        'nInputFeatures': 256,
        'dropout': 0.4,
        'whiteNoiseSD': 0.8,
        'constantOffsetSD': 0.2,
        'gaussianSmoothWidth': 2.0,
        'strideLen': 4,
        'kernelLen': 32,
        'bidirectional': True,
        'l2_decay': 1e-5
    }
    print("Using FULL training configuration (10,000 batches, ~several hours)")
else:
    # Quick training configuration (from train_rnn_quick.sh)
    args = {
        'outputDir': output_dir,
        'datasetPath': data_path,
        'seqLen': 150,
        'maxTimeSeriesLen': 1200,
        'batchSize': 32,
        'lrStart': 0.02,
        'lrEnd': 0.02,
        'nUnits': 512,
        'nBatch': 100,  # Much faster for testing
        'nLayers': 3,
        'seed': 0,
        'nClasses': 40,
        'nInputFeatures': 256,
        'dropout': 0.4,
        'whiteNoiseSD': 0.8,
        'constantOffsetSD': 0.2,
        'gaussianSmoothWidth': 2.0,
        'strideLen': 4,
        'kernelLen': 32,
        'bidirectional': True,
        'l2_decay': 1e-5
    }
    print("Using QUICK training configuration (500 batches, ~30 minutes)")

print("\nTraining parameters:")
for key, value in args.items():
    if key not in ['outputDir', 'datasetPath']:
        print(f"  {key}: {value}")

Training mode: quick
Model output directory: /code/outputs/models/colab_trained_model
Log directory: /code/logs/colab_training
Using QUICK training configuration (500 batches, ~30 minutes)

Training parameters:
  seqLen: 150
  maxTimeSeriesLen: 1200
  batchSize: 32
  lrStart: 0.02
  lrEnd: 0.02
  nUnits: 512
  nBatch: 100
  nLayers: 3
  seed: 0
  nClasses: 40
  nInputFeatures: 256
  dropout: 0.4
  whiteNoiseSD: 0.8
  constantOffsetSD: 0.2
  gaussianSmoothWidth: 2.0
  strideLen: 4
  kernelLen: 32
  bidirectional: True
  l2_decay: 1e-05


## 4. Run Training

This cell will start the training process and **automatically download** the trained model when complete.

**Features:**
- Monitor training progress with real-time updates
- **Auto-download**: Model files are automatically downloaded when training finishes
- Multiple download fallbacks ensure you get your model files
- Training duration tracking and file size reporting

**Warning**: Full training may take several hours. Make sure your Colab session stays active.

In [5]:
import time
from datetime import datetime

# Verify data exists before starting training
if not os.path.exists(data_path):
    print("❌ Cannot start training: Data file not found!")
    print(f"Expected: {data_path}")
else:
    print(f"✓ Data file found: {data_path}")
    print(f"Starting training at {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print(f"Training mode: {TRAINING_MODE}")
    print(f"Number of batches: {args['nBatch']}")
    print("\n" + "="*50)
    print("TRAINING STARTED")
    print("="*50 + "\n")
    
    # Start training
    start_time = time.time()
    
    try:
        # Import and run the training function
        from neural_decoder.neural_decoder_trainer import trainModel
        trainModel(args)
        
        # Training completed successfully
        end_time = time.time()
        training_duration = end_time - start_time
        hours = int(training_duration // 3600)
        minutes = int((training_duration % 3600) // 60)
        seconds = int(training_duration % 60)
        
        print("\n" + "="*50)
        print("TRAINING COMPLETED SUCCESSFULLY!")
        print("="*50)
        print(f"Training duration: {hours:02d}:{minutes:02d}:{seconds:02d}")
        print(f"Model saved to: {output_dir}")
        
        # List generated files
        if os.path.exists(output_dir):
            print("\nGenerated files:")
            for file in os.listdir(output_dir):
                file_path = os.path.join(output_dir, file)
                if os.path.isfile(file_path):
                    size_mb = os.path.getsize(file_path) / (1024**2)
                    print(f"  {file} ({size_mb:.1f} MB)")
        
        # AUTO-SAVE: Ensure model is saved persistently even if you're away
        print("\n" + "="*50)
        print("AUTO-SAVING TRAINED MODEL TO PERSISTENT STORAGE...")
        print("="*50)
        
        # Import download libraries safely
        try:
            from google.colab import files as colab_files
            COLAB_FILES_AVAILABLE = True
        except ImportError:
            COLAB_FILES_AVAILABLE = False
            colab_files = None
        
        try:
            from IPython.display import FileLink, display
            FILELINK_AVAILABLE = True
        except ImportError:
            FILELINK_AVAILABLE = False
            FileLink = None
        
        try:
            import base64
            from IPython.display import HTML
            HTML_DOWNLOAD_AVAILABLE = True
        except ImportError:
            HTML_DOWNLOAD_AVAILABLE = False
            HTML = None
        
        # Create zip file for download
        from datetime import datetime
        zip_filename = f'trained_model_{TRAINING_MODE}_{datetime.now().strftime("%Y%m%d_%H%M%S")}.zip'
        
        if IN_COLAB:
            zip_path = os.path.join('/content', zip_filename)
        else:
            zip_path = os.path.join(project_dir, zip_filename)
        
        print(f"Creating zip file: {zip_filename}")
        
        with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zipf:
            # Add all files from the output directory
            for root, dirs, files in os.walk(output_dir):
                for file in files:
                    file_path = os.path.join(root, file)
                    archive_path = os.path.relpath(file_path, output_dir)
                    zipf.write(file_path, archive_path)
                    print(f"  Added: {archive_path}")
            
            # Also add training logs if they exist
            if os.path.exists(log_dir):
                for root, dirs, files in os.walk(log_dir):
                    for file in files:
                        file_path = os.path.join(root, file)
                        archive_path = os.path.join('logs', os.path.relpath(file_path, log_dir))
                        zipf.write(file_path, archive_path)
                        print(f"  Added: {archive_path}")
        
        zip_size_mb = os.path.getsize(zip_path) / (1024**2)
        print(f"\nZip file created: {zip_filename} ({zip_size_mb:.1f} MB)")
        
        # Try multiple download methods
        download_success = False
        
        # Method 1: Google Colab files (primary)
        if IN_COLAB and COLAB_FILES_AVAILABLE:
            try:
                print("🚀 Auto-downloading via Google Colab...")
                colab_files.download(zip_path)
                print("✅ AUTO-DOWNLOAD COMPLETED!")
                download_success = True
            except Exception as e:
                print(f"❌ Colab auto-download failed: {e}")
        
        # Method 2: IPython FileLink (backup)
        if not download_success and FILELINK_AVAILABLE:
            try:
                print("🔗 Creating download link...")
                link = FileLink(zip_path)
                display(link)
                print("✅ Download link created! Click above to download.")
                download_success = True
            except Exception as e:
                print(f"❌ FileLink failed: {e}")
        
        # Method 3: HTML download (for small files)
        if not download_success and HTML_DOWNLOAD_AVAILABLE and zip_size_mb < 50:
            try:
                print("🎯 Creating HTML download button...")
                with open(zip_path, 'rb') as f:
                    zip_data = f.read()
                
                b64_data = base64.b64encode(zip_data).decode()
                download_html = f'''
                <div style="text-align: center; margin: 20px;">
                    <a download="{zip_filename}" href="data:application/zip;base64,{b64_data}" 
                       style="background-color: #4CAF50; color: white; padding: 15px 30px; 
                              text-decoration: none; border-radius: 8px; font-weight: bold; 
                              font-size: 16px; display: inline-block;">
                       🎉 DOWNLOAD TRAINED MODEL ({zip_size_mb:.1f} MB)
                    </a>
                </div>
                '''
                display(HTML(download_html))
                print("✅ Download button created! Click the button above.")
                download_success = True
            except Exception as e:
                print(f"❌ HTML download failed: {e}")
        
        # Fallback instructions
        if not download_success:
            print("\n⚠️  Auto-download failed. Manual options:")
            print(f"📁 File location: {zip_path}")
            if IN_COLAB:
                print("   • Use Colab file browser (left sidebar) to download")
                print("   • Or run: !cp '{}' /content/download.zip".format(zip_path))
        
        print("\n" + "="*50)
        print("TRAINING AND AUTO-DOWNLOAD COMPLETE!")
        print("="*50)
        
    except Exception as e:
        print(f"\n❌ Training failed with error: {e}")
        import traceback
        traceback.print_exc()

✓ Data file found: /code/data/pickledData/ptDecoder_ctc.pkl
Starting training at 2025-08-17 21:08:03
Training mode: quick
Number of batches: 100

TRAINING STARTED



  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
  return self.conv(input, weight=self.weight, groups=self.groups, padding="same")
  torch.tensor(pred[iterIdx, 0 : adjustedLens[iterIdx], :]),


batch 0, ctc loss: 6.301495, cer: 0.934836, time/batch:   0.028

TRAINING COMPLETED SUCCESSFULLY!
Training duration: 00:00:33
Model saved to: /code/outputs/models/colab_trained_model

Generated files:
  trainingStats (0.0 MB)
  args (0.0 MB)
  modelWeights (150.3 MB)

AUTO-SAVING TRAINED MODEL TO PERSISTENT STORAGE...
Creating zip file: trained_model_quick_20250817_210836.zip
  Added: trainingStats
  Added: args
  Added: modelWeights

Zip file created: trained_model_quick_20250817_210836.zip (138.1 MB)
🔗 Creating download link...


✅ Download link created! Click above to download.

TRAINING AND AUTO-DOWNLOAD COMPLETE!


## 5. Manual Download (Optional)

**Note**: The training cell above automatically downloads your model when training completes!

Use this cell only if:
- Auto-download failed during training
- You want to re-download the model files
- You need to download a previously trained model

In [6]:
import zipfile
from google.colab import files

if IN_COLAB and os.path.exists(output_dir):
    # Create a zip file with all model files
    zip_filename = f'trained_model_{TRAINING_MODE}_{datetime.now().strftime("%Y%m%d_%H%M%S")}.zip'
    zip_path = os.path.join('/content', zip_filename)
    
    print(f"Creating zip file: {zip_filename}")
    
    with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zipf:
        # Add all files from the output directory
        for root, dirs, files in os.walk(output_dir):
            for file in files:
                file_path = os.path.join(root, file)
                # Create archive path relative to output_dir
                archive_path = os.path.relpath(file_path, output_dir)
                zipf.write(file_path, archive_path)
                print(f"  Added: {archive_path}")
        
        # Also add training logs if they exist
        if os.path.exists(log_dir):
            for root, dirs, files in os.walk(log_dir):
                for file in files:
                    file_path = os.path.join(root, file)
                    archive_path = os.path.join('logs', os.path.relpath(file_path, log_dir))
                    zipf.write(file_path, archive_path)
                    print(f"  Added: {archive_path}")
    
    # Get zip file size
    zip_size_mb = os.path.getsize(zip_path) / (1024**2)
    print(f"\nZip file created: {zip_filename} ({zip_size_mb:.1f} MB)")
    
    # Download the zip file
    print("Starting download...")
    files.download(zip_path)
    print("Download completed!")
    
elif not IN_COLAB:
    print("Not running in Colab - model files are already saved locally at:")
    print(f"  {output_dir}")
    if os.path.exists(output_dir):
        print("\nModel files:")
        for file in os.listdir(output_dir):
            print(f"  {file}")
else:
    print("❌ No trained model found to download.")
    print("Please run the training cell first.")

ModuleNotFoundError: No module named 'google.colab'

## 6. Using the Trained Model

After downloading, you can load and use the trained model in your local environment:

In [None]:
# Example of how to load the trained model
# (This cell is for reference - run this in your local environment after downloading)

print("Example code to load the trained model:")
print("")
print("```python")
print("from neural_decoder.neural_decoder_trainer import loadModel")
print("")
print("# Load the trained model")
print(f"model = loadModel('{output_dir}', device='cuda')")
print("")
print("# The model is now ready for inference")
print("# Use model.forward(neural_input, day_idx) for predictions")
print("```")
print("")
print("Key files in the model directory:")
print("  - modelWeights: PyTorch model state dict")
print("  - args: Training arguments (pickled)")
print("  - trainingStats: Training statistics and metrics")

## Troubleshooting

### Common Issues:

1. **Data file not found**: Make sure your uploaded zip contains `data/pickledData/ptDecoder_ctc.pkl`
2. **Out of memory**: Try reducing batch size or switching to 'quick' training mode
3. **Session timeout**: For full training, consider using Colab Pro for longer sessions
4. **Import errors**: Make sure the package was installed correctly in the setup cell

### Tips:

- Monitor GPU usage: `!nvidia-smi`
- Check disk space: `!df -h`
- For long training, consider saving intermediate checkpoints
- Use 'quick' mode first to test everything works before full training