# Brain-to-Text Competition: Training Plan & Platform Options

## üìã Project Summary

**Goal:** Train and evaluate a brain-to-text model for the [Kaggle Brain-to-Text '25 Competition](https://www.kaggle.com/competitions/brain-to-text-25)

**Objective:** Decode neural signals from the speech motor cortex into text using a two-stage pipeline:
1. **RNN Model** (GRU-based): Predicts phonemes from neural data (512 features from 256 electrodes)
2. **Language Model** (Ngram + OPT 6.7b): Converts phoneme sequences to text predictions

**Evaluation Metric:** Word Error Rate (WER) - lower is better

**Output Format:** CSV file with `id` and `text` columns for Kaggle submission

---

## üéØ Task Breakdown

### Phase 1: Environment Setup
1. **Data Preparation**
   - Download datasets from Dryad (~10GB+)
   - Verify data directory structure
   - Unzip neural data files

2. **Environment Configuration**
   - Set up conda environment for model training (`b2txt25`)
   - Set up conda environment for language model (`b2txt25_lm`)
   - Install system dependencies (Redis, CMake, gcc)
   - Verify GPU availability and CUDA compatibility

### Phase 2: Model Training
3. **Baseline RNN Training**
   - Configure training hyperparameters (`rnn_args.yaml`)
   - Train GRU decoder model (120,000 batches, ~3.5 hours on RTX 4090)
   - Monitor validation Phoneme Error Rate (PER)
   - Save best checkpoint based on validation metrics
   - Target: Achieve ~10.1% aggregate PER on validation set

4. **Model Evaluation (Validation)**
   - Load trained model checkpoint
   - Run inference on validation set to get phoneme logits
   - Pass logits through language model to get word predictions
   - Calculate WER on validation set
   - Generate submission CSV for validation split

### Phase 3: Submission Generation
5. **Test Set Inference**
   - Run inference on test set (no ground truth available)
   - Generate final submission CSV with predictions
   - Format: `id,text` columns
   - Submit to Kaggle competition

### Phase 4: Model Improvement (Optional)
6. **Hyperparameter Tuning**
   - Experiment with different model architectures
   - Adjust learning rates, dropout, batch sizes
   - Try different data augmentation strategies
   - Experiment with different language models (1gram, 3gram, 5gram)

---

## üìù Detailed Implementation Plan

### Step 1: Data Setup
```bash
# Navigate to project root
cd /Users/tim/Documents/timo/semester7/DataMining/KaggleCompetition/nejm-brain-to-text

# Activate conda environment
conda activate b2txt25

# Download data from Dryad
python download_data.py

# Verify data structure:
# data/
# ‚îú‚îÄ‚îÄ t15_copyTask.pkl
# ‚îú‚îÄ‚îÄ t15_personalUse.pkl
# ‚îú‚îÄ‚îÄ hdf5_data_final/          # Unzipped from t15_copyTask_neuralData.zip
# ‚îÇ   ‚îú‚îÄ‚îÄ t15.2023.08.11/
# ‚îÇ   ‚îÇ   ‚îú‚îÄ‚îÄ data_train.hdf5
# ‚îÇ   ‚îú‚îÄ‚îÄ t15.2023.08.13/
# ‚îÇ   ‚îÇ   ‚îú‚îÄ‚îÄ data_train.hdf5
# ‚îÇ   ‚îÇ   ‚îú‚îÄ‚îÄ data_val.hdf5
# ‚îÇ   ‚îÇ   ‚îú‚îÄ‚îÄ data_test.hdf5
# ‚îÇ   ‚îî‚îÄ‚îÄ ...
# ‚îî‚îÄ‚îÄ t15_pretrained_rnn_baseline/  # Unzipped from t15_pretrained_rnn_baseline.zip
#     ‚îú‚îÄ‚îÄ checkpoint/
#     ‚îÇ   ‚îú‚îÄ‚îÄ args.yaml
#     ‚îÇ   ‚îú‚îÄ‚îÄ best_checkpoint
#     ‚îî‚îÄ‚îÄ training_log
```

### Step 2: Environment Setup

#### For Model Training (`b2txt25`):
```bash
# From project root
./setup.sh

# Verify installation
conda activate b2txt25
python -c "import torch; print(f'PyTorch: {torch.__version__}'); print(f'CUDA available: {torch.cuda.is_available()}')"
```

**Requirements:**
- Python 3.10
- PyTorch with CUDA 12.6
- Redis, NumPy, Pandas, h5py, etc. (see `setup.sh`)

#### For Language Model (`b2txt25_lm`):
```bash
# From project root
./setup_lm.sh

# Verify installation
conda activate b2txt25_lm
python -c "import torch; print(f'PyTorch: {torch.__version__}')"
```

**Requirements:**
- Python 3.9
- PyTorch 1.13.1 (older version for LM compatibility)
- CMake >= 3.14
- gcc >= 10.1

#### System Dependencies:
```bash
# Install Redis (Ubuntu/Debian)
sudo apt-get update
sudo apt-get install redis-server build-essential cmake

# Disable Redis auto-restart
sudo systemctl disable redis-server
```

### Step 3: Training Configuration

**File:** `model_training/rnn_args.yaml`

**Key Parameters to Review:**
- `gpu_number`: Set to available GPU (default: '1')
- `num_training_batches`: Default 120,000 (~3.5 hours on RTX 4090)
- `batch_size`: Default 64 (adjust based on GPU memory)
- `lr_max`: Default 0.005 (learning rate)
- `output_dir`: Where to save trained model
- `checkpoint_dir`: Where to save checkpoints

**Training Sessions:**
- 45 sessions spanning 20 months
- 10,948 sentences total
- Training/validation split defined in `dataset_probability_val` array

### Step 4: Model Training

```bash
cd model_training
conda activate b2txt25

# Train the model
python train_model.py

# Monitor training:
# - Training logs saved to: trained_models/baseline_rnn/training_log
# - Best checkpoint saved to: trained_models/baseline_rnn/checkpoint/best_checkpoint
# - Validation metrics saved to: trained_models/baseline_rnn/checkpoint/val_metrics.pkl
```

**Expected Training Time:**
- ~3.5 hours on RTX 4090
- ~7-10 hours on RTX 3080
- ~15-20 hours on RTX 3060
- Much longer on CPU (not recommended)

**Monitoring:**
- Validation PER should decrease over time
- Target: ~10.1% aggregate PER on validation set
- Check training log for progress every 200 batches

### Step 5: Model Evaluation

#### Step 5a: Start Redis Server
```bash
# In a separate terminal
redis-server

# Keep this running during evaluation
```

#### Step 5b: Start Language Model
```bash
# In another separate terminal
cd /path/to/nejm-brain-to-text
conda activate b2txt25_lm

# For 1gram model (lightweight, no grammatical structure)
python language_model/language-model-standalone.py \
    --lm_path language_model/pretrained_language_models/openwebtext_1gram_lm_sil \
    --do_opt \
    --nbest 100 \
    --acoustic_scale 0.325 \
    --blank_penalty 90 \
    --alpha 0.55 \
    --redis_ip localhost \
    --gpu_number 0

# For 3gram model (requires ~60GB RAM, better accuracy)
python language_model/language-model-standalone.py \
    --lm_path language_model/pretrained_language_models/openwebtext_3gram_lm_sil \
    --do_opt \
    --nbest 100 \
    --acoustic_scale 0.325 \
    --blank_penalty 90 \
    --alpha 0.55 \
    --redis_ip localhost \
    --gpu_number 0

# For 5gram model (requires ~300GB RAM, best accuracy)
python language_model/language-model-standalone.py \
    --lm_path language_model/pretrained_language_models/openwebtext_5gram_lm_sil \
    --rescore \
    --do_opt \
    --nbest 100 \
    --acoustic_scale 0.325 \
    --blank_penalty 90 \
    --alpha 0.55 \
    --redis_ip localhost \
    --gpu_number 0
```

**Note:** First run will download OPT-6.7b from HuggingFace (~13GB)

#### Step 5c: Run Evaluation
```bash
# In main terminal
cd model_training
conda activate b2txt25

# Evaluate on validation set (for testing)
python evaluate_model.py \
    --model_path trained_models/baseline_rnn \
    --data_dir ../data/hdf5_data_final \
    --eval_type val \
    --gpu_number 1

# Evaluate on test set (for submission)
python evaluate_model.py \
    --model_path trained_models/baseline_rnn \
    --data_dir ../data/hdf5_data_final \
    --eval_type test \
    --gpu_number 1
```

**Output:**
- CSV file: `baseline_rnn_{eval_type}_predicted_sentences_YYYYMMDD_HHMMSS.csv`
- Contains `id` and `text` columns ready for Kaggle submission
- For validation set, also prints WER metrics

#### Step 5d: Shutdown
```bash
# When done, shutdown Redis
redis-cli shutdown
```

---

## üñ•Ô∏è Platform Options for Training

### Option 1: Local Machine (macOS - Current Setup)

**Pros:**
- ‚úÖ No setup required, already have the code
- ‚úÖ Full control over environment
- ‚úÖ No internet dependency during training
- ‚úÖ Easy to iterate and debug

**Cons:**
- ‚ùå **No GPU support on macOS** (Apple Silicon uses Metal, not CUDA)
- ‚ùå Training will be extremely slow on CPU (days/weeks)
- ‚ùå Language model inference requires GPU with 12.4GB+ VRAM
- ‚ùå Large language models (3gram/5gram) require massive RAM

**Verdict:** ‚ùå **NOT RECOMMENDED** for training. Only use for code development and testing.

**Recommendation:** Use this setup for code development, then train on a GPU-enabled platform.

---

### Option 2: WSL2 (Windows Subsystem for Linux) with NVIDIA GPU

**Pros:**
- ‚úÖ Native Linux environment (Ubuntu 22.04 recommended)
- ‚úÖ Direct GPU access if NVIDIA GPU is available
- ‚úÖ Can run on Windows machine
- ‚úÖ Full control over environment
- ‚úÖ No cloud costs

**Cons:**
- ‚ùå Requires Windows 11 with WSL2
- ‚ùå Requires NVIDIA GPU with CUDA support
- ‚ùå Requires NVIDIA drivers for WSL2
- ‚ùå Setup complexity (GPU passthrough)
- ‚ùå Limited by local hardware resources

**Setup Requirements:**
1. Windows 11 with WSL2 installed
2. NVIDIA GPU with CUDA support (RTX series recommended)
3. NVIDIA drivers for WSL2
4. Ubuntu 22.04 distribution in WSL2

**Estimated Cost:** Free (uses existing hardware)

**Best For:** Users with Windows machine + NVIDIA GPU

**Setup Steps:**
```bash
# Install WSL2 with Ubuntu 22.04
wsl --install -d Ubuntu-22.04

# Install NVIDIA drivers for WSL2
# Download from: https://www.nvidia.com/Download/index.aspx

# Inside WSL2, install CUDA toolkit
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.6.0/local_installers/cuda-repo-wsl-ubuntu-12-6-local_12.6.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-12-6-local_12.6.0-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-12-6-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda

# Verify CUDA
nvidia-smi
nvcc --version

# Clone repository and setup
git clone <repo-url>
cd nejm-brain-to-text
./setup.sh
./setup_lm.sh
```

---

### Option 3: Google Colab (Free Tier)

**Pros:**
- ‚úÖ Free GPU access (T4, 16GB VRAM)
- ‚úÖ Pre-configured environment
- ‚úÖ No local setup required
- ‚úÖ Easy to share and collaborate
- ‚úÖ Jupyter notebook interface

**Cons:**
- ‚ùå **Limited runtime** (12 hours max, then disconnects)
- ‚ùå **Training may not complete** in one session (3.5+ hours needed)
- ‚ùå Unstable connection (can disconnect)
- ‚ùå Limited storage (need to upload/download data)
- ‚ùå Can't run Redis server easily
- ‚ùå Difficult to run language model pipeline
- ‚ùå No guarantee of GPU availability

**Estimated Cost:** Free (with limitations)

**Best For:** Quick experiments, testing code, prototyping

**Setup Steps:**
1. Upload project to Google Drive or GitHub
2. Mount Google Drive in Colab
3. Install dependencies in Colab notebook
4. Run training (may need multiple sessions)

**Limitations:**
- Training time: ~3.5 hours (but session limits: 12 hours max)
- May need to save checkpoints and resume
- Language model evaluation is complex in Colab

**Verdict:** ‚ö†Ô∏è **POSSIBLE BUT CHALLENGING** - Good for prototyping, not ideal for full pipeline

---

### Option 4: Google Colab Pro ($10/month)

**Pros:**
- ‚úÖ More reliable GPU access (T4, A100 options)
- ‚úÖ Longer runtime sessions
- ‚úÖ Better performance
- ‚úÖ Priority access to GPUs

**Cons:**
- ‚ùå Still has session limits
- ‚ùå Monthly subscription cost
- ‚ùå Storage limitations
- ‚ùå Complex setup for full pipeline

**Estimated Cost:** $10/month

**Best For:** Users who want better Colab experience

**Verdict:** ‚ö†Ô∏è **BETTER THAN FREE TIER** but still has limitations

---

### Option 5: Kaggle Notebooks (Free)

**Pros:**
- ‚úÖ Free GPU access (P100, 16GB VRAM)
- ‚úÖ 30 hours/week GPU time limit
- ‚úÖ Pre-configured environment
- ‚úÖ Competition-specific platform
- ‚úÖ Easy data access
- ‚úÖ Can run for ~9 hours per session

**Cons:**
- ‚ùå Limited to 30 hours/week GPU time
- ‚ùå Session limits (~9 hours max)
- ‚ùå Storage limitations
- ‚ùå Complex setup for Redis + language model
- ‚ùå Internet access restrictions

**Estimated Cost:** Free

**Best For:** Competition participants, quick experiments

**Setup Approach:**
1. Upload project as Kaggle dataset
2. Create new notebook with GPU enabled
3. Install dependencies
4. Run training (may need to save/resume)

**Verdict:** ‚ö†Ô∏è **GOOD FOR COMPETITION** but may need multiple sessions for full training

---

### Option 6: AWS EC2 (Recommended)

**Pros:**
- ‚úÖ Full control over environment
- ‚úÖ Choose GPU instance (g4dn, p3, p4d)
- ‚úÖ Can run complete pipeline
- ‚úÖ Persistent storage (EBS)
- ‚úÖ Can run 24/7 if needed
- ‚úÖ Professional setup

**Cons:**
- ‚ùå Costs money ($0.50-$10+/hour depending on instance)
- ‚ùå Requires AWS account setup
- ‚ùå Need to manage instance lifecycle
- ‚ùå More complex initial setup

**Estimated Cost:** 
- **g4dn.xlarge** (T4, 16GB): ~$0.50/hour = **~$1.75 per training run**
- **p3.2xlarge** (V100, 16GB): ~$3.06/hour = **~$10.71 per training run**
- **p4d.24xlarge** (A100, 40GB): ~$32.77/hour = **~$114.70 per training run**

**Best For:** Serious training, production workloads

**Recommended Instance:** `g4dn.xlarge` (T4 GPU, 16GB VRAM, sufficient for training)

**Setup Steps:**
```bash
# 1. Launch EC2 instance
# - AMI: Ubuntu 22.04 LTS
# - Instance: g4dn.xlarge (or larger)
# - Storage: 100GB+ (for data and models)
# - Security Group: Allow SSH (port 22)

# 2. Connect via SSH
ssh -i your-key.pem ubuntu@your-instance-ip

# 3. Install NVIDIA drivers and CUDA
sudo apt-get update
sudo apt-get install -y build-essential
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.6.0/local_installers/cuda-repo-ubuntu2204-12-6-local_12.6.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-12-6-local_12.6.0-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-6-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-6
sudo apt-get -y install nvidia-driver-550

# 4. Install Redis, CMake, GCC
sudo apt-get install -y redis-server build-essential cmake
sudo systemctl disable redis-server

# 5. Install Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b
source ~/miniconda3/bin/activate

# 6. Clone repository and setup
git clone <repo-url>  # or upload via SCP
cd nejm-brain-to-text
./setup.sh
./setup_lm.sh

# 7. Download data
conda activate b2txt25
python download_data.py

# 8. Train model (use screen or tmux for long-running jobs)
screen -S training
conda activate b2txt25
cd model_training
python train_model.py
# Press Ctrl+A then D to detach

# 9. Monitor training
screen -r training

# 10. Download results when done
# Use SCP to download trained models
scp -i your-key.pem -r ubuntu@your-instance-ip:~/nejm-brain-to-text/trained_models ./
```

**Cost Optimization Tips:**
- Use Spot Instances for 70% cost savings (but can be interrupted)
- Stop instance when not training
- Use smaller instance for evaluation only
- Consider Reserved Instances for long-term use

**Verdict:** ‚úÖ **HIGHLY RECOMMENDED** for serious training

---

### Option 7: Google Cloud Platform (GCP)

**Pros:**
- ‚úÖ Similar to AWS, full control
- ‚úÖ Good GPU options
- ‚úÖ $300 free credits for new users
- ‚úÖ Persistent storage

**Cons:**
- ‚ùå Costs money after free credits
- ‚ùå More complex setup
- ‚ùå Need GCP account

**Estimated Cost:**
- **n1-standard-4 + T4 GPU**: ~$0.35/hour = **~$1.23 per training run**
- **n1-standard-8 + V100**: ~$2.50/hour = **~$8.75 per training run**

**Best For:** Users with GCP credits or preference

**Setup:** Similar to AWS, but using GCP Compute Engine

**Verdict:** ‚úÖ **GOOD ALTERNATIVE TO AWS**

---

### Option 8: Azure ML / Azure Compute

**Pros:**
- ‚úÖ Managed ML platform
- ‚úÖ Good GPU options
- ‚úÖ $200 free credits for new users
- ‚úÖ Integration with ML tools

**Cons:**
- ‚ùå Costs money
- ‚ùå More complex setup
- ‚ùå Need Azure account

**Estimated Cost:** Similar to AWS/GCP

**Verdict:** ‚úÖ **GOOD OPTION** if you prefer Azure ecosystem

---

### Option 9: Lambda Labs / Vast.ai / RunPod (GPU Rental)

**Pros:**
- ‚úÖ Cheaper than AWS/GCP (often 50-70% less)
- ‚úÖ Pay per hour
- ‚úÖ Good GPU selection
- ‚úÖ Simple setup

**Cons:**
- ‚ùå Less established providers
- ‚ùå May have less reliability
- ‚ùå Need to trust third-party

**Estimated Cost:**
- **RTX 3090 (24GB)**: ~$0.35/hour = **~$1.23 per training run**
- **A100 (40GB)**: ~$1.10/hour = **~$3.85 per training run**

**Best For:** Cost-conscious users

**Verdict:** ‚úÖ **COST-EFFECTIVE OPTION**

---

### Option 10: University/Research Compute Cluster

**Pros:**
- ‚úÖ Often free for students/researchers
- ‚úÖ High-performance GPUs
- ‚úÖ Professional infrastructure
- ‚úÖ Support available

**Cons:**
- ‚ùå May require approval/access
- ‚ùå May have usage limits
- ‚ùå Less control
- ‚ùå May have queue waiting times

**Best For:** Students with access to university resources

**Verdict:** ‚úÖ **BEST IF AVAILABLE**

---

## üéØ Platform Recommendation Summary

### For Training (Ranked by Preference):

1. **ü•á University Compute Cluster** (if available)
   - Free, high-performance, professional setup

2. **ü•à AWS EC2 (g4dn.xlarge)** 
   - ~$1.75 per training run
   - Full control, reliable, professional

3. **ü•â Lambda Labs / Vast.ai**
   - ~$1.23 per training run (RTX 3090)
   - Cost-effective, good performance

4. **Kaggle Notebooks**
   - Free, but limited to 30 hours/week
   - Good for competition, may need multiple sessions

5. **Google Colab Pro**
   - $10/month, but still has limitations
   - Good for prototyping

6. **WSL2 (if you have NVIDIA GPU)**
   - Free, but requires Windows + NVIDIA GPU
   - Good if hardware is available

7. **Local macOS**
   - ‚ùå Not recommended (no CUDA support)

---

## üìä Resource Requirements Summary

### Minimum Requirements:
- **GPU**: NVIDIA GPU with CUDA support (8GB+ VRAM minimum, 16GB+ recommended)
- **RAM**: 16GB minimum (60GB+ for 3gram LM, 300GB+ for 5gram LM)
- **Storage**: 50GB+ for data and models
- **OS**: Ubuntu 22.04 (recommended) or Linux equivalent
- **Training Time**: ~3.5 hours on RTX 4090, longer on slower GPUs

### Recommended Setup:
- **GPU**: RTX 3090/4090, V100, or A100 (16GB+ VRAM)
- **RAM**: 32GB+ (64GB+ for 3gram LM)
- **Storage**: 100GB+ SSD
- **OS**: Ubuntu 22.04 LTS
- **Network**: Stable connection for data download

---

## ‚ö†Ô∏è Important Considerations

1. **Training Interruptions**: 
   - Save checkpoints regularly (configured in `rnn_args.yaml`)
   - Use `screen` or `tmux` for long-running jobs
   - Consider resumable training if interrupted

2. **Data Storage**:
   - Data is ~10GB+ compressed
   - Unzipped data is larger
   - Trained models are several GB
   - Plan for sufficient storage

3. **Language Model Requirements**:
   - OPT 6.7b requires 12.4GB+ VRAM
   - 3gram LM requires ~60GB RAM
   - 5gram LM requires ~300GB RAM
   - May need to use smaller models or upgrade hardware

4. **Redis Server**:
   - Required for language model inference
   - Must run during evaluation
   - Can run on same machine or separate instance

5. **Multiple Environments**:
   - Two conda environments needed (`b2txt25` and `b2txt25_lm`)
   - Different PyTorch versions (training vs. LM)
   - Cannot mix environments

---

## üöÄ Quick Start Recommendation

**For Quick Testing:**
1. Use **Kaggle Notebooks** (free, 30 hours/week)
2. Upload project as dataset
3. Run training in notebook
4. Save checkpoints and download results

**For Serious Training:**
1. Use **AWS EC2 g4dn.xlarge** (~$1.75 per run)
2. Follow AWS setup steps above
3. Use `screen` for long-running training
4. Download results when complete

**For Cost-Conscious:**
1. Use **Lambda Labs** or **Vast.ai** (~$1.23 per run)
2. Similar setup to AWS
3. Monitor usage carefully

---

## ‚ùì Questions to Clarify

Before proceeding, please confirm:

1. **What is your primary goal?**
   - [ ] Just get baseline model running
   - [ ] Compete in Kaggle competition
   - [ ] Experiment with improvements
   - [ ] Reproduce paper results

2. **What resources do you have access to?**
   - [ ] University compute cluster
   - [ ] Local machine with NVIDIA GPU
   - [ ] AWS/GCP/Azure account
   - [ ] Budget for cloud computing ($1-5 per training run)
   - [ ] Only free options

3. **What is your timeline?**
   - [ ] Need results ASAP
   - [ ] Can wait for free resources
   - [ ] Have weeks/months

4. **What is your experience level?**
   - [ ] Comfortable with Linux/cloud setup
   - [ ] Prefer managed platforms (Colab/Kaggle)
   - [ ] Need step-by-step guidance

5. **Do you need the full pipeline?**
   - [ ] Just training the RNN model
   - [ ] Need language model evaluation too
   - [ ] Need to generate submission file

---

## üìù Next Steps

Once you've chosen a platform:

1. **Confirm platform choice**
2. **Set up environment** (follow platform-specific steps)
3. **Download and verify data**
4. **Run training** (monitor closely first time)
5. **Evaluate model** (validation set first)
6. **Generate submission** (test set)
7. **Submit to Kaggle**

---

**Document Version:** 1.0  
**Last Updated:** 2025-01-XX  
**Author:** Training Plan Generator

