# FUME-FastSCNN Training Notebook

This notebook trains the FUME-FastSCNN model for dual-gas acidosis detection.

**Model:** FUME-FastSCNN (~2.8M parameters)
**Task:** Multi-task (Segmentation + Classification)
**Dataset:** Augmented gas emission dataset (8,967 samples)

## 1. Setup

In [1]:
import sys
sys.path.append('..')  # Add parent directory to path

import torch
import yaml
from pathlib import Path

# Check CUDA
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA device: {torch.cuda.get_device_name(0)}")
    print(f"CUDA version: {torch.version.cuda}")

PyTorch version: 2.8.0+cu128
CUDA available: True
CUDA device: NVIDIA RTX 6000 Ada Generation
CUDA version: 12.8


## 2. Configuration

In [2]:
# Load config
config_path = '../configs/fume_fastscnn_config.yaml'

with open(config_path, 'r') as f:
    config = yaml.safe_load(f)

print("üìã Configuration loaded:")
print(f"  Experiment: {config['experiment']['name']}")
print(f"  Model: {config['model']['name']}")
print(f"  Batch size: {config['training']['batch_size']}")
print(f"  Epochs: {config['training']['num_epochs']}")
print(f"  Learning rate: {config['training']['optimizer']['lr']}")

üìã Configuration loaded:
  Experiment: FUME-FastSCNN
  Model: FUMEFastSCNN
  Batch size: 8
  Epochs: 50
  Learning rate: 0.001


## 3. Data Preparation

**First, run the data pairing script if not already done:**

```bash
cd ../data
python pairing.py
```

In [3]:
# Check if paired annotations exist
import os

required_files = [
    '../data/paired_train_annotations.csv',
    '../data/paired_val_annotations.csv',
    '../data/paired_test_annotations.csv'
]

all_exist = all(os.path.exists(f) for f in required_files)

if all_exist:
    print("‚úÖ All paired annotation files found!")
else:
    print("‚ùå Paired annotation files not found. Run data/pairing.py first!")
    print("\nRun this command in terminal:")
    print("  cd data && python pairing.py")

‚úÖ All paired annotation files found!


## 4. Model Overview

In [4]:
from models import FUMEFastSCNN

# Create model
model = FUMEFastSCNN(
    num_classes=3,
    num_seg_classes=3,
    shared_encoder=True
)

# Count parameters
num_params = model.get_num_parameters()
print(f"\nModel Statistics:")
print(f"  Total parameters: {num_params:,}")
print(f"  Parameters (M): {num_params/1e6:.2f}M")
print(f"  Within budget: {'Yes' if num_params < 3e6 else 'No'}")

# Test forward pass (use eval mode for inference testing)
model.eval()
dummy_co2 = torch.randn(1, 1, 480, 640)
dummy_ch4 = torch.randn(1, 1, 480, 640)
dummy_mask = torch.ones(1, 2)

with torch.no_grad():
    outputs = model(dummy_co2, dummy_ch4, dummy_mask)
print(f"\nOutput shapes:")
print(f"  Classification: {outputs['cls_logits'].shape}")
print(f"  CO2 Segmentation: {outputs['co2_seg_logits'].shape}")
print(f"  CH4 Segmentation: {outputs['ch4_seg_logits'].shape}")


Model Statistics:
  Total parameters: 1,656,236
  Parameters (M): 1.66M
  Within budget: Yes

Output shapes:
  Classification: torch.Size([1, 3])
  CO2 Segmentation: torch.Size([1, 3, 480, 640])
  CH4 Segmentation: torch.Size([1, 3, 480, 640])


## 5. Start Training

**Option B: Train in notebook (for debugging)**

In [None]:
# Import trainer
import os
sys.path.append('..')

# Change to project root (required for config relative paths)
os.chdir('..')
print(f"Working directory: {os.getcwd()}")

from train import Trainer

# Create trainer
trainer = Trainer(config_path='configs/fume_fastscnn_config.yaml')

# Start training
trainer.train()

Working directory: /home/siu856569517/Taminul/Acidosis/FUME


  import pkg_resources
ERROR:wandb.jupyter:Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


‚úÖ Seed set to 42
üöÄ Using device: cuda
‚úÖ Directories created


  import pkg_resources
[34m[1mwandb[0m: Currently logged in as: [33mtaminul[0m. Use [1m`wandb login --relogin`[0m to force relogin


INFO:data.dataset:Loaded 4383 paired samples
INFO:data.dataset:  Fully paired: 1893
INFO:data.dataset:  Modality dropout: 0.2
INFO:data.dataset:  Class distribution: {'Acidotic': 2734, 'Healthy': 1488, 'Transitional': 161}
INFO:data.dataset:Loaded 939 paired samples
INFO:data.dataset:  Fully paired: 406
INFO:data.dataset:  Modality dropout: 0.0
INFO:data.dataset:  Class distribution: {'Acidotic': 586, 'Healthy': 318, 'Transitional': 35}


‚úÖ W&B initialized: FUME-Acidosis/FUME-FastSCNN
   Run ID: hs7o9al3
   View at: https://wandb.ai/taminul/FUME-Acidosis/runs/hs7o9al3
‚úÖ Logger initialized
‚úÖ Data loaded: 4383 train, 939 val samples


  self.scaler = torch.cuda.amp.GradScaler() if self.config['training']['use_amp'] else None


‚úÖ Model created: 1,656,236 parameters (1.66M)
‚úÖ Model logged: model (1.66M params)
‚úÖ Loss function initialized
‚úÖ Optimizer and scheduler initialized
‚úÖ Metrics initialized

üöÄ Starting Training


  with torch.cuda.amp.autocast(enabled=self.config['training']['use_amp']):
Epoch 1/50: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 548/548 [05:09<00:00,  1.77it/s, loss=0.284]
Validating: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 939/939 [00:12<00:00, 77.12it/s]


Epoch 1 completed in 5.36 minutes
Best metric: 0.3339
----------------------------------------------------------------------


  with torch.cuda.amp.autocast(enabled=self.config['training']['use_amp']):
Epoch 2/50: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 548/548 [05:03<00:00,  1.81it/s, loss=0.291]
Validating: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 939/939 [00:12<00:00, 73.74it/s]


Epoch 2 completed in 5.27 minutes
Best metric: 0.3799
----------------------------------------------------------------------


  with torch.cuda.amp.autocast(enabled=self.config['training']['use_amp']):
Epoch 3/50: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 548/548 [05:07<00:00,  1.78it/s, loss=0.224]
Validating: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 939/939 [00:12<00:00, 76.44it/s]


Epoch 3 completed in 5.33 minutes
Best metric: 0.3970
----------------------------------------------------------------------


  with torch.cuda.amp.autocast(enabled=self.config['training']['use_amp']):
Epoch 4/50: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 548/548 [04:58<00:00,  1.84it/s, loss=0.241]
Validating: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 939/939 [00:12<00:00, 76.80it/s]


Epoch 4 completed in 5.18 minutes
Best metric: 0.4016
----------------------------------------------------------------------


  with torch.cuda.amp.autocast(enabled=self.config['training']['use_amp']):
Epoch 5/50: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 548/548 [04:59<00:00,  1.83it/s, loss=0.29] 
Validating: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 939/939 [00:12<00:00, 76.62it/s]



  EPOCH 5 RESULTS

  LOSS
----------------------------------------
  Train Loss                0.4830
  Val Loss                  12.5227

  SEGMENTATION METRICS
----------------------------------------
  mean_iou                  0.7352
  mean_dice                 0.8421
  pixel_accuracy            0.9180

  CLASSIFICATION METRICS
----------------------------------------
  accuracy                  0.4633
  balanced_accuracy         0.3984
  macro_f1                  0.2980
  weighted_f1               0.3994
  cohens_kappa              0.1398

  PER-CLASS F1 SCORES
----------------------------------------
  Healthy                   0.5556
  Transitional              0.0000
  Acidotic                  0.3385

Epoch 5 completed in 5.20 minutes
Best metric: 0.4016
----------------------------------------------------------------------


  with torch.cuda.amp.autocast(enabled=self.config['training']['use_amp']):
Epoch 6/50:  82%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñè | 451/548 [04:16<00:42,  2.28it/s, loss=0.351]

## 6. Monitor Training

**Weights & Biases Dashboard:**
- View training curves in real-time
- Compare experiments
- Track system metrics

**TensorBoard (if W&B not available):**
```bash
tensorboard --logdir=logs
```

## 7. Load and Test Checkpoint

In [None]:
# Load best checkpoint
checkpoint_path = '../checkpoints/best_model.pth'

if Path(checkpoint_path).exists():
    checkpoint = torch.load(checkpoint_path)

    # Load model
    model = FUMEFastSCNN(
        num_classes=3,
        num_seg_classes=3,
        shared_encoder=True
    )
    model.load_state_dict(checkpoint['model_state_dict'])
    model.eval()

    print(f"‚úÖ Loaded checkpoint from epoch {checkpoint['epoch']}")
    print(f"   Best metric: {checkpoint['best_metric']:.4f}")
else:
    print("‚ùå No checkpoint found. Train the model first!")

## 8. Training Summary

After training completes, check:
- `checkpoints/best_model.pth` - Best model weights
- `checkpoints/last_model.pth` - Latest model weights
- `logs/` - Training logs
- W&B dashboard - Training curves and metrics

## Next Steps

1. ‚úÖ Training completed
2. üìä Evaluate on test set (use `test_fume.ipynb`)
3. üìà Compare with baselines
4. üî¨ Run ablation studies
5. üìù Generate paper figures