# Plan: Google Brain Ventilator Pressure Prediction

Objectives:
- Build a robust CV and fast baseline quickly, then iterate to medal-level.
- Use GroupKFold by breath_id and fit transforms inside folds only.
- Start with a feature-engineered tree baseline (fast), then a GRU/LSTM 1D deep model if needed.
- Log progress and cache artifacts; keep seeds/folds deterministic.

Environment and Setup Checklist:
- Verify GPU availability via nvidia-smi and torch.cuda.is_available().
- Install PyTorch CUDA 12.1 wheels if needed; prefer XGBoost/CatBoost for GPU-accelerated trees.
- Create a constraints file to lock torch versions.

Data Understanding:
- train.csv/test.csv contain sequences for breaths (breath_id) with time_step, u_in, u_out, R, C.
- Predict pressure for each time step; evaluation here is dice-hausdorff-combo (proxy; original comp used MAE).
- Key caution: leakage via using future time steps for current prediction in non-autoregressive models—must align features per time step.

Validation Protocol:
- GroupKFold(n_splits=5) on breath_id. Shuffle with fixed seed.
- Metric for model selection: MAE on OOF for u_out==0 steps only (typical comp rule), but we will also track full-seq MAE; we will rely on leaderboard metric after submission. Ensure no leakage by fitting scalers/encoders inside each fold.

Baseline v1: Feature-Engineered Trees (XGBoost/CB, GPU):
- Features per time step:
  - raw: time_step, u_in, u_out, R, C
  - interaction: R*C, R/C, 1/R, 1/C, RC one-hot
  - lags/leads of u_in, pressure (train only for targets, but for features avoid target leakage; can use cumulative stats of u_in only)
  - deltas: du_in, d2u_in; dt (time_step diff), cumulative_time
  - rolling stats per breath: rolling mean/std/min/max of u_in over windows (3,5,7), expanding mean/std
  - cumulative integrals: cumsum(u_in*dt), cumsum(u_in), cummean(u_in), area under curve proxies
  - segment features at inspiratory/expiratory phases using u_out transitions
  - breath-level stats: mean/median/std of u_in, count, unique R,C
- Target transformation: None for trees.
- Train with GPU if available; early stopping; 5-fold OOF; log per-fold times.

Deep Model v2: Sequence GRU/1D-CNN (if needed):
- Inputs: standardized sequence features per breath, length ~80 time steps.
- Architecture: 2-3 layer bidirectional GRU + FC head; mask u_out==1 for loss.
- Loss: MAE on masked positions; scheduler + early stopping.
- Quantization trick (optional): classify over unique pressures then refine; consider later.

Iteration Strategy:
1) Sanity: environment + quick EDA (shapes, unique breaths, sequence length).
2) Baseline features + XGBoost GPU, 5-fold OOF. Produce submission.
3) Error analysis on OOF: buckets by R,C; by time position; by u_out; tune features accordingly.
4) If needed, implement GRU and blend with tree model.

Risk/Checks:
- Consistent fold split saved to disk; reuse across runs.
- Avoid using future info when generating per-step features unless symmetrical windows are allowed in both train/test.
- Ensure deterministic seeds; log config and runtime per fold.

Next Steps:
1) Run environment check (nvidia-smi, torch install).
2) Load data, basic EDA, confirm sequence length and breath counts.
3) Implement fold creation and feature pipeline (cached).
4) Train baseline XGBoost (GPU) with early stopping; create submission.
5) Request expert review after baseline OOF.

Questions for Experts:
- Which engineered features historically gave the largest gains for this comp? Any must-haves we missed?
- Preferred CV split (5x GroupKFold vs. 10x) and whether to mask u_out==1 during training for trees?
- Tips on ensembling trees with GRU for robust gains vs. overfit?
- Any pitfalls around time_step scaling or leakage to watch closely?

In [None]:
# Environment check: GPU + Torch cu121, quick data sanity
import os, sys, subprocess, shutil, time
from pathlib import Path
import pandas as pd

print('=== nvidia-smi ===', flush=True)
subprocess.run(['bash','-lc','nvidia-smi || true'], check=False)

def pip(*args):
    print('> pip', *args, flush=True)
    subprocess.run([sys.executable, '-m', 'pip', *args], check=True)

torch_ok = False
try:
    import torch
    v = getattr(torch.version, 'cuda', None)
    print('Found torch:', torch.__version__, 'CUDA build:', v, flush=True)
    torch_ok = (v or '').startswith('12.1')
except Exception as e:
    print('Torch import failed:', e, flush=True)
    torch_ok = False

if not torch_ok:
    # Clean any stray torch installs
    for pkg in ('torch','torchvision','torchaudio'):
        subprocess.run([sys.executable, '-m', 'pip', 'uninstall', '-y', pkg], check=False)
    for d in (
        '/app/.pip-target/torch',
        '/app/.pip-target/torchvision',
        '/app/.pip-target/torchaudio',
        '/app/.pip-target/torch-2.8.0.dist-info',
        '/app/.pip-target/torch-2.4.1.dist-info',
        '/app/.pip-target/torchvision-0.23.0.dist-info',
        '/app/.pip-target/torchvision-0.19.1.dist-info',
        '/app/.pip-target/torchaudio-2.8.0.dist-info',
        '/app/.pip-target/torchaudio-2.4.1.dist-info',
        '/app/.pip-target/torchgen',
        '/app/.pip-target/functorch',
    ):
        if os.path.exists(d):
            print('Removing', d, flush=True)
            shutil.rmtree(d, ignore_errors=True)
    # Install exact cu121 stack
    pip('install',
        '--index-url', 'https://download.pytorch.org/whl/cu121',
        '--extra-index-url', 'https://pypi.org/simple',
        'torch==2.4.1', 'torchvision==0.19.1', 'torchaudio==2.4.1')
    Path('constraints.txt').write_text('torch==2.4.1\ntorchvision==0.19.1\ntorchaudio==2.4.1\n')
    import importlib; importlib.invalidate_caches()
    import torch

print('torch:', torch.__version__, 'built CUDA:', getattr(torch.version, 'cuda', None), flush=True)
print('CUDA available:', torch.cuda.is_available(), flush=True)
if torch.cuda.is_available():
    print('GPU:', torch.cuda.get_device_name(0), flush=True)

# Quick data sanity
train_path = 'train.csv'; test_path = 'test.csv'
assert os.path.exists(train_path) and os.path.exists(test_path), 'Missing train/test CSVs'
t0 = time.time()
train_head = pd.read_csv(train_path, nrows=1000)
print('train.csv sample shape:', train_head.shape, 'cols:', list(train_head.columns), flush=True)
print(train_head.head(3))
print('Loaded sample in', round(time.time()-t0,2), 's', flush=True)

=== nvidia-smi ===


Failed to initialize NVML: Unknown Error
Torch import failed: No module named 'torch'






> pip install --index-url https://download.pytorch.org/whl/cu121 --extra-index-url https://pypi.org/simple torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1




Looking in indexes: https://download.pytorch.org/whl/cu121, https://pypi.org/simple


Collecting torch==2.4.1
  Downloading https://download.pytorch.org/whl/cu121/torch-2.4.1%2Bcu121-cp311-cp311-linux_x86_64.whl (799.0 MB)
