# Plan to Medal: Google Brain Ventilator Pressure Prediction

Objectives:
- Establish GPU-verified environment and robust CV.
- Build fast, strong baseline (XGBoost GPU) with proven ventilator FE.
- Pivot to a BiGRU baseline; blend with trees; apply fold-safe snapping.

Data facts (from prior comp):
- Each breath_id is a sequence of 80 timesteps.
- Inputs: time_step, u_in, u_out, R, C, plus engineered sequence features.
- Target: pressure; evaluation uses only timesteps where u_out == 0 (we predict all steps, but mask metric/loss to u_out==0).

Validation:
- 5-fold GroupKFold by breath_id to avoid leakage; stratify folds by (R, C) distribution if possible.
- Fix seeds; log OOF per fold; save OOF for error analysis.
- Metric: MAE computed only on u_out==0 (masked).

Baseline v1:
- FE: lags/leads of u_in, rolling stats per breath, cumulative integrals (volume via u_in*dt), time diffs, R/C and interaction, step index (t_idx 0..79), RC-aware EWM.
- Model: XGBoost regressor with GPU, objective=reg:squarederror, eval_metric=mae, sample_weight=(u_out==0). Early stopping.
- Post-proc: fold-safe snap to train pressure grid; test snap per-(R,C) grid, optional median filter (window=3) per breath.

Next iterations:
- BiGRU baseline with masked MAE; inputs: physics + dynamics features above.
- Blend BiGRU + XGB (and CatBoost if time); tune weights on OOF; snap after blend.

Discipline:
- Cache features to feather/parquet.
- Heavy jobs: print elapsed per fold; stop if divergence.
- After baseline built, request expert review for FE/model/validation.

Milestones:
1) Env + EDA + CV spec.
2) FE v1 + XGB baseline OOF (masked MAE + fold-safe snap).
3) BiGRU baseline + OOF + snap; blend with XGB.
4) Finalize blend + per-(R,C) snap + median(3) + submission.

In [1]:
import os, sys, time, subprocess, pandas as pd, numpy as np
from pathlib import Path
t0 = time.time()
print('=== Environment check ===', flush=True)
try:
    out = subprocess.run(['bash','-lc','nvidia-smi || true'], capture_output=True, text=True, check=False)
    print(out.stdout.strip()[:2000], flush=True)
except Exception as e:
    print('nvidia-smi error:', e, flush=True)
print('Python', sys.version)
print('Pandas', pd.__version__)

print('\n=== Load data ===', flush=True)
train_path = Path('train.csv')
test_path = Path('test.csv')
assert train_path.exists() and test_path.exists(), 'Missing train/test CSVs'
usecols = None  # load all
train = pd.read_csv(train_path, low_memory=False, usecols=usecols)
test = pd.read_csv(test_path, low_memory=False, usecols=usecols)
print('train shape:', train.shape, 'test shape:', test.shape, flush=True)
print('train columns:', list(train.columns), flush=True)
print('\nDtypes:\n', train.dtypes, flush=True)
print('\nMemory usage (MB):', round(train.memory_usage(deep=True).sum()/1e6, 2), flush=True)

if 'pressure' in train.columns:
    desc = train['pressure'].describe()
    print('\npressure describe:\n', desc, flush=True)
    print('pressure unique approx:', train['pressure'].nunique(), flush=True)

key_cols = [c for c in ['breath_id','R','C','time_step','u_in','u_out'] if c in train.columns]
print('\nKey cols present:', key_cols, flush=True)
if 'breath_id' in train.columns:
    n_breaths = train['breath_id'].nunique()
    lens = train.groupby('breath_id').size().value_counts().head()
    print('breaths:', n_breaths, 'sequence length distribution (top):\n', lens, flush=True)
    # quick check: typical 80 timesteps
    if 80 in train.groupby('breath_id').size().values:
        print('80-timestep sequences detected', flush=True)

print('\nHead:\n', train.head(8), flush=True)
print('\nElapsed:', round(time.time()-t0,2), 's', flush=True)

=== Environment check ===


Wed Sep 24 17:01:11 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.144.06             Driver Version: 550.144.06     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA A10-24Q                 On  |   00000002:00:00.0 Off |                    0 |
| N/A   N/A    P0             N/A /  N/A  |     128MiB /  24512MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

Python 3.11.0rc1 (main, Aug 12 2022, 10:02:14) [GCC 11.2.0]
Pandas 2.2.2

=== Load data ===


train shape: (5432400, 8) test shape: (603600, 7)


train columns: ['id', 'breath_id', 'R', 'C', 'time_step', 'u_in', 'u_out', 'pressure']



Dtypes:
 id             int64
breath_id      int64
R              int64
C              int64
time_step    float64
u_in         float64
u_out          int64
pressure     float64
dtype: object



Memory usage (MB): 347.67



pressure describe:
 count    5.432400e+06
mean     1.121807e+01
std      8.106474e+00
min     -1.895744e+00
25%      6.329607e+00
50%      7.032628e+00
75%      1.364103e+01
max      6.482099e+01
Name: pressure, dtype: float64


pressure unique approx: 950



Key cols present: ['breath_id', 'R', 'C', 'time_step', 'u_in', 'u_out']


breaths: 67905 sequence length distribution (top):
 80    67905
Name: count, dtype: int64


80-timestep sequences detected



Head:
    id  breath_id  R   C  time_step      u_in  u_out   pressure
0   1      85053  5  10   0.000000  4.174419      0   6.118700
1   2      85053  5  10   0.033812  7.050149      0   5.907794
2   3      85053  5  10   0.067497  7.564931      0   7.313837
3   4      85053  5  10   0.101394  8.103306      0   8.227765
4   5      85053  5  10   0.135344  8.502619      0   9.422901
5   6      85053  5  10   0.169323  8.758625      0  10.899246
6   7      85053  5  10   0.203229  9.110845      0  11.602268
7   8      85053  5  10   0.236715  9.209150      0  13.148915



Elapsed: 2.19 s


In [59]:
import time, numpy as np, pandas as pd, os, gc
from pathlib import Path
t0 = time.time()
print('=== FE v3: expanded physics + dynamics + integrals + breath stats (stabilized) + peak & vol_insp lags ===', flush=True)

# Load (reuse if present)
if 'train' not in globals():
    train = pd.read_csv('train.csv')
if 'test' not in globals():
    test = pd.read_csv('test.csv')

# Concatenate for consistent FE
train['is_train'] = 1
test['is_train'] = 0
test['pressure'] = np.nan  # placeholder to keep cols aligned
df = pd.concat([train, test], axis=0, ignore_index=True)
df = df.sort_values(['breath_id','time_step']).reset_index(drop=True)

# Helpers
grp = df.groupby('breath_id', sort=False)

# Base
df['t_idx'] = grp.cumcount().astype(np.int16)
df['dt'] = grp['time_step'].diff().fillna(0.0).astype(np.float32)
df['t_idx_norm'] = (df['t_idx'] / 79.0).astype(np.float32)
df['RC'] = (df['R'] * df['C']).astype(np.int32)
df['rc_key'] = (df['R'] * 100 + df['C']).astype(np.int32)

# Lags/Leads
for k in [1,2,3,4,5]:
    df[f'u_in_lag{k}'] = grp['u_in'].shift(k).fillna(0.0)
for k in [1,2]:
    df[f'u_in_lead{k}'] = grp['u_in'].shift(-k).fillna(0.0)

# First/second/third diffs
df['du1'] = (df['u_in'] - df['u_in_lag1']).astype(np.float32)
df['du2'] = (df['u_in'] - df['u_in_lag2']).astype(np.float32)
df['du3'] = (df['u_in'] - df['u_in_lag3']).astype(np.float32)

# Rolling stats (window=3) per breath
roll = grp['u_in'].rolling(window=3, min_periods=1)
df['roll_mean3_uin'] = roll.mean().reset_index(level=0, drop=True)
df['roll_std3_uin']  = roll.std().reset_index(level=0, drop=True).fillna(0.0)
df['roll_max3_uin']  = roll.max().reset_index(level=0, drop=True)

# Integrals/areas
df['vol_dt'] = (df['u_in'] * df['dt']).groupby(df['breath_id']).cumsum()
df['u_in_cumsum'] = grp['u_in'].cumsum()
insp_mask = (df['u_out'] == 0).astype(np.float32)
df['vol_insp'] = (df['u_in'] * df['dt'] * insp_mask).groupby(df['breath_id']).cumsum()
df['u_in_cumsum_insp'] = (df['u_in'] * insp_mask).groupby(df['breath_id']).cumsum()

# vol_insp lags and rolling mean
for k in [1,2,3]:
    df[f'vol_insp_lag{k}'] = grp['vol_insp'].shift(k).fillna(0.0)
df['roll_mean3_vol_insp'] = grp['vol_insp'].rolling(window=3, min_periods=1).mean().reset_index(level=0, drop=True)

# Breath stats (broadcast within breath)
b_max = grp['u_in'].transform('max')
b_mean = grp['u_in'].transform('mean')
b_std = grp['u_in'].transform('std').fillna(0.0)
df['u_in_max_breath'] = b_max
df['u_in_mean_breath'] = b_mean
df['u_in_std_breath'] = b_std
end_vol = grp['vol_dt'].transform('last')
df['vol_dt_end_breath'] = end_vol
df['u_in_over_max'] = (df['u_in'] / (b_max + 1e-6)).astype(np.float32)
df['vol_dt_over_end'] = (df['vol_dt'] / (end_vol + 1e-6)).fillna(0.0).astype(np.float32)

# Peak features: idx_peak_uin, u_in_at_peak, dist_to_peak, vol_at_peak
peak_idx_rows = df.loc[grp['u_in'].idxmax(), ['breath_id','t_idx','u_in','vol_insp']]
peak_idx_rows = peak_idx_rows.rename(columns={'t_idx':'idx_peak_uin','u_in':'u_in_at_peak','vol_insp':'vol_at_peak'})
df = df.merge(peak_idx_rows, on='breath_id', how='left')
df['idx_peak_uin'] = df['idx_peak_uin'].astype(np.int16)
df['u_in_at_peak'] = df['u_in_at_peak'].astype(np.float32)
df['vol_at_peak'] = df['vol_at_peak'].astype(np.float32)
df['dist_to_peak'] = (df['t_idx'].astype(np.int16) - df['idx_peak_uin'].astype(np.int16)).astype(np.int16)

# RC/physics + interactions
df['R_term'] = (df['R'].astype(np.float32) * df['u_in'].astype(np.float32))
df['V_term'] = (df['vol_dt'] / df['C'].replace(0, np.nan)).fillna(0.0)

def ewm_rc_group(g):
    u = g['u_in'].to_numpy(dtype=np.float32, copy=False)
    dt = g['dt'].to_numpy(dtype=np.float32, copy=False)
    RC_val = float(g['R'].iloc[0]) * float(g['C'].iloc[0])
    if RC_val == 0:
        RC_val = 1.0
    RC = np.float32(RC_val)
    alpha = 1.0 - np.exp(-dt / RC)
    y = np.empty_like(u, dtype=np.float32)
    prev = np.float32(0.0)
    for i in range(u.shape[0]):
        a = alpha[i]
        prev = a * u[i] + (1.0 - a) * prev
        y[i] = prev
    return pd.Series(y, index=g.index, dtype='float32')
df['ewm_rc'] = grp.apply(ewm_rc_group).reset_index(level=0, drop=True)

# Simple per-breath EWM of u_in (alpha ~0.1)
def ewm_simple_group(g, alpha=0.1):
    u = g['u_in'].to_numpy(dtype=np.float32, copy=False)
    y = np.empty_like(u, dtype=np.float32)
    prev = np.float32(0.0)
    a = np.float32(alpha)
    for i in range(u.shape[0]):
        prev = a * u[i] + (1.0 - a) * prev
        y[i] = prev
    return pd.Series(y, index=g.index, dtype='float32')
df['ewm_simple_uin'] = grp.apply(ewm_simple_group).reset_index(level=0, drop=True)

df['u_in_time'] = (df['u_in'] * df['time_step']).astype(np.float32)
# Patch B: stabilize u_in_dt and du1_dt; clamp to sane range and add diagnostics
dt_eps = 1e-3
dt_arr = df['dt'].to_numpy(dtype=np.float32, copy=False)
dt_safe = np.where(dt_arr > dt_eps, dt_arr, dt_eps).astype(np.float32)
uin = df['u_in'].to_numpy(dtype=np.float32, copy=False)
uin_dt = (uin / dt_safe).astype(np.float32)
uin_dt = np.clip(uin_dt, -2e3, 2e3)
df['u_in_dt'] = uin_dt
du1 = df['du1'].to_numpy(dtype=np.float32, copy=False)
du1_dt = (du1 / dt_safe).astype(np.float32)
du1_dt = np.clip(du1_dt, -2e3, 2e3)
df['du1_dt'] = du1_dt

# Phase/progress
df['breath_progress'] = df['t_idx_norm']
df['u_out_lag1'] = grp['u_out'].shift(1).fillna(0).astype(np.int16)
df['u_out_lead1'] = grp['u_out'].shift(-1).fillna(0).astype(np.int16)
df['insp_step'] = grp['u_out'].apply(lambda s: (~(s.astype(bool))).cumsum()).reset_index(level=0, drop=True).astype(np.int16)
df['insp_max'] = grp['insp_step'].transform('max').replace(0, 1).astype(np.int16)
df['insp_frac'] = (df['insp_step'] / df['insp_max'].replace(0, 1)).astype(np.float32)

# Cast types for memory (safe casts only)
for col in ['t_idx','R','C','RC','rc_key','u_out','u_out_lag1','u_out_lead1','insp_step','insp_max','idx_peak_uin','dist_to_peak']:
    if col in df.columns:
        df[col] = df[col].astype(np.int16)

num_cols = [
    'time_step','u_in','pressure','dt','t_idx_norm','breath_progress',
    'u_in_lag1','u_in_lag2','u_in_lag3','u_in_lag4','u_in_lag5','u_in_lead1','u_in_lead2',
    'du1','du2','du3','du1_dt','roll_mean3_uin','roll_std3_uin','roll_max3_uin',
    'vol_dt','vol_insp','vol_insp_lag1','vol_insp_lag2','vol_insp_lag3','roll_mean3_vol_insp',
    'u_in_cumsum','u_in_cumsum_insp','vol_dt_end_breath',
    'u_in_over_max','vol_dt_over_end','R_term','V_term','ewm_rc','ewm_simple_uin','u_in_time','u_in_dt',
    'u_in_at_peak','vol_at_peak'
]
for col in num_cols:
    if col in df.columns:
        df[col] = pd.to_numeric(df[col], errors='coerce').astype(np.float32)

# Diagnostics for NaN/Inf after FE
num_check_cols = [c for c in df.columns if c not in ['id'] and (np.issubdtype(df[c].dtype, np.number))]
n_nans = 0; n_infs = 0
for c in num_check_cols:
    vals = df[c].to_numpy()
    n_nans += np.isnan(vals).sum()
    n_infs += np.isinf(vals).sum()
print(f'FE diagnostics: total NaNs={int(n_nans)} | Infs={int(n_infs)} across numeric features', flush=True)
if n_infs > 0:
    for c in num_check_cols:
        vals = df[c].to_numpy()
        if np.isinf(vals).any():
            df[c] = np.where(np.isinf(vals), 0.0, vals).astype(np.float32)
if n_nans > 0:
    for c in num_check_cols:
        if df[c].isna().any():
            df[c] = df[c].fillna(0.0).astype(np.float32)

print('FE columns count:', len(df.columns), 'Sample:', [c for c in df.columns if c not in ['id']][:25], flush=True)

# Split back
train_fe = df[df['is_train']==1].copy()
test_fe = df[df['is_train']==0].copy()
train_fe = train_fe.sort_values('id').reset_index(drop=True)
test_fe = test_fe.sort_values('id').reset_index(drop=True)

# Save features to parquet
train_fe_path = Path('train_fe_v3.parquet')
test_fe_path = Path('test_fe_v3.parquet')
train_fe.to_parquet(train_fe_path, index=False)
test_fe.to_parquet(test_fe_path, index=False)
print('Saved:', str(train_fe_path), str(test_fe_path), flush=True)

# Build 5-fold GroupKFold with (R,C) strat if available
from sklearn.model_selection import GroupKFold
try:
    from sklearn.model_selection import StratifiedGroupKFold
    use_sgk = True
except Exception:
    use_sgk = False

breath_df = (train_fe[['breath_id','R','C']].drop_duplicates().reset_index(drop=True))
breath_df['rc_key'] = (breath_df['R']*100 + breath_df['C']).astype(np.int32)
breath_df = breath_df.sample(frac=1.0, random_state=42).reset_index(drop=True)

n_splits = 5
fold_col = np.full(len(breath_df), -1, dtype=np.int8)
if use_sgk:
    sgk = StratifiedGroupKFold(n_splits=n_splits, shuffle=True, random_state=42)
    for k, (_, val_idx) in enumerate(sgk.split(breath_df, y=breath_df['rc_key'], groups=breath_df['breath_id'])):
        fold_col[val_idx] = k
    print('Using StratifiedGroupKFold', flush=True)
else:
    gk = GroupKFold(n_splits=n_splits)
    for k, (_, val_idx) in enumerate(gk.split(breath_df, groups=breath_df['breath_id'])):
        fold_col[val_idx] = k
    print('Using GroupKFold (no strat fallback)', flush=True)

breath_df['fold'] = fold_col
assert (breath_df['fold']>=0).all()
breath_df.to_csv('folds_breath_v3.csv', index=False)
print('Saved folds_breath_v3.csv', flush=True)

# Attach fold to train rows
train_fe = train_fe.merge(breath_df[['breath_id','fold']], on='breath_id', how='left')
train_fe.to_parquet(train_fe_path, index=False)  # overwrite with fold column included
print('Train parquet updated with fold column.', flush=True)

# Cleanup
del df; gc.collect()
print('Done FE v3+. Elapsed:', round(time.time()-t0,2), 's', flush=True)

=== FE v3: expanded physics + dynamics + integrals + breath stats (stabilized) + peak & vol_insp lags ===


  df = pd.concat([train, test], axis=0, ignore_index=True)


  df['ewm_rc'] = grp.apply(ewm_rc_group).reset_index(level=0, drop=True)


  df['ewm_simple_uin'] = grp.apply(ewm_simple_group).reset_index(level=0, drop=True)


FE diagnostics: total NaNs=1207200 | Infs=0 across numeric features


FE columns count: 63 Sample: ['breath_id', 'R', 'C', 'time_step', 'u_in', 'u_out', 'pressure', 'is_train', 't_idx', 'dt', 't_idx_norm', 'RC', 'rc_key', 'u_in_lag1', 'u_in_lag2', 'u_in_lag3', 'u_in_lag4', 'u_in_lag5', 'u_in_lead1', 'u_in_lead2', 'du1', 'du2', 'du3', 'roll_mean3_uin', 'roll_std3_uin']


Saved: train_fe_v3.parquet test_fe_v3.parquet


Using StratifiedGroupKFold


Saved folds_breath_v3.csv


Train parquet updated with fold column.


Done FE v3+. Elapsed: 99.94 s


In [43]:
import numpy as np, pandas as pd, time
from pathlib import Path
from sklearn.metrics import mean_absolute_error

print('=== Fold-safe physics baselines: fix unit mix; vectorized; fit on u_out==0 only ===', flush=True)

tr_path, te_path = Path('train_fe_v3.parquet'), Path('test_fe_v3.parquet')
train_fe = pd.read_parquet(tr_path).sort_values('id').reset_index(drop=True)
test_fe  = pd.read_parquet(te_path).sort_values('id').reset_index(drop=True)

y = train_fe['pressure'].to_numpy(np.float32)
u = train_fe['u_in'].to_numpy(np.float32)
R = train_fe['R'].to_numpy(np.float32)
C = train_fe['C'].to_numpy(np.float32)
w = (train_fe['u_out'].to_numpy()==0).astype(np.float32)
voli = train_fe['vol_insp'].to_numpy(np.float32)  # already cumsum(u_in*dt*mask)

u_te = test_fe['u_in'].to_numpy(np.float32)
R_te = test_fe['R'].to_numpy(np.float32)
C_te = test_fe['C'].to_numpy(np.float32)
voli_te = test_fe['vol_insp'].to_numpy(np.float32)

rc_tr = (train_fe['R'].astype(np.int32)*100 + train_fe['C'].astype(np.int32)).to_numpy()
rc_te = (test_fe['R'].astype(np.int32)*100 + test_fe['C'].astype(np.int32)).to_numpy()
rcs = np.unique(rc_tr)
folds = train_fe['fold'].to_numpy(np.int32)
n_folds = int(folds.max()) + 1

def fit_on_insp(X, y, w):
    m = w > 0
    if m.sum() < 3:
        return np.array([0.,0.,float(y[m].mean()) if m.any() else float(y.mean())], dtype=np.float64)
    beta, *_ = np.linalg.lstsq(X[m].astype(np.float64), y[m].astype(np.float64), rcond=None)
    return beta.astype(np.float64)

def run_wls(x1_tr, x2_tr, x1_te, x2_te, label=''):
    oof = np.zeros_like(y, dtype=np.float32)
    test_fold = np.zeros((len(test_fe), n_folds), dtype=np.float32)
    for k in range(n_folds):
        tr_mask = (folds != k); va_mask = (folds == k)
        betas = {}
        for rc in rcs:
            m = (rc_tr == rc) & tr_mask
            if not np.any(m):
                continue
            X = np.stack([x1_tr[m], x2_tr[m], np.ones(m.sum(), np.float32)], 1)
            beta = fit_on_insp(X, y[m], w[m])
            betas[int(rc)] = beta
        for rc, beta in betas.items():
            a,b,c = [float(t) for t in beta]
            mv = (rc_tr == rc) & va_mask
            if np.any(mv):
                oof[mv] = a*x1_tr[mv] + b*x2_tr[mv] + c
            mt = (rc_te == rc)
            if np.any(mt):
                test_fold[mt, k] = a*x1_te[mt] + b*x2_te[mt] + c
        mae_k = mean_absolute_error(y[va_mask & (w>0)], oof[va_mask & (w>0)])
        print(f'Fold {k} masked MAE{" "+label if label else ""}: {mae_k:.4f}', flush=True)
    mae = mean_absolute_error(y[w>0], oof[w>0])
    return mae, oof, test_fold.mean(1).astype(np.float32)

# Main physics-consistent variant
flow_tr = u * 0.01
flow_te = u_te * 0.01
x1_tr = R * flow_tr
x2_tr = voli / C                  # NOTE: no extra 0.01 here
x1_te = R_te * flow_te
x2_te = voli_te / C_te
print('=== Physics WLS: X=[R*(u_in/100), vol_insp/C, 1] ===')
mae_phys, oof_phys, test_phys = run_wls(x1_tr, x2_tr, x1_te, x2_te, label='(phys)')
print(f'OOF masked MAE (physics): {mae_phys:.4f}', flush=True)

# Alt Kaggle prior (vectorized cumsum, no loops)
for df in (train_fe, test_fe):
    df.sort_values(['breath_id','t_idx'], inplace=True)
    df['flow'] = (df['u_in'].astype(np.float32) * 0.01).astype(np.float32)
    df['vol_cumsum'] = df.groupby('breath_id')['flow'].cumsum().astype(np.float32)
train_fe.sort_values('id', inplace=True); test_fe.sort_values('id', inplace=True)

flow_tr2 = train_fe['flow'].to_numpy(np.float32)
volc_tr  = train_fe['vol_cumsum'].to_numpy(np.float32)
flow_te2 = test_fe['flow'].to_numpy(np.float32)
volc_te  = test_fe['vol_cumsum'].to_numpy(np.float32)

print('=== Alt WLS: X=[flow, vol_cumsum, 1] (no dt) ===')
mae_alt, oof_alt, test_alt = run_wls(flow_tr2, volc_tr, flow_te2, volc_te, label='(alt)')
print(f'OOF masked MAE (alt): {mae_alt:.4f}', flush=True)

# Choose best and save
use_alt = mae_alt < mae_phys
p_train = oof_alt if use_alt else oof_phys
p_test  = test_alt if use_alt else test_phys
train_fe['p_phys'] = p_train.astype(np.float32)
test_fe['p_phys']  = p_test.astype(np.float32)
train_fe.to_parquet(tr_path, index=False)
test_fe.to_parquet(te_path, index=False)
print(f'Saved p_phys (alt_used={use_alt})')

=== Fold-safe physics baselines: fix unit mix; vectorized; fit on u_out==0 only ===


=== Physics WLS: X=[R*(u_in/100), vol_insp/C, 1] ===


Fold 0 masked MAE (phys): 3.2763


Fold 1 masked MAE (phys): 3.2769


Fold 2 masked MAE (phys): 3.2406


Fold 3 masked MAE (phys): 3.2641


Fold 4 masked MAE (phys): 3.2831


OOF masked MAE (physics): 3.2682


=== Alt WLS: X=[flow, vol_cumsum, 1] (no dt) ===
Fold 0 masked MAE (alt): 3.4351


Fold 1 masked MAE (alt): 3.4387


Fold 2 masked MAE (alt): 3.3935


Fold 3 masked MAE (alt): 3.4218


Fold 4 masked MAE (alt): 3.4345


OOF masked MAE (alt): 3.4247


Saved p_phys (alt_used=False)


In [31]:
import os, sys, time, math, gc, shutil, subprocess
from pathlib import Path
import numpy as np
import pandas as pd

print('=== BiGRU Prep v4 rollback: minimal FEATS, exclude integrals from std, target z-score, cosine (no warmup), ReLU head ===', flush=True)

# Install exact cu121 torch stack if not present
try:
    import torch
    import torchvision, torchaudio
    ok = (getattr(torch.version, 'cuda', '') or '').startswith('12.1') and torch.cuda.is_available()
    if not ok:
        raise RuntimeError('Torch CUDA stack mismatch or CUDA not available')
except Exception as e:
    print('Installing PyTorch cu121 ...', e, flush=True)
    for pkg in ('torch','torchvision','torchaudio'):
        subprocess.run([sys.executable, '-m', 'pip', 'uninstall', '-y', pkg], check=False)
    for d in ('/app/.pip-target/torch', '/app/.pip-target/torchvision', '/app/.pip-target/torchaudio',
              '/app/.pip-target/torch-2.4.1.dist-info','/app/.pip-target/torchvision-0.19.1.dist-info','/app/.pip-target/torchaudio-2.4.1.dist-info','/app/.pip-target/torchgen','/app/.pip-target/functorch'):
        if os.path.exists(d):
            shutil.rmtree(d, ignore_errors=True)
    subprocess.run([sys.executable, '-m', 'pip', 'install',
                    '--index-url','https://download.pytorch.org/whl/cu121',
                    '--extra-index-url','https://pypi.org/simple',
                    'torch==2.4.1','torchvision==0.19.1','torchaudio==2.4.1'], check=True)
    import torch
    import torchvision, torchaudio
    print('torch:', torch.__version__, 'cuda build:', getattr(torch.version,'cuda',None), 'CUDA avail:', torch.cuda.is_available(), flush=True)

import torch
from torch import nn
from torch.utils.data import Dataset, DataLoader
from torch.optim.lr_scheduler import CosineAnnealingLR

# ---------------- Data load ----------------
FE_PATH_TRAIN = Path('train_fe_v3.parquet')
FE_PATH_TEST = Path('test_fe_v3.parquet')
assert FE_PATH_TRAIN.exists() and FE_PATH_TEST.exists(), 'Run FE v3 cell first'
train_fe = pd.read_parquet(FE_PATH_TRAIN)
test_fe = pd.read_parquet(FE_PATH_TEST)

# Add p_phys for NN too (R_term + V_term)
train_fe['p_phys'] = (train_fe['R_term'].astype(np.float32) + train_fe['V_term'].astype(np.float32)).astype(np.float32)
test_fe['p_phys']  = (test_fe['R_term'].astype(np.float32) + test_fe['V_term'].astype(np.float32)).astype(np.float32)

train_fe = train_fe.sort_values(['breath_id','t_idx']).reset_index(drop=True)
test_fe = test_fe.sort_values(['breath_id','t_idx']).reset_index(drop=True)

# Minimal stable FEATS (no u_out, no leads)
FEATS = [
    'u_in','dt','t_idx_norm','R','C','RC',
    'u_in_lag1','u_in_lag2','u_in_lag3',
    'du1',
    'vol_dt','vol_insp','u_in_cumsum',
    'R_term','V_term','p_phys',
    'breath_progress','insp_frac'
]
missing = [c for c in FEATS if c not in train_fe.columns]
if missing:
    raise ValueError(f'Missing features: {missing}')

SEQ_LEN = int(train_fe.groupby('breath_id').size().mode().iloc[0])
print('SEQ_LEN:', SEQ_LEN, flush=True)

def make_sequences(df: pd.DataFrame, feats):
    g = df.groupby('breath_id', sort=False)
    breath_ids = g.size().index.to_numpy()
    B = breath_ids.shape[0]
    F = len(feats)
    X = np.zeros((B, SEQ_LEN, F), dtype=np.float32)
    mask = np.zeros((B, SEQ_LEN), dtype=np.float32)
    rc_key = np.zeros(B, dtype=np.int32)
    y = None
    has_y = 'pressure' in df.columns and not df['pressure'].isna().all()
    if has_y:
        y = np.zeros((B, SEQ_LEN), dtype=np.float32)
    for i, (bid, sub) in enumerate(g):
        sub = sub.sort_values('t_idx')
        tlen = len(sub)
        if tlen != SEQ_LEN:
            sub = sub.iloc[:SEQ_LEN]
            tlen = len(sub)
        X[i, :tlen, :] = sub[feats].to_numpy(dtype=np.float32, copy=False)
        mask[i, :tlen] = (sub['u_out'].to_numpy() == 0).astype(np.float32)
        rc_key[i] = (int(sub['R'].iloc[0])*100 + int(sub['C'].iloc[0]))
        if has_y:
            y[i, :tlen] = sub['pressure'].to_numpy(dtype=np.float32, copy=False)
    return X, y, mask, rc_key, breath_ids

X_all, y_all, mask_all, rc_all, bids_all = make_sequences(train_fe, FEATS)
X_test_all, _, mask_test_dummy, rc_test_all, bids_test_all = make_sequences(test_fe, FEATS)
print('Train seq:', X_all.shape, 'Test seq:', X_test_all.shape, flush=True)
print(f'Target stats: min={y_all.min():.2f}, max={y_all.max():.2f}, mean={y_all.mean():.2f}')
print('FEATS used:', len(FEATS))

# Quick sanity checks
m0 = mask_all[0].mean()
print('Sanity: first breath shapes:', X_all[0].shape, y_all[0].shape, mask_all[0].shape, '| mask_mean:', round(float(m0),3), flush=True)
assert X_all.shape[1] == SEQ_LEN and X_test_all.shape[1] == SEQ_LEN
assert np.isfinite(X_all).all() and np.isfinite(y_all).all() and np.isfinite(mask_all).all(), 'NaN/Inf in sequences'

class BreathDataset(Dataset):
    def __init__(self, X, y, mask, idx=None):
        self.X = X
        self.y = y
        self.m = mask
        self.idx = np.arange(X.shape[0], dtype=np.int64) if idx is None else np.asarray(idx, dtype=np.int64)
    def __len__(self): return self.X.shape[0]
    def __getitem__(self, i):
        x = torch.from_numpy(self.X[i])
        m_arr = self.m[i]
        if not isinstance(m_arr, np.ndarray):
            m_arr = np.asarray(m_arr, dtype=np.float32)
        if m_arr.ndim == 0:
            m_arr = np.full((x.shape[0],), float(m_arr), dtype=np.float32)
        elif m_arr.dtype != np.float32:
            m_arr = m_arr.astype(np.float32)
        m = torch.from_numpy(m_arr)
        idx_val = int(self.idx[i])
        if self.y is None:
            return x, m, idx_val
        return x, torch.from_numpy(self.y[i]), m, idx_val

class BiGRUReg(nn.Module):
    def __init__(self, in_dim, hidden=256, layers=3, dropout=0.2, layer_norm=True):
        super().__init__()
        self.gru = nn.GRU(in_dim, hidden, num_layers=layers, batch_first=True, dropout=dropout, bidirectional=True)
        self.ln = nn.LayerNorm(hidden*2) if layer_norm else nn.Identity()
        self.head = nn.Sequential(
            nn.Linear(hidden*2, 256),
            nn.ReLU(),
            nn.Dropout(dropout),
            nn.Linear(256, 1)
        )
    def forward(self, x):
        y, _ = self.gru(x)
        y = self.ln(y)
        out = self.head(y)
        return out.squeeze(-1)

def masked_smooth_l1(pred, target, mask, beta=0.5):
    diff = (pred - target).abs()
    loss = torch.where(diff < beta, 0.5 * (diff ** 2) / beta, diff - 0.5 * beta)
    loss = loss * mask
    denom = mask.sum().clamp_min(1.0)
    return loss.sum() / denom

def train_bigru_cv(seed=42, n_folds=5, batch_size=1536, epochs=25, lr=2e-4, hidden=256, layers=3, dropout=0.2):
    torch.manual_seed(seed); np.random.seed(seed)
    device = 'cuda' if torch.cuda.is_available() else 'cpu'
    print('Device:', device, flush=True)
    folds_df = pd.read_csv('folds_breath_v3.csv')
    fold_map = dict(zip(folds_df['breath_id'].astype(int).values, folds_df['fold'].astype(int).values))
    folds = np.array([fold_map[int(b)] for b in bids_all], dtype=np.int16)

    # Build per-(R,C) grid for snapping
    grid_all = np.unique(train_fe['pressure'].values.astype(np.float32)); grid_all.sort()
    rc_train = (train_fe['R']*100 + train_fe['C']).astype(np.int32)
    rc_press = {}
    for rc, grp in train_fe.groupby(rc_train):
        g = np.unique(grp['pressure'].values.astype(np.float32)); g.sort(); rc_press[int(rc)] = g
    for rc in np.unique(rc_test_all):
        if int(rc) not in rc_press:
            rc_press[int(rc)] = grid_all

    def snap_to_grid(arr, grid):
        idx = np.searchsorted(grid, arr)
        idx0 = np.clip(idx-1, 0, grid.size-1); idx1 = np.clip(idx, 0, grid.size-1)
        left = grid[idx0]; right = grid[idx1]
        return np.where(np.abs(arr-left) <= np.abs(arr-right), left, right).astype(np.float32)

    from scipy.signal import medfilt

    # Exclude discrete/progress/integrals/physics from standardization
    EXCLUDE_STD = set(['R','C','RC','t_idx_norm','breath_progress','insp_frac','vol_dt','vol_insp','u_in_cumsum','V_term','p_phys'])
    cont_idx = [i for i, f in enumerate(FEATS) if f not in EXCLUDE_STD]

    oof = np.zeros_like(y_all, dtype=np.float32)
    test_preds_folds = []

    for k in range(n_folds):
        t_fold = time.time()
        trn_idx = np.where(folds != k)[0]
        val_idx = np.where(folds == k)[0]
        print(f'Fold {k+1}/{n_folds}: train breaths {trn_idx.size} | val breaths {val_idx.size}', flush=True)

        # Fold-safe global standardization on continuous features
        X_tr = X_all[trn_idx].copy(); X_va = X_all[val_idx].copy(); X_te = X_test_all.copy()
        flat_tr = X_tr[:, :, cont_idx].reshape(-1, len(cont_idx))
        mu = flat_tr.mean(axis=0, keepdims=True)
        sd = flat_tr.std(axis=0, keepdims=True) + 1e-6
        print('Std stats (min/max):', float(sd.min()), float(sd.max()), flush=True)
        X_tr[:, :, cont_idx] = (X_tr[:, :, cont_idx] - mu) / sd
        X_va[:, :, cont_idx] = (X_va[:, :, cont_idx] - mu) / sd
        X_te[:, :, cont_idx] = (X_te[:, :, cont_idx] - mu) / sd
        assert np.isfinite(X_tr).all() and np.isfinite(X_va).all() and np.isfinite(X_te).all()

        y_tr = y_all[trn_idx]
        y_va = y_all[val_idx]
        m_tr = mask_all[trn_idx]
        m_va = mask_all[val_idx]

        # Target z-score on masked (u_out==0) timesteps in training fold
        ytr_flat = y_tr.reshape(-1)
        mtr_flat = (m_tr.reshape(-1) > 0)
        tgt_mu = float(ytr_flat[mtr_flat].mean())
        tgt_sd = float(ytr_flat[mtr_flat].std() + 1e-6)
        y_tr_n = (y_tr - tgt_mu) / tgt_sd
        print(f'target sd (masked): {tgt_sd:.3f}', flush=True)

        print('Sanity fold', k, ': X_tr', X_tr.shape, 'y_tr', y_tr.shape, 'mask mean tr', round(float(m_tr.mean()),3), flush=True)
        assert np.isfinite(y_tr).all() and np.isfinite(m_tr).all()

        ds_tr = BreathDataset(X_tr, y_tr_n, m_tr, idx=trn_idx)   # normalized target
        ds_va = BreathDataset(X_va, y_va,   m_va, idx=val_idx)    # raw target for val loss
        dl_tr = DataLoader(ds_tr, batch_size=batch_size, shuffle=True, num_workers=0, pin_memory=True)
        dl_va = DataLoader(ds_va, batch_size=batch_size, shuffle=False, num_workers=0, pin_memory=True)

        model = BiGRUReg(in_dim=len(FEATS), hidden=hidden, layers=layers, dropout=dropout).to(device)
        opt = torch.optim.AdamW(model.parameters(), lr=lr, weight_decay=1e-5)
        sched = CosineAnnealingLR(opt, T_max=epochs, eta_min=2e-5)
        scaler = torch.cuda.amp.GradScaler(enabled=(device=='cuda'))
        best = 1e9; best_state = None; patience = 6; bad=0

        for ep in range(1, epochs+1):
            model.train(); tr_loss=0.0; nsteps=0
            for xb, yb, mb, _idx in dl_tr:
                xb = xb.to(device); yb = yb.to(device); mb = mb.to(device)
                opt.zero_grad(set_to_none=True)
                with torch.amp.autocast('cuda', enabled=(device=='cuda')):
                    pred_n = model(xb)
                    loss = masked_smooth_l1(pred_n, yb, mb, beta=0.5)
                scaler.scale(loss).backward()
                torch.nn.utils.clip_grad_norm_(model.parameters(), 5.0)
                scaler.step(opt); scaler.update()
                tr_loss += loss.item(); nsteps += 1
            model.eval(); va_loss=0.0; vsteps=0
            with torch.no_grad():
                for xb, yb, mb, _idx in dl_va:
                    xb = xb.to(device); yb = yb.to(device); mb = mb.to(device)
                    with torch.amp.autocast('cuda', enabled=(device=='cuda')):
                        pred_n = model(xb)
                        pred = pred_n * tgt_sd + tgt_mu
                        loss = masked_smooth_l1(pred, yb, mb, beta=0.5)
                    va_loss += loss.item(); vsteps += 1
            va = va_loss/max(vsteps,1); tr = tr_loss/max(nsteps,1)
            print(f'Epoch {ep}: tr {tr:.5f} va {va:.5f}', flush=True)
            sched.step()
            if va < best - 1e-5:
                best = va; best_state = {k_:v_.detach().cpu().clone() for k_,v_ in model.state_dict().items()}; bad=0
            else:
                bad += 1
                if bad >= patience:
                    print('Early stop at epoch', ep, flush=True); break

        for k_, v_ in best_state.items():
            model.state_dict()[k_].copy_(v_.to(device))
        model.eval()

        with torch.no_grad():
            for xb, yb, mb, idx_batch in dl_va:
                xb = xb.to(device)
                with torch.amp.autocast('cuda', enabled=(device=='cuda')):
                    pred_n = model(xb).float().cpu().numpy()
                pred = pred_n * tgt_sd + tgt_mu
                oof[idx_batch,:] = pred

        ds_te = BreathDataset(X_te, None, mask_test_dummy, idx=np.arange(X_te.shape[0]))
        dl_te = DataLoader(ds_te, batch_size=batch_size, shuffle=False, num_workers=0, pin_memory=True)
        te_preds = []
        with torch.no_grad():
            for xb, mb, _idx in dl_te:
                xb = xb.to(device)
                with torch.amp.autocast('cuda', enabled=(device=='cuda')):
                    pred_n = model(xb).float().cpu().numpy()
                pred = pred_n * tgt_sd + tgt_mu
                te_preds.append(pred)
        te_pred = np.concatenate(te_preds, axis=0)
        test_preds_folds.append(te_pred.astype(np.float32))

        m_flat = mask_all[val_idx].reshape(-1)
        pred_flat = oof[val_idx].reshape(-1)
        y_flat = y_all[val_idx].reshape(-1)
        mae_raw = np.mean(np.abs(pred_flat[m_flat > 0] - y_flat[m_flat > 0]))
        print(f'Fold {k} raw masked MAE: {mae_raw:.6f} | elapsed {time.time()-t_fold:.1f}s', flush=True)

    test_pred_mean = np.mean(np.stack(test_preds_folds, axis=0), axis=0)

    from scipy.signal import medfilt
    oof_snap = np.zeros_like(oof)
    for k in range(n_folds):
        trn_idx = np.where(folds != k)[0]
        fold_grid = np.unique(y_all[trn_idx].reshape(-1)); fold_grid.sort()
        val_idx = np.where(folds == k)[0]
        for i, bi in enumerate(val_idx):
            pred_b = oof[bi]
            snapped = snap_to_grid(pred_b, fold_grid)
            snapped = np.where(mask_all[bi]>0, medfilt(snapped, 3), snapped)
            oof_snap[bi] = snapped
    m_all = mask_all.reshape(-1) > 0
    mae_oof_raw = np.mean(np.abs(oof.reshape(-1)[m_all] - y_all.reshape(-1)[m_all]))
    mae_oof_snap = np.mean(np.abs(oof_snap.reshape(-1)[m_all] - y_all.reshape(-1)[m_all]))
    print(f'OOF MAE raw: {mae_oof_raw:.6f} | snapped+median3: {mae_oof_snap:.6f}', flush=True)
    np.save('oof_bigru_raw.npy', oof.astype(np.float32))

    test_rows = test_fe.sort_values(['breath_id','t_idx']).reset_index(drop=True)
    pred_rows = np.zeros(test_rows.shape[0], dtype=np.float32)
    start = 0
    for i, bid in enumerate(bids_test_all):
        T = SEQ_LEN
        pred_b = test_pred_mean[i]
        rc = int(test_rows.loc[start, 'R'])*100 + int(test_rows.loc[start, 'C'])
        grid = rc_press.get(rc, grid_all)
        pred_b = snap_to_grid(pred_b, grid)
        m = (test_rows.iloc[start:start+T]['u_out'].to_numpy()==0).astype(np.float32)
        sm = medfilt(pred_b, kernel_size=3)
        pred_b = np.where(m>0, sm, pred_b).astype(np.float32)
        pred_rows[start:start+T] = pred_b
        start += T

    sub = pd.DataFrame({'id': test_rows['id'].to_numpy(), 'pressure': pred_rows})
    sub = sub.sort_values('id').reset_index(drop=True)
    sub.to_csv('submission_nn.csv', index=False)
    np.save('oof_bigru.npy', oof_snap.astype(np.float32))
    print('Saved submission_nn.csv and oof_bigru.npy', flush=True)

    return mae_oof_snap

print('BiGRU prep v4 rollback ready. Run train_bigru_cv(epochs=25, lr=2e-4).', flush=True)

=== BiGRU Prep v4 rollback: minimal FEATS, exclude integrals from std, target z-score, cosine (no warmup), ReLU head ===


SEQ_LEN: 80


Train seq: (67905, 80, 18) Test seq: (7545, 80, 18)


Target stats: min=-1.90, max=64.82, mean=11.22
FEATS used: 18
Sanity: first breath shapes: (80, 18) (80,) (80,) | mask_mean: 0.375


BiGRU prep v4 rollback ready. Run train_bigru_cv(epochs=25, lr=2e-4).


In [32]:
import sys, subprocess, time
print('=== Launch BiGRU CV v4 (fold-safe std, SmoothL1, Cosine, target z-score) ===', flush=True)
try:
    import scipy
except Exception:
    print('Installing scipy for median filter...', flush=True)
    subprocess.run([sys.executable, '-m', 'pip', 'install', 'scipy'], check=True)

t0 = time.time()
mae = train_bigru_cv(
    seed=42,
    n_folds=5,
    batch_size=1536,
    epochs=25,
    lr=2e-4,
    hidden=256,
    layers=3,
    dropout=0.2
)
print(f'BiGRU OOF MAE (snapped+median3): {mae:.6f} | Elapsed: {time.time()-t0:.1f}s', flush=True)

=== Launch BiGRU CV v4 (fold-safe std, SmoothL1, Cosine, target z-score) ===


Device: cuda


Fold 1/5: train breaths 54324 | val breaths 13581


Std stats (min/max): 0.0038702357560396194 329.40277099609375


target sd (masked): 9.244


Sanity fold 0 : X_tr (54324, 80, 18) y_tr (54324, 80) mask mean tr 0.38


  scaler = torch.cuda.amp.GradScaler(enabled=(device=='cuda'))


Epoch 1: tr 0.25111 va 2.35440


Epoch 2: tr 0.11760 va 1.71015


Epoch 3: tr 0.08876 va 1.54783


Epoch 4: tr 0.07374 va 1.28922


Epoch 5: tr 0.07054 va 1.35449


Epoch 6: tr 0.06240 va 1.31222


Epoch 7: tr 0.06121 va 1.23026


Epoch 8: tr 0.05601 va 1.15984


Epoch 9: tr 0.05338 va 1.16836


Epoch 10: tr 0.05038 va 1.07716


Epoch 11: tr 0.04849 va 1.16266


Epoch 12: tr 0.04708 va 1.03108


Epoch 13: tr 0.04575 va 1.03394


Epoch 14: tr 0.04491 va 1.02620


Epoch 15: tr 0.04311 va 0.98272


Epoch 16: tr 0.04229 va 0.99538


Epoch 17: tr 0.04132 va 0.95885


Epoch 18: tr 0.04070 va 0.95036


Epoch 19: tr 0.03979 va 0.95623


Epoch 20: tr 0.03933 va 0.93660


Epoch 21: tr 0.03888 va 0.92711


Epoch 22: tr 0.03838 va 0.93270


Epoch 23: tr 0.03784 va 0.91739


Epoch 24: tr 0.03770 va 0.92008


Epoch 25: tr 0.03735 va 0.91086


Fold 0 raw masked MAE: 1.131406 | elapsed 97.3s


Fold 2/5: train breaths 54324 | val breaths 13581


Std stats (min/max): 0.0038637558463960886 330.52703857421875


target sd (masked): 9.251


Sanity fold 1 : X_tr (54324, 80, 18) y_tr (54324, 80) mask mean tr 0.38


Epoch 1: tr 0.23392 va 2.29220


Epoch 2: tr 0.11163 va 1.63146


Epoch 3: tr 0.08389 va 1.49932


Epoch 4: tr 0.07699 va 1.96965


Epoch 5: tr 0.07729 va 1.51499


Epoch 6: tr 0.06247 va 1.34346


Epoch 7: tr 0.06141 va 1.38116


Epoch 8: tr 0.06145 va 1.38495


Epoch 9: tr 0.05292 va 1.16495


Epoch 10: tr 0.04959 va 1.04130


Epoch 11: tr 0.04664 va 1.02150


Epoch 12: tr 0.04657 va 1.00516


Epoch 13: tr 0.04591 va 1.11673


Epoch 14: tr 0.04260 va 1.05561


Epoch 15: tr 0.04146 va 0.95469


Epoch 16: tr 0.03949 va 0.92210


Epoch 17: tr 0.03863 va 0.91607


Epoch 18: tr 0.03766 va 0.89999


Epoch 19: tr 0.03678 va 0.90474


Epoch 20: tr 0.03630 va 0.87902


Epoch 21: tr 0.03581 va 0.91834


Epoch 22: tr 0.03517 va 0.89966


Epoch 23: tr 0.03469 va 0.87335


Epoch 24: tr 0.03440 va 0.86371


Epoch 25: tr 0.03428 va 0.86271


Fold 1 raw masked MAE: 1.082569 | elapsed 97.8s


Fold 3/5: train breaths 54324 | val breaths 13581


Std stats (min/max): 0.003875849535688758 329.597900390625


target sd (masked): 9.251


Sanity fold 2 : X_tr (54324, 80, 18) y_tr (54324, 80) mask mean tr 0.38


Epoch 1: tr 0.23222 va 2.23736


Epoch 2: tr 0.12628 va 1.80180


Epoch 3: tr 0.09223 va 1.55313


Epoch 4: tr 0.07826 va 1.32470


Epoch 5: tr 0.08966 va 2.02992


Epoch 6: tr 0.07517 va 1.23903


Epoch 7: tr 0.06085 va 1.40576


Epoch 8: tr 0.06209 va 1.24813


Epoch 9: tr 0.05214 va 1.10024


Epoch 10: tr 0.04739 va 1.05085


Epoch 11: tr 0.04576 va 1.13903


Epoch 12: tr 0.04455 va 1.02279


Epoch 13: tr 0.04223 va 0.97764


Epoch 14: tr 0.04100 va 0.99451


Epoch 15: tr 0.03941 va 0.93780


Epoch 16: tr 0.03782 va 0.94641


Epoch 17: tr 0.03916 va 0.90830


Epoch 18: tr 0.03643 va 0.88828


Epoch 19: tr 0.03579 va 0.90165


Epoch 20: tr 0.03540 va 0.90010


Epoch 21: tr 0.03494 va 0.86384


Epoch 22: tr 0.03455 va 0.85456


Epoch 23: tr 0.03396 va 0.86095


Epoch 24: tr 0.03371 va 0.85697


Epoch 25: tr 0.03340 va 0.86044


Fold 2 raw masked MAE: 1.072442 | elapsed 98.5s


Fold 4/5: train breaths 54324 | val breaths 13581


Std stats (min/max): 0.0038729801308363676 329.4145812988281


target sd (masked): 9.236


Sanity fold 3 : X_tr (54324, 80, 18) y_tr (54324, 80) mask mean tr 0.38


Epoch 1: tr 0.25335 va 2.27817


Epoch 2: tr 0.11098 va 1.69992


Epoch 3: tr 0.08498 va 1.48975


Epoch 4: tr 0.07779 va 1.42073


Epoch 5: tr 0.07646 va 1.41307


Epoch 6: tr 0.06456 va 1.25663


Epoch 7: tr 0.06521 va 1.25529


Epoch 8: tr 0.05634 va 1.15209


Epoch 9: tr 0.05427 va 1.31013


Epoch 10: tr 0.05163 va 1.37395


Epoch 11: tr 0.04837 va 1.04527


Epoch 12: tr 0.04639 va 1.05161


Epoch 13: tr 0.04540 va 1.17564


Epoch 14: tr 0.04389 va 1.00157


Epoch 15: tr 0.04303 va 0.99547


Epoch 16: tr 0.04154 va 1.01512


Epoch 17: tr 0.04079 va 0.97526


Epoch 18: tr 0.04025 va 1.03295


Epoch 19: tr 0.03945 va 0.93208


Epoch 20: tr 0.03850 va 0.93272


Epoch 21: tr 0.03794 va 0.94885


Epoch 22: tr 0.03745 va 0.94881


Epoch 23: tr 0.03716 va 0.91811


Epoch 24: tr 0.03678 va 0.89953


Epoch 25: tr 0.03639 va 0.91050


Fold 3 raw masked MAE: 1.120140 | elapsed 99.0s


Fold 5/5: train breaths 54324 | val breaths 13581


Std stats (min/max): 0.0038754011038690805 330.1415100097656


target sd (masked): 9.241


Sanity fold 4 : X_tr (54324, 80, 18) y_tr (54324, 80) mask mean tr 0.38


Epoch 1: tr 0.25844 va 2.49062


Epoch 2: tr 0.12085 va 2.02166


Epoch 3: tr 0.09547 va 1.72610


Epoch 4: tr 0.07751 va 1.46919


Epoch 5: tr 0.07380 va 1.25820


Epoch 6: tr 0.06175 va 1.26759


Epoch 7: tr 0.05857 va 1.23203


Epoch 8: tr 0.05364 va 1.21241


Epoch 9: tr 0.05226 va 1.04517


Epoch 10: tr 0.04859 va 1.09289


Epoch 11: tr 0.04915 va 1.21394


Epoch 12: tr 0.04736 va 1.01510


Epoch 13: tr 0.04298 va 0.98517


Epoch 14: tr 0.04235 va 0.98192


Epoch 15: tr 0.03949 va 0.92632


Epoch 16: tr 0.03950 va 1.01942


Epoch 17: tr 0.03805 va 0.94003


Epoch 18: tr 0.03715 va 0.92626


Epoch 19: tr 0.03694 va 0.92181


Epoch 20: tr 0.03652 va 0.87615


Epoch 21: tr 0.03574 va 0.88280


Epoch 22: tr 0.03496 va 0.87973


Epoch 23: tr 0.03476 va 0.88710


Epoch 24: tr 0.03456 va 0.86591


Epoch 25: tr 0.03410 va 0.87326


Fold 4 raw masked MAE: 1.085498 | elapsed 99.5s


OOF MAE raw: 1.098407 | snapped+median3: 1.182177


Saved submission_nn.csv and oof_bigru.npy


BiGRU OOF MAE (snapped+median3): 1.182177 | Elapsed: 499.0s


In [49]:
import time, sys, subprocess, numpy as np, pandas as pd, os
from pathlib import Path
print('=== Blend XGB + BiGRU (OOF-tuned), then per-(R,C) snap + median(3) ===', flush=True)

# Ensure scipy for median filter
try:
    from scipy.signal import medfilt
except Exception:
    print('Installing scipy...', flush=True)
    subprocess.run([sys.executable, '-m', 'pip', 'install', 'scipy'], check=True)
    from scipy.signal import medfilt

sub_xgb_path = Path('submission.csv')            # from XGB
sub_nn_path = Path('submission_nn.csv')          # from BiGRU
oof_xgb_path = Path('oof_xgb.npy')
# Prefer raw NN OOF if available
oof_nn_raw = Path('oof_bigru_raw.npy')
oof_nn_snap = Path('oof_bigru.npy')
oof_nn_path = oof_nn_raw if oof_nn_raw.exists() else oof_nn_snap

# Tune weight on OOF if both present
best_w = 0.7
if oof_xgb_path.exists() and oof_nn_path.exists():
    print(f'Tuning blend weight on OOF using {oof_nn_path.name} ...', flush=True)
    oof_x_id = np.load(oof_xgb_path).astype(np.float32)  # id-order
    oof_n = np.load(oof_nn_path).astype(np.float32)      # breath-major [B,80]
    # Load train in both orders to align OOFs
    tr_breath = pd.read_parquet('train_fe_v3.parquet').sort_values(['breath_id','t_idx']).reset_index(drop=True)
    tr_id = pd.read_parquet('train_fe_v3.parquet').sort_values('id').reset_index(drop=True)
    assert len(tr_breath) == len(tr_id) == oof_x_id.shape[0], 'OOF length mismatch'
    # Reorder XGB OOF from id-order to breath-order
    id_to_pos = dict(zip(tr_id['id'].to_numpy(), np.arange(len(tr_id), dtype=np.int64)))
    idx_breath_order = np.array([id_to_pos[i] for i in tr_breath['id'].to_numpy()], dtype=np.int64)
    oof_x = oof_x_id[idx_breath_order]
    mask = (tr_breath['u_out'].to_numpy()==0)
    y_true = tr_breath['pressure'].to_numpy(dtype=np.float32, copy=False)
    # Flatten breath-major NN OOF to row order (breath-order)
    oof_n_flat = np.zeros_like(y_true, dtype=np.float32)
    start = 0
    for i, (bid, g) in enumerate(tr_breath.groupby('breath_id', sort=False)):
        L = len(g)
        oof_n_flat[start:start+L] = oof_n[i, :L]
        start += L
    ws = np.linspace(0.0, 1.0, 21)
    best_mae = 1e9
    for w in ws:
        pred = w*oof_n_flat + (1.0-w)*oof_x
        mae = np.mean(np.abs(pred[mask]-y_true[mask]))
        if mae < best_mae:
            best_mae, best_w = mae, float(w)
    print(f'Best OOF weight: w_nn={best_w:.2f} -> MAE={best_mae:.6f}', flush=True)
else:
    print('OOF not available for both models; using default w_nn=0.7', flush=True)

assert sub_xgb_path.exists(), 'submission.csv (XGB) not found'
while not sub_nn_path.exists():
    print('Waiting for submission_nn.csv ...', flush=True); time.sleep(10)

sub_xgb = pd.read_csv(sub_xgb_path)
sub_nn = pd.read_csv(sub_nn_path)
assert sub_xgb.shape == sub_nn.shape, 'Submissions shape mismatch'
sub = sub_xgb.merge(sub_nn, on='id', suffixes=('_xgb','_nn'))

# Blend with tuned weight
w_nn = best_w; w_xgb = 1.0 - best_w
sub['pressure_blend'] = (w_xgb*sub['pressure_xgb'] + w_nn*sub['pressure_nn']).astype(np.float32)

# Load FE/test info for per-breath smoothing and (R,C) snapping (v3 files)
test_fe = pd.read_parquet('test_fe_v3.parquet').sort_values(['breath_id','t_idx']).reset_index(drop=True)
train_fe = pd.read_parquet('train_fe_v3.parquet')

# Build per-(R,C) pressure grids from full train
grid_all = np.unique(train_fe['pressure'].values.astype(np.float32)); grid_all.sort()
rc_train = (train_fe['R']*100 + train_fe['C']).astype(np.int32)
rc_press = {}
for rc, grp in train_fe.groupby(rc_train):
    g = np.unique(grp['pressure'].values.astype(np.float32)); g.sort(); rc_press[int(rc)] = g

def snap_to_grid(arr, grid):
    idx = np.searchsorted(grid, arr)
    idx0 = np.clip(idx-1, 0, grid.size-1); idx1 = np.clip(idx, 0, grid.size-1)
    left = grid[idx0]; right = grid[idx1]
    return np.where(np.abs(arr-left) <= np.abs(arr-right), left, right).astype(np.float32)

# Attach blend to test rows, then per-breath median(3) on u_out==0 and per-(R,C) snap
df = test_fe[['id','breath_id','t_idx','u_out','R','C']].copy()
df = df.merge(sub[['id','pressure_blend']], on='id', how='left')
assert df['pressure_blend'].notna().all(), 'Missing blended pressures after merge'

out_vals = np.zeros(len(df), dtype=np.float32)
start = 0
for bid, g in df.groupby('breath_id', sort=False):
    g = g.sort_values('t_idx')
    vals = g['pressure_blend'].to_numpy(dtype=np.float32, copy=False)
    mask_b = (g['u_out'].to_numpy()==0).astype(np.float32)
    rc = int(g['R'].iloc[0])*100 + int(g['C'].iloc[0])
    grid = rc_press.get(rc, grid_all)
    vals = snap_to_grid(vals, grid)
    sm = medfilt(vals, kernel_size=3)
    vals = np.where(mask_b>0, sm, vals).astype(np.float32)
    out_vals[start:start+len(g)] = vals
    start += len(g)

blend_sub = pd.DataFrame({'id': df['id'].to_numpy(), 'pressure': out_vals})
blend_sub = blend_sub.sort_values('id').reset_index(drop=True)
blend_sub.to_csv('submission_blend.csv', index=False)
blend_sub.to_csv('submission.csv', index=False)
print(f'Saved submission_blend.csv and updated submission.csv (w_nn={w_nn:.2f}, w_xgb={1.0-w_nn:.2f})', flush=True)

=== Blend XGB + BiGRU (OOF-tuned), then per-(R,C) snap + median(3) ===


Tuning blend weight on OOF using oof_bigru_raw.npy ...


Best OOF weight: w_nn=0.00 -> MAE=0.301180


Saved submission_blend.csv and updated submission.csv (w_nn=0.00, w_xgb=1.00)


In [38]:
import pandas as pd, numpy as np
from pathlib import Path
print('=== DEBUG: Verify u_in integrity between raw train.csv and FE v3 ===', flush=True)
raw = pd.read_csv('train.csv', usecols=['id','u_in'])
fe = pd.read_parquet('train_fe_v3.parquet', columns=['id','u_in'])
print('Shapes:', raw.shape, fe.shape, flush=True)
raw = raw.sort_values('id').reset_index(drop=True)
fe = fe.sort_values('id').reset_index(drop=True)
assert (raw['id'].values == fe['id'].values).all(), 'ID misalignment between raw and FE'
diff = (raw['u_in'].astype(np.float32).values - fe['u_in'].astype(np.float32).values)
abs_diff = np.abs(diff)
print('Raw u_in stats: min/max', float(raw['u_in'].min()), float(raw['u_in'].max()))
print('FE  u_in stats: min/max', float(fe['u_in'].min()), float(fe['u_in'].max()))
print('Diff stats: max', float(abs_diff.max()), 'mean', float(abs_diff.mean()), '95th', float(np.quantile(abs_diff, 0.95)))
idx_bad = np.where(abs_diff > 1e-5)[0]
print('Mismatched rows:', idx_bad.size)
if idx_bad.size > 0:
    sample = idx_bad[:10]
    print('Sample mismatches:\n', pd.DataFrame({
        'id': raw['id'].iloc[sample].values,
        'u_in_raw': raw['u_in'].iloc[sample].values,
        'u_in_fe': fe['u_in'].iloc[sample].values,
        'abs_diff': abs_diff[sample]
    }))

=== DEBUG: Verify u_in integrity between raw train.csv and FE v3 ===


Shapes: (5432400, 2) (5432400, 2)


Raw u_in stats: min/max 0.0 100.0
FE  u_in stats: min/max 0.0 100.0
Diff stats: max 0.0 mean 0.0 95th 0.0
Mismatched rows: 0


In [42]:
import pandas as pd, numpy as np
from pathlib import Path
print('=== DEBUG: Recompute vol_insp from raw and compare to FE v3 ===', flush=True)
# Load raw and FE, align by id
raw = pd.read_csv('train.csv', usecols=['id','breath_id','time_step','u_in','u_out'])
fe = pd.read_parquet('train_fe_v3.parquet', columns=['id','breath_id','t_idx','vol_insp'])
raw = raw.sort_values(['breath_id','time_step']).reset_index(drop=True)

# Compute dt per breath and inspiration-only integral
grp = raw.groupby('breath_id', sort=False)
dt = grp['time_step'].diff().fillna(0.0).astype(np.float32)
vol_insp_re = (raw['u_in'].astype(np.float32) * dt * (raw['u_out'].values==0).astype(np.float32))
vol_insp_re = vol_insp_re.groupby(raw['breath_id']).cumsum().astype(np.float32)

# Attach recomputed to raw ids order
raw_comp = raw[['id']].copy(); raw_comp['vol_insp_re'] = vol_insp_re.values.astype(np.float32)
raw_comp = raw_comp.sort_values('id').reset_index(drop=True)
fe_sorted = fe.sort_values('id').reset_index(drop=True)
assert (raw_comp['id'].values == fe_sorted['id'].values).all(), 'ID misalignment'

diff = (fe_sorted['vol_insp'].astype(np.float32).values - raw_comp['vol_insp_re'].values)
abs_diff = np.abs(diff)
print('vol_insp FE vs recomputed | max abs diff:', float(abs_diff.max()), 'mean abs diff:', float(abs_diff.mean()),
      'p95:', float(np.quantile(abs_diff, 0.95)), flush=True)
print('vol_insp ranges | FE min/max:', float(fe_sorted['vol_insp'].min()), float(fe_sorted['vol_insp'].max()),
      '| recomputed min/max:', float(raw_comp['vol_insp_re'].min()), float(raw_comp['vol_insp_re'].max()))

# Spot-check a breath with largest discrepancy
idx_bad = np.argmax(abs_diff)
bid_bad = int(fe_sorted.loc[idx_bad, 'breath_id'])
print('Worst breath_id:', bid_bad, 'sample compare (first 10 rows by t_idx):', flush=True)
fe_b = fe[fe['breath_id']==bid_bad].sort_values('t_idx')
raw_b = raw[raw['breath_id']==bid_bad].sort_values('time_step')
print(pd.DataFrame({
    'id': fe_b['id'].head(10).values,
    't_idx': fe_b['t_idx'].head(10).values,
    'vol_insp_FE': fe_b['vol_insp'].head(10).values,
    'vol_insp_re': raw_b['u_in'].astype(np.float32).head(10).values * 0 + raw_b['u_in'].head(10).values  # placeholder to show alignment
}))

=== DEBUG: Recompute vol_insp from raw and compare to FE v3 ===


vol_insp FE vs recomputed | max abs diff: 7.62939453125e-06 mean abs diff: 8.23157577656275e-08 p95: 4.76837158203125e-07


vol_insp ranges | FE min/max: 0.0 86.46919250488281 | recomputed min/max: 0.0 86.46919250488281
Worst breath_id: 3230 sample compare (first 10 rows by t_idx):


     id  t_idx  vol_insp_FE  vol_insp_re
0  4321      0     0.000000        100.0
1  4322      1     3.192830        100.0
2  4323      2     6.383181        100.0
3  4324      3     9.572363        100.0
4  4325      4    12.868190        100.0
5  4326      5    16.054226        100.0
6  4327      6    19.248676        100.0
7  4328      7    22.425915        100.0
8  4329      8    25.711609        100.0
9  4330      9    28.886341        100.0


In [44]:
import numpy as np, pandas as pd
from pathlib import Path
from sklearn.metrics import mean_absolute_error
print('=== Diagnostics: inspect shapes, mask counts, correlations, and betas per (R,C) ===', flush=True)
tr_path = Path('train_fe_v3.parquet'); te_path = Path('test_fe_v3.parquet')
train_fe = pd.read_parquet(tr_path).sort_values('id').reset_index(drop=True)

y = train_fe['pressure'].to_numpy(np.float32)
u = train_fe['u_in'].to_numpy(np.float32)
R = train_fe['R'].to_numpy(np.float32)
C = train_fe['C'].to_numpy(np.float32)
w = (train_fe['u_out'].to_numpy()==0).astype(np.float32)
voli = train_fe['vol_insp'].to_numpy(np.float32)
folds = train_fe['fold'].to_numpy(np.int32)

flow = u * 0.01
x1 = R * flow
x2 = voli / C
print('Shapes:', x1.shape, x2.shape, y.shape, w.shape, flush=True)
print('Mask fraction (u_out==0):', float(w.mean()), flush=True)
m_insp = w > 0
def safe_corr(a,b):
    a = a[m_insp].astype(np.float64); b = b[m_insp].astype(np.float64)
    if a.size < 3: return np.nan
    a = (a - a.mean())/(a.std()+1e-9); b = (b - b.mean())/(b.std()+1e-9)
    return float(np.mean(a*b))
print('Corr(y,x1) insp:', safe_corr(y, x1), 'Corr(y,x2) insp:', safe_corr(y, x2), flush=True)
print('Ranges: flow[min,max]=', float(flow.min()), float(flow.max()), ' x1[min,max]=', float(x1.min()), float(x1.max()),
      ' voli[min,max]=', float(voli.min()), float(voli.max()), ' x2[min,max]=', float(x2.min()), float(x2.max()), flush=True)

rc_tr = (train_fe['R'].astype(np.int32)*100 + train_fe['C'].astype(np.int32)).to_numpy()
rcs = np.unique(rc_tr)
print('Unique RCs:', rcs.tolist(), flush=True)

def fit_beta(X, y, w):
    m = w > 0
    if m.sum() < 3:
        return np.array([0.,0.,float(y[m].mean()) if m.any() else float(y.mean())], dtype=np.float64)
    return np.linalg.lstsq(X[m].astype(np.float64), y[m].astype(np.float64), rcond=None)[0].astype(np.float64)

print('--- Global fit (no folds), per RC, physics X=[x1,x2,1] ---', flush=True)
betas = {}
for rc in rcs[:9]:
    m = (rc_tr == rc)
    X = np.stack([x1[m], x2[m], np.ones(m.sum(), np.float32)], 1)
    b = fit_beta(X, y[m], w[m])
    betas[int(rc)] = b
    mae_rc = mean_absolute_error(y[m & (w>0)], (X @ b).astype(np.float64)[w[m]>0]) if (w[m]>0).any() else np.nan
    print(f'RC {int(rc)}: n={int(m.sum())} | insp={int((w[m]>0).sum())} | betas={b.round(4).tolist()} | MAE_insp={mae_rc:.4f}', flush=True)

print('--- Fold 0 quick check per RC ---', flush=True)
k = 0
tr_mask = (folds != k); va_mask = (folds == k)
for rc in rcs[:9]:
    m = (rc_tr == rc) & tr_mask
    if not np.any(m):
        continue
    Xtr = np.stack([x1[m], x2[m], np.ones(m.sum(), np.float32)], 1)
    b = fit_beta(Xtr, y[m], w[m])
    mv = (rc_tr == rc) & va_mask
    if np.any(mv):
        pred = (np.stack([x1[mv], x2[mv], np.ones(mv.sum(), np.float32)], 1) @ b).astype(np.float64)
        mae_rc = mean_absolute_error(y[mv & (w>0)], pred[w[mv]>0]) if (w[mv]>0).any() else np.nan
        print(f'[Fold0] RC {int(rc)}: tr_insp={int((w[m]>0).sum())} va_insp={int((w[mv]>0).sum())} betas={b.round(4).tolist()} MAE_insp={mae_rc:.4f}', flush=True)

=== Diagnostics: inspect shapes, mask counts, correlations, and betas per (R,C) ===


Shapes: (5432400,) (5432400,) (5432400,) (5432400,)


Mask fraction (u_out==0): 0.37956005334854126


Corr(y,x1) insp: 0.1323156083300688 Corr(y,x2) insp: 0.739106385554138


Ranges: flow[min,max]= 0.0 1.0  x1[min,max]= 0.0 50.0  voli[min,max]= 0.0 86.46919250488281  x2[min,max]= 0.0 8.646919250488281


Unique RCs: [510, 520, 550, 2010, 2020, 2050, 5010, 5020, 5050]


--- Global fit (no folds), per RC, physics X=[x1,x2,1] ---


RC 510: n=598880 | insp=224610 | betas=[-1.5169, 18.0364, 8.556] | MAE_insp=2.0161


RC 520: n=596720 | insp=230359 | betas=[-0.2575, 11.1536, 9.5992] | MAE_insp=2.3522


RC 550: n=595600 | insp=221169 | betas=[0.4466, 18.0487, 7.2372] | MAE_insp=0.9777


RC 2010: n=437520 | insp=165879 | betas=[-0.1151, 18.3685, 8.9759] | MAE_insp=2.4242


RC 2020: n=446960 | insp=167233 | betas=[0.0991, 17.5953, 9.9233] | MAE_insp=3.3876


RC 2050: n=589600 | insp=218932 | betas=[0.3136, 28.603, 8.9024] | MAE_insp=2.9114


RC 5010: n=981120 | insp=374926 | betas=[-0.0712, 21.5121, 8.8611] | MAE_insp=3.7730


RC 5020: n=596480 | insp=231001 | betas=[0.0655, 28.8646, 11.1694] | MAE_insp=5.7068


RC 5050: n=589520 | insp=227813 | betas=[-0.1198, 63.7526, 11.5931] | MAE_insp=5.2155


--- Fold 0 quick check per RC ---


[Fold0] RC 510: tr_insp=180503 va_insp=44107 betas=[-1.4977, 18.1179, 8.5094] MAE_insp=2.0637


[Fold0] RC 520: tr_insp=185571 va_insp=44788 betas=[-0.2496, 11.1212, 9.5933] MAE_insp=2.3468


[Fold0] RC 550: tr_insp=176761 va_insp=44408 betas=[0.4446, 18.0314, 7.2327] MAE_insp=0.9715


[Fold0] RC 2010: tr_insp=133091 va_insp=32788 betas=[-0.1174, 18.3808, 8.9925] MAE_insp=2.3731


[Fold0] RC 2020: tr_insp=134371 va_insp=32862 betas=[0.1015, 17.6541, 9.8717] MAE_insp=3.4271


[Fold0] RC 2050: tr_insp=172756 va_insp=46176 betas=[0.3024, 28.6513, 8.9291] MAE_insp=2.8928


[Fold0] RC 5010: tr_insp=298900 va_insp=76026 betas=[-0.0712, 21.5147, 8.8647] MAE_insp=3.8031


[Fold0] RC 5020: tr_insp=185062 va_insp=45939 betas=[0.0722, 28.8874, 11.1516] MAE_insp=5.6991


[Fold0] RC 5050: tr_insp=182596 va_insp=45217 betas=[-0.1211, 63.7526, 11.5743] MAE_insp=5.2332


In [45]:
import numpy as np, pandas as pd
from pathlib import Path
from sklearn.metrics import mean_absolute_error
print('=== Deployable physics prior: [R*flow, vol_true/C, u_in_first, 1] per-(R,C), fit on u_out==0 ===', flush=True)
tr_path, te_path = Path('train_fe_v3.parquet'), Path('test_fe_v3.parquet')
train = pd.read_parquet(tr_path).sort_values('id').reset_index(drop=True)
test  = pd.read_parquet(te_path).sort_values('id').reset_index(drop=True)

# Per-breath proxy for intercept (available at test)
for df in (train, test):
    df['u_in_first'] = df.groupby('breath_id')['u_in'].transform('first').astype(np.float32)

y = train['pressure'].to_numpy(np.float32)
w = (train['u_out'].to_numpy()==0)

# Consistent physics units
flow_tr = (train['u_in'].to_numpy(np.float32) * 0.01)
flow_te = (test['u_in'].to_numpy(np.float32)  * 0.01)
vol_tr  = (train['vol_insp'].to_numpy(np.float32) * 0.01)  # vol_true = cumsum(flow*dt)
vol_te  = (test['vol_insp'].to_numpy(np.float32)  * 0.01)
R_tr = train['R'].to_numpy(np.float32); C_tr = train['C'].to_numpy(np.float32)
R_te = test['R'].to_numpy(np.float32);  C_te = test['C'].to_numpy(np.float32)
x1_tr = R_tr * flow_tr
x2_tr = vol_tr / C_tr
x3_tr = train['u_in_first'].to_numpy(np.float32)
x1_te = R_te * flow_te
x2_te = vol_te / C_te
x3_te = test['u_in_first'].to_numpy(np.float32)

rc_tr = (train['R'].astype(np.int32)*100 + train['C'].astype(np.int32)).to_numpy()
rc_te = (test['R'].astype(np.int32)*100 + test['C'].astype(np.int32)).to_numpy()
rcs = np.unique(rc_tr)
folds = train['fold'].to_numpy(np.int32)
n_folds = int(folds.max()) + 1

def fit_on_insp(X, y):
    if X.shape[0] < X.shape[1] + 1:
        return np.zeros(X.shape[1], dtype=np.float64)
    beta, *_ = np.linalg.lstsq(X.astype(np.float64), y.astype(np.float64), rcond=None)
    return beta.astype(np.float64)

oof = np.zeros_like(y, dtype=np.float32)
test_fold = np.zeros((len(test), n_folds), dtype=np.float32)

for k in range(n_folds):
    tr_mask = (folds != k) & w
    va_mask = (folds == k) & w
    betas = {}
    for rc in rcs:
        m = tr_mask & (rc_tr == rc)
        if not m.any():
            continue
        X = np.stack([x1_tr[m], x2_tr[m], x3_tr[m], np.ones(m.sum(), np.float32)], 1)
        b = fit_on_insp(X, y[m])
        betas[int(rc)] = b
    for rc, b in betas.items():
        a1,a2,a3,c0 = [float(t) for t in b]
        mv = (rc_tr == rc) & (folds == k)
        if mv.any():
            oof[mv] = (a1*x1_tr[mv] + a2*x2_tr[mv] + a3*x3_tr[mv] + c0).astype(np.float32)
        mt = (rc_te == rc)
        if mt.any():
            test_fold[mt, k] = (a1*x1_te[mt] + a2*x2_te[mt] + a3*x3_te[mt] + c0).astype(np.float32)
    mae_k = mean_absolute_error(y[va_mask], oof[va_mask])
    print(f'Fold {k} masked MAE (deployable 3-term): {mae_k:.4f}', flush=True)

mae = mean_absolute_error(y[w], oof[w])
print('OOF masked MAE (deployable 3-term):', round(mae,4))

train['p_phys'] = oof.astype(np.float32)
test['p_phys']  = test_fold.mean(1).astype(np.float32)
train.to_parquet(tr_path, index=False)
test.to_parquet(te_path, index=False)
print('Saved p_phys (deployable).')

=== Deployable physics prior: [R*flow, vol_true/C, u_in_first, 1] per-(R,C), fit on u_out==0 ===


Fold 0 masked MAE (deployable 3-term): 3.2530


Fold 1 masked MAE (deployable 3-term): 3.2532


Fold 2 masked MAE (deployable 3-term): 3.2209


Fold 3 masked MAE (deployable 3-term): 3.2406


Fold 4 masked MAE (deployable 3-term): 3.2591


OOF masked MAE (deployable 3-term): 3.2454


Saved p_phys (deployable).


In [46]:
import numpy as np, pandas as pd
from pathlib import Path
from sklearn.metrics import mean_absolute_error
print('=== Sanity: Fixed-effects OOF (per-breath intercept from val y) ===', flush=True)
tr_path, te_path = Path('train_fe_v3.parquet'), Path('test_fe_v3.parquet')
train = pd.read_parquet(tr_path).sort_values(['id']).reset_index(drop=True)
test  = pd.read_parquet(te_path).sort_values(['id']).reset_index(drop=True)

y = train['pressure'].to_numpy(np.float32)
w = (train['u_out'].to_numpy()==0)

# Consistent physics features
flow_tr = (train['u_in'].to_numpy(np.float32) * 0.01)
flow_te = (test['u_in'].to_numpy(np.float32)  * 0.01)
vol_tr  = (train['vol_insp'].to_numpy(np.float32) * 0.01)  # vol_true = cumsum(flow*dt)
vol_te  = (test['vol_insp'].to_numpy(np.float32)  * 0.01)
R_tr = train['R'].to_numpy(np.float32); C_tr = train['C'].to_numpy(np.float32)
R_te = test['R'].to_numpy(np.float32);  C_te = test['C'].to_numpy(np.float32)

x1_tr = R_tr * flow_tr
x2_tr = vol_tr / C_tr
x1_te = R_te * flow_te
x2_te = vol_te / C_te

rc_tr = (train['R'].astype(np.int32)*100 + train['C'].astype(np.int32)).to_numpy()
rc_te = (test['R'].astype(np.int32)*100 + test['C'].astype(np.int32)).to_numpy()
rcs = np.unique(rc_tr)

folds = train['fold'].to_numpy(np.int32)
n_folds = folds.max() + 1

def fit_slopes_no_intercept(X, y):
    b, *_ = np.linalg.lstsq(X.astype(np.float64), y.astype(np.float64), rcond=None)
    return b.astype(np.float64)

oof = np.zeros_like(y, dtype=np.float32)
test_fold = np.zeros((len(test), n_folds), dtype=np.float32)

for k in range(n_folds):
    tr_mask = (folds != k) & w
    va_mask = (folds == k) & w

    betas = {}; c_rc = {}
    for rc in rcs:
        m = tr_mask & (rc_tr == rc)
        if not m.any():
            continue
        Xtr = np.stack([x1_tr[m], x2_tr[m]], 1)
        ytr = y[m]
        b = fit_slopes_no_intercept(Xtr, ytr)
        betas[int(rc)] = b
        c_rc[int(rc)] = float(np.median(ytr - (Xtr @ b)))

    # Validation: per-breath intercept from that fold's y
    for bid, g in train[va_mask].groupby('breath_id', sort=False):
        idx = g.index.to_numpy()
        rc = int((g['R'].iat[0])*100 + g['C'].iat[0])
        if rc not in betas:
            continue
        a,b = betas[rc]
        pred_shape = a*x1_tr[idx] + b*x2_tr[idx]
        c_b = float(np.median(y[idx] - pred_shape))
        oof[idx] = (pred_shape + c_b).astype(np.float32)

    # Test: per-(R,C) fallback intercept
    for rc, beta in betas.items():
        a,b = beta
        mt = (rc_te == rc)
        if mt.any():
            test_fold[mt, k] = (a*x1_te[mt] + b*x2_te[mt] + c_rc[rc]).astype(np.float32)

    mae_k = mean_absolute_error(y[va_mask], oof[va_mask])
    print(f'Fold {k} masked MAE (fixed-effects OOF): {mae_k:.4f}', flush=True)

mae_oof = mean_absolute_error(y[w], oof[w])
print(f'OOF masked MAE (fixed-effects): {mae_oof:.4f}')

train['p_phys'] = oof.astype(np.float32)
test['p_phys']  = test_fold.mean(1).astype(np.float32)
train = train.sort_values('id').reset_index(drop=True)
test  = test.sort_values('id').reset_index(drop=True)
train.to_parquet(tr_path, index=False)
test.to_parquet(te_path, index=False)
print('Saved p_phys (fixed-effects OOF sanity).')

=== Sanity: Fixed-effects OOF (per-breath intercept from val y) ===


Fold 0 masked MAE (fixed-effects OOF): 2.3842


Fold 1 masked MAE (fixed-effects OOF): 2.3825


Fold 2 masked MAE (fixed-effects OOF): 2.3738


Fold 3 masked MAE (fixed-effects OOF): 2.3684


Fold 4 masked MAE (fixed-effects OOF): 2.3974


OOF masked MAE (fixed-effects): 2.3813


Saved p_phys (fixed-effects OOF sanity).


In [60]:
import numpy as np, pandas as pd, time, gc, os
from pathlib import Path
from sklearn.metrics import mean_absolute_error
print('=== Per-timestep XGBoost GPU baseline: masked training on u_out==0, fold-safe OOF/Test ===', flush=True)

import xgboost as xgb
tr_path, te_path = Path('train_fe_v3.parquet'), Path('test_fe_v3.parquet')
train = pd.read_parquet(tr_path)
test  = pd.read_parquet(te_path)

# Ensure id/breath order availability
train_id = train.sort_values('id').reset_index(drop=True)
train_b = train.sort_values(['breath_id','t_idx']).reset_index(drop=True)
test_b  = test.sort_values(['breath_id','t_idx']).reset_index(drop=True)

# Feature set (safe, target-free). Exclude pressure, is_train, fold, ids.
drop_cols = {'pressure','is_train','fold','id'}
feat_blacklist = set()
FEATS = [c for c in train.columns if c not in drop_cols and c in test.columns and c not in feat_blacklist]
FEATS = [c for c in FEATS if c not in ['u_out']]  # exclude mask as feature
print('Num features:', len(FEATS))

# Build fold mapping (breath-wise)
folds_df = pd.read_csv('folds_breath_v3.csv')
b2f = dict(zip(folds_df['breath_id'].astype(int), folds_df['fold'].astype(int)))
folds_b = train_b['breath_id'].astype(int).map(b2f).astype(np.int8).to_numpy()
assert not np.isnan(folds_b).any(), 'Missing folds for some breaths'

# Targets and mask
y_b = train_b['pressure'].to_numpy(np.float32)
mask_b = (train_b['u_out'].to_numpy()==0)
t_idx_b = train_b['t_idx'].astype(np.int16).to_numpy()

n_folds = int(folds_df['fold'].max()) + 1
T = int(train_b['t_idx'].max()) + 1
print('Folds:', n_folds, 'Timesteps:', T, flush=True)

# Prepare OOF and test preds (breath-order grids) then convert to id-order at end
B = train_b['breath_id'].nunique()
oof = np.zeros(train_b.shape[0], dtype=np.float32)
test_pred_all = np.zeros(test_b.shape[0], dtype=np.float32)

params = {
    'tree_method': 'hist',
    'device': 'cuda',
    'max_depth': 8,
    'min_child_weight': 16,
    'subsample': 0.8,
    'colsample_bytree': 0.8,
    'lambda': 4.0,
    'alpha': 4.0,
    'eta': 0.05,
    'objective': 'reg:squarederror',
    'eval_metric': 'mae',
    'nthread': max(1, os.cpu_count()-2)
}
n_rounds = 1500
early = 100

t0 = time.time()
for t in range(T):
    idx_t_tr = (t_idx_b == t)
    # Only inspiration rows participate (masked metric)
    idx_fit = idx_t_tr & mask_b
    if idx_fit.sum() == 0:
        continue
    X_t = train_b.loc[idx_fit, FEATS].to_numpy(np.float32, copy=False)
    y_t = y_b[idx_fit]
    f_t = folds_b[idx_fit]
    # Build arrays for mapping back to full t-slice positions
    pos_all_t = np.where(idx_t_tr)[0]
    pos_fit_t = np.where(idx_fit)[0]
    fold_pred_val = np.zeros(idx_t_tr.sum(), dtype=np.float32)
    fold_pred_test = np.zeros(test_b.shape[0]//T, dtype=np.float32)  # per-breath rows per t
    # Test slice for this t
    mt = (test_b['t_idx'].to_numpy()==t)
    X_te_t = test_b.loc[mt, FEATS].to_numpy(np.float32, copy=False)
    dte = xgb.DMatrix(X_te_t)
    for k in range(n_folds):
        m_tr = (f_t != k)
        m_va = (f_t == k)
        if m_tr.sum() == 0 or m_va.sum() == 0:
            continue
        dtr = xgb.DMatrix(X_t[m_tr], label=y_t[m_tr])
        dva = xgb.DMatrix(X_t[m_va], label=y_t[m_va])
        watch = [(dtr, 'tr'), (dva, 'va')]
        bst = xgb.train(params, dtr, num_boost_round=n_rounds, evals=watch, early_stopping_rounds=early, verbose_eval=False)
        # best iteration handling for xgboost >=2.0
        attrs = bst.attributes()
        best_it = int(attrs.get('best_iteration', '0'))
        iter_range = (0, best_it + 1) if best_it > 0 else None
        # Val predictions
        fold_pred_val_subset = bst.predict(dva, iteration_range=iter_range)
        pos_va_fit = pos_fit_t[m_va]
        fold_pred_val_indices = np.searchsorted(pos_all_t, pos_va_fit)
        fold_pred_val[fold_pred_val_indices] = fold_pred_val_subset.astype(np.float32)
        # Test predictions
        fold_pred_test += bst.predict(dte, iteration_range=iter_range).astype(np.float32) / n_folds
    # Write back this timestep's OOF and test predictions
    oof[idx_t_tr] = fold_pred_val
    test_pred_all[mt] = fold_pred_test
    if (t+1) % 10 == 0 or t in (0,1,2):
        m_slice = mask_b & (t_idx_b == t)
        mae_t = mean_absolute_error(y_b[m_slice], oof[m_slice]) if m_slice.any() else np.nan
        print(f't={t:02d} | rows_fit={int(idx_fit.sum())} | val_rows={int((idx_t_tr & mask_b).sum())} | MAE_masked={mae_t:.4f} | elapsed={time.time()-t0:.1f}s', flush=True)
    gc.collect()

# Compute overall OOF masked MAE
mae_all = mean_absolute_error(y_b[mask_b], oof[mask_b])
print(f'OOF masked MAE (per-t XGB): {mae_all:.6f}', flush=True)

# Convert breath-order predictions back to id-order for saving and blending
train_b_pred = train_b[['id']].copy(); train_b_pred['pressure'] = oof.astype(np.float32)
train_id_pred = train_b_pred.sort_values('id').reset_index(drop=True)
np.save('oof_xgb.npy', train_id_pred['pressure'].to_numpy(np.float32))
print('Saved oof_xgb.npy (id-order)', flush=True)

test_b_pred = test_b[['id']].copy(); test_b_pred['pressure'] = test_pred_all.astype(np.float32)
sub = test_b_pred.sort_values('id').reset_index(drop=True)
sub.to_csv('submission.csv', index=False)
print('Saved submission.csv from per-t XGB.', flush=True)

=== Per-timestep XGBoost GPU baseline: masked training on u_out==0, fold-safe OOF/Test ===


Num features: 58
Folds: 5 Timesteps: 80


t=00 | rows_fit=67905 | val_rows=67905 | MAE_masked=0.1990 | elapsed=47.4s


t=01 | rows_fit=67905 | val_rows=67905 | MAE_masked=0.1823 | elapsed=97.8s


t=02 | rows_fit=67905 | val_rows=67905 | MAE_masked=0.1884 | elapsed=148.3s


KeyboardInterrupt: 

In [52]:
import numpy as np, pandas as pd, time, gc, os, sys, subprocess
from pathlib import Path
from sklearn.metrics import mean_absolute_error
print('=== Per-timestep CatBoost GPU baseline: masked training on u_out==0, fold-safe OOF/Test ===', flush=True)

try:
    import catboost as cb
except Exception:
    print('Installing catboost...', flush=True)
    subprocess.run([sys.executable, '-m', 'pip', 'install', 'catboost==1.2.5'], check=True)
    import catboost as cb

tr_path, te_path = Path('train_fe_v3.parquet'), Path('test_fe_v3.parquet')
train = pd.read_parquet(tr_path)
test  = pd.read_parquet(te_path)

# Sort to breath-order for per-t training
train_b = train.sort_values(['breath_id','t_idx']).reset_index(drop=True)
test_b  = test.sort_values(['breath_id','t_idx']).reset_index(drop=True)

# Feature set: same as XGB (exclude target, ids, fold, u_out)
drop_cols = {'pressure','is_train','fold','id'}
FEATS = [c for c in train.columns if c not in drop_cols and c in test.columns and c != 'u_out']
print('Num features:', len(FEATS))

# Folds (breath-wise)
folds_df = pd.read_csv('folds_breath_v3.csv')
b2f = dict(zip(folds_df['breath_id'].astype(int), folds_df['fold'].astype(int)))
folds_b = train_b['breath_id'].astype(int).map(b2f).astype(np.int8).to_numpy()
assert not np.isnan(folds_b).any(), 'Missing folds for some breaths'

# Targets and mask
y_b = train_b['pressure'].to_numpy(np.float32)
mask_b = (train_b['u_out'].to_numpy()==0)
t_idx_b = train_b['t_idx'].astype(np.int16).to_numpy()

n_folds = int(folds_df['fold'].max()) + 1
T = int(train_b['t_idx'].max()) + 1
print('Folds:', n_folds, 'Timesteps:', T, flush=True)

oof = np.zeros(train_b.shape[0], dtype=np.float32)
test_pred_all = np.zeros(test_b.shape[0], dtype=np.float32)

params = dict(
    loss_function='MAE',
    depth=8,
    learning_rate=0.05,
    l2_leaf_reg=6.0,
    subsample=0.8,
    bootstrap_type='Bernoulli',
    random_strength=0.5,
    task_type='GPU',
    devices='0',
    border_count=128,
    random_seed=42,
    verbose=False
)
n_rounds = 2000
early = 150

t0 = time.time()
for t in range(T):
    idx_t_tr = (t_idx_b == t)
    idx_fit = idx_t_tr & mask_b
    if idx_fit.sum() == 0:
        continue
    X_t = train_b.loc[idx_fit, FEATS]
    y_t = y_b[idx_fit]
    f_t = folds_b[idx_fit]
    # mapping indices back
    pos_all_t = np.where(idx_t_tr)[0]
    pos_fit_t = np.where(idx_fit)[0]
    fold_pred_val = np.zeros(idx_t_tr.sum(), dtype=np.float32)
    fold_pred_test = np.zeros(test_b.shape[0]//T, dtype=np.float32)
    # Test slice
    mt = (test_b['t_idx'].to_numpy()==t)
    X_te_t = test_b.loc[mt, FEATS]
    for k in range(n_folds):
        m_tr = (f_t != k)
        m_va = (f_t == k)
        if m_tr.sum() == 0 or m_va.sum() == 0:
            continue
        dtr = cb.Pool(X_t.iloc[m_tr], y_t[m_tr])
        dva = cb.Pool(X_t.iloc[m_va], y_t[m_va])
        model = cb.CatBoostRegressor(**params, iterations=n_rounds, early_stopping_rounds=early)
        model.fit(dtr, eval_set=dva, use_best_model=True, verbose=False)
        pred_va = model.predict(dva).astype(np.float32)
        pos_va_fit = pos_fit_t[m_va]
        fold_pred_val_indices = np.searchsorted(pos_all_t, pos_va_fit)
        fold_pred_val[fold_pred_val_indices] = pred_va
        fold_pred_test += model.predict(X_te_t).astype(np.float32) / n_folds
    oof[idx_t_tr] = fold_pred_val
    test_pred_all[mt] = fold_pred_test
    if (t+1) % 10 == 0 or t in (0,1,2):
        m_slice = mask_b & (t_idx_b == t)
        mae_t = mean_absolute_error(y_b[m_slice], oof[m_slice]) if m_slice.any() else np.nan
        print(f't={t:02d} | rows_fit={int(idx_fit.sum())} | MAE_masked={mae_t:.4f} | elapsed={time.time()-t0:.1f}s', flush=True)
    gc.collect()

mae_all = mean_absolute_error(y_b[mask_b], oof[mask_b])
print(f'OOF masked MAE (per-t CatBoost): {mae_all:.6f}', flush=True)

# Save OOF (id-order) and submission
train_b_pred = train_b[['id']].copy(); train_b_pred['pressure'] = oof.astype(np.float32)
train_id_pred = train_b_pred.sort_values('id').reset_index(drop=True)
np.save('oof_cat.npy', train_id_pred['pressure'].to_numpy(np.float32))
print('Saved oof_cat.npy (id-order)', flush=True)

test_b_pred = test_b[['id']].copy(); test_b_pred['pressure'] = test_pred_all.astype(np.float32)
sub_cat = test_b_pred.sort_values('id').reset_index(drop=True)
sub_cat.to_csv('submission_cat.csv', index=False)
print('Saved submission_cat.csv', flush=True)

=== Per-timestep CatBoost GPU baseline: masked training on u_out==0, fold-safe OOF/Test ===


Num features: 50
Folds: 5 Timesteps: 80


Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU


KeyboardInterrupt: 

In [56]:
import numpy as np, pandas as pd, time
from pathlib import Path
from sklearn.metrics import mean_absolute_error
print('=== XGB OOF post-processing v3: fix breath start map + per-(R,C) fold-safe grid + insp-segment median(3) + RC×t de-bias ===', flush=True)

tr_path = Path('train_fe_v3.parquet')
oof_path = Path('oof_xgb.npy')
assert tr_path.exists() and oof_path.exists(), 'Missing train_fe_v3.parquet or oof_xgb.npy'

train_id = pd.read_parquet(tr_path).sort_values('id').reset_index(drop=True)
train_b = pd.read_parquet(tr_path).sort_values(['breath_id','t_idx']).reset_index(drop=True)
oof_id = np.load(oof_path).astype(np.float32)
assert len(oof_id) == len(train_id), 'OOF length mismatch vs train rows'

# Map id-order OOF to breath-order
id_to_pos = dict(zip(train_id['id'].to_numpy(), np.arange(len(train_id), dtype=np.int64)))
idx_breath_order = np.array([id_to_pos[i] for i in train_b['id'].to_numpy()], dtype=np.int64)
pred_breath = oof_id[idx_breath_order].astype(np.float32)

y = train_b['pressure'].to_numpy(np.float32)
mask = (train_b['u_out'].to_numpy()==0)
t_idx = train_b['t_idx'].to_numpy(np.int16)
rc_key = (train_b['R'].astype(np.int32)*100 + train_b['C'].astype(np.int32)).to_numpy()

folds_df = pd.read_csv('folds_breath_v3.csv')
b2f = dict(zip(folds_df['breath_id'].astype(int), folds_df['fold'].astype(int)))
folds_row = train_b['breath_id'].astype(int).map(b2f).to_numpy()
n_folds = int(folds_df['fold'].max()) + 1

raw_mae = mean_absolute_error(y[mask], pred_breath[mask])
print(f'OOF masked MAE (raw XGB): {raw_mae:.6f}', flush=True)

B = train_b['breath_id'].nunique()
T = int(train_b['t_idx'].max()) + 1
assert B*T == len(train_b), 'Breath-order rows not contiguous; cannot reshape'

def snap_to_grid(arr, grid):
    idx = np.searchsorted(grid, arr)
    idx0 = np.clip(idx-1, 0, grid.size-1); idx1 = np.clip(idx, 0, grid.size-1)
    left = grid[idx0]; right = grid[idx1]
    return np.where(np.abs(arr-left) <= np.abs(arr-right), left, right).astype(np.float32)

def median_insp_segments(vals: np.ndarray, m: np.ndarray, k: int = 3) -> np.ndarray:
    out = vals.copy()
    n = len(vals)
    i = 0
    while i < n:
        if not m[i]:
            i += 1
            continue
        j = i
        while j < n and m[j]:
            j += 1
        seg = vals[i:j]
        if seg.size >= 3 and k == 3:
            seg_ext = np.pad(seg, (1,1), mode='edge')
            med = np.median(np.stack([seg_ext[:-2], seg_ext[1:-1], seg_ext[2:]], axis=0), axis=0)
            out[i:j] = med.astype(np.float32)
        else:
            out[i:j] = seg.astype(np.float32)
        i = j
    return out

# Build correct breath_id -> start index map (use first index from breath-order, preserve order) once
first_rows = train_b.groupby('breath_id', sort=False).head(1)
bid_to_start = dict(zip(first_rows['breath_id'].to_numpy(), first_rows.index.to_numpy()))

# Prepare containers for PP result
pp_pred = pred_breath.copy()

t_start = time.time()
for k in range(n_folds):
    tr_rows = (folds_row != k)
    va_rows = (folds_row == k)
    if not va_rows.any():
        continue
    # RC×t de-bias computed from training folds only on masked rows
    resid = (y - pred_breath).astype(np.float32)
    m_tr = tr_rows & mask
    df_res = pd.DataFrame({
        'rc': rc_key[m_tr],
        't': t_idx[m_tr],
        'resid': resid[m_tr]
    })
    delta_tbl = df_res.groupby(['rc','t'])['resid'].median().astype(np.float32)
    # Build per-(R,C) grid from training folds only, RC-specific
    df_tr = pd.DataFrame({'y': y[tr_rows], 'rc': rc_key[tr_rows]})
    rc_grids = {}
    for rc, grp in df_tr.groupby('rc'):
        g = np.unique(grp['y'].values.astype(np.float32))
        rc_grids[int(rc)] = g

    # Process breath-wise for this fold's validation breaths
    va_breaths = np.unique(train_b.loc[va_rows, 'breath_id'].to_numpy())
    for bid in va_breaths:
        s = bid_to_start[int(bid)]
        e = s + T
        vals = pp_pred[s:e].copy()
        rc = int(rc_key[s])
        # de-bias with per-(rc,t) median residuals from train folds
        dt = delta_tbl.reindex(pd.MultiIndex.from_product([[rc], np.arange(T)], names=['rc','t']))
        if dt is not None:
            delta_arr = dt.values.astype(np.float32)
            if np.isnan(delta_arr).any():
                delta_arr = np.where(np.isnan(delta_arr), 0.0, delta_arr).astype(np.float32)
            vals = vals + delta_arr
        # median(3) only within inspiration segments
        m_b = mask[s:e]
        vals = median_insp_segments(vals, m_b, k=3)
        # snap to per-(R,C) fold-safe grid
        grid = rc_grids.get(rc, None)
        if grid is None or grid.size == 0:
            grid = np.unique(y[tr_rows].astype(np.float32))
        vals = snap_to_grid(vals, grid)
        pp_pred[s:e] = vals.astype(np.float32)
    print(f'Fold {k}: processed {va_rows.sum()} rows | elapsed {time.time()-t_start:.1f}s', flush=True)

pp_mae = mean_absolute_error(y[mask], pp_pred[mask])
print(f'OOF masked MAE (PP v3: RC×t de-bias + RC-grids + median3-insp + fixed breath map): {pp_mae:.6f}', flush=True)

=== XGB OOF post-processing v3: fix breath start map + per-(R,C) fold-safe grid + insp-segment median(3) + RC×t de-bias ===


OOF masked MAE (raw XGB): 0.301180


Fold 0: processed 1086480 rows | elapsed 9.5s


Fold 1: processed 1086480 rows | elapsed 19.0s


Fold 2: processed 1086480 rows | elapsed 28.4s


Fold 3: processed 1086480 rows | elapsed 37.9s


Fold 4: processed 1086480 rows | elapsed 47.4s


OOF masked MAE (PP v3: RC×t de-bias + RC-grids + median3-insp + fixed breath map): 0.489980


In [58]:
import numpy as np, pandas as pd, time
from pathlib import Path
from sklearn.metrics import mean_absolute_error
print('=== Test-time: apply RC×t de-bias (from XGB OOF), then per-(R,C) snap + insp-only median(3) ===', flush=True)

tr_path = Path('train_fe_v3.parquet')
te_path = Path('test_fe_v3.parquet')
oof_path = Path('oof_xgb.npy')
sub_path = Path('submission.csv')  # current XGB test preds
assert tr_path.exists() and te_path.exists() and oof_path.exists() and sub_path.exists(), 'Missing required artifacts'

# Load train/test FE and align orders
train_id = pd.read_parquet(tr_path).sort_values('id').reset_index(drop=True)
train_b = pd.read_parquet(tr_path).sort_values(['breath_id','t_idx']).reset_index(drop=True)
test_id  = pd.read_parquet(te_path).sort_values('id').reset_index(drop=True)
test_b   = pd.read_parquet(te_path).sort_values(['breath_id','t_idx']).reset_index(drop=True)
sub = pd.read_csv(sub_path).sort_values('id').reset_index(drop=True)
assert (sub['id'].values == test_id['id'].values).all(), 'submission.csv not aligned to test id order'

# Map id-order OOF to breath-order to compute residual table
oof_id = np.load(oof_path).astype(np.float32)
assert len(oof_id) == len(train_id), 'OOF length mismatch vs train rows'
id_to_pos_tr = dict(zip(train_id['id'].to_numpy(), np.arange(len(train_id), dtype=np.int64)))
idx_breath_order = np.array([id_to_pos_tr[i] for i in train_b['id'].to_numpy()], dtype=np.int64)
oof_breath = oof_id[idx_breath_order].astype(np.float32)

# Residual de-bias per (R,C,t_idx) on masked rows, using full train (no fold needed for test)
y = train_b['pressure'].to_numpy(np.float32)
mask = (train_b['u_out'].to_numpy()==0)
t_idx = train_b['t_idx'].to_numpy(np.int16)
rc_key_tr = (train_b['R'].astype(np.int32)*100 + train_b['C'].astype(np.int32)).to_numpy()
resid = (y - oof_breath).astype(np.float32)
df_res = pd.DataFrame({'rc': rc_key_tr[mask], 't': t_idx[mask], 'resid': resid[mask]})
delta_tbl = df_res.groupby(['rc','t'])['resid'].median().astype(np.float32)
print('Built delta table size:', delta_tbl.size, flush=True)

# Build per-(R,C) pressure grids from full train
grid_all = np.unique(train_b['pressure'].values.astype(np.float32)); grid_all.sort()
rc_train = (train_b['R'].astype(np.int32)*100 + train_b['C'].astype(np.int32)).to_numpy()
rc_press = {}
for rc, grp in pd.DataFrame({'rc': rc_train, 'p': train_b['pressure'].values.astype(np.float32)}).groupby('rc'):
    g = np.unique(grp['p'].values); g.sort(); rc_press[int(rc)] = g

def snap_to_grid(arr, grid):
    idx = np.searchsorted(grid, arr)
    idx0 = np.clip(idx-1, 0, grid.size-1); idx1 = np.clip(idx, 0, grid.size-1)
    left = grid[idx0]; right = grid[idx1]
    return np.where(np.abs(arr-left) <= np.abs(arr-right), left, right).astype(np.float32)

def median_insp_segments(vals: np.ndarray, m: np.ndarray, k: int = 3) -> np.ndarray:
    out = vals.copy()
    n = len(vals); i = 0
    while i < n:
        if not m[i]:
            i += 1; continue
        j = i
        while j < n and m[j]:
            j += 1
        seg = vals[i:j]
        if seg.size >= 3 and k == 3:
            seg_ext = np.pad(seg, (1,1), mode='edge')
            med = np.median(np.stack([seg_ext[:-2], seg_ext[1:-1], seg_ext[2:]], axis=0), axis=0)
            out[i:j] = med.astype(np.float32)
        else:
            out[i:j] = seg.astype(np.float32)
        i = j
    return out

# Prepare test predictions in breath-order
press_id_order = sub['pressure'].to_numpy(np.float32)
id_to_pos_te = dict(zip(test_id['id'].to_numpy(), np.arange(len(test_id), dtype=np.int64)))
idx_test_breath_order = np.array([id_to_pos_te[i] for i in test_b['id'].to_numpy()], dtype=np.int64)
test_vals_breath = press_id_order[idx_test_breath_order].astype(np.float32)

t0 = time.time()
out_vals = np.zeros_like(test_vals_breath, dtype=np.float32)
start = 0
T = int(test_b['t_idx'].max()) + 1
for bid, g in test_b.groupby('breath_id', sort=False):
    g = g.sort_values('t_idx')
    L = len(g)
    vals = test_vals_breath[start:start+L].copy()
    rc = int(g['R'].iloc[0])*100 + int(g['C'].iloc[0])
    tt = g['t_idx'].to_numpy(np.int16)
    # apply RC×t de-bias (missing -> 0)
    keys = pd.MultiIndex.from_arrays([np.full(L, rc, dtype=np.int32), tt], names=['rc','t'])
    delta = delta_tbl.reindex(keys).to_numpy()
    delta = np.where(np.isnan(delta), 0.0, delta).astype(np.float32)
    vals = vals + delta
    # median filter only on inspiration steps
    m = (g['u_out'].to_numpy()==0)
    vals = median_insp_segments(vals, m, k=3)
    # snap to per-(R,C) train grid
    grid = rc_press.get(rc, grid_all)
    vals = snap_to_grid(vals, grid)
    out_vals[start:start+L] = vals.astype(np.float32)
    start += L

# Map back to id-order for saving
out_df_breath = pd.DataFrame({'id': test_b['id'].to_numpy(), 'pressure': out_vals})
sub_out = out_df_breath.sort_values('id').reset_index(drop=True)
sub_out.to_csv('submission.csv', index=False)
print('Saved updated submission.csv with RC×t de-bias + per-(R,C) snap + insp-only median(3). Elapsed:', round(time.time()-t0,1), 's', flush=True)

=== Test-time: apply RC×t de-bias (from XGB OOF), then per-(R,C) snap + insp-only median(3) ===


Built delta table size: 288


Saved updated submission.csv with RC×t de-bias + per-(R,C) snap + insp-only median(3). Elapsed: 5.2 s


In [61]:
import numpy as np, pandas as pd, time, gc, os, sys
from pathlib import Path
from sklearn.metrics import mean_absolute_error
print('=== Per-timestep XGBoost MAE | 3 seeds (S42/S17/S91) | GPU | masked rows | fold-safe OOF/Test ===', flush=True)

import xgboost as xgb
tr_path, te_path = Path('train_fe_v3.parquet'), Path('test_fe_v3.parquet')
assert tr_path.exists() and te_path.exists(), 'Run FE cell first'
train = pd.read_parquet(tr_path)
test  = pd.read_parquet(te_path)

# Sort to breath-major for per-t training
train_b = train.sort_values(['breath_id','t_idx']).reset_index(drop=True)
test_b  = test.sort_values(['breath_id','t_idx']).reset_index(drop=True)

# Features: exclude target/meta and u_out (keep lags/leads etc.)
drop_cols = {'pressure','is_train','fold','id'}
FEATS = [c for c in train.columns if c not in drop_cols and c in test.columns and c != 'u_out']
print('Num features:', len(FEATS))

# Folds mapping (breath-wise)
folds_df = pd.read_csv('folds_breath_v3.csv')
b2f = dict(zip(folds_df['breath_id'].astype(int), folds_df['fold'].astype(int)))
folds_b = train_b['breath_id'].astype(int).map(b2f).astype(np.int8).to_numpy()
assert not np.isnan(folds_b).any(), 'Missing folds for some breaths'

# Targets and mask
y_b = train_b['pressure'].to_numpy(np.float32)
mask_b = (train_b['u_out'].to_numpy()==0)
t_idx_b = train_b['t_idx'].astype(np.int16).to_numpy()

n_folds = int(folds_df['fold'].max()) + 1
T = int(train_b['t_idx'].max()) + 1
B = train_b['breath_id'].nunique()
print('Folds:', n_folds, 'Timesteps:', T, 'Breaths:', B, flush=True)

# Seed configurations (expert presets) with rounds/early stop
seed_cfgs = [
    dict(name='S42', seed=42, rounds=900, params=dict(eta=0.03, max_depth=8, min_child_weight=48, subsample=0.80, colsample_bytree=0.60, reg_lambda=24, reg_alpha=2, gamma=0.0)),
    dict(name='S17', seed=17, rounds=850, params=dict(eta=0.035, max_depth=7, min_child_weight=40, subsample=0.75, colsample_bytree=0.55, reg_lambda=16, reg_alpha=0, gamma=0.1)),
    dict(name='S91', seed=91, rounds=800, params=dict(eta=0.03, max_depth=8, min_child_weight=64, subsample=0.85, colsample_bytree=0.70, reg_lambda=32, reg_alpha=4, gamma=0.0)),
]
early = 100

def train_one_seed(cfg):
    name = cfg['name']; seed = int(cfg['seed']); rounds = int(cfg['rounds']); hp = cfg['params']
    print(f'-- Seed {name} start | rounds={rounds} --', flush=True)
    params = {
        'tree_method': 'hist',
        'device': 'cuda',
        'objective': 'reg:absoluteerror',
        'eval_metric': 'mae',
        'nthread': max(1, os.cpu_count()-2),
        'seed': seed,
        'eta': hp['eta'],
        'max_depth': hp['max_depth'],
        'min_child_weight': hp['min_child_weight'],
        'subsample': hp['subsample'],
        'colsample_bytree': hp['colsample_bytree'],
        'lambda': hp['reg_lambda'],
        'alpha': hp['reg_alpha'],
        'gamma': hp['gamma'],
    }
    oof = np.zeros(train_b.shape[0], dtype=np.float32)
    test_pred_all = np.zeros(test_b.shape[0], dtype=np.float32)
    t0 = time.time()
    # Precompute per-t test slice index and data to avoid repeated to_numpy conversions
    t_vec_test = test_b['t_idx'].to_numpy()
    for t in range(T):
        idx_t_tr = (t_idx_b == t)
        idx_fit = idx_t_tr & mask_b
        if idx_fit.sum() == 0:
            continue
        X_t = train_b.loc[idx_fit, FEATS].to_numpy(np.float32, copy=False)
        y_t = y_b[idx_fit]
        f_t = folds_b[idx_fit]
        # positions to place back fold preds into the full t-slice
        pos_all_t = np.where(idx_t_tr)[0]
        pos_fit_t = np.where(idx_fit)[0]
        fold_pred_val = np.zeros(idx_t_tr.sum(), dtype=np.float32)
        # Test slice for this t
        mt = (t_vec_test == t)
        X_te_t = test_b.loc[mt, FEATS].to_numpy(np.float32, copy=False)
        dte = xgb.DMatrix(X_te_t)
        fold_pred_test = np.zeros(mt.sum(), dtype=np.float32)
        for k in range(n_folds):
            m_tr = (f_t != k); m_va = (f_t == k)
            if m_tr.sum() == 0 or m_va.sum() == 0:
                continue
            dtr = xgb.DMatrix(X_t[m_tr], label=y_t[m_tr])
            dva = xgb.DMatrix(X_t[m_va], label=y_t[m_va])
            bst = xgb.train(params=params, dtrain=dtr, num_boost_round=rounds, evals=[(dtr,'tr'),(dva,'va')], early_stopping_rounds=early, verbose_eval=False)
            attrs = bst.attributes()
            best_it = int(attrs.get('best_iteration', '0'))
            iter_range = (0, best_it + 1) if best_it > 0 else None
            # Validation predictions mapped back to t-slice
            pred_va = bst.predict(dva, iteration_range=iter_range).astype(np.float32)
            pos_va_fit = pos_fit_t[m_va]
            fold_pred_val_indices = np.searchsorted(pos_all_t, pos_va_fit)
            fold_pred_val[fold_pred_val_indices] = pred_va
            # Test predictions averaged across folds
            fold_pred_test += bst.predict(dte, iteration_range=iter_range).astype(np.float32) / n_folds
        # Write back
        oof[idx_t_tr] = fold_pred_val
        test_pred_all[mt] = fold_pred_test
        if (t+1) % 10 == 0 or t < 3:
            m_slice = mask_b & (t_idx_b == t)
            mae_t = mean_absolute_error(y_b[m_slice], oof[m_slice]) if m_slice.any() else np.nan
            print(f'{name} | t={t:02d} | rows_fit={int(idx_fit.sum())} | MAE_masked={mae_t:.4f} | elapsed={time.time()-t0:.1f}s', flush=True)
        gc.collect()
    # Overall OOF masked MAE
    mae_all = mean_absolute_error(y_b[mask_b], oof[mask_b])
    print(f'{name} | OOF masked MAE: {mae_all:.6f} | total elapsed {time.time()-t0:.1f}s', flush=True)
    # Save OOF (id-order) and test preds (id-order)
    train_b_pred = train_b[['id']].copy(); train_b_pred['pressure'] = oof.astype(np.float32)
    train_id_pred = train_b_pred.sort_values('id').reset_index(drop=True)
    np.save(f'oof_xgb_{name.lower()}.npy', train_id_pred['pressure'].to_numpy(np.float32))
    test_b_pred = test_b[['id']].copy(); test_b_pred['pressure'] = test_pred_all.astype(np.float32)
    sub_seed = test_b_pred.sort_values('id').reset_index(drop=True)
    sub_seed.to_csv(f'submission_xgb_{name.lower()}.csv', index=False)
    np.save(f'test_xgb_{name.lower()}.npy', sub_seed['pressure'].to_numpy(np.float32))
    print(f'{name} | Saved oof_xgb_{name.lower()}.npy and test_xgb_{name.lower()}.npy', flush=True)
    return oof, test_pred_all

# Run seeds and average
all_oof = []; all_test = []
for cfg in seed_cfgs:
    oof_s, test_s = train_one_seed(cfg)
    all_oof.append(oof_s.astype(np.float32))
    all_test.append(test_s.astype(np.float32))

oof_mean = np.mean(np.stack(all_oof, axis=0), axis=0)
test_mean = np.mean(np.stack(all_test, axis=0), axis=0)
mae_mean = mean_absolute_error(y_b[mask_b], oof_mean[mask_b])
print(f'AVG(3 seeds) OOF masked MAE: {mae_mean:.6f}', flush=True)

# Save averaged OOF (id-order) and submission
train_b_pred = train_b[['id']].copy(); train_b_pred['pressure'] = oof_mean.astype(np.float32)
train_id_pred = train_b_pred.sort_values('id').reset_index(drop=True)
np.save('oof_xgb_mae_avg.npy', train_id_pred['pressure'].to_numpy(np.float32))
test_b_pred = test_b[['id']].copy(); test_b_pred['pressure'] = test_mean.astype(np.float32)
sub_avg = test_b_pred.sort_values('id').reset_index(drop=True)
sub_avg.to_csv('submission_xgb_mae_avg.csv', index=False)
sub_avg.to_csv('submission.csv', index=False)  # set as current submission (raw, no PP)
print('Saved oof_xgb_mae_avg.npy and submission_xgb_mae_avg.csv; updated submission.csv to seed-avg raw.', flush=True)

=== Per-timestep XGBoost MAE | 3 seeds (S42/S17/S91) | GPU | masked rows | fold-safe OOF/Test ===


Num features: 58
Folds: 5 Timesteps: 80 Breaths: 67905


-- Seed S42 start | rounds=900 --


S42 | t=00 | rows_fit=67905 | MAE_masked=0.2153 | elapsed=26.1s


S42 | t=01 | rows_fit=67905 | MAE_masked=0.2038 | elapsed=52.4s


S42 | t=02 | rows_fit=67905 | MAE_masked=0.2149 | elapsed=79.4s


S42 | t=09 | rows_fit=67905 | MAE_masked=0.3827 | elapsed=264.2s


S42 | t=19 | rows_fit=67905 | MAE_masked=0.3946 | elapsed=534.4s


S42 | t=29 | rows_fit=47285 | MAE_masked=0.4253 | elapsed=809.1s


S42 | OOF masked MAE: 0.369073 | total elapsed 852.3s


S42 | Saved oof_xgb_s42.npy and test_xgb_s42.npy


-- Seed S17 start | rounds=850 --


S17 | t=00 | rows_fit=67905 | MAE_masked=0.2231 | elapsed=19.3s


S17 | t=01 | rows_fit=67905 | MAE_masked=0.2110 | elapsed=39.2s


S17 | t=02 | rows_fit=67905 | MAE_masked=0.2252 | elapsed=59.5s


S17 | t=09 | rows_fit=67905 | MAE_masked=0.4012 | elapsed=199.1s


S17 | t=19 | rows_fit=67905 | MAE_masked=0.4128 | elapsed=402.6s


S17 | t=29 | rows_fit=47285 | MAE_masked=0.4435 | elapsed=607.7s


S17 | OOF masked MAE: 0.385862 | total elapsed 640.7s


S17 | Saved oof_xgb_s17.npy and test_xgb_s17.npy


-- Seed S91 start | rounds=800 --


S91 | t=00 | rows_fit=67905 | MAE_masked=0.2178 | elapsed=23.3s


S91 | t=01 | rows_fit=67905 | MAE_masked=0.2058 | elapsed=47.0s


S91 | t=02 | rows_fit=67905 | MAE_masked=0.2206 | elapsed=71.5s


S91 | t=09 | rows_fit=67905 | MAE_masked=0.3987 | elapsed=236.6s


S91 | t=19 | rows_fit=67905 | MAE_masked=0.3976 | elapsed=473.7s


S91 | t=29 | rows_fit=47285 | MAE_masked=0.4251 | elapsed=714.9s


S91 | OOF masked MAE: 0.375033 | total elapsed 753.0s


S91 | Saved oof_xgb_s91.npy and test_xgb_s91.npy


AVG(3 seeds) OOF masked MAE: 0.363891


Saved oof_xgb_mae_avg.npy and submission_xgb_mae_avg.csv; updated submission.csv to seed-avg raw.


In [63]:
import numpy as np, pandas as pd, time, gc, os, sys, subprocess
from pathlib import Path
from sklearn.metrics import mean_absolute_error
print('=== CatBoost t-bucket (t_bin=t_idx//8 -> 10 bins) | GPU | MAE | masked rows | fold-safe OOF/Test ===', flush=True)

try:
    import catboost as cb
except Exception:
    print('Installing catboost...', flush=True)
    subprocess.run([sys.executable, '-m', 'pip', 'install', 'catboost==1.2.5'], check=True)
    import catboost as cb

tr_path, te_path = Path('train_fe_v3.parquet'), Path('test_fe_v3.parquet')
assert tr_path.exists() and te_path.exists(), 'Run FE cell first'
train = pd.read_parquet(tr_path)
test  = pd.read_parquet(te_path)

# Sort breath-major
train_b = train.sort_values(['breath_id','t_idx']).reset_index(drop=True)
test_b  = test.sort_values(['breath_id','t_idx']).reset_index(drop=True)

# Features: exclude target/meta and u_out (keep lags/leads etc.)
drop_cols = {'pressure','is_train','fold','id'}
FEATS = [c for c in train.columns if c not in drop_cols and c in test.columns and c != 'u_out']
print('Num features:', len(FEATS))

# Folds by breath
folds_df = pd.read_csv('folds_breath_v3.csv')
b2f = dict(zip(folds_df['breath_id'].astype(int), folds_df['fold'].astype(int)))
folds_b = train_b['breath_id'].astype(int).map(b2f).astype(np.int8).to_numpy()
assert not np.isnan(folds_b).any(), 'Missing folds for some breaths'

# Targets and mask
y_b = train_b['pressure'].to_numpy(np.float32)
mask_b = (train_b['u_out'].to_numpy()==0)
t_idx_b = train_b['t_idx'].astype(np.int16).to_numpy()
t_bin_b = (t_idx_b // 8).astype(np.int16)  # 0..9
t_bin_te = (test_b['t_idx'].astype(np.int16).to_numpy() // 8).astype(np.int16)

n_folds = int(folds_df['fold'].max()) + 1
NBINS = int(train_b['t_idx'].max() // 8) + 1
print('Folds:', n_folds, 't_bins:', NBINS, flush=True)

oof = np.zeros(train_b.shape[0], dtype=np.float32)
test_pred_all = np.zeros(test_b.shape[0], dtype=np.float32)

params = dict(
    loss_function='MAE',
    task_type='GPU',
    devices='0',
    depth=8,
    learning_rate=0.033,
    l2_leaf_reg=10.0,
    subsample=0.8,
    bootstrap_type='Bernoulli',
    random_strength=0.3,
    border_count=128,
    random_seed=42,
    verbose=False
)
n_rounds = 1300
early = 130

t0 = time.time()
for b in range(NBINS):
    idx_bin_tr_all = (t_bin_b == b)
    idx_fit = idx_bin_tr_all & mask_b
    if idx_fit.sum() == 0:
        print(f'Bin {b}: no fit rows (masked). Skipping.', flush=True)
        continue
    X_t = train_b.loc[idx_fit, FEATS]
    y_t = y_b[idx_fit]
    f_t = folds_b[idx_fit]
    pos_all_bin = np.where(idx_bin_tr_all)[0]
    pos_fit_bin = np.where(idx_fit)[0]
    fold_pred_val = np.zeros(idx_bin_tr_all.sum(), dtype=np.float32)
    # Test slice for this bin
    mt = (t_bin_te == b)
    X_te_t = test_b.loc[mt, FEATS]
    fold_pred_test = np.zeros(mt.sum(), dtype=np.float32)
    for k in range(n_folds):
        m_tr = (f_t != k); m_va = (f_t == k)
        if m_tr.sum() == 0 or m_va.sum() == 0:
            continue
        dtr = cb.Pool(X_t.iloc[m_tr], y_t[m_tr])
        dva = cb.Pool(X_t.iloc[m_va], y_t[m_va])
        model = cb.CatBoostRegressor(**params, iterations=n_rounds, early_stopping_rounds=early)
        model.fit(dtr, eval_set=dva, use_best_model=True, verbose=False)
        pred_va = model.predict(dva).astype(np.float32)
        pos_va_fit = pos_fit_bin[m_va]
        fold_pred_val_indices = np.searchsorted(pos_all_bin, pos_va_fit)
        fold_pred_val[fold_pred_val_indices] = pred_va
        fold_pred_test += model.predict(X_te_t).astype(np.float32) / n_folds
    oof[idx_bin_tr_all] = fold_pred_val
    test_pred_all[mt] = fold_pred_test
    m_slice = mask_b & idx_bin_tr_all
    mae_b = mean_absolute_error(y_b[m_slice], oof[m_slice]) if m_slice.any() else np.nan
    print(f'bin={b:02d} | rows_fit={int(idx_fit.sum())} | MAE_masked={mae_b:.4f} | elapsed={time.time()-t0:.1f}s', flush=True)
    gc.collect()

mae_all = mean_absolute_error(y_b[mask_b], oof[mask_b])
print(f'OOF masked MAE (CatBoost t-bucket): {mae_all:.6f}', flush=True)

# Save OOF (id-order) and submission
train_b_pred = train_b[['id']].copy(); train_b_pred['pressure'] = oof.astype(np.float32)
train_id_pred = train_b_pred.sort_values('id').reset_index(drop=True)
np.save('oof_cat_bucket.npy', train_id_pred['pressure'].to_numpy(np.float32))
test_b_pred = test_b[['id']].copy(); test_b_pred['pressure'] = test_pred_all.astype(np.float32)
sub_catb = test_b_pred.sort_values('id').reset_index(drop=True)
sub_catb.to_csv('submission_cat_bucket.csv', index=False)
print('Saved oof_cat_bucket.npy and submission_cat_bucket.csv', flush=True)

=== CatBoost t-bucket (t_bin=t_idx//8 -> 10 bins) | GPU | MAE | masked rows | fold-safe OOF/Test ===


Num features: 58
Folds: 5 t_bins: 10


Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU


bin=00 | rows_fit=543240 | MAE_masked=0.9301 | elapsed=36.5s


Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU


bin=01 | rows_fit=543240 | MAE_masked=0.9541 | elapsed=72.7s


Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU


bin=02 | rows_fit=543240 | MAE_masked=0.7475 | elapsed=108.6s


Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU




Default metric period is 5 because MAE is/are not implemented for GPU


bin=03 | rows_fit=432202 | MAE_masked=0.7120 | elapsed=142.9s


Bin 4: no fit rows (masked). Skipping.


Bin 5: no fit rows (masked). Skipping.


Bin 6: no fit rows (masked). Skipping.


Bin 7: no fit rows (masked). Skipping.


Bin 8: no fit rows (masked). Skipping.


Bin 9: no fit rows (masked). Skipping.


OOF masked MAE (CatBoost t-bucket): 0.842597


Saved oof_cat_bucket.npy and submission_cat_bucket.csv


In [66]:
import numpy as np, pandas as pd, time
from pathlib import Path
from sklearn.metrics import mean_absolute_error
print('=== Stepwise OOF PP diagnostics: RC fold-safe snap -> +median(3) -> +RC×t de-bias ===', flush=True)

tr_path = Path('train_fe_v3.parquet')
assert tr_path.exists(), 'Missing train_fe_v3.parquet'
train_id = pd.read_parquet(tr_path).sort_values('id').reset_index(drop=True)
train_b = pd.read_parquet(tr_path).sort_values(['breath_id','t_idx']).reset_index(drop=True)

# Pick OOF file (prefer the stronger per-t XGB squarederror run if available)
candidates = ['oof_xgb.npy','oof_xgb_s91.npy','oof_xgb_s42.npy','oof_xgb_s17.npy','oof_xgb_mae_avg.npy']
oof_file = None
for c in candidates:
    if Path(c).exists():
        oof_file = c; break
assert oof_file is not None, 'No OOF file found among: ' + ', '.join(candidates)
print('Using OOF file:', oof_file, flush=True)
oof_id = np.load(oof_file).astype(np.float32)
assert len(oof_id) == len(train_id), 'OOF length mismatch vs train rows'

# Map id-order OOF to breath-order
id_to_pos = dict(zip(train_id['id'].to_numpy(), np.arange(len(train_id), dtype=np.int64)))
idx_breath_order = np.array([id_to_pos[i] for i in train_b['id'].to_numpy()], dtype=np.int64)
pred_breath = oof_id[idx_breath_order].astype(np.float32)

y = train_b['pressure'].to_numpy(np.float32)
mask = (train_b['u_out'].to_numpy()==0)
t_idx = train_b['t_idx'].to_numpy(np.int16)
rc_key = (train_b['R'].astype(np.int32)*100 + train_b['C'].astype(np.int32)).to_numpy()

folds_df = pd.read_csv('folds_breath_v3.csv')
b2f = dict(zip(folds_df['breath_id'].astype(int), folds_df['fold'].astype(int)))
folds_row = train_b['breath_id'].astype(int).map(b2f).to_numpy()
n_folds = int(folds_df['fold'].max()) + 1

raw_mae = mean_absolute_error(y[mask], pred_breath[mask])
print(f'Step 0 | OOF masked MAE (raw): {raw_mae:.6f}', flush=True)

B = train_b['breath_id'].nunique()
T = int(train_b['t_idx'].max()) + 1
assert B*T == len(train_b), 'Breath-order rows not contiguous; cannot reshape'

def snap_to_grid(arr, grid):
    idx = np.searchsorted(grid, arr)
    idx0 = np.clip(idx-1, 0, grid.size-1); idx1 = np.clip(idx, 0, grid.size-1)
    left = grid[idx0]; right = grid[idx1]
    return np.where(np.abs(arr-left) <= np.abs(arr-right), left, right).astype(np.float32)

def median_insp_segments(vals: np.ndarray, m: np.ndarray, k: int = 3) -> np.ndarray:
    out = vals.copy()
    n = len(vals); i = 0
    while i < n:
        if not m[i]:
            i += 1; continue
        j = i
        while j < n and m[j]:
            j += 1
        seg = vals[i:j]
        if seg.size >= 3 and k == 3:
            seg_ext = np.pad(seg, (1,1), mode='edge')
            med = np.median(np.stack([seg_ext[:-2], seg_ext[1:-1], seg_ext[2:]], axis=0), axis=0)
            out[i:j] = med.astype(np.float32)
        else:
            out[i:j] = seg.astype(np.float32)
        i = j
    return out

# Step 1: per-fold RC snap only
pp1 = pred_breath.copy()
t_start = time.time()
for k in range(n_folds):
    tr_rows = (folds_row != k); va_rows = (folds_row == k)
    if not va_rows.any():
        continue
    df_tr = pd.DataFrame({'y': y[tr_rows], 'rc': rc_key[tr_rows]})
    rc_grids = {int(rc): np.unique(grp['y'].values.astype(np.float32)) for rc, grp in df_tr.groupby('rc')}
    va_breaths = np.unique(train_b.loc[va_rows, 'breath_id'].to_numpy())
    # build breath start map
    first_rows = train_b.groupby('breath_id', sort=False).head(1)
    bid_to_start = dict(zip(first_rows['breath_id'].to_numpy(), first_rows.index.to_numpy()))
    for bid in va_breaths:
        s = bid_to_start[int(bid)]; e = s + T
        rc = int(rc_key[s])
        grid = rc_grids.get(rc, None)
        if grid is None or grid.size == 0:
            grid = np.unique(y[tr_rows].astype(np.float32))
        pp1[s:e] = snap_to_grid(pp1[s:e], grid)
print(f'Step 1 done in {time.time()-t_start:.1f}s', flush=True)
mae1 = mean_absolute_error(y[mask], pp1[mask])
print(f'Step 1 | OOF masked MAE (RC snap only): {mae1:.6f}', flush=True)

# Step 2: RC snap + median(3) within inspiration segments
pp2 = pp1.copy()
t_start = time.time()
first_rows = train_b.groupby('breath_id', sort=False).head(1)
bid_to_start = dict(zip(first_rows['breath_id'].to_numpy(), first_rows.index.to_numpy()))
for bid in np.unique(train_b['breath_id'].to_numpy()):
    s = bid_to_start[int(bid)]; e = s + T
    mb = mask[s:e]
    pp2[s:e] = median_insp_segments(pp2[s:e], mb, k=3)
print(f'Step 2 done in {time.time()-t_start:.1f}s', flush=True)
mae2 = mean_absolute_error(y[mask], pp2[mask])
print(f'Step 2 | OOF masked MAE (RC snap + median3-insp): {mae2:.6f}', flush=True)

# Step 3: add RC×t de-bias (train-fold med residual) + RC snap + median3-insp
pp3 = pred_breath.copy()
t_start = time.time()
first_rows = train_b.groupby('breath_id', sort=False).head(1)
bid_to_start = dict(zip(first_rows['breath_id'].to_numpy(), first_rows.index.to_numpy()))
for k in range(n_folds):
    tr_rows = (folds_row != k); va_rows = (folds_row == k)
    if not va_rows.any():
        continue
    resid = (y - pred_breath).astype(np.float32)
    m_tr = tr_rows & mask
    df_res = pd.DataFrame({'rc': rc_key[m_tr], 't': t_idx[m_tr], 'resid': resid[m_tr]})
    delta_tbl = df_res.groupby(['rc','t'])['resid'].median().astype(np.float32)
    df_tr = pd.DataFrame({'y': y[tr_rows], 'rc': rc_key[tr_rows]})
    rc_grids = {int(rc): np.unique(grp['y'].values.astype(np.float32)) for rc, grp in df_tr.groupby('rc')}
    for bid in np.unique(train_b.loc[va_rows, 'breath_id'].to_numpy()):
        s = bid_to_start[int(bid)]; e = s + T
        rc = int(rc_key[s])
        # de-bias
        keys = pd.MultiIndex.from_product([[rc], np.arange(T)], names=['rc','t'])
        dt = delta_tbl.reindex(keys).to_numpy() if not delta_tbl.empty else None
        vals = pp3[s:e].copy()
        if dt is not None:
            dt = np.where(np.isnan(dt), 0.0, dt).astype(np.float32)
            vals = vals + dt
        # snap
        grid = rc_grids.get(rc, None)
        if grid is None or grid.size == 0:
            grid = np.unique(y[tr_rows].astype(np.float32))
        vals = snap_to_grid(vals, grid)
        # median within insp
        vals = median_insp_segments(vals, mask[s:e], k=3)
        pp3[s:e] = vals.astype(np.float32)
print(f'Step 3 done in {time.time()-t_start:.1f}s', flush=True)
mae3 = mean_absolute_error(y[mask], pp3[mask])
print(f'Step 3 | OOF masked MAE (de-bias + RC snap + median3-insp): {mae3:.6f}', flush=True)

=== Stepwise OOF PP diagnostics: RC fold-safe snap -> +median(3) -> +RC×t de-bias ===


Using OOF file: oof_xgb.npy


Step 0 | OOF masked MAE (raw): 0.301180


Step 1 done in 2.9s


Step 1 | OOF masked MAE (RC snap only): 0.300484


Step 2 done in 3.6s


Step 2 | OOF masked MAE (RC snap + median3-insp): 0.489931


Step 3 done in 47.5s


Step 3 | OOF masked MAE (de-bias + RC snap + median3-insp): 0.489980


In [64]:
import numpy as np, pandas as pd, time
from pathlib import Path
from sklearn.metrics import mean_absolute_error
print('=== Blend XGB(3-seed MAE-avg) + CatBoost t-bucket on raw OOF; then test-time PP: de-bias -> RC snap -> median(3) ===', flush=True)

tr_path = Path('train_fe_v3.parquet')
te_path = Path('test_fe_v3.parquet')
oof_xgb_path = Path('oof_xgb_mae_avg.npy')
oof_cat_path = Path('oof_cat_bucket.npy')
sub_xgb_path = Path('submission_xgb_mae_avg.csv')
sub_cat_path = Path('submission_cat_bucket.csv')
assert tr_path.exists() and te_path.exists() and oof_xgb_path.exists() and oof_cat_path.exists() and sub_xgb_path.exists() and sub_cat_path.exists(), 'Missing artifacts for blend'

# Load FE and align
train_id = pd.read_parquet(tr_path).sort_values('id').reset_index(drop=True)
train_b  = pd.read_parquet(tr_path).sort_values(['breath_id','t_idx']).reset_index(drop=True)
test_id  = pd.read_parquet(te_path).sort_values('id').reset_index(drop=True)
test_b   = pd.read_parquet(te_path).sort_values(['breath_id','t_idx']).reset_index(drop=True)

# Load OOF preds (id-order) and map to breath-order
oof_xgb_id = np.load(oof_xgb_path).astype(np.float32)
oof_cat_id = np.load(oof_cat_path).astype(np.float32)
assert len(oof_xgb_id) == len(train_id) == len(oof_cat_id), 'OOF length mismatch vs train rows'
id_to_pos = dict(zip(train_id['id'].to_numpy(), np.arange(len(train_id), dtype=np.int64)))
idx_breath_order = np.array([id_to_pos[i] for i in train_b['id'].to_numpy()], dtype=np.int64)
oof_xgb = oof_xgb_id[idx_breath_order].astype(np.float32)
oof_cat = oof_cat_id[idx_breath_order].astype(np.float32)

y = train_b['pressure'].to_numpy(np.float32)
mask = (train_b['u_out'].to_numpy()==0)

# Grid-search blend weight on raw OOF
ws = np.linspace(0.0, 1.0, 21)
best_w = 0.0; best_mae = 1e9
for w in ws:
    pred = (1.0-w)*oof_xgb + w*oof_cat
    mae = mean_absolute_error(y[mask], pred[mask])
    if mae < best_mae:
        best_mae, best_w = float(mae), float(w)
print(f'Best raw OOF blend weight w_cat={best_w:.2f} (w_xgb={1.0-best_w:.2f}) -> MAE={best_mae:.6f}', flush=True)

# Build blended OOF (breath-order) and residual table for de-bias
oof_blend = (1.0-best_w)*oof_xgb + best_w*oof_cat
t_idx = train_b['t_idx'].to_numpy(np.int16)
rc_key_tr = (train_b['R'].astype(np.int32)*100 + train_b['C'].astype(np.int32)).to_numpy()
resid = (y - oof_blend).astype(np.float32)
df_res = pd.DataFrame({'rc': rc_key_tr[mask], 't': t_idx[mask], 'resid': resid[mask]})
delta_tbl = df_res.groupby(['rc','t'])['resid'].median().astype(np.float32)
print('Delta table size:', delta_tbl.size, flush=True)

# Load test submissions and blend in id-order, then map to breath-order for PP
sub_xgb = pd.read_csv(sub_xgb_path).sort_values('id').reset_index(drop=True)
sub_cat = pd.read_csv(sub_cat_path).sort_values('id').reset_index(drop=True)
assert (sub_xgb['id'].values == test_id['id'].values).all() and (sub_cat['id'].values == test_id['id'].values).all(), 'Submission id mismatch'
pred_xgb_id = sub_xgb['pressure'].to_numpy(np.float32)
pred_cat_id = sub_cat['pressure'].to_numpy(np.float32)
pred_blend_id = (1.0-best_w)*pred_xgb_id + best_w*pred_cat_id

# Map blended test preds to breath-order
id_to_pos_te = dict(zip(test_id['id'].to_numpy(), np.arange(len(test_id), dtype=np.int64)))
idx_test_breath_order = np.array([id_to_pos_te[i] for i in test_b['id'].to_numpy()], dtype=np.int64)
test_vals_breath = pred_blend_id[idx_test_breath_order].astype(np.float32)

# Build per-(R,C) train pressure grids
grid_all = np.unique(train_b['pressure'].values.astype(np.float32)); grid_all.sort()
rc_train = (train_b['R'].astype(np.int32)*100 + train_b['C'].astype(np.int32)).to_numpy()
rc_press = {}
tmp_df = pd.DataFrame({'rc': rc_train, 'p': train_b['pressure'].values.astype(np.float32)})
for rc, grp in tmp_df.groupby('rc'):
    g = np.unique(grp['p'].values); g.sort(); rc_press[int(rc)] = g

def snap_to_grid(arr, grid):
    idx = np.searchsorted(grid, arr)
    idx0 = np.clip(idx-1, 0, grid.size-1); idx1 = np.clip(idx, 0, grid.size-1)
    left = grid[idx0]; right = grid[idx1]
    return np.where(np.abs(arr-left) <= np.abs(arr-right), left, right).astype(np.float32)

def median_insp_segments(vals: np.ndarray, m: np.ndarray, k: int = 3) -> np.ndarray:
    out = vals.copy()
    n = len(vals); i = 0
    while i < n:
        if not m[i]:
            i += 1; continue
        j = i
        while j < n and m[j]:
            j += 1
        seg = vals[i:j]
        if seg.size >= 3 and k == 3:
            seg_ext = np.pad(seg, (1,1), mode='edge')
            med = np.median(np.stack([seg_ext[:-2], seg_ext[1:-1], seg_ext[2:]], axis=0), axis=0)
            out[i:j] = med.astype(np.float32)
        else:
            out[i:j] = seg.astype(np.float32)
        i = j
    return out

# Apply test-time PP: de-bias -> RC snap -> median(3) within inspiration segments
t0 = time.time()
out_vals = np.zeros_like(test_vals_breath, dtype=np.float32)
start = 0
T = int(test_b['t_idx'].max()) + 1
for bid, g in test_b.groupby('breath_id', sort=False):
    g = g.sort_values('t_idx')
    L = len(g)
    vals = test_vals_breath[start:start+L].copy()
    rc = int(g['R'].iloc[0])*100 + int(g['C'].iloc[0])
    tt = g['t_idx'].to_numpy(np.int16)
    # de-bias
    keys = pd.MultiIndex.from_arrays([np.full(L, rc, dtype=np.int32), tt], names=['rc','t'])
    delta = delta_tbl.reindex(keys).to_numpy()
    delta = np.where(np.isnan(delta), 0.0, delta).astype(np.float32)
    vals = vals + delta
    # snap
    grid = rc_press.get(rc, grid_all)
    vals = snap_to_grid(vals, grid)
    # median only on inspiration
    m = (g['u_out'].to_numpy()==0)
    vals = median_insp_segments(vals, m, k=3)
    out_vals[start:start+L] = vals.astype(np.float32)
    start += L

# Map back to id-order and save
out_df_breath = pd.DataFrame({'id': test_b['id'].to_numpy(), 'pressure': out_vals})
sub_out = out_df_breath.sort_values('id').reset_index(drop=True)
sub_out.to_csv('submission.csv', index=False)
print(f'Saved submission.csv (blend w_cat={best_w:.2f}) with test-time PP. Elapsed {round(time.time()-t0,1)}s', flush=True)

=== Blend XGB(3-seed MAE-avg) + CatBoost t-bucket on raw OOF; then test-time PP: de-bias -> RC snap -> median(3) ===


Best raw OOF blend weight w_cat=0.00 (w_xgb=1.00) -> MAE=0.363891


Delta table size: 288


Saved submission.csv (blend w_cat=0.00) with test-time PP. Elapsed 5.1s


In [65]:
import pandas as pd, os
from pathlib import Path
print('=== Reset submission.csv from best available prior artifact ===', flush=True)
candidates = ['submission_blend.csv', 'submission_xgb_mae_avg.csv', 'submission_xgb_s42.csv']
src = None
for c in candidates:
    if Path(c).exists():
        src = c; break
assert src is not None, 'No candidate submission file found'
sub = pd.read_csv(src).sort_values('id').reset_index(drop=True)
assert {'id','pressure'}.issubset(sub.columns), f'Bad columns in {src}'
sub.to_csv('submission.csv', index=False)
print(f'Wrote submission.csv from {src} (rows={len(sub)})', flush=True)

=== Reset submission.csv from best available prior artifact ===


Wrote submission.csv from submission_blend.csv (rows=603600)


In [67]:
import numpy as np, pandas as pd, time, gc, os
from pathlib import Path
from sklearn.metrics import mean_absolute_error
print('=== Fast recovery: per-timestep XGB (squarederror) with single holdout fold and RC snap-only ===', flush=True)

import xgboost as xgb
tr_path, te_path = Path('train_fe_v3.parquet'), Path('test_fe_v3.parquet')
assert tr_path.exists() and te_path.exists(), 'Run FE v3 cell first'
train = pd.read_parquet(tr_path)
test  = pd.read_parquet(te_path)

# Sort to breath-major for per-t training
train_b = train.sort_values(['breath_id','t_idx']).reset_index(drop=True)
test_b  = test.sort_values(['breath_id','t_idx']).reset_index(drop=True)

# Features: exclude target/meta and u_out
drop_cols = {'pressure','is_train','fold','id'}
FEATS = [c for c in train.columns if c not in drop_cols and c in test.columns and c != 'u_out']
print('Num features:', len(FEATS), flush=True)

# Folds mapping (breath-wise); pick one holdout fold for early stopping (k_holdout)
folds_df = pd.read_csv('folds_breath_v3.csv')
b2f = dict(zip(folds_df['breath_id'].astype(int), folds_df['fold'].astype(int)))
folds_b = train_b['breath_id'].astype(int).map(b2f).astype(np.int8).to_numpy()
assert not np.isnan(folds_b).any(), 'Missing folds for some breaths'
n_folds = int(folds_df['fold'].max()) + 1
k_holdout = 0
print('Using holdout fold:', k_holdout, flush=True)

# Targets and mask
y_b = train_b['pressure'].to_numpy(np.float32)
mask_b = (train_b['u_out'].to_numpy()==0)
t_idx_b = train_b['t_idx'].astype(np.int16).to_numpy()

T = int(train_b['t_idx'].max()) + 1
B = train_b['breath_id'].nunique()
print('Timesteps:', T, 'Breaths:', B, flush=True)

# Prepare OOF (holdout only, for monitoring) and test preds
oof = np.zeros(train_b.shape[0], dtype=np.float32)
test_pred_all = np.zeros(test_b.shape[0], dtype=np.float32)

# Fast-recovery params (expert-guided)
params = {
    'tree_method': 'hist',
    'device': 'cuda',
    'max_depth': 7,
    'min_child_weight': 32,
    'subsample': 0.8,
    'colsample_bytree': 0.7,
    'lambda': 12.0,
    'alpha': 1.0,
    'eta': 0.10,
    'objective': 'reg:squarederror',
    'eval_metric': 'mae',
    'nthread': max(1, os.cpu_count()-2)
}
num_round = 400
early = 50

t0 = time.time()
t_vec_test = test_b['t_idx'].to_numpy()
for t in range(T):
    idx_t_all = (t_idx_b == t)
    idx_fit = idx_t_all & mask_b
    if idx_fit.sum() == 0:
        continue
    X_t = train_b.loc[idx_fit, FEATS].to_numpy(np.float32, copy=False)
    y_t = y_b[idx_fit]
    f_t = folds_b[idx_fit]
    # Train/Val split by breath fold
    m_tr = (f_t != k_holdout)
    m_va = (f_t == k_holdout)
    if m_tr.sum() == 0 or m_va.sum() == 0:
        # fallback: simple split 90/10
        n = X_t.shape[0]
        cut = int(n*0.9)
        m_tr = np.zeros(n, dtype=bool); m_tr[:cut] = True
        m_va = ~m_tr
    dtr = xgb.DMatrix(X_t[m_tr], label=y_t[m_tr])
    dva = xgb.DMatrix(X_t[m_va], label=y_t[m_va])
    watch = [(dtr,'tr'),(dva,'va')]
    bst = xgb.train(params, dtr, num_boost_round=num_round, evals=watch, early_stopping_rounds=early, verbose_eval=False)
    attrs = bst.attributes()
    best_it = int(attrs.get('best_iteration', '0'))
    iter_range = (0, best_it + 1) if best_it > 0 else None
    # Map holdout preds back to full positions of timestep t
    pos_all_t = np.where(idx_t_all)[0]
    pos_fit_t = np.where(idx_fit)[0]
    pos_va_fit = pos_fit_t[m_va]
    fold_pred_val = np.zeros(idx_t_all.sum(), dtype=np.float32)
    pred_va = bst.predict(dva, iteration_range=iter_range).astype(np.float32)
    fold_pred_val_indices = np.searchsorted(pos_all_t, pos_va_fit)
    fold_pred_val[fold_pred_val_indices] = pred_va
    oof[idx_t_all] = fold_pred_val
    # Test slice for this t
    mt = (t_vec_test == t)
    if mt.any():
        X_te_t = test_b.loc[mt, FEATS].to_numpy(np.float32, copy=False)
        dte = xgb.DMatrix(X_te_t)
        test_pred_all[mt] = bst.predict(dte, iteration_range=iter_range).astype(np.float32)
    if (t+1) % 10 == 0 or t < 3:
        m_slice = mask_b & (t_idx_b == t)
        mae_t = mean_absolute_error(y_b[m_slice], oof[m_slice]) if m_slice.any() else np.nan
        print(f't={t:02d} | fit_rows={int(idx_fit.sum())} | holdout MAE_masked={mae_t:.4f} | elapsed={time.time()-t0:.1f}s', flush=True)
    gc.collect()

# Holdout OOF masked MAE (for monitoring only)
mae_holdout = mean_absolute_error(y_b[mask_b & (folds_b==k_holdout)], oof[mask_b & (folds_b==k_holdout)])
print(f'Holdout fold {k_holdout} masked MAE: {mae_holdout:.6f}', flush=True)

# Save raw test preds (id-order)
test_b_pred = test_b[['id']].copy(); test_b_pred['pressure'] = test_pred_all.astype(np.float32)
sub_raw = test_b_pred.sort_values('id').reset_index(drop=True)
sub_raw.to_csv('submission_raw_xgb_fast.csv', index=False)
print('Saved submission_raw_xgb_fast.csv (raw, no PP)', flush=True)

# Build per-(R,C) pressure grids from full train for snap-only
grid_all = np.unique(train_b['pressure'].values.astype(np.float32)); grid_all.sort()
rc_train = (train_b['R'].astype(np.int32)*100 + train_b['C'].astype(np.int32)).to_numpy()
rc_press = {}
tmp_df = pd.DataFrame({'rc': rc_train, 'p': train_b['pressure'].values.astype(np.float32)})
for rc, grp in tmp_df.groupby('rc'):
    g = np.unique(grp['p'].values); g.sort(); rc_press[int(rc)] = g

def snap_to_grid(arr, grid):
    idx = np.searchsorted(grid, arr)
    idx0 = np.clip(idx-1, 0, grid.size-1); idx1 = np.clip(idx, 0, grid.size-1)
    left = grid[idx0]; right = grid[idx1]
    return np.where(np.abs(arr-left) <= np.abs(arr-right), left, right).astype(np.float32)

# Apply RC snap-only to test preds in breath-order
test_id  = test.sort_values('id').reset_index(drop=True)
press_id_order = sub_raw['pressure'].to_numpy(np.float32)
id_to_pos_te = dict(zip(test_id['id'].to_numpy(), np.arange(len(test_id), dtype=np.int64)))
idx_test_breath_order = np.array([id_to_pos_te[i] for i in test_b['id'].to_numpy()], dtype=np.int64)
test_vals_breath = press_id_order[idx_test_breath_order].astype(np.float32)

out_vals = np.zeros_like(test_vals_breath, dtype=np.float32)
start = 0
T = int(test_b['t_idx'].max()) + 1
t1 = time.time()
for bid, g in test_b.groupby('breath_id', sort=False):
    g = g.sort_values('t_idx')
    L = len(g)
    vals = test_vals_breath[start:start+L].copy()
    rc = int(g['R'].iloc[0])*100 + int(g['C'].iloc[0])
    grid = rc_press.get(rc, grid_all)
    vals = snap_to_grid(vals, grid)
    out_vals[start:start+L] = vals.astype(np.float32)
    start += L
print('Snap-only done in', round(time.time()-t1,1), 's', flush=True)

# Map back to id-order and save final submission (snap-only)
out_df_breath = pd.DataFrame({'id': test_b['id'].to_numpy(), 'pressure': out_vals})
sub_out = out_df_breath.sort_values('id').reset_index(drop=True)
sub_out.to_csv('submission.csv', index=False)
print('Saved submission.csv (RC snap-only applied to raw fast XGB preds).', flush=True)

=== Fast recovery: per-timestep XGB (squarederror) with single holdout fold and RC snap-only ===


Num features: 58


Using holdout fold: 0


Timesteps: 80 Breaths: 67905


t=00 | fit_rows=67905 | holdout MAE_masked=4.6953 | elapsed=1.9s


t=01 | fit_rows=67905 | holdout MAE_masked=5.3090 | elapsed=4.0s


t=02 | fit_rows=67905 | holdout MAE_masked=7.3975 | elapsed=6.1s


t=09 | fit_rows=67905 | holdout MAE_masked=13.6506 | elapsed=20.5s


t=19 | fit_rows=67905 | holdout MAE_masked=16.2944 | elapsed=40.4s


t=29 | fit_rows=47285 | holdout MAE_masked=17.8744 | elapsed=59.1s


Holdout fold 0 masked MAE: 0.353893


Saved submission_raw_xgb_fast.csv (raw, no PP)


Snap-only done in 1.1 s


Saved submission.csv (RC snap-only applied to raw fast XGB preds).


In [69]:
import numpy as np, pandas as pd, time, gc, os, sys, subprocess
from pathlib import Path
from sklearn.metrics import mean_absolute_error
print('=== Per-t LightGBM (CPU, MAE) fast run with holdout; blend with fast XGB; RC snap-only ===', flush=True)

try:
    import lightgbm as lgb
except Exception:
    print('Installing lightgbm...', flush=True)
    subprocess.run([sys.executable, '-m', 'pip', 'install', 'lightgbm==4.6.0'], check=True)
    import lightgbm as lgb

tr_path, te_path = Path('train_fe_v3.parquet'), Path('test_fe_v3.parquet')
assert tr_path.exists() and te_path.exists(), 'Run FE v3 cell first'
train = pd.read_parquet(tr_path)
test  = pd.read_parquet(te_path)

# Sort breath-major
train_b = train.sort_values(['breath_id','t_idx']).reset_index(drop=True)
test_b  = test.sort_values(['breath_id','t_idx']).reset_index(drop=True)

# Features: exclude target/meta and u_out
drop_cols = {'pressure','is_train','fold','id'}
FEATS = [c for c in train.columns if c not in drop_cols and c in test.columns and c != 'u_out']
print('Num features:', len(FEATS), flush=True)

# Folds mapping (breath-wise) and holdout fold
folds_df = pd.read_csv('folds_breath_v3.csv')
b2f = dict(zip(folds_df['breath_id'].astype(int), folds_df['fold'].astype(int)))
folds_b = train_b['breath_id'].astype(int).map(b2f).astype(np.int8).to_numpy()
assert not np.isnan(folds_b).any(), 'Missing folds for some breaths'
k_holdout = 0
print('Using holdout fold:', k_holdout, flush=True)

# Targets and mask
y_b = train_b['pressure'].to_numpy(np.float32)
mask_b = (train_b['u_out'].to_numpy()==0)
t_idx_b = train_b['t_idx'].astype(np.int16).to_numpy()

T = int(train_b['t_idx'].max()) + 1
B = train_b['breath_id'].nunique()
print('Timesteps:', T, 'Breaths:', B, flush=True)

# Containers
oof_lgb = np.zeros(train_b.shape[0], dtype=np.float32)  # only holdout positions will be non-zero
test_pred_all_lgb = np.zeros(test_b.shape[0], dtype=np.float32)

# LGB params (expert-suggested fast settings)
lgb_params = {
  'objective': 'regression_l1',
  'metric': 'mae',
  'learning_rate': 0.05,
  'num_leaves': 127,
  'min_child_samples': 20,
  'feature_fraction': 0.8,
  'bagging_fraction': 0.8,
  'bagging_freq': 1,
  'lambda_l1': 1.0,
  'lambda_l2': 5.0,
  'num_threads': os.cpu_count() or 8,
  'force_row_wise': True,
  'deterministic': True,
  'seed': 42,
  'verbosity': -1
}
num_boost_round = 800
early_stopping_rounds = 80

t0 = time.time()
t_vec_test = test_b['t_idx'].to_numpy()
for t in range(T):
    idx_t_all = (t_idx_b == t)
    idx_fit = idx_t_all & mask_b
    if idx_fit.sum() == 0:
        continue
    X_t = train_b.loc[idx_fit, FEATS].to_numpy(np.float32, copy=False)
    y_t = y_b[idx_fit]
    f_t = folds_b[idx_fit]
    m_tr = (f_t != k_holdout)
    m_va = (f_t == k_holdout)
    if m_tr.sum() == 0 or m_va.sum() == 0:
        n = X_t.shape[0]
        cut = int(n*0.9)
        m_tr = np.zeros(n, dtype=bool); m_tr[:cut] = True
        m_va = ~m_tr
    dtr = lgb.Dataset(X_t[m_tr], label=y_t[m_tr])
    dva = lgb.Dataset(X_t[m_va], label=y_t[m_va], reference=dtr)
    callbacks = [lgb.early_stopping(stopping_rounds=early_stopping_rounds, verbose=False)]
    booster = lgb.train(lgb_params, dtr, num_boost_round=num_boost_round, valid_sets=[dtr, dva], valid_names=['tr','va'], callbacks=callbacks)
    # Map holdout preds back to full positions of timestep t
    pos_all_t = np.where(idx_t_all)[0]
    pos_fit_t = np.where(idx_fit)[0]
    pos_va_fit = pos_fit_t[m_va]
    fold_pred_val = np.zeros(idx_t_all.sum(), dtype=np.float32)
    best_iter = getattr(booster, 'best_iteration', None)
    if best_iter is None or best_iter <= 0:
        best_iter = booster.num_trees()
    pred_va = booster.predict(X_t[m_va], num_iteration=best_iter).astype(np.float32)
    fold_pred_val_indices = np.searchsorted(pos_all_t, pos_va_fit)
    fold_pred_val[fold_pred_val_indices] = pred_va
    oof_lgb[idx_t_all] = fold_pred_val
    # Test slice for this t
    mt = (t_vec_test == t)
    if mt.any():
        X_te_t = test_b.loc[mt, FEATS].to_numpy(np.float32, copy=False)
        test_pred_all_lgb[mt] = booster.predict(X_te_t, num_iteration=best_iter).astype(np.float32)
    if (t+1) % 10 == 0 or t < 3:
        m_slice = mask_b & (t_idx_b == t) & (folds_b == k_holdout)
        mae_t = mean_absolute_error(y_b[m_slice], oof_lgb[m_slice]) if m_slice.any() else np.nan
        print(f'LGB t={t:02d} | fit_rows={int(idx_fit.sum())} | holdout MAE_masked={mae_t:.4f} | elapsed={time.time()-t0:.1f}s', flush=True)
    gc.collect()

# Holdout masked MAE for monitoring
mae_hold = mean_absolute_error(y_b[mask_b & (folds_b==k_holdout)], oof_lgb[mask_b & (folds_b==k_holdout)])
print(f'LGB Holdout fold {k_holdout} masked MAE: {mae_hold:.6f}', flush=True)

# Save raw LGB test preds (id-order)
sub_lgb_raw = test_b[['id']].copy(); sub_lgb_raw['pressure'] = test_pred_all_lgb.astype(np.float32)
sub_lgb_raw = sub_lgb_raw.sort_values('id').reset_index(drop=True)
sub_lgb_raw.to_csv('submission_lgb_raw.csv', index=False)
print('Saved submission_lgb_raw.csv (raw, no PP)', flush=True)

# Blend with fast XGB raw preds (heuristic weight if no tuning):
xgb_raw_path = Path('submission_raw_xgb_fast.csv')
assert xgb_raw_path.exists(), 'submission_raw_xgb_fast.csv not found (run fast XGB cell first)'
sub_xgb_raw = pd.read_csv(xgb_raw_path).sort_values('id').reset_index(drop=True)
assert (sub_xgb_raw['id'].values == sub_lgb_raw['id'].values).all(), 'ID mismatch between XGB and LGB raw submissions'
w_lgb = 0.30  # heuristic (cap <= 0.5)
blend_raw = sub_xgb_raw[['id']].copy()
blend_raw['pressure'] = ((1.0 - w_lgb) * sub_xgb_raw['pressure'].astype(np.float32) + w_lgb * sub_lgb_raw['pressure'].astype(np.float32)).astype(np.float32)
blend_raw.to_csv('submission_raw_xgb_lgb_blend.csv', index=False)
print(f'Saved submission_raw_xgb_lgb_blend.csv (w_lgb={w_lgb:.2f})', flush=True)

# RC snap-only on blended preds
test_id  = test.sort_values('id').reset_index(drop=True)
press_id_order = blend_raw['pressure'].to_numpy(np.float32)
id_to_pos_te = dict(zip(test_id['id'].to_numpy(), np.arange(len(test_id), dtype=np.int64)))
idx_test_breath_order = np.array([id_to_pos_te[i] for i in test_b['id'].to_numpy()], dtype=np.int64)
test_vals_breath = press_id_order[idx_test_breath_order].astype(np.float32)

# Build per-(R,C) grids from full train
grid_all = np.unique(train_b['pressure'].values.astype(np.float32)); grid_all.sort()
rc_train = (train_b['R'].astype(np.int32)*100 + train_b['C'].astype(np.int32)).to_numpy()
rc_press = {}
tmp_df = pd.DataFrame({'rc': rc_train, 'p': train_b['pressure'].values.astype(np.float32)})
for rc, grp in tmp_df.groupby('rc'):
    g = np.unique(grp['p'].values); g.sort(); rc_press[int(rc)] = g

def snap_to_grid(arr, grid):
    idx = np.searchsorted(grid, arr)
    idx0 = np.clip(idx-1, 0, grid.size-1); idx1 = np.clip(idx, 0, grid.size-1)
    left = grid[idx0]; right = grid[idx1]
    return np.where(np.abs(arr-left) <= np.abs(arr-right), left, right).astype(np.float32)

out_vals = np.zeros_like(test_vals_breath, dtype=np.float32)
start = 0
t1 = time.time()
for bid, g in test_b.groupby('breath_id', sort=False):
    g = g.sort_values('t_idx')
    L = len(g)
    vals = test_vals_breath[start:start+L].copy()
    rc = int(g['R'].iloc[0])*100 + int(g['C'].iloc[0])
    grid = rc_press.get(rc, grid_all)
    vals = snap_to_grid(vals, grid)
    out_vals[start:start+L] = vals.astype(np.float32)
    start += L
print('Snap-only (blend) done in', round(time.time()-t1,1), 's', flush=True)

sub_blend_breath = pd.DataFrame({'id': test_b['id'].to_numpy(), 'pressure': out_vals})
sub_blend = sub_blend_breath.sort_values('id').reset_index(drop=True)
sub_blend.to_csv('submission_blend_xgb_lgb_snap.csv', index=False)
sub_blend.to_csv('submission.csv', index=False)
print('Saved submission_blend_xgb_lgb_snap.csv and updated submission.csv with RC snap-only.', flush=True)

=== Per-t LightGBM (CPU, MAE) fast run with holdout; blend with fast XGB; RC snap-only ===


Num features: 58


Using holdout fold: 0


Timesteps: 80 Breaths: 67905


LGB t=00 | fit_rows=67905 | holdout MAE_masked=0.2008 | elapsed=17.1s


LGB t=01 | fit_rows=67905 | holdout MAE_masked=0.1885 | elapsed=35.8s


LGB t=02 | fit_rows=67905 | holdout MAE_masked=0.2008 | elapsed=55.8s


LGB t=09 | fit_rows=67905 | holdout MAE_masked=0.3605 | elapsed=202.7s


LGB t=19 | fit_rows=67905 | holdout MAE_masked=0.3533 | elapsed=407.3s


LGB t=29 | fit_rows=47285 | holdout MAE_masked=0.3772 | elapsed=616.8s


LGB Holdout fold 0 masked MAE: 0.334499


Saved submission_lgb_raw.csv (raw, no PP)


Saved submission_raw_xgb_lgb_blend.csv (w_lgb=0.30)


Snap-only (blend) done in 1.1 s


Saved submission_blend_xgb_lgb_snap.csv and updated submission.csv with RC snap-only.


In [None]:
import time, numpy as np, pandas as pd
from pathlib import Path
print('=== Postprocess full 5-fold XGB outputs (RC snap-only) when ready ===', flush=True)

tr_path = Path('train_fe_v3.parquet')
te_path = Path('test_fe_v3.parquet')
raw_sub_path = Path('submission_xgb_full.csv')
oof_path = Path('oof_xgb_full.npy')

if not tr_path.exists() or not te_path.exists():
    raise SystemExit('Missing FE parquet files')

if not raw_sub_path.exists():
    print('submission_xgb_full.csv not found yet. Run this cell after train_xgb_full.py finishes.', flush=True)
else:
    # Load FE and raw submission
    train_b = pd.read_parquet(tr_path).sort_values(['breath_id','t_idx']).reset_index(drop=True)
    test_b  = pd.read_parquet(te_path).sort_values(['breath_id','t_idx']).reset_index(drop=True)
    test_id = pd.read_parquet(te_path).sort_values('id').reset_index(drop=True)
    sub_raw = pd.read_csv(raw_sub_path).sort_values('id').reset_index(drop=True)
    assert {'id','pressure'}.issubset(sub_raw.columns), 'Bad columns in submission_xgb_full.csv'
    assert len(sub_raw)==len(test_id), 'Row count mismatch vs test'
    assert (sub_raw['id'].values == test_id['id'].values).all(), 'ID order mismatch vs test'

    # Build per-(R,C) pressure grids from full train
    grid_all = np.unique(train_b['pressure'].values.astype(np.float32)); grid_all.sort()
    rc_train = (train_b['R'].astype(np.int32)*100 + train_b['C'].astype(np.int32)).to_numpy()
    rc_press = {}
    tmp_df = pd.DataFrame({'rc': rc_train, 'p': train_b['pressure'].values.astype(np.float32)})
    for rc, grp in tmp_df.groupby('rc'):
        g = np.unique(grp['p'].values); g.sort(); rc_press[int(rc)] = g

    def snap_to_grid(arr, grid):
        idx = np.searchsorted(grid, arr)
        idx0 = np.clip(idx-1, 0, grid.size-1); idx1 = np.clip(idx, 0, grid.size-1)
        left = grid[idx0]; right = grid[idx1]
        return np.where(np.abs(arr-left) <= np.abs(arr-right), left, right).astype(np.float32)

    # Map raw preds to breath-order
    id_to_pos_te = dict(zip(test_id['id'].to_numpy(), np.arange(len(test_id), dtype=np.int64)))
    idx_test_breath_order = np.array([id_to_pos_te[i] for i in test_b['id'].to_numpy()], dtype=np.int64)
    vals_breath = sub_raw['pressure'].to_numpy(np.float32)[idx_test_breath_order]

    out_vals = np.zeros_like(vals_breath, dtype=np.float32)
    start = 0
    t0 = time.time()
    for bid, g in test_b.groupby('breath_id', sort=False):
        g = g.sort_values('t_idx')
        L = len(g)
        v = vals_breath[start:start+L].copy()
        rc = int(g['R'].iloc[0])*100 + int(g['C'].iloc[0])
        grid = rc_press.get(rc, grid_all)
        v = snap_to_grid(v, grid)
        out_vals[start:start+L] = v.astype(np.float32)
        start += L
    print('Snap-only done in', round(time.time()-t0,1), 's', flush=True)

    # Map back to id-order and save
    sub_snap = pd.DataFrame({'id': test_b['id'].to_numpy(), 'pressure': out_vals}).sort_values('id').reset_index(drop=True)
    sub_snap.to_csv('submission_xgb_full_snap.csv', index=False)
    sub_snap.to_csv('submission.csv', index=False)
    print('Saved submission_xgb_full_snap.csv and updated submission.csv (RC snap-only).', flush=True)

    # Optional: if OOF exists, report masked MAE for raw vs snap (fold-safe grids not applied here)
    if oof_path.exists():
        from sklearn.metrics import mean_absolute_error
        train_id = pd.read_parquet(tr_path).sort_values('id').reset_index(drop=True)
        oof_id = np.load(oof_path).astype(np.float32)
        id_to_pos_tr = dict(zip(train_id['id'].to_numpy(), np.arange(len(train_id), dtype=np.int64)))
        idx_breath_tr = np.array([id_to_pos_tr[i] for i in train_b['id'].to_numpy()], dtype=np.int64)
        oof_breath = oof_id[idx_breath_tr]
        y = train_b['pressure'].to_numpy(np.float32)
        mask = (train_b['u_out'].to_numpy()==0)
        mae_raw = mean_absolute_error(y[mask], oof_breath[mask])
        print(f'OOF masked MAE (full XGB raw): {mae_raw:.6f}', flush=True)
    else:
        print('oof_xgb_full.npy not found; skipping OOF report.', flush=True)