# NanoMamba Training ‚Äî Structural Noise Robustness
**Interspeech 2026: Noise Robustness by Architectural Design**

## SA-SSM Structural Guarantees:
- **Œî-modulation**: SNR-conditioned SSM bandwidth control
- **B-gating**: SNR-conditioned observation attenuation
- **Œµ residual**: Ungated input path (fixed 0.1, register_buffer)
- **Œî floor**: SSM bandwidth minimum (fixed 0.15, register_buffer)

## NEW: 3-Layer PCEN Defense (Factory/Pink Noise):
- **PCEN**: Adaptive AGC replaces log(mel) ‚Äî preserves speech under noise
- **Running SNR**: EMA noise tracking ‚Äî handles factory impulses
- **FreqDepFloor**: Low-freq mel band safety net (non-learnable)

## Models:
| Model | Params | Feature |
|-------|--------|---------|
| `NanoMamba-Tiny` | 4,634 | SA-SSM baseline |
| `NanoMamba-Tiny-TC` | 4,646 | + TinyConv2D |
| `NanoMamba-Tiny-PCEN` | 4,796 | + PCEN + Running SNR + FreqDepFloor |
| `NanoMamba-Small-PCEN` | 12,194 | Small + PCEN |
| `NanoMamba-Tiny-PCEN-TC` | 4,806 | PCEN + TinyConv2D (full defense) |

## 0. Setup & GPU Check

In [None]:
import torch
print(f"PyTorch: {torch.__version__}")
print(f"CUDA: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"Memory: {torch.cuda.get_device_properties(0).total_mem / 1024**3:.1f} GB")

In [None]:
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

In [None]:
import os

# === Î°úÏª¨ ÎîîÏä§ÌÅ¨ ÏÇ¨Ïö© (DriveÎ≥¥Îã§ 10Î∞∞ Îπ†Î¶Ñ) ===
WORK_DIR = '/content/NanoMamba'
os.makedirs(WORK_DIR, exist_ok=True)

# GitHubÏóêÏÑú ÏµúÏã† ÏΩîÎìú clone
!git clone https://github.com/DrJinHoChoi/NanoMamba-Interspeech2026.git /content/NanoMamba-repo 2>/dev/null || (cd /content/NanoMamba-repo && git pull)
!cp /content/NanoMamba-repo/nanomamba.py {WORK_DIR}/
!cp /content/NanoMamba-repo/train_colab.py {WORK_DIR}/

# %cdÎäî !python ÏÖ∏ Î™ÖÎ†πÏóêÎèÑ Ï†ÅÏö©Îê® (os.chdirÎäî Ïïà Îê®)
%cd {WORK_DIR}
!pwd
!ls -la *.py

## 1. Verify nanomamba.py (Structural Œî floor + Œµ residual)

In [None]:
import sys
sys.path.insert(0, WORK_DIR)

from nanomamba import (
    create_nanomamba_tiny,
    create_nanomamba_small,
    create_nanomamba_tiny_tc,
    create_nanomamba_tiny_ws_tc,
    create_nanomamba_tiny_ws,
    create_nanomamba_tiny_pcen,
    create_nanomamba_small_pcen,
    create_nanomamba_tiny_pcen_tc,
)

print("=" * 60)
print("  Model Verification (All Variants)")
print("=" * 60)

import torch
audio = torch.randn(2, 16000)

for name, fn in [
    ('NanoMamba-Tiny', create_nanomamba_tiny),
    ('NanoMamba-Tiny-TC', create_nanomamba_tiny_tc),
    ('NanoMamba-Tiny-PCEN', create_nanomamba_tiny_pcen),
    ('NanoMamba-Small-PCEN', create_nanomamba_small_pcen),
    ('NanoMamba-Tiny-PCEN-TC', create_nanomamba_tiny_pcen_tc),
]:
    m = fn()
    m.eval()
    p = sum(x.numel() for x in m.parameters())
    kb = p * 4 / 1024

    with torch.no_grad():
        out = m(audio)

    print(f"  {name:<25} | {p:>6,} params ({kb:.1f} KB) | output={list(out.shape)}")

print("\n  All models OK!")

## 2. Dataset Download (Google Speech Commands V2)

In [None]:
# Download dataset (only needed once ‚Äî cached on Drive)
DATA_DIR = os.path.join(WORK_DIR, 'data')
os.makedirs(DATA_DIR, exist_ok=True)

from train_colab import SpeechCommandsDataset

print("Loading datasets...")
train_dataset = SpeechCommandsDataset(DATA_DIR, subset='training', augment=True)
val_dataset = SpeechCommandsDataset(DATA_DIR, subset='validation', augment=False)
test_dataset = SpeechCommandsDataset(DATA_DIR, subset='testing', augment=False)

print(f"\nTrain: {len(train_dataset)}, Val: {len(val_dataset)}, Test: {len(test_dataset)}")

## 3. Train NanoMamba-Tiny (Structural Baseline)

In [None]:
!python train_colab.py \
    --models NanoMamba-Tiny \
    --data_dir ./data \
    --checkpoint_dir ./checkpoints \
    --results_dir ./results \
    --epochs 30 \
    --batch_size 128 \
    --lr 3e-3 \
    --noise_types factory,white,babble,street,pink \
    --snr_range="-15,-10,-5,0,5,10,15"

## 4. Train NanoMamba-Tiny-TC (+ TinyConv2D)

In [None]:
!python train_colab.py \
    --models NanoMamba-Tiny-TC \
    --data_dir ./data \
    --checkpoint_dir ./checkpoints \
    --results_dir ./results \
    --epochs 30 \
    --batch_size 128 \
    --lr 3e-3 \
    --noise_types factory,white,babble,street,pink \
    --snr_range="-15,-10,-5,0,5,10,15"

## 5. Train NanoMamba-Tiny-WS-TC (Hero Model ‚Äî 3.7K params)

In [None]:
!python train_colab.py \
    --models NanoMamba-Tiny-WS-TC \
    --data_dir ./data \
    --checkpoint_dir ./checkpoints \
    --results_dir ./results \
    --epochs 30 \
    --batch_size 128 \
    --lr 3e-3 \
    --noise_types factory,white,babble,street,pink \
    --snr_range="-15,-10,-5,0,5,10,15"

## 6. üî• Train NanoMamba-Tiny-PCEN (3-Layer Structural Factory/Pink Defense)

In [None]:
!python train_colab.py \
    --models NanoMamba-Tiny-PCEN \
    --data_dir ./data \
    --checkpoint_dir ./checkpoints \
    --results_dir ./results \
    --epochs 30 \
    --batch_size 128 \
    --lr 3e-3 \
    --noise_types factory,white,babble \
    --snr_range="-15,-10,-5,0,5,10,15"

In [None]:
# === Í≤∞Í≥ºÎ•º DriveÏóê Î∞±ÏóÖ (Îü∞ÌÉÄÏûÑ Ï¢ÖÎ£å Ïãú Î°úÏª¨ Îç∞Ïù¥ÌÑ∞ ÏÇ¨ÎùºÏßê) ===
import shutil
DRIVE_BACKUP = '/content/drive/MyDrive/NanoMamba/results_pcen'
os.makedirs(DRIVE_BACKUP, exist_ok=True)
for f in ['checkpoints', 'results']:
    src = os.path.join(WORK_DIR, f)
    if os.path.exists(src):
        shutil.copytree(src, os.path.join(DRIVE_BACKUP, f), dirs_exist_ok=True)
        print(f"  Backed up {f} -> {DRIVE_BACKUP}/{f}")
print("Done! Results saved to Drive.")

## 7. Train NanoMamba-Small-PCEN & Tiny-PCEN-TC (Optional)

In [None]:
# Small-PCEN (12,194 params)
!python train_colab.py \
    --models NanoMamba-Small-PCEN \
    --data_dir ./data \
    --checkpoint_dir ./checkpoints \
    --results_dir ./results \
    --epochs 30 \
    --batch_size 128 \
    --lr 1e-3 \
    --noise_types factory,white,babble \
    --snr_range="-15,-10,-5,0,5,10,15"

In [None]:
# PCEN + TinyConv2D combo (4,806 params ‚Äî full structural defense)
!python train_colab.py \
    --models NanoMamba-Tiny-PCEN-TC \
    --data_dir ./data \
    --checkpoint_dir ./checkpoints \
    --results_dir ./results \
    --epochs 30 \
    --batch_size 128 \
    --lr 3e-3 \
    --noise_types factory,white,babble \
    --snr_range="-15,-10,-5,0,5,10,15"

## 6. Results Analysis & Comparison

In [None]:
import json
import numpy as np

# Load results
results_path = os.path.join(WORK_DIR, 'results', 'final_results.json')
if os.path.exists(results_path):
    with open(results_path) as f:
        results = json.load(f)
    
    print("=" * 70)
    print("  FINAL RESULTS")
    print("=" * 70)
    
    for model_name, data in results.get('models', {}).items():
        print(f"\n  {model_name}: {data['params']:,} params")
        print(f"    Test Accuracy: {data.get('test_acc', 0):.2f}%")
        for noise_type, snr_data in data.get('noise_robustness', {}).items():
            snrs = ['-15', '-10', '-5', '0', '5', '10', '15', 'clean']
            vals = [snr_data.get(s, 0) for s in snrs]
            vals_str = ' | '.join(f"{v:.1f}" for v in vals)
            print(f"    {noise_type:<8}: {vals_str}")
else:
    print("No results file found. Run training cells first.")

In [None]:
import matplotlib.pyplot as plt
import json

# === Baseline results (from previous training, NanoMamba-Tiny without structural) ===
baseline_results = {
    'NanoMamba-Tiny (old)': {
        'factory': [38.4, 56.1, 70.1, 77.6, 83.2, 85.1, 86.6],
        'white':   [20.2, 51.6, 69.3, 79.8, 86.2, 90.1, 91.8],
        'babble':  [58.6, 60.4, 65.0, 69.6, 77.3, 84.1, 87.4],
        'street':  [46.8, 58.9, 71.1, 78.8, 85.9, 89.0, 91.7],
        'pink':    [9.9, 38.3, 69.4, 81.9, 88.6, 91.3, 92.5],
    }
}

snr_levels = [-15, -10, -5, 0, 5, 10, 15]

# Load new results
results_path = os.path.join(WORK_DIR, 'results', 'final_results.json')
new_results = {}
if os.path.exists(results_path):
    with open(results_path) as f:
        data = json.load(f)
    for model_name, mdata in data.get('models', {}).items():
        new_results[model_name] = {}
        for noise_type, snr_data in mdata.get('noise_robustness', {}).items():
            new_results[model_name][noise_type] = [
                snr_data.get(str(s), 0) for s in snr_levels
            ]

# Plot comparison for each noise type
noise_types = ['factory', 'white', 'babble', 'street', 'pink']
fig, axes = plt.subplots(1, 5, figsize=(25, 5))

colors_old = {'NanoMamba-Tiny (old)': '#999999'}
colors_new = {
    'NanoMamba-Tiny': '#2196F3',
    'NanoMamba-Tiny-TC': '#FF9800',
    'NanoMamba-Tiny-WS-TC': '#E91E63',
}

for idx, noise_type in enumerate(noise_types):
    ax = axes[idx]
    
    # Plot baseline (old, dashed)
    for name, noise_data in baseline_results.items():
        if noise_type in noise_data:
            ax.plot(snr_levels, noise_data[noise_type],
                    '--', color=colors_old.get(name, '#999'),
                    label=name, linewidth=1.5, alpha=0.7)
    
    # Plot new (solid)
    for name, noise_data in new_results.items():
        if noise_type in noise_data:
            ax.plot(snr_levels, noise_data[noise_type],
                    '-o', color=colors_new.get(name, '#333'),
                    label=name, linewidth=2, markersize=5)
    
    ax.set_title(f'{noise_type.upper()} Noise', fontsize=13, fontweight='bold')
    ax.set_xlabel('SNR (dB)')
    if idx == 0:
        ax.set_ylabel('Accuracy (%)')
    ax.set_ylim(0, 100)
    ax.grid(True, alpha=0.3)
    ax.legend(fontsize=7, loc='lower right')

plt.suptitle('Structural Noise Robustness: Before vs After (Œî floor + Œµ residual)',
             fontsize=14, fontweight='bold', y=1.02)
plt.tight_layout()
plt.savefig(os.path.join(WORK_DIR, 'results', 'structural_comparison.png'),
            dpi=150, bbox_inches='tight')
plt.show()
print("Plot saved to results/structural_comparison.png")

## 7. Degradation Slope Analysis (ÌïµÏã¨ metric)

In [None]:
import json
import numpy as np

# Baseline (old NanoMamba-Tiny without structural)
baseline = {
    'factory': [38.4, 56.1, 70.1, 77.6, 83.2, 85.1, 86.6],
    'white':   [20.2, 51.6, 69.3, 79.8, 86.2, 90.1, 91.8],
    'pink':    [9.9, 38.3, 69.4, 81.9, 88.6, 91.3, 92.5],
}

snr_levels = np.array([-15, -10, -5, 0, 5, 10, 15])

# Load new results
results_path = os.path.join(WORK_DIR, 'results', 'final_results.json')
if os.path.exists(results_path):
    with open(results_path) as f:
        data = json.load(f)

    print("=" * 70)
    print("  DEGRADATION SLOPE ANALYSIS")
    print("  (Lower slope = more gradual degradation = BETTER)")
    print("=" * 70)
    
    for noise_type in ['factory', 'white', 'pink']:
        print(f"\n  --- {noise_type.upper()} ---")
        
        # Baseline slope
        b = np.array(baseline[noise_type])
        slope_b = np.polyfit(snr_levels, b, 1)[0]
        drop_b = b[-1] - b[0]  # 15dB - (-15dB)
        print(f"  NanoMamba-Tiny (old):    slope={slope_b:.2f}%/dB | "
              f"range={b[0]:.1f}%‚Üí{b[-1]:.1f}% (Œî={drop_b:.1f}pp)")
        
        # New models
        for model_name, mdata in data.get('models', {}).items():
            nr = mdata.get('noise_robustness', {})
            if noise_type in nr:
                vals = np.array([nr[noise_type].get(str(s), 0) for s in snr_levels])
                slope_n = np.polyfit(snr_levels, vals, 1)[0]
                drop_n = vals[-1] - vals[0]
                
                # Compare with baseline
                slope_diff = slope_n - slope_b
                indicator = '‚úì flatter' if slope_n < slope_b else '‚úó steeper'
                
                print(f"  {model_name:<25} slope={slope_n:.2f}%/dB | "
                      f"range={vals[0]:.1f}%‚Üí{vals[-1]:.1f}% (Œî={drop_n:.1f}pp) | "
                      f"{indicator} ({slope_diff:+.2f})")
else:
    print("Run training first!")

## 8. Save Trained Structural Parameters

In [None]:
import torch
import torch.nn.functional as F
from pathlib import Path
from nanomamba import create_nanomamba_tiny, create_nanomamba_tiny_tc, create_nanomamba_tiny_ws_tc

print("=" * 60)
print("  LEARNED STRUCTURAL PARAMETERS (after training)")
print("=" * 60)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

for name, fn in [
    ('NanoMamba-Tiny', create_nanomamba_tiny),
    ('NanoMamba-Tiny-TC', create_nanomamba_tiny_tc),
    ('NanoMamba-Tiny-WS-TC', create_nanomamba_tiny_ws_tc),
]:
    ckpt_path = Path(WORK_DIR) / 'checkpoints' / name / 'best.pt'
    if not ckpt_path.exists():
        print(f"\n  {name}: [no checkpoint]")
        continue
    
    model = fn().to(device)
    ckpt = torch.load(ckpt_path, map_location=device)
    model.load_state_dict(ckpt['model_state_dict'])
    
    print(f"\n  {name} (val_acc={ckpt.get('val_acc', 0):.2f}%):")
    for pname, p in model.named_parameters():
        if any(k in pname for k in ['log_delta_floor', 'log_epsilon',
                                     'gate_floor', 'alpha']):
            if 'log_delta' in pname:
                raw = p.item()
                val = F.softplus(p).item()
                print(f"    Œî_floor: raw={raw:.4f} ‚Üí softplus={val:.6f}")
            elif 'log_epsilon' in pname:
                raw = p.item()
                val = F.softplus(p).item()
                print(f"    Œµ:       raw={raw:.4f} ‚Üí softplus={val:.6f}")
            elif 'gate_floor' in pname:
                print(f"    B_floor: {p.item():.4f}")
            elif 'alpha' in pname:
                print(f"    Œ±:       {p.item():.4f}")

## 9. (Optional) Eval-Only: Re-run noise eval on trained checkpoints

In [None]:
# Use this if you want to re-evaluate without retraining
# !python train_colab.py \
#     --models NanoMamba-Tiny,NanoMamba-Tiny-TC,NanoMamba-Tiny-WS-TC \
#     --data_dir ./data \
#     --checkpoint_dir ./checkpoints \
#     --results_dir ./results \
#     --eval_only \
#     --noise_types factory,white,babble,street,pink \
#     --snr_range="-15,-10,-5,0,5,10,15"