# CH-OOD Deep Learning Experiments\n\nThis notebook runs the deep OOD detection experiments for the paper:\n**Certified Geometric OOD via Directional Depth and Kubota Projection Sketches**\n\n## Expected Runtime on T4 GPU\n- **Total time**: ~45-60 minutes\n  - Model training: ~20-25 minutes\n  - Feature extraction: ~10-15 minutes\n  - OOD evaluation: ~15-20 minutes\n\n## Experiments\n- **ID Dataset**: CIFAR-10 (test set)\n- **OOD Datasets**: SVHN, CIFAR-100\n- **Methods**: CH-OOD (ours), Energy, ODIN, Mahalanobis\n- **Metrics**: AUROC, FPR@95%TPR

## 1. Setup Environment

In [None]:
# Check GPU availability\nimport torch\nif torch.cuda.is_available():\n    device = torch.device('cuda')\n    print(f\"✓ GPU available: {torch.cuda.get_device_name(0)}\")\n    print(f\"  Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB\")\nelse:\n    print(\"⚠️ No GPU detected. Runtime will be slower.\")\n    print(\"To enable GPU in Colab: Runtime → Change runtime type → Hardware accelerator → GPU (T4)\")

## 2. Install Dependencies

In [None]:
# Install required packages (if not in Colab, these should already be available)\n!pip install -q torch torchvision numpy scikit-learn matplotlib tqdm

## 3. Download Experiment Code

In [None]:
# Download the experiment code from GitHub\n!wget -q https://raw.githubusercontent.com/ashachar/ch-ood-experiments/main/ch_ood_colab.py\nprint(\"✓ Experiment code downloaded\")

In [None]:
# Import the experiment module\nimport ch_ood_colab\nimport importlib\nimportlib.reload(ch_ood_colab)  # Reload in case of updates\nprint(\"✓ Module imported successfully\")

## 4. Setup Directories and Device

In [None]:
# Create necessary directories and return device\ndevice = ch_ood_colab.setup_colab()\nprint(f\"✓ Setup complete. Using device: {device}\")

## 5. Train ResNet-18 on CIFAR-10\n\nThis step trains a ResNet-18 model on CIFAR-10 for 30 epochs.\n- **Expected time**: ~20-25 minutes on T4 GPU\n- **Skip this cell** if you want to use a pre-trained model

In [None]:
# Train the model (set train_model=False to skip and use pre-trained)\nmodel = ch_ood_colab.train_resnet_cifar10(device, epochs=30)\nprint(\"\\n✓ Model training complete\")

## 6. Load Datasets\n\nLoad CIFAR-10 (ID) and OOD datasets (SVHN, CIFAR-100)

In [None]:
# Load all datasets\nprint(\"Loading datasets...\")\nloaders = ch_ood_colab.load_datasets()\n\nprint(\"\\n✓ Datasets loaded:\")\nprint(f\"  - ID: CIFAR-10 (test set: {len(loaders['id_test'].dataset)} samples)\")\nprint(f\"  - OOD: SVHN ({len(loaders['ood_svhn'].dataset)} samples)\")\nprint(f\"  - OOD: CIFAR-100 ({len(loaders['ood_cifar100'].dataset)} samples)\")

## 7. Extract Features\n\nExtract penultimate layer features for OOD detection\n- **Expected time**: ~10-15 minutes

In [None]:
# Load model if not already loaded\nif 'model' not in locals():\n    print(\"Loading pre-trained model...\")\n    model = ch_ood_colab.load_model(device)\n\n# Extract features\nprint(\"Extracting features...\")\nfeatures = ch_ood_colab.extract_features(model, loaders, device)\n\nprint(\"\\n✓ Features extracted:\")\nfor key, feat in features.items():\n    print(f\"  - {key}: shape {feat.shape}\")

## 8. Run OOD Detection Methods\n\nEvaluate multiple OOD detection methods:\n- CH-OOD (our method)\n- Energy-based\n- ODIN\n- Mahalanobis\n\n**Expected time**: ~15-20 minutes

In [None]:
# Run all OOD detection methods\nprint(\"Running OOD detection methods...\\n\")\nresults = ch_ood_colab.evaluate_ood_methods(features, device)\n\n# Display results summary\nprint(\"\\n\" + \"=\"*60)\nprint(\"OOD Detection Results (AUROC / FPR@95%)\")\nprint(\"=\"*60)\n\nfor ood_name in ['SVHN', 'CIFAR-100']:\n    print(f\"\\n{ood_name}:\")\n    for method in results[ood_name]:\n        auroc = results[ood_name][method]['auroc']\n        fpr95 = results[ood_name][method]['fpr95']\n        print(f\"  {method:15s}: AUROC={auroc:.3f}, FPR@95%={fpr95:.3f}\")

## 9. Generate ROC Curves

In [None]:
# Plot ROC curves\nimport matplotlib.pyplot as plt\n\nfig, axes = plt.subplots(1, 2, figsize=(12, 5))\n\nfor idx, ood_name in enumerate(['SVHN', 'CIFAR-100']):\n    ax = axes[idx]\n    \n    for method in results[ood_name]:\n        fpr = results[ood_name][method]['fpr']\n        tpr = results[ood_name][method]['tpr']\n        auroc = results[ood_name][method]['auroc']\n        \n        ax.plot(fpr, tpr, label=f\"{method} (AUROC={auroc:.3f})\")\n    \n    ax.plot([0, 1], [0, 1], 'k--', alpha=0.3)\n    ax.set_xlabel('False Positive Rate')\n    ax.set_ylabel('True Positive Rate')\n    ax.set_title(f'CIFAR-10 (ID) vs {ood_name} (OOD)')\n    ax.legend(loc='lower right')\n    ax.grid(True, alpha=0.3)\n\nplt.tight_layout()\nplt.show()

## 10. Save Results

In [None]:
# Save results to JSON\nimport json\n\n# Prepare results for saving (remove numpy arrays)\nsave_results = {}\nfor ood_name in results:\n    save_results[ood_name] = {}\n    for method in results[ood_name]:\n        save_results[ood_name][method] = {\n            'auroc': float(results[ood_name][method]['auroc']),\n            'fpr95': float(results[ood_name][method]['fpr95'])\n        }\n\nwith open('results/deep_ood_results.json', 'w') as f:\n    json.dump(save_results, f, indent=2)\n\nprint(\"✓ Results saved to results/deep_ood_results.json\")

## 11. Create Summary Table

In [None]:
# Create a formatted summary table\nimport pandas as pd\n\n# Create DataFrame\ndata = []\nfor ood_name in save_results:\n    for method in save_results[ood_name]:\n        data.append({\n            'OOD Dataset': ood_name,\n            'Method': method,\n            'AUROC': save_results[ood_name][method]['auroc'],\n            'FPR@95%': save_results[ood_name][method]['fpr95']\n        })\n\ndf = pd.DataFrame(data)\n\n# Pivot tables for better visualization\nprint(\"\\n\" + \"=\"*50)\nprint(\"AUROC Results\")\nprint(\"=\"*50)\nauroc_pivot = df.pivot(index='Method', columns='OOD Dataset', values='AUROC')\nprint(auroc_pivot.round(3))\n\nprint(\"\\n\" + \"=\"*50)\nprint(\"FPR@95%TPR Results\")\nprint(\"=\"*50)\nfpr_pivot = df.pivot(index='Method', columns='OOD Dataset', values='FPR@95%')\nprint(fpr_pivot.round(3))

## 12. Generate Detailed Analysis Report\n\nCreate a comprehensive report for analysis

In [None]:
# Generate comprehensive analysis report\nfrom datetime import datetime\nimport numpy as np\n\nreport = []\nreport.append(\"=\"*70)\nreport.append(\"CH-OOD EXPERIMENTAL RESULTS - DETAILED ANALYSIS REPORT\")\nreport.append(\"=\"*70)\nreport.append(f\"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\")\nreport.append(f\"GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'CPU'}\")\nreport.append(\"\")\n\n# 1. Dataset Information\nreport.append(\"1. DATASET SUMMARY\")\nreport.append(\"-\" * 40)\nreport.append(f\"   ID Dataset: CIFAR-10\")\nreport.append(f\"   - Training samples: 50,000\")\nreport.append(f\"   - Test samples: {len(loaders['id_test'].dataset)}\")\nreport.append(f\"   - Classes: 10\")\nreport.append(f\"   - Image size: 32x32x3\")\nreport.append(\"\")\nreport.append(f\"   OOD Datasets:\")\nreport.append(f\"   - SVHN: {len(loaders['ood_svhn'].dataset)} samples\")\nreport.append(f\"   - CIFAR-100: {len(loaders['ood_cifar100'].dataset)} samples\")\nreport.append(\"\")\n\n# 2. Model Architecture\nreport.append(\"2. MODEL ARCHITECTURE\")\nreport.append(\"-\" * 40)\nreport.append(f\"   Architecture: ResNet-18 (CIFAR variant)\")\nreport.append(f\"   - Modified first conv: 3x3 (no maxpool)\")\nreport.append(f\"   - Feature dimension: 512\")\nreport.append(f\"   - Training epochs: 30\")\nreport.append(f\"   - Optimizer: SGD with momentum\")\nreport.append(f\"   - Learning rate schedule: MultiStep [15, 25]\")\nreport.append(\"\")\n\n# 3. Method-by-Method Results\nreport.append(\"3. DETAILED RESULTS BY METHOD\")\nreport.append(\"-\" * 40)\n\nfor method in ['CH-OOD', 'Energy', 'ODIN', 'Mahalanobis']:\n    report.append(f\"\\n   {method}:\")\n    \n    # Collect results across datasets\n    aurocs = []\n    fpr95s = []\n    \n    for ood_name in ['SVHN', 'CIFAR-100']:\n        if method in results[ood_name]:\n            auroc = results[ood_name][method]['auroc']\n            fpr95 = results[ood_name][method]['fpr95']\n            aurocs.append(auroc)\n            fpr95s.append(fpr95)\n            report.append(f\"   - vs {ood_name:12s}: AUROC={auroc:.4f}, FPR@95%={fpr95:.4f}\")\n    \n    if aurocs:\n        report.append(f\"   - Average AUROC: {np.mean(aurocs):.4f} ± {np.std(aurocs):.4f}\")\n        report.append(f\"   - Average FPR@95%: {np.mean(fpr95s):.4f} ± {np.std(fpr95s):.4f}\")\n\nreport.append(\"\")\n\n# 4. Comparative Analysis\nreport.append(\"4. COMPARATIVE ANALYSIS\")\nreport.append(\"-\" * 40)\n\n# Best performer for each dataset\nfor ood_name in ['SVHN', 'CIFAR-100']:\n    report.append(f\"\\n   {ood_name} Results:\")\n    \n    # Find best AUROC\n    best_auroc_method = max(results[ood_name].keys(), \n                            key=lambda m: results[ood_name][m]['auroc'])\n    best_auroc = results[ood_name][best_auroc_method]['auroc']\n    \n    # Find best FPR@95\n    best_fpr_method = min(results[ood_name].keys(), \n                          key=lambda m: results[ood_name][m]['fpr95'])\n    best_fpr = results[ood_name][best_fpr_method]['fpr95']\n    \n    report.append(f\"   - Best AUROC: {best_auroc_method} ({best_auroc:.4f})\")\n    report.append(f\"   - Best FPR@95%: {best_fpr_method} ({best_fpr:.4f})\")\n    \n    # CH-OOD relative performance\n    ch_auroc = results[ood_name]['CH-OOD']['auroc']\n    ch_fpr = results[ood_name]['CH-OOD']['fpr95']\n    \n    auroc_gap = best_auroc - ch_auroc\n    fpr_gap = ch_fpr - best_fpr\n    \n    report.append(f\"   - CH-OOD gaps: AUROC={auroc_gap:+.4f}, FPR@95%={fpr_gap:+.4f}\")\n\nreport.append(\"\")\n\n# 5. Statistical Summary\nreport.append(\"5. STATISTICAL SUMMARY\")\nreport.append(\"-\" * 40)\n\n# Overall rankings\nall_methods = list(results['SVHN'].keys())\nrankings = {method: [] for method in all_methods}\n\nfor ood_name in ['SVHN', 'CIFAR-100']:\n    # Rank by AUROC (higher is better)\n    sorted_methods = sorted(all_methods, \n                           key=lambda m: results[ood_name][m]['auroc'], \n                           reverse=True)\n    for rank, method in enumerate(sorted_methods, 1):\n        rankings[method].append(rank)\n\nreport.append(\"\\n   Average Rankings (by AUROC):\")\navg_rankings = [(method, np.mean(ranks)) for method, ranks in rankings.items()]\navg_rankings.sort(key=lambda x: x[1])\n\nfor rank, (method, avg_rank) in enumerate(avg_rankings, 1):\n    report.append(f\"   {rank}. {method:15s}: {avg_rank:.1f}\")\n\nreport.append(\"\")\n\n# 6. Key Observations\nreport.append(\"6. KEY OBSERVATIONS\")\nreport.append(\"-\" * 40)\n\n# Calculate some insights\nch_ood_avg_auroc = np.mean([results[d]['CH-OOD']['auroc'] for d in ['SVHN', 'CIFAR-100']])\nenergy_avg_auroc = np.mean([results[d]['Energy']['auroc'] for d in ['SVHN', 'CIFAR-100']])\n\nif ch_ood_avg_auroc > 0.90:\n    report.append(\"   ✓ CH-OOD achieves strong performance (>0.90 avg AUROC)\")\nif ch_ood_avg_auroc > energy_avg_auroc:\n    report.append(\"   ✓ CH-OOD outperforms Energy baseline on average\")\n\n# Check consistency\nch_ood_std = np.std([results[d]['CH-OOD']['auroc'] for d in ['SVHN', 'CIFAR-100']])\nif ch_ood_std < 0.05:\n    report.append(\"   ✓ CH-OOD shows consistent performance across datasets (std < 0.05)\")\n\nreport.append(\"\")\n\n# 7. Recommendations\nreport.append(\"7. RECOMMENDATIONS FOR PAPER\")\nreport.append(\"-\" * 40)\nreport.append(\"   • Emphasize geometric interpretability and theoretical guarantees\")\nreport.append(\"   • Note that CH-OOD provides conformal p-values for calibrated uncertainty\")\nreport.append(\"   • Consider ensemble with Energy/ODIN for production deployment\")\nreport.append(\"   • Highlight the method's efficiency: O(nmd) vs O(n^4) for CHV\")\n\nreport.append(\"\")\nreport.append(\"=\"*70)\nreport.append(\"END OF REPORT\")\nreport.append(\"=\"*70)\n\n# Print the full report\nfull_report = \"\\n\".join(report)\nprint(full_report)\n\n# Save to file\nwith open('results/analysis_report.txt', 'w') as f:\n    f.write(full_report)\n\nprint(\"\\n✓ Report saved to results/analysis_report.txt\")\nprint(\"\\n\" + \"=\"*70)\nprint(\"COPY THE REPORT ABOVE FOR ANALYSIS\")\nprint(\"=\"*70)

## 13. Download Results\n\nPackage all results for download

In [None]:
# Create a zip file with all results\nimport zipfile\nimport os\n\nwith zipfile.ZipFile('ch_ood_results.zip', 'w') as zipf:\n    # Add results JSON\n    if os.path.exists('results/deep_ood_results.json'):\n        zipf.write('results/deep_ood_results.json')\n    \n    # Add all figures if any were saved\n    if os.path.exists('figures'):\n        for fig in os.listdir('figures'):\n            if fig.endswith('.png'):\n                zipf.write(f'figures/{fig}')\n    \n    # Add trained model if it exists\n    if os.path.exists('models/cifar10_resnet18_best.pth'):\n        zipf.write('models/cifar10_resnet18_best.pth')\n\nprint(\"✓ Results packaged in ch_ood_results.zip\")\nprint(\"\\nTo download in Colab:\")\nprint(\"  1. Click the folder icon in the left sidebar\")\nprint(\"  2. Find ch_ood_results.zip\")\nprint(\"  3. Click the three dots and select 'Download'\")

## Optional: Quick Test Run\n\nFor testing purposes, run everything with minimal epochs

In [None]:
# Quick test with just 1 epoch (for debugging)\n# Uncomment to run:\n\n# device = ch_ood_colab.setup_colab()\n# model = ch_ood_colab.train_resnet_cifar10(device, epochs=1)  # Just 1 epoch\n# results = ch_ood_colab.run_deep_ood_experiments(device, train_model=False)\n# print(\"Quick test complete!\")

## Citation\n\nIf you use this code in your research, please cite:\n\n```bibtex\n@article{shachar2025chood,\n  title={Certified Geometric OOD via Directional Depth and Kubota Projection Sketches},\n  author={Shachar, Amir},\n  year={2025}\n}\n```