# üìä Stat-OOD: Google Colab Experiment

Welcome to the **Stat-OOD** interactive playground. This notebook allows you to clone the repository, install dependencies using `uv`, and run OOD detection experiments directly on a free Tesla T4 GPU.

## 1. Setup Environment
We use `uv` for fast package management. This step checks out the code and installs PyTorch, HuggingFace Transformers, and other dependencies.

In [1]:
!nvidia-smi

Mon Jan 12 06:24:26 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   34C    P8              9W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

In [2]:
!pip install -q uv
!git clone https://github.com/sucpark/stat-ood.git
%cd stat-ood
!uv sync

[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m22.3/22.3 MB[0m [31m86.0 MB/s[0m eta [36m0:00:00[0m:00:01[0m00:01[0m
[?25hCloning into 'stat-ood'...
remote: Enumerating objects: 136, done.[K
remote: Counting objects: 100% (136/136), done.[K
remote: Compressing objects: 100% (86/86), done.[K
remote: Total 136 (delta 63), reused 116 (delta 43), pack-reused 0 (from 0)[K
Receiving objects: 100% (136/136), 171.28 KiB | 1.38 MiB/s, done.
Resolving deltas: 100% (63/63), done.
/content/stat-ood
Using CPython 3.12.12 interpreter at: [36m/usr/bin/python3[39m
Creating virtual environment at: [36m.venv[39m
[2mResolved [1m87 packages[0m [2min 2ms[0m[0m
[2K[2mPrepared [1m85 packages[0m [2min 1m 17s[0m[0m                                           
[2K[2mInstalled [1m85 packages[0m [2min 702ms[0m[0m                              [0m
 [32m+[39m [1maiohappyeyeballs[0m[2m

## 2. Quick Verification (Debug Mode)
Run a fast 1-epoch training cycle to verify everything is working correctly.

In [5]:
!uv run python main.py experiment.debug=true

[2026-01-12 05:43:18,775][__main__][INFO] - Using device: cpu
[2026-01-12 05:43:18,776][__main__][INFO] - Initializing Data Loader...
[2026-01-12 05:43:19,440][src.data.loader][INFO] - Loading dataset: clinc_oos
Map: 100% 15250/15250 [00:02<00:00, 5676.46 examples/s]
Map: 100% 3100/3100 [00:00<00:00, 4142.98 examples/s]
Map: 100% 5500/5500 [00:01<00:00, 4032.23 examples/s]
Filter: 100% 15250/15250 [00:00<00:00, 28812.47 examples/s]
Filter: 100% 3100/3100 [00:00<00:00, 28654.67 examples/s]
Filter: 100% 5500/5500 [00:00<00:00, 38470.23 examples/s]
Filter: 100% 5500/5500 [00:00<00:00, 50893.10 examples/s]
[2026-01-12 05:43:27,749][src.data.loader][INFO] - Train (ID) size: 15150
[2026-01-12 05:43:27,749][src.data.loader][INFO] - Val (ID) size: 3080
[2026-01-12 05:43:27,749][src.data.loader][INFO] - Test (ID) size: 5470
[2026-01-12 05:43:27,749][src.data.loader][INFO] - Test (OOD) size: 30
[2026-01-12 05:43:27,750][__main__][INFO] - Initializing Model...
[2026-01-12 05:43:29,138][src.models

## 3. Run Standard Experiment (English)
Train BERT on CLINC150 (English) and evaluate using **Mahalanobis Distance**.

In [7]:
!uv run python main.py model=base dataset=ba3se ood_method=mahalanobis

In 'config': Could not find 'dataset/ba3se'

Available options in 'dataset':
	base
	massive_ko
Config search path:
	provider=hydra, path=pkg://hydra.conf
	provider=main, path=file:///content/stat-ood/stat-ood/configs
	provider=schema, path=structured://

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.


## 4. Run Advanced Experiment (Korean + E5 + Energy)
This runs the research-grade configuration:
*   **Dataset**: MASSIVE (Korean)
*   **Model**: E5-Multilingual (Mean Pooling)
*   **Method**: Energy Score

In [None]:
!uv run python main.py \
    dataset=massive_ko \
    model.name="intfloat/multilingual-e5-base" \
    model.pooling="mean" \
    ood_method="energy"

## 5. WandB Login (Optional)
To visualize results in Weights & Biases, run the cell below and enter your API key.

In [None]:
!uv run wandb login