# TranspOLMo2-1B: Kaggle Starter Notebook

Quick setup for running transparency analysis on Kaggle.

This notebook will:
1. Clone the TranspOLMo2-1B repository
2. Install dependencies
3. Run a quick test to verify everything works
4. Run full analysis with Kaggle-optimized settings
5. Display results

## Setup (Run Once)

Clone the repository and install dependencies:

In [None]:
# Clone repository
!git clone https://github.com/BarzinL/TranspOLMo2-1B.git
%cd TranspOLMo2-1B

# Install dependencies (includes accelerate for multi-GPU support)
!pip install -q -e .[all]

In [None]:
# Verify GPU is available
!nvidia-smi

## Quick Test

Run a minimal test to verify everything works:

In [None]:
# Run minimal test (uses very little memory)
!python scripts/minimal_test.py

## Full Analysis

Run the complete transparency analysis pipeline with Kaggle-optimized configuration:

- Model: OLMo-2-0425-1B (1B parameters)
- Samples: 10,000 
- Layers: [0, 6, 11] (representative layers)
- Precision: float16 (memory efficient)
- SAE training: Skipped (for faster initial run)

Results will be saved to `/kaggle/working/results/`

In [None]:
# Run full analysis with Kaggle config
!python scripts/run_full_analysis.py --config configs/kaggle.yaml

## View Results

Display the transparency scores and summary:

In [None]:
import json
from pathlib import Path

# Display transparency scores
scores_path = Path('/kaggle/working/results/transparency_scores.json')
if scores_path.exists():
    with open(scores_path) as f:
        scores = json.load(f)
        print("="*60)
        print("TRANSPARENCY ANALYSIS RESULTS")
        print("="*60)
        print(json.dumps(scores, indent=2))
        print("\n" + "="*60)
        print(f"Transparency Score: {scores['transparency_score']:.2%}")
        print(f"Features Discovered: {scores['total_features_discovered']}")
        print(f"Circuits Analyzed: {scores['total_circuits_discovered']}")
        print("="*60)
else:
    print("Results not found. Make sure the analysis completed successfully.")

## Download Results

All results are saved to `/kaggle/working/results/`:

- `transparency_scores.json` - Quick summary of results
- `geometric_analysis.json` - Layer geometry analysis
- `model_documentation.json` - Complete documentation
- `model_documentation.md` - Human-readable report
- `summary.json` - Analysis summary

Use Kaggle's "Save Version" to preserve all outputs, or download specific files using the data panel.

## Advanced Options

You can override any config settings via command line arguments:

In [None]:
# Example: Run with different settings
# !python scripts/run_full_analysis.py \
#     --config configs/kaggle.yaml \
#     --num-samples 20000 \
#     --layers "0,6,11" \
#     --skip-sae  # Skip SAE training for faster runs