# Colab Setup for Probe Research

**Purpose:** Set up Google Colab environment with GPU for mechanistic interpretability work.

**Steps:**
1. Enable GPU in Colab
2. Install required packages
3. Verify GPU access
4. Test transformer_lens with GPU

---

## ‚ö†Ô∏è FIRST: Enable GPU in Colab

**Before running any code:**
1. Click **Runtime** ‚Üí **Change runtime type**
2. Set **Hardware accelerator** to **GPU** (T4 or better)
3. Click **Save**

Then proceed with the cells below.

---

## Step 1: Check GPU Availability

In [1]:
import torch

print("=" * 50)
print("GPU Availability Check")
print("=" * 50)

if torch.cuda.is_available():
    print(f"‚úÖ CUDA available: {torch.cuda.is_available()}")
    print(f"‚úÖ CUDA version: {torch.version.cuda}")
    print(f"‚úÖ GPU device: {torch.cuda.get_device_name(0)}")
    print(f"‚úÖ GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
    print(f"\nüöÄ Ready for GPU-accelerated computations!")
else:
    print("‚ùå No GPU detected!")
    print("‚ö†Ô∏è  Go to Runtime ‚Üí Change runtime type ‚Üí Set Hardware accelerator to GPU")
    print("Then restart this cell.")

GPU Availability Check
‚úÖ CUDA available: True
‚úÖ CUDA version: 12.6
‚úÖ GPU device: Tesla T4
‚úÖ GPU memory: 15.83 GB

üöÄ Ready for GPU-accelerated computations!


---

## Step 2: Install Required Packages

**Note:** Colab has PyTorch pre-installed. We just need to install:
- transformer-lens (mechanistic interpretability)
- Additional dependencies

In [2]:
# Install transformer-lens and dependencies
!pip install transformer-lens -q

print("\n" + "=" * 50)
print("Package Installation Complete")
print("=" * 50)


Package Installation Complete


In [3]:
# Verify installations
import transformer_lens as tl
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report

print("‚úÖ All packages imported successfully!")
print(f"‚úÖ NumPy: {np.__version__}")
print(f"‚úÖ PyTorch: {torch.__version__}")
print(f"‚úÖ TransformerLens ready!")

‚úÖ All packages imported successfully!
‚úÖ NumPy: 1.26.4
‚úÖ PyTorch: 2.9.0+cu126
‚úÖ TransformerLens ready!


---

## Step 3: Test GPU with Transformer Model

**This will:**
1. Load GPT-2 small on GPU
2. Run inference
3. Measure speed vs. CPU

In [4]:
import time

print("Loading GPT-2 small...")
model = tl.HookedTransformer.from_pretrained("gpt2-small")

# Move model to GPU if available
if torch.cuda.is_available():
    model = model.cuda()
    print(f"‚úÖ Model loaded on GPU: {torch.cuda.get_device_name(0)}")
else:
    print("‚ö†Ô∏è  Model on CPU (slower)")

print(f"\nModel specs: {model.cfg.n_layers} layers, {model.cfg.d_model} dimensions")

Loading GPT-2 small...


Error while fetching `HF_TOKEN` secret value from your vault: 'Requesting secret HF_TOKEN timed out. Secrets can only be fetched when running from the Colab UI.'.
You are not authenticated with the Hugging Face Hub in this notebook.
If the error persists, please let us know by opening an issue on GitHub (https://github.com/huggingface/huggingface_hub/issues/new).


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

`torch_dtype` is deprecated! Use `dtype` instead!


model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Loaded pretrained model gpt2-small into HookedTransformer
Moving model to device:  cuda
‚úÖ Model loaded on GPU: Tesla T4

Model specs: 12 layers, 768 dimensions


In [5]:
# Speed test
test_prompt = "The capital of France is"

print("Running speed test...\n")

# Warm-up run
_ = model.run_with_cache(test_prompt)

# Timed run
start = time.time()
logits, cache = model.run_with_cache(test_prompt)
end = time.time()

print(f"‚è±Ô∏è  Inference time: {(end - start) * 1000:.2f} ms")
print(f"\nTop predictions:")
top_tokens = logits[0, -1].topk(5)
for i in range(5):
    token_id = top_tokens.indices[i]
    token_str = model.tokenizer.decode(token_id)
    prob = torch.softmax(logits[0, -1], dim=-1)[token_id].item()
    print(f"  {i+1}. '{token_str}' (prob: {prob:.2%})")

print(f"\n‚úÖ GPU inference working!")

Running speed test...

‚è±Ô∏è  Inference time: 34.34 ms

Top predictions:
  1. ' now' (prob: 4.75%)
  2. ' the' (prob: 3.74%)
  3. ' a' (prob: 3.55%)
  4. ' home' (prob: 3.09%)
  5. ' in' (prob: 2.70%)

‚úÖ GPU inference working!


---

## Step 4: Test Activation Extraction on GPU

**This tests the core operation for probe research.**

In [6]:
def get_final_token_activation(model, sentence, layer=6):
    """
    Extract activation of final token at specified layer.
    Works on both CPU and GPU.
    """
    _, cache = model.run_with_cache(sentence)
    layer_acts = cache["resid_post", layer]
    final_act = layer_acts[0, -1, :].cpu().numpy()  # Move to CPU for numpy
    return final_act

# Test with multiple sentences
test_sentences = [
    "I love this movie!",
    "This is terrible.",
    "The weather is nice today.",
]

print("Extracting activations from layer 6...\n")

start = time.time()
activations = []
for sent in test_sentences:
    act = get_final_token_activation(model, sent, layer=6)
    activations.append(act)
    print(f"‚úÖ '{sent[:30]}...' ‚Üí activation shape: {act.shape}")

end = time.time()

activations = np.array(activations)
print(f"\n‚è±Ô∏è  Total time: {(end - start):.2f} seconds")
print(f"üìä Combined activations shape: {activations.shape}")
print(f"\n‚úÖ Activation extraction working on GPU!")

Extracting activations from layer 6...

‚úÖ 'I love this movie!...' ‚Üí activation shape: (768,)
‚úÖ 'This is terrible....' ‚Üí activation shape: (768,)
‚úÖ 'The weather is nice today....' ‚Üí activation shape: (768,)

‚è±Ô∏è  Total time: 0.10 seconds
üìä Combined activations shape: (3, 768)

‚úÖ Activation extraction working on GPU!


---

## Step 5: Quick Probe Test on GPU

**Let's verify the full pipeline works.**

In [7]:
# Small sentiment dataset
positive = ["I love this!", "Amazing work!", "Great job!", "Fantastic!", "Excellent!"]
negative = ["I hate this.", "Terrible work.", "Poor job.", "Awful.", "Disappointing."]

print("Extracting activations for probe test...")

# Extract activations
X_pos = np.array([get_final_token_activation(model, s, layer=6) for s in positive])
X_neg = np.array([get_final_token_activation(model, s, layer=6) for s in negative])

X = np.vstack([X_pos, X_neg])
y = np.array([1]*len(positive) + [0]*len(negative))

print(f"‚úÖ Dataset: {X.shape[0]} examples, {X.shape[1]} features")

# Train probe
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
probe = LogisticRegression(max_iter=1000)
probe.fit(X_train, y_train)

train_acc = probe.score(X_train, y_train)
test_acc = probe.score(X_test, y_test)

print(f"\n=== Probe Performance ===")
print(f"Train accuracy: {train_acc:.2%}")
print(f"Test accuracy:  {test_acc:.2%}")

print(f"\nüéâ Full pipeline working on GPU!")

Extracting activations for probe test...
‚úÖ Dataset: 10 examples, 768 features

=== Probe Performance ===
Train accuracy: 100.00%
Test accuracy:  100.00%

üéâ Full pipeline working on GPU!


---

## GPU Memory Management Tips

**Important for longer experiments:**

In [8]:
# Check GPU memory usage
if torch.cuda.is_available():
    allocated = torch.cuda.memory_allocated(0) / 1e9
    reserved = torch.cuda.memory_reserved(0) / 1e9
    total = torch.cuda.get_device_properties(0).total_memory / 1e9
    
    print("=" * 50)
    print("GPU Memory Usage")
    print("=" * 50)
    print(f"Allocated: {allocated:.2f} GB")
    print(f"Reserved:  {reserved:.2f} GB")
    print(f"Total:     {total:.2f} GB")
    print(f"Free:      {total - reserved:.2f} GB")
    
    # Clear cache if needed
    # torch.cuda.empty_cache()
    # print("\n‚úÖ CUDA cache cleared")

GPU Memory Usage
Allocated: 0.87 GB
Reserved:  1.02 GB
Total:     15.83 GB
Free:      14.80 GB


---

## Next Steps

‚úÖ **Setup complete!** You can now:

1. **Upload your probe notebooks** to Colab and run them with GPU
2. **Process larger datasets** faster
3. **Test multiple layers** simultaneously
4. **Run Week 4 experiments** with better performance

**To save your work:**
- File ‚Üí Download ‚Üí Download .ipynb
- Or: File ‚Üí Save a copy in Drive (recommended)

**Colab tips:**
- Sessions last ~12 hours with activity
- Save frequently (Ctrl+S)
- Download important results
- Free tier: ~15-20 hours GPU/week

---

**Ready to start probe research with GPU acceleration!** üöÄ