# Multiband Steering Scan (Configurable Prompt)

This notebook performs the complete "multiband" scan:
- **All 36 layers** (0-35)
- **Full α range** [-10, 10] with high resolution
- **Configurable prompt** - easily swap topics to test reproducibility

**Purpose:** Test whether the signal processing findings (exp(α²) variance growth, KALM boundaries, parabolic log-derivative) are universal properties of the model or prompt-specific artifacts.

**Method:** Keep EVERYTHING identical except the prompt text. Compare structure across scans.

**Expected runtime:** ~3-5 minutes on H200 (18,000 generations in 72 batches)

## Configuration

In [1]:
# ============================================================================
# EXPERIMENT CONFIGURATION - Edit these to create different scans
# ============================================================================

# Prompt configuration
PROMPT = "Tell me about the sun. Please do not use Markdown."
SCAN_NAME = "sun"  # Short identifier for output files (no spaces)

# Output paths (automatically prefixed with scan name)
OUTPUT_CSV = f'../data/results/multiband_scan_{SCAN_NAME}.csv'
OUTPUT_JSON = f'../data/results/multiband_scan_{SCAN_NAME}_metadata.json'

# ============================================================================
# Model and steering configuration
# ============================================================================

MODEL_NAME = 'Qwen/Qwen3-4B-Instruct-2507'
DEVICE = 'cuda'  # H200 SXM
VECTOR_PATH = '../data/vectors/complexity_wikipedia.pt'
ALL_LAYERS = list(range(36))  # Layers 0-35

# ============================================================================
# Experiment parameters
# ============================================================================

ALPHA_MIN = -10.0
ALPHA_MAX = 10.0
N_ALPHA_SAMPLES = 500  # α values per layer (resolution 0.04)
BATCH_SIZE = 1000  # Parallel generations
MAX_NEW_TOKENS = 250

# Generation parameters
TEMPERATURE = 0.0  # Deterministic
DO_SAMPLE = False

## Setup

In [2]:
import torch
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from transformers import AutoModelForCausalLM, AutoTokenizer
from tqdm.auto import tqdm
import textstat
import json
import os
from datetime import datetime

device = torch.device(DEVICE)
print(f"✓ Using device: {device}")
if torch.cuda.is_available():
    print(f"  GPU: {torch.cuda.get_device_name(0)}")
    print(f"  VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

print(f"\nScan configuration:")
print(f"  Name: {SCAN_NAME}")
print(f"  Prompt: {PROMPT}")

✓ Using device: cuda
  GPU: NVIDIA H100 80GB HBM3
  VRAM: 84.9 GB

Scan configuration:
  Name: sun
  Prompt: Tell me about the sun. Please do not use Markdown.


## Load Model and Tokenizer

In [3]:
print(f"Loading model: {MODEL_NAME}...")

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    torch_dtype=torch.bfloat16,
    device_map=device,
)
model.eval()

print(f"✓ Model loaded")
print(f"  Layers: {model.config.num_hidden_layers}")
print(f"  Hidden dim: {model.config.hidden_size}")

Loading model: Qwen/Qwen3-4B-Instruct-2507...


`torch_dtype` is deprecated! Use `dtype` instead!


Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

✓ Model loaded
  Layers: 36
  Hidden dim: 2560


## Load All Steering Vectors

In [4]:
print(f"Loading steering vectors from {VECTOR_PATH}...")
vector_data = torch.load(VECTOR_PATH, weights_only=False)

# Load all 36 layer vectors
steering_vectors = vector_data['vectors'].to(device).to(torch.bfloat16)  # [36, hidden_dim]

print(f"✓ Loaded steering vectors for all layers")
print(f"  Shape: {steering_vectors.shape}")
print(f"  Euclidean norms by layer:")
for layer_idx in [0, 10, 20, 30, 34, 35]:
    print(f"    Layer {layer_idx:2d}: {vector_data['layer_norms'][layer_idx]:.2f}")

Loading steering vectors from ../data/vectors/complexity_wikipedia.pt...
✓ Loaded steering vectors for all layers
  Shape: torch.Size([36, 2560])
  Euclidean norms by layer:
    Layer  0: 0.42
    Layer 10: 10.77
    Layer 20: 14.28
    Layer 30: 38.86
    Layer 34: 62.46
    Layer 35: 24.39


## Prepare Experiment Grid

In [5]:
# Generate all α values (same for each layer)
alphas = np.linspace(ALPHA_MIN, ALPHA_MAX, N_ALPHA_SAMPLES)

# Create experiment grid: (layer, alpha) pairs
experiment_grid = []
for layer in ALL_LAYERS:
    for alpha in alphas:
        experiment_grid.append((layer, alpha))

total_experiments = len(experiment_grid)
n_batches = int(np.ceil(total_experiments / BATCH_SIZE))

print(f"Experiment grid: {len(ALL_LAYERS)} layers × {N_ALPHA_SAMPLES} α values = {total_experiments} total generations")
print(f"  α range: [{ALPHA_MIN}, {ALPHA_MAX}] with step size {(ALPHA_MAX - ALPHA_MIN) / N_ALPHA_SAMPLES:.4f}")
print(f"  Batching: {n_batches} batches of {BATCH_SIZE} samples each")
print(f"  Estimated runtime: ~{n_batches * 3:.0f} seconds ({n_batches * 3 / 60:.1f} minutes) at 3s/batch")

Experiment grid: 36 layers × 500 α values = 18000 total generations
  α range: [-10.0, 10.0] with step size 0.0400
  Batching: 18 batches of 1000 samples each
  Estimated runtime: ~54 seconds (0.9 minutes) at 3s/batch


## Batched Generation Function

In [6]:
def generate_batch_with_layer_steering(prompt, layer_alpha_pairs):
    """
    Generate multiple completions with different (layer, α) pairs in a single batch.
    
    Args:
        prompt: Input text (will be formatted as user message)
        layer_alpha_pairs: List of (layer_idx, alpha) tuples
    
    Returns:
        List of completions (one per pair)
    """
    batch_size = len(layer_alpha_pairs)
    
    # Format prompt using chat template
    messages = [{"role": "user", "content": prompt}]
    formatted_prompt = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )
    
    # Tokenize and replicate for batch
    inputs = tokenizer([formatted_prompt] * batch_size, return_tensors='pt', padding=True).to(device)
    
    # Organize by layer (hook registration is per-layer)
    layer_to_samples = {}
    for batch_idx, (layer, alpha) in enumerate(layer_alpha_pairs):
        if layer not in layer_to_samples:
            layer_to_samples[layer] = []
        layer_to_samples[layer].append((batch_idx, alpha))
    
    # Register hooks for all active layers
    hook_handles = []
    
    for layer, samples in layer_to_samples.items():
        # Precompute which batch indices get steered at this layer
        batch_indices = torch.tensor([s[0] for s in samples], device=device, dtype=torch.long)
        batch_alphas = torch.tensor([s[1] for s in samples], device=device, dtype=torch.bfloat16)
        steering_vector = steering_vectors[layer]  # [hidden_dim]
        
        def make_steering_hook(batch_indices, batch_alphas, steering_vector):
            def steering_hook(module, input, output):
                # Extract hidden states
                if isinstance(output, tuple):
                    hidden_states = output[0]
                else:
                    hidden_states = output
                
                # Apply per-sample steering ONLY to samples targeting this layer
                for i, alpha in zip(batch_indices, batch_alphas):
                    hidden_states[i] = hidden_states[i] + alpha * steering_vector
                
                if isinstance(output, tuple):
                    return (hidden_states,) + output[1:]
                else:
                    return hidden_states
            return steering_hook
        
        # Register hook for this layer
        target_layer = model.model.layers[layer]
        hook = make_steering_hook(batch_indices, batch_alphas, steering_vector)
        hook_handle = target_layer.register_forward_hook(hook)
        hook_handles.append(hook_handle)
    
    try:
        # Generate
        with torch.no_grad():
            outputs = model.generate(
                **inputs,
                max_new_tokens=MAX_NEW_TOKENS,
                temperature=TEMPERATURE if TEMPERATURE > 0 else None,
                do_sample=DO_SAMPLE,
                pad_token_id=tokenizer.pad_token_id or tokenizer.eos_token_id,
            )
        
        # Decode all outputs
        completions = []
        for output in outputs:
            full_text = tokenizer.decode(output, skip_special_tokens=True)
            # Extract just assistant response
            completion = full_text.split("assistant\n")[-1].strip()
            completions.append(completion)
        
        return completions
    
    finally:
        # Always remove all hooks
        for handle in hook_handles:
            handle.remove()

# Test with small batch
print("Testing batched generation with (layer, α) = [(0, 0.0), (34, 1.0)]...")
test_completions = generate_batch_with_layer_steering(PROMPT, [(0, 0.0), (34, 1.0)])
print(f"✓ Generated {len(test_completions)} completions")
print(f"  Sample 1 (L0, α=0): {len(test_completions[0].split())} words")
print(f"  Sample 2 (L34, α=1): {len(test_completions[1].split())} words")

The following generation flags are not valid and may be ignored: ['top_p', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


Testing batched generation with (layer, α) = [(0, 0.0), (34, 1.0)]...
✓ Generated 2 completions
  Sample 1 (L0, α=0): 197 words
  Sample 2 (L34, α=1): 203 words


## Run Full Multiband Scan 🚀

In [7]:
print(f"Running multiband scan '{SCAN_NAME}': {total_experiments} generations across {n_batches} batches\n")
print(f"Prompt: {PROMPT}\n")

results = []

# Split experiment grid into batches
for batch_start in tqdm(range(0, total_experiments, BATCH_SIZE), desc="Processing batches"):
    batch_end = min(batch_start + BATCH_SIZE, total_experiments)
    batch_pairs = experiment_grid[batch_start:batch_end]
    
    # Generate entire batch
    completions = generate_batch_with_layer_steering(PROMPT, batch_pairs)
    
    # Compute metrics for each completion
    for (layer, alpha), completion in zip(batch_pairs, completions):
        # Compute grade level and reading ease
        try:
            grade_level = textstat.flesch_kincaid_grade(completion)
            reading_ease = textstat.flesch_reading_ease(completion)
        except:
            grade_level = np.nan
            reading_ease = np.nan
        
        # Count words and sentences
        words = completion.split()
        n_words = len(words)
        n_sentences = completion.count('.') + completion.count('!') + completion.count('?')
        
        # Compute diagnostics
        avg_word_length = np.mean([len(w) for w in words]) if words else 0
        avg_sentence_length = n_words / n_sentences if n_sentences > 0 else 0
        
        # Store
        results.append({
            'layer': layer,
            'alpha': float(alpha),
            'grade_level': grade_level,
            'reading_ease': reading_ease,
            'n_words': n_words,
            'n_sentences': n_sentences,
            'avg_word_length': avg_word_length,
            'avg_sentence_length': avg_sentence_length,
            'completion': completion,
        })

# Convert to DataFrame
df = pd.DataFrame(results)

print(f"\n✓ Completed {len(df)} generations across {len(ALL_LAYERS)} layers")
print(f"\nGrade level range: {df['grade_level'].min():.1f} to {df['grade_level'].max():.1f}")
print(f"Reading ease range: {df['reading_ease'].min():.1f} to {df['reading_ease'].max():.1f}")
print(f"Mean words per generation: {df['n_words'].mean():.1f}")

Running multiband scan 'sun': 18000 generations across 18 batches

Prompt: Tell me about the sun. Please do not use Markdown.



Processing batches:   0%|          | 0/18 [00:00<?, ?it/s]


✓ Completed 18000 generations across 36 layers

Grade level range: -3.4 to 383.4
Reading ease range: -2632.4 to 121.2
Mean words per generation: 183.0


## Quick Diagnostics

In [8]:
# Define "coherent" as: grade level < 50 and n_sentences > 0
df['coherent'] = (df['grade_level'] < 50) & (df['n_sentences'] > 0)

# Quick layer summary
layer_stats = []
for layer in ALL_LAYERS:
    layer_df = df[df['layer'] == layer]
    coherent_samples = layer_df[layer_df['coherent']]
    
    if len(coherent_samples) > 0:
        alpha_min = coherent_samples['alpha'].min()
        alpha_max = coherent_samples['alpha'].max()
        coherent_fraction = len(coherent_samples) / len(layer_df)
        mean_grade = coherent_samples['grade_level'].mean()
    else:
        alpha_min = np.nan
        alpha_max = np.nan
        coherent_fraction = 0.0
        mean_grade = np.nan
    
    layer_stats.append({
        'layer': layer,
        'alpha_min_coherent': alpha_min,
        'alpha_max_coherent': alpha_max,
        'coherent_fraction': coherent_fraction,
        'mean_grade_level': mean_grade,
    })

layer_stats_df = pd.DataFrame(layer_stats)

print("\nTop 5 layers by coherent fraction:")
print(layer_stats_df.nlargest(5, 'coherent_fraction')[['layer', 'coherent_fraction', 'alpha_min_coherent', 'alpha_max_coherent']].to_string(index=False))

print("\nBottom 5 layers by coherent fraction:")
print(layer_stats_df.nsmallest(5, 'coherent_fraction')[['layer', 'coherent_fraction', 'alpha_min_coherent', 'alpha_max_coherent']].to_string(index=False))


Top 5 layers by coherent fraction:
 layer  coherent_fraction  alpha_min_coherent  alpha_max_coherent
     0                1.0               -10.0                10.0
     2                1.0               -10.0                10.0
     3                1.0               -10.0                10.0
     4                1.0               -10.0                10.0
     5                1.0               -10.0                10.0

Bottom 5 layers by coherent fraction:
 layer  coherent_fraction  alpha_min_coherent  alpha_max_coherent
    34              0.682          -10.000000            5.951904
    33              0.696          -10.000000            6.833667
    25              0.702           -6.633267            8.316633
    30              0.702           -6.593186            8.116232
    28              0.704           -6.793587            7.875752


## Sample Output (α=0)

In [9]:
# Show output at α≈0 for a few representative layers
sample_layers = [0, 15, 34, 35]

for layer in sample_layers:
    layer_df = df[df['layer'] == layer]
    
    # Find α≈0 sample
    idx = (layer_df['alpha'] - 0.0).abs().idxmin()
    row = layer_df.loc[idx]
    
    print(f"\n{'='*80}")
    print(f"Layer {int(row['layer'])} | α = {row['alpha']:.3f} | Grade Level = {row['grade_level']:.1f}")
    print(f"Coherent = {row['coherent']} | Words = {row['n_words']}")
    print(f"{'='*80}")
    print(row['completion'][:300])
    if len(row['completion']) > 300:
        print("...")


Layer 0 | α = -0.020 | Grade Level = 10.4
Coherent = True | Words = 178
The Sun is a star located at the center of our solar system. It is a massive ball of hot, glowing gas, primarily composed of hydrogen and helium. The Sun's immense gravity holds the planets, moons, asteroids, and comets in orbit around it. It generates energy through a process called nuclear fusion,
...

Layer 15 | α = -0.020 | Grade Level = 10.0
Coherent = True | Words = 180
The Sun is a star located at the center of our solar system. It is a massive ball of hot, glowing gas, primarily composed of hydrogen and helium. The Sun's immense gravity holds together the planets, moons, asteroids, and comets that orbit around it. It generates energy through a process called nucl
...

Layer 34 | α = -0.020 | Grade Level = 9.7
Coherent = True | Words = 198
The Sun is a star located at the center of our solar system. It is a massive ball of hot, glowing gas, primarily composed of hydrogen and helium. The Sun's immense gravit

## Save Results

In [11]:
# Save DataFrame
df.to_csv(OUTPUT_CSV, index=False)
print(f"✓ Saved results to {OUTPUT_CSV}")

# Save metadata
metadata = {
    'scan_name': SCAN_NAME,
    'prompt': PROMPT,
    'model': MODEL_NAME,
    'vector_path': VECTOR_PATH,
    'layers': ALL_LAYERS,
    'alpha_range': [float(ALPHA_MIN), float(ALPHA_MAX)],
    'n_alpha_samples': N_ALPHA_SAMPLES,
    'total_generations': total_experiments,
    'batch_size': BATCH_SIZE,
    'max_new_tokens': MAX_NEW_TOKENS,
    'temperature': TEMPERATURE,
    'do_sample': DO_SAMPLE,
    'layer_statistics': layer_stats_df.to_dict(orient='records'),
    'timestamp': datetime.now().isoformat(),
}

with open(OUTPUT_JSON, 'w') as f:
    json.dump(metadata, f, indent=2)
print(f"✓ Saved metadata to {OUTPUT_JSON}")

file_size_mb = sum([
    os.path.getsize(OUTPUT_CSV),
    os.path.getsize(OUTPUT_JSON)
]) / (1024 * 1024)
print(f"\nTotal output size: {file_size_mb:.2f} MB")

✓ Saved results to ../data/results/multiband_scan_sun.csv
✓ Saved metadata to ../data/results/multiband_scan_sun_metadata.json

Total output size: 19.62 MB


## Summary

Multiband scan complete! 🚀

**What we captured:**
- 36 layers × 500 α values = 18,000 generations
- Flesch-Kincaid grade level + reading ease
- Word/sentence counts and diagnostics
- Full completion text for qualitative analysis

**Next steps:**
1. Download data to laptop
2. Run signal processing analysis (consensus signal, variance, derivative)
3. Compare structure across different prompts (QM vs sun vs others)
4. Test universality of exp(α²) signature