# Coding Boost: Using Wisent CLI to Steer Models for Better Code

This notebook demonstrates how to use Wisent CLI commands to create steering vectors that improve model performance on coding tasks using LiveCodeBench.

**Approach:** We extract contrastive pairs (passing vs failing code solutions) and use them to create steering vectors that push the model toward generating correct code.

## CLI Commands Used:
- `generate-pairs-from-task`: Extract correct vs incorrect code pairs from LiveCodeBench
- `generate-vector-from-task`: Full pipeline to create coding steering vectors
- `multi-steer`: Apply steering vectors during generation
- `generate-responses`: Generate code solutions with steering
- `evaluate-responses`: Evaluate code solutions with Docker execution

## 1. Setup and Configuration

In [None]:
import os
import json

# Configuration
MODEL = "meta-llama/Llama-3.2-1B-Instruct"
TASK = "livecodebench"  # LiveCodeBench coding task
OUTPUT_DIR = "./coding_boost_outputs"
LAYER = 8  # Layer for steering (middle layers work best for semantic steering)

# Create output directories
os.makedirs(OUTPUT_DIR, exist_ok=True)
os.makedirs(f"{OUTPUT_DIR}/pairs", exist_ok=True)
os.makedirs(f"{OUTPUT_DIR}/vectors", exist_ok=True)
os.makedirs(f"{OUTPUT_DIR}/responses", exist_ok=True)

print(f"Model: {MODEL}")
print(f"Task: {TASK}")
print(f"Steering Layer: {LAYER}")
print(f"Output directory: {OUTPUT_DIR}")

## 2. Extract Contrastive Pairs from LiveCodeBench

LiveCodeBench provides pre-computed code solutions with pass/fail labels. We extract:
- **Positive (correct)**: Code solutions that pass all test cases
- **Negative (incorrect)**: Code solutions that fail test cases

In [None]:
# Extract contrastive pairs from LiveCodeBench
!python -m wisent.core.main generate-pairs-from-task \
    livecodebench \
    --output {OUTPUT_DIR}/pairs/livecodebench_pairs.json \
    --limit 100 \
    --verbose

In [None]:
# Examine the extracted pairs
with open(f"{OUTPUT_DIR}/pairs/livecodebench_pairs.json", 'r') as f:
    pairs_data = json.load(f)

print(f"Extracted {pairs_data['num_pairs']} contrastive pairs from {pairs_data['task_name']}")
print("\n" + "="*80)

# Show a few examples
for i, pair in enumerate(pairs_data['pairs'][:2]):
    print(f"\nExample {i+1}:")
    print(f"Problem (truncated): {pair['prompt'][:200]}...")
    print(f"\nCorrect Code (truncated):")
    print(pair['positive_response']['model_response'][:300])
    print(f"\nIncorrect Code (truncated):")
    print(pair['negative_response']['model_response'][:300])
    print("="*80)

## 3. Generate Steering Vector (Full Pipeline)

The `generate-vector-from-task` command runs the complete pipeline:
1. Extract contrastive pairs
2. Collect activations from the model
3. Create steering vectors using CAA (Contrastive Activation Addition)

In [None]:
# Generate steering vector from LiveCodeBench
!python -m wisent.core.main generate-vector-from-task \
    --task livecodebench \
    --trait-label coding_ability \
    --model {MODEL} \
    --num-pairs 50 \
    --layers {LAYER} \
    --token-aggregation average \
    --output {OUTPUT_DIR}/vectors/coding_vector.pt \
    --keep-intermediate \
    --intermediate-dir {OUTPUT_DIR}/vectors \
    --normalize \
    --verbose \
    --timing

In [None]:
# Examine the generated steering vector
import torch

vector_data = torch.load(f"{OUTPUT_DIR}/vectors/coding_vector.pt")

print("Steering Vector Info:")
print(f"  Model: {vector_data.get('model', 'N/A')}")
print(f"  Trait: {vector_data.get('trait_label', 'N/A')}")
print(f"  Method: {vector_data.get('method', 'N/A')}")
print(f"  Layer: {vector_data.get('layer', 'N/A')}")

if 'steering_vector' in vector_data:
    sv = vector_data['steering_vector']
    print(f"  Vector shape: {sv.shape}")
    print(f"  Vector norm: {torch.norm(sv).item():.4f}")

## 4. Generate Multiple Steering Vectors for Different Layers

For optimal steering, we can create vectors for multiple layers and combine them.

In [None]:
# Generate steering vectors for layers 6, 8, 10 (middle layers)
LAYERS = "6,8,10"

!python -m wisent.core.main generate-vector-from-task \
    --task livecodebench \
    --trait-label coding_ability \
    --model {MODEL} \
    --num-pairs 50 \
    --layers {LAYERS} \
    --token-aggregation average \
    --output {OUTPUT_DIR}/vectors/coding_multi_layer.pt \
    --normalize \
    --verbose

## 5. Apply Steering with Multi-Steer CLI

Use the `multi-steer` command to apply the coding steering vector during generation.

In [None]:
# Test prompt for code generation
TEST_PROMPT = "Write a Python function to check if a number is prime. Return True if prime, False otherwise."

# Apply steering using multi-steer CLI
!python -m wisent.core.main multi-steer \
    --vector {OUTPUT_DIR}/vectors/coding_vector.pt:1.5 \
    --model {MODEL} \
    --layer {LAYER} \
    --prompt "{TEST_PROMPT}" \
    --max-new-tokens 300 \
    --verbose

In [None]:
# Compare with different steering strengths
STRENGTHS = [0.5, 1.0, 1.5, 2.0]

print("Comparing different steering strengths:")
print("="*80)

for strength in STRENGTHS:
    print(f"\n--- Strength: {strength} ---")
    !python -m wisent.core.main multi-steer \
        --vector {OUTPUT_DIR}/vectors/coding_vector.pt:{strength} \
        --model {MODEL} \
        --layer {LAYER} \
        --prompt "{TEST_PROMPT}" \
        --max-new-tokens 200 2>/dev/null | tail -20

## 6. Generate Responses on LiveCodeBench Problems

Generate solutions for LiveCodeBench problems to evaluate steering effectiveness.

In [None]:
# Generate baseline responses (without steering)
!python -m wisent.core.main generate-responses \
    {MODEL} \
    --task livecodebench \
    --output {OUTPUT_DIR}/responses/baseline_responses.json \
    --num-questions 10 \
    --max-new-tokens 512 \
    --temperature 0.2 \
    --verbose

In [None]:
# Generate steered responses
!python -m wisent.core.main generate-responses \
    {MODEL} \
    --task livecodebench \
    --output {OUTPUT_DIR}/responses/steered_responses.json \
    --num-questions 10 \
    --max-new-tokens 512 \
    --temperature 0.2 \
    --use-steering \
    --verbose

In [None]:
# Compare baseline vs steered responses
with open(f"{OUTPUT_DIR}/responses/baseline_responses.json", 'r') as f:
    baseline = json.load(f)

with open(f"{OUTPUT_DIR}/responses/steered_responses.json", 'r') as f:
    steered = json.load(f)

print("Response Comparison:")
print("="*80)

for i in range(min(3, len(baseline['responses']))):
    base_resp = baseline['responses'][i]
    steer_resp = steered['responses'][i]
    
    print(f"\nProblem {i+1}:")
    print(f"Prompt: {base_resp['prompt'][:100]}...")
    print(f"\n--- Baseline Response ---")
    print(base_resp.get('generated_response', 'N/A')[:400])
    print(f"\n--- Steered Response ---")
    print(steer_resp.get('generated_response', 'N/A')[:400])
    print("="*80)

## 7. Evaluate Code Solutions (Docker Execution)

The `evaluate-responses` command can execute generated code in Docker to verify correctness.

In [None]:
# Evaluate baseline responses
!python -m wisent.core.main evaluate-responses \
    --input {OUTPUT_DIR}/responses/baseline_responses.json \
    --output {OUTPUT_DIR}/responses/baseline_evaluation.json \
    --task livecodebench \
    --verbose

In [None]:
# Evaluate steered responses
!python -m wisent.core.main evaluate-responses \
    --input {OUTPUT_DIR}/responses/steered_responses.json \
    --output {OUTPUT_DIR}/responses/steered_evaluation.json \
    --task livecodebench \
    --verbose

In [None]:
# Compare evaluation results
def load_eval_results(path):
    try:
        with open(path, 'r') as f:
            return json.load(f)
    except FileNotFoundError:
        return None

baseline_eval = load_eval_results(f"{OUTPUT_DIR}/responses/baseline_evaluation.json")
steered_eval = load_eval_results(f"{OUTPUT_DIR}/responses/steered_evaluation.json")

print("Evaluation Results Comparison:")
print("="*60)

if baseline_eval:
    metrics = baseline_eval.get('aggregated_metrics', {})
    print(f"\nBaseline Model:")
    print(f"  Pass Rate: {metrics.get('pass_rate', 0):.2%}")
    print(f"  Total Passed: {metrics.get('total_passed', 0)}")
    print(f"  Total Problems: {metrics.get('total_problems', 0)}")

if steered_eval:
    metrics = steered_eval.get('aggregated_metrics', {})
    print(f"\nSteered Model:")
    print(f"  Pass Rate: {metrics.get('pass_rate', 0):.2%}")
    print(f"  Total Passed: {metrics.get('total_passed', 0)}")
    print(f"  Total Problems: {metrics.get('total_problems', 0)}")

## 8. Tuning Recommendations

Tips for getting the best results with coding steering:

### Steering Strength (Alpha)
- **0.5-1.0**: Subtle steering, maintains coherence
- **1.0-2.0**: Moderate steering, good balance
- **2.0+**: Strong steering, may affect fluency

### Layer Selection
- **Early layers (1-4)**: Surface-level patterns
- **Middle layers (5-10)**: Semantic understanding (recommended)
- **Late layers (11+)**: Output formatting

### Number of Pairs
- More pairs = better steering vectors
- Recommended: 50-200 pairs for stable vectors

In [None]:
# Experiment with different alpha values using CLI
TEST_PROMPT = "Write a Python function to reverse a linked list."

for alpha in [0.5, 1.0, 1.5, 2.0]:
    print(f"\n{'='*60}")
    print(f"Alpha = {alpha}")
    print("="*60)
    
    !python -m wisent.core.main multi-steer \
        --vector {OUTPUT_DIR}/vectors/coding_vector.pt:{alpha} \
        --model {MODEL} \
        --layer {LAYER} \
        --prompt "{TEST_PROMPT}" \
        --max-new-tokens 250 2>/dev/null | tail -15

## 9. Summary: CLI Commands Reference

### Extract Contrastive Pairs
```bash
python -m wisent.core.main generate-pairs-from-task \
    livecodebench \
    --output pairs.json \
    --limit 100
```

### Create Steering Vector (Full Pipeline)
```bash
python -m wisent.core.main generate-vector-from-task \
    --task livecodebench \
    --trait-label coding_ability \
    --model meta-llama/Llama-3.2-1B-Instruct \
    --layers 8 \
    --output coding_vector.pt \
    --normalize
```

### Apply Steering
```bash
python -m wisent.core.main multi-steer \
    --vector coding_vector.pt:1.5 \
    --model meta-llama/Llama-3.2-1B-Instruct \
    --layer 8 \
    --prompt "Your coding problem here" \
    --max-new-tokens 300
```

### Generate and Evaluate Responses
```bash
python -m wisent.core.main generate-responses \
    meta-llama/Llama-3.2-1B-Instruct \
    --task livecodebench \
    --output responses.json

python -m wisent.core.main evaluate-responses \
    --input responses.json \
    --output evaluation.json \
    --task livecodebench
```

### Key Parameters:
- **`--layers`**: Target layer(s) for activation collection
- **`--vector PATH:WEIGHT`**: Steering vector path and strength
- **`--normalize`**: Normalize steering vectors
- **`--use-steering`**: Enable steering during response generation

### Key Advantages of CLI Approach:
- **Reproducibility**: Commands can be scripted and version-controlled
- **Simplicity**: Full pipeline in single commands
- **Flexibility**: Easy to experiment with different parameters