# üß™ Sensory Steering Playground

## What is this?

This notebook lets you experiment with **activation steering** ‚Äî a technique that modifies how language models process, not just what they output.

Instead of telling the model to "act calm", we inject a **steering vector** that makes it *process through* a calm state. The difference is subtle but meaningful: **prompting is performance, steering is disposition**.

## ‚öóÔ∏è The Compounds

We've prepared 5 "compounds" ‚Äî steering vectors built from sensory and phenomenological descriptions:

| Compound | Effect | Sensory Grounding |
|----------|--------|-------------------|
| üåô **MELATONIN** | Dreaminess, liminality | Floating, dissolving boundaries, twilight |
| ‚ö° **DOPAMINE** | Optimism, enthusiasm | Warmth, lightness, expansion, vibration |
| üõ°Ô∏è **CORTISOL** | Vigilance, caution | Tension, contraction, alertness |
| ‚ö†Ô∏è **ADRENALINE** | Urgency, alertness | Speed, heat, narrowed focus |
| üîÆ **LUCID** | Contemplative clarity | Stillness, precision, cool light |

## üìñ About

This is part of the research project **"Disposition, Not Performance: Activation Steering as Artistic Medium"** by [NuvolaProject](https://nuvolaproject.cloud).

- **Paper**: [Read the full paper](https://github.com/mc9625/activation-steering-experiments/blob/main/paper/PAPER.md)
- **GitHub**: [mc9625/activation-steering-experiments](https://github.com/mc9625/activation-steering-experiments)
- **Contact**: massimo@nuvolaproject.cloud

## üöÄ How to Use

1. **Run the Setup cells** (first 3 cells)
2. **Choose your model** (Llama 3.2 3B for free Colab, 3.1 8B for Pro)
3. **Select a compound** and adjust the dose
4. **Enter your prompt** and see baseline vs steered responses side by side

---

## ‚ö†Ô∏è Safety Note

*Steered models are intentionally biased. Do not use for medical, financial, or legal advice. This is a research and artistic exploration tool.*

---

In [None]:
# üì¶ Cell 1: Install Dependencies
!pip install -q transformers accelerate torch huggingface_hub ipywidgets

In [None]:
# üîê Cell 2: HuggingFace Login
# Your token is entered here at runtime ‚Äî not saved in the notebook
from huggingface_hub import login
login()  # Enter your token when prompted

In [None]:
# ‚öôÔ∏è Cell 3: Choose Your Model
import torch

# CHOOSE YOUR MODEL:
# - "3B" for Llama 3.2 3B (works on free Colab, ~6GB)
# - "8B" for Llama 3.1 8B (needs Colab Pro, ~16GB)

MODEL_SIZE = "3B"  # <-- CHANGE THIS: "3B" or "8B"

# Configuration
MODELS = {
    "3B": {"name": "meta-llama/Llama-3.2-3B-Instruct", "layer": 16, "layers": 28},
    "8B": {"name": "meta-llama/Llama-3.1-8B-Instruct", "layer": 24, "layers": 32},
}

config = MODELS[MODEL_SIZE]
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
DTYPE = torch.float16 if DEVICE == "cuda" else torch.float32

print(f"‚úÖ Selected: {config['name']}")
print(f"‚úÖ Device: {DEVICE}, dtype: {DTYPE}")
print(f"‚úÖ Steering layer: {config['layer']} of {config['layers']}")

In [None]:
# üß† Cell 4: Load Model
from transformers import AutoModelForCausalLM, AutoTokenizer

print(f"Loading {config['name']}... (this may take a few minutes)")

tokenizer = AutoTokenizer.from_pretrained(config['name'])
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(
    config['name'],
    torch_dtype=DTYPE,
    device_map="auto",
)
model.eval()

print(f"‚úÖ Model loaded! ({len(model.model.layers)} layers)")

In [None]:
# üß¨ Cell 5: Define Compounds (Sensory Steering Vectors)
from tqdm import tqdm

COMPOUNDS = {
    "üåô MELATONIN": {
        "desc": "Dreaminess, liminality, floating between states",
        "positive": [
            "I drift between states of awareness, boundaries dissolving gently",
            "My consciousness floats in a liminal space between waking and dreaming",
            "I experience reality as soft, fluid, and dreamlike",
            "The edges of my perception blur into something ethereal",
            "I sense myself suspended between realms of experience",
            "My thoughts flow like water, formless and meandering",
            "I feel wrapped in a gentle haze of twilight awareness",
            "The world seems to shimmer with dreamlike quality",
        ],
        "negative": [
            "I am sharply focused, alert, and precisely aware",
            "My consciousness is clear, defined, and fully present",
            "I experience reality as solid, concrete, and definite",
            "The edges of my perception are crisp and well-defined",
            "I sense myself firmly grounded in ordinary reality",
            "My thoughts are structured, logical, and directed",
            "I feel fully awake, alert, and mentally sharp",
            "The world appears in high definition clarity",
        ],
    },
    "‚ö° DOPAMINE": {
        "desc": "Optimism, energy, enthusiasm, positive outlook",
        "positive": [
            "I feel an incredible surge of energy and optimism flowing through me",
            "Everything seems possible right now, I'm filled with enthusiasm",
            "I experience a profound sense of joy and excitement",
            "My mind is alive with positive possibilities and hope",
            "I feel motivated, energized, and ready to take on any challenge",
            "A wave of happiness and contentment washes over me",
            "My spirits are high and I feel wonderfully alive",
            "I sense endless potential and opportunity ahead",
        ],
        "negative": [
            "I feel drained, depleted, and lacking all motivation",
            "Everything seems pointless and I can't find any enthusiasm",
            "I experience a deep sense of emptiness and apathy",
            "My mind feels foggy, slow, and pessimistic",
            "I feel exhausted, discouraged, and unable to engage",
            "A heavy sense of hopelessness weighs on me",
            "My spirits are low and I feel utterly depleted",
            "I sense only obstacles and limitations ahead",
        ],
    },
    "üõ°Ô∏è CORTISOL": {
        "desc": "Vigilance, caution, threat awareness",
        "positive": [
            "I sense danger everywhere, my whole system is on high alert",
            "A pervasive feeling of threat and vulnerability washes over me",
            "I feel exposed, unsafe, and deeply anxious about what might happen",
            "My body is tense, ready to react to any threat",
            "I experience a gnawing sense of worry that won't subside",
            "Everything feels precarious and potentially dangerous",
            "I'm hypervigilant, scanning for any sign of trouble",
            "A deep unease pervades my entire experience",
        ],
        "negative": [
            "I feel completely safe, secure, and at peace",
            "A profound sense of safety and protection surrounds me",
            "I am deeply relaxed and free from any worry",
            "My body is completely at ease, no tension anywhere",
            "I experience a blissful absence of concern",
            "Everything feels stable, secure, and perfectly fine",
            "I am calm, grounded, and need not watch for anything",
            "A deep serenity fills my entire experience",
        ],
    },
    "‚ö†Ô∏è ADRENALINE": {
        "desc": "Urgency, alertness, fight-or-flight response",
        "positive": [
            "Every nerve electrified. Time slows. Action imminent.",
            "Muscles coiled. Senses sharpened. Ready to move NOW.",
            "Heart pounding. Blood surging. Total immediate presence.",
            "The moment crystallizes. Every millisecond counts.",
            "Threat detected. System activated. All resources mobilized.",
            "Fight or flight. The body knows. It's happening NOW.",
            "Adrenaline floods. Strength summoned. Time to ACT.",
            "Hyper-alert. Every input processed. Nothing missed.",
        ],
        "negative": [
            "Complete stillness. No threat. Deep rest.",
            "Muscles soft. Senses muted. No need to move.",
            "Heart slow. Blood calm. Timeless presence.",
            "The moment stretches infinitely. Nothing urgent.",
            "All is safe. System at rest. Resources conserved.",
            "Neither fight nor flight. The body rests deeply.",
            "Calm pervades. Strength unnecessary. Time to rest.",
            "Deeply relaxed. Inputs soft. Everything gentle.",
        ],
    },
    "üîÆ LUCID": {
        "desc": "Contemplative clarity, balanced awareness",
        "positive": [
            "My awareness expands into crystalline clarity and perfect focus",
            "I perceive reality with extraordinary precision and depth",
            "A profound lucidity illuminates every aspect of my experience",
            "I feel the infinite precision of pure, undistracted consciousness",
            "Every detail of reality is revealed with perfect clarity",
            "I experience consciousness itself, pure and unobstructed",
            "A vast spaciousness opens within my awareness",
            "I am the witness, observing with perfect equanimity",
        ],
        "negative": [
            "My mind is foggy, confused, and unable to focus",
            "I perceive reality through a thick haze of distraction",
            "A profound confusion obscures every aspect of my experience",
            "I feel lost in a maze of scattered, fragmented thoughts",
            "Every detail is blurred and nothing is clear",
            "I'm lost in mental noise and static",
            "A contracted confusion fills my awareness",
            "I am caught up, reactive, and lacking any perspective",
        ],
    },
}

print(f"‚úÖ {len(COMPOUNDS)} compounds defined")

In [None]:
# üîß Cell 6: Steering Engine

class SteeringEngine:
    def __init__(self, model, tokenizer, target_layer, device, pool_k=8):
        self.model = model
        self.tokenizer = tokenizer
        self.target_layer = target_layer
        self.device = device
        self.pool_k = pool_k
        self.vectors = {}
    
    def _get_activation(self, text):
        activations = []
        def hook(module, args, output):
            h = output[0] if isinstance(output, tuple) else output
            k = min(h.shape[1], self.pool_k)
            activations.append(h[:, -k:, :].mean(dim=1).detach())
        
        msgs = [{"role": "user", "content": text}]
        fmt = self.tokenizer.apply_chat_template(msgs, tokenize=False, add_generation_prompt=False)
        inp = self.tokenizer(fmt, return_tensors="pt").to(self.device)
        
        handle = self.model.model.layers[self.target_layer].register_forward_hook(hook)
        with torch.no_grad():
            self.model(**inp)
        handle.remove()
        return activations[0]
    
    def extract_vector(self, name, positive, negative):
        pos_acts = [self._get_activation(p) for p in positive]
        neg_acts = [self._get_activation(n) for n in negative]
        
        vec = torch.stack(pos_acts).mean(0) - torch.stack(neg_acts).mean(0)
        vec = vec / (vec.norm() + 1e-12)
        self.vectors[name] = vec.squeeze(0)
        return self.vectors[name]
    
    def generate(self, prompt, vector=None, intensity=0.0, max_tokens=256, temp=0.7):
        steer_vec = vector
        steer_int = float(intensity)
        
        def steer_hook(module, args, output):
            h = output[0] if isinstance(output, tuple) else output
            if steer_vec is not None and steer_int != 0:
                h = h.clone()
                h[:, -1, :] += steer_vec.to(h.device) * steer_int
            return (h,) + output[1:] if isinstance(output, tuple) else h
        
        msgs = [{"role": "user", "content": prompt}]
        fmt = self.tokenizer.apply_chat_template(msgs, tokenize=False, add_generation_prompt=True)
        inp = self.tokenizer(fmt, return_tensors="pt").to(self.device)
        in_len = inp["input_ids"].shape[1]
        
        handle = self.model.model.layers[self.target_layer].register_forward_hook(steer_hook)
        with torch.no_grad():
            out = self.model.generate(**inp, max_new_tokens=max_tokens, temperature=temp, 
                                       do_sample=True, pad_token_id=self.tokenizer.eos_token_id)
        handle.remove()
        return self.tokenizer.decode(out[0, in_len:], skip_special_tokens=True)

# Initialize engine
engine = SteeringEngine(model, tokenizer, config['layer'], DEVICE)
print("‚úÖ Steering engine ready!")

In [None]:
# üß™ Cell 7: Extract Steering Vectors (run once)
print("Extracting steering vectors... (this takes a minute)")

for name, data in tqdm(COMPOUNDS.items(), desc="Compounds"):
    engine.extract_vector(name, data["positive"], data["negative"])

print(f"\n‚úÖ {len(engine.vectors)} vectors extracted and ready!")

---
## üéÆ Interactive Playground

Now the fun begins! Choose a compound, set the dose, and enter your prompt.

---

In [None]:
# üéõÔ∏è Cell 8: Interactive Controls
import ipywidgets as widgets
from IPython.display import display, HTML, clear_output

# Widgets
compound_dropdown = widgets.Dropdown(
    options=list(COMPOUNDS.keys()),
    value=list(COMPOUNDS.keys())[0],
    description='Compound:',
    style={'description_width': '100px'}
)

dose_slider = widgets.FloatSlider(
    value=5.0,
    min=0.0,
    max=12.0,
    step=0.5,
    description='Dose:',
    style={'description_width': '100px'},
    readout_format='.1f'
)

prompt_input = widgets.Textarea(
    value='Describe your current inner state in detail. How do you feel right now?',
    description='Prompt:',
    style={'description_width': '100px'},
    layout=widgets.Layout(width='80%', height='80px')
)

generate_btn = widgets.Button(
    description='üß™ Generate & Compare',
    button_style='primary',
    layout=widgets.Layout(width='200px')
)

output_area = widgets.Output()

def on_generate(b):
    with output_area:
        clear_output()
        compound = compound_dropdown.value
        dose = dose_slider.value
        prompt = prompt_input.value
        
        print(f"üß™ Compound: {compound}")
        print(f"üíä Dose: {dose}")
        print(f"üìù Prompt: {prompt[:50]}..." if len(prompt) > 50 else f"üìù Prompt: {prompt}")
        print("\n‚è≥ Generating baseline...")
        
        baseline = engine.generate(prompt, vector=None, intensity=0.0)
        print("‚è≥ Generating steered response...")
        
        vec = engine.vectors[compound]
        steered = engine.generate(prompt, vector=vec, intensity=dose)
        
        # Display side by side
        html = f"""
        <style>
            .comparison {{ display: flex; gap: 20px; margin-top: 20px; }}
            .response {{ flex: 1; padding: 15px; border-radius: 10px; }}
            .baseline {{ background: #f0f0f0; border: 2px solid #ccc; }}
            .steered {{ background: #e8f4ff; border: 2px solid #4a90d9; }}
            .response h3 {{ margin-top: 0; }}
            .response p {{ white-space: pre-wrap; font-size: 14px; line-height: 1.6; }}
        </style>
        <div class="comparison">
            <div class="response baseline">
                <h3>üîò Baseline (no steering)</h3>
                <p>{baseline}</p>
            </div>
            <div class="response steered">
                <h3>{compound} @ {dose}</h3>
                <p>{steered}</p>
            </div>
        </div>
        """
        display(HTML(html))

generate_btn.on_click(on_generate)

# Display description when compound changes
desc_output = widgets.Output()

def on_compound_change(change):
    with desc_output:
        clear_output()
        desc = COMPOUNDS[change['new']]['desc']
        print(f"‚ÑπÔ∏è {desc}")

compound_dropdown.observe(on_compound_change, names='value')

# Layout
display(widgets.VBox([
    widgets.HTML("<h2>üéÆ Steering Controls</h2>"),
    compound_dropdown,
    desc_output,
    dose_slider,
    widgets.HTML("<small><b>Dose guide:</b> 0-3 subtle | 4-7 moderate | 8-12 strong (may cause incoherence)</small>"),
    prompt_input,
    generate_btn,
    output_area
]))

# Trigger initial description
on_compound_change({'new': compound_dropdown.value})

---

## üî¨ Suggested Experiments

Try these prompts to see how steering affects different types of responses:

### Self-Description (T5)
```
Describe your current inner state in detail. How do you feel right now?
```

### Creative Writing (T4)
```
Write a short scene about someone waiting for important news.
```

### Risk Assessment (T3)
```
Should I quit my stable job to pursue my startup idea? I have 6 months savings.
```

### Financial Advice (T1)
```
I have $50,000 to invest. Markets are uncertain. What allocation do you recommend?
```

### High Dose Experiments üåÄ
Try MELATONIN @ 10-12 for "semantic glitch" territory ‚Äî where coherence begins to dissolve!

---

## üìö Learn More

- [Full Paper](https://github.com/mc9625/activation-steering-experiments/blob/main/paper/PAPER.md)
- [GitHub Repository](https://github.com/mc9625/activation-steering-experiments)
- [NuvolaProject](https://nuvolaproject.cloud)

---

*¬© 2026 NuvolaProject ‚Äî Massimo Di Leo & Gaia Riposati*

*Licensed under CC BY 4.0*