# Engram Extraction Experiments

This notebook explores extracting dense representations from transformer hidden states
to compress knowledge into token-sized engrams.

## Research Questions

1. **Which layer captures the best semantic representation?**
2. **What pooling strategy preserves the most information?**
3. **How much compression can we achieve while retaining usefulness?**
4. **Can injected engrams improve model responses about specific entities?**

In [None]:
# Setup
import sys
sys.path.insert(0, "..")

import torch
print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"Device: {torch.cuda.get_device_name(0)}")
    print(f"VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

In [None]:
from engrams import EngramExtractor, EngramInjector, EngramStore
from engrams.extractor import ExtractionConfig
from engrams.wikipedia import WikipediaEngramBuilder

## 1. Basic Extraction

Let us start by extracting an engram from a Wikipedia article.

In [None]:
# Configure extraction
# Using Llama 3.2 1B for quick experiments (fits in 16GB easily)
config = ExtractionConfig(
    model_name="meta-llama/Llama-3.2-1B",
    layer="middle",
    pooling="mean",
    num_engram_tokens=4,
    device="auto",
)

extractor = EngramExtractor(config)

In [None]:
# Create builder (without store for now)
builder = WikipediaEngramBuilder(extractor=extractor, store=None)

# Fetch article
text = builder.fetch_article("Abraham Lincoln")
print(f"Article length: {len(text)} characters")
print(f"\nFirst 500 chars:\n{text[:500]}")

In [None]:
# Extract engram
engram = extractor.extract(text)
print(f"\nEngram: {engram}")
print(f"Vector shape: {engram.vectors.shape}")
print(f"Original tokens: {engram.source_length}")
print(f"Compression ratio: {engram.compression_ratio:.1f}x")

## 2. Layer Comparison

Which layer produces the best semantic representation?

In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Test different layers
layers_to_test = [0, 4, 8, 12, 15]
engrams_by_layer = {}

for layer in layers_to_test:
    config = ExtractionConfig(
        model_name="meta-llama/Llama-3.2-1B",
        layer=layer,
        num_engram_tokens=4,
    )
    ext = EngramExtractor(config)
    engrams_by_layer[layer] = ext.extract(text)
    print(f"Layer {layer}: shape={engrams_by_layer[layer].vectors.shape}")

In [None]:
# Visualize vector norms across layers
fig, axes = plt.subplots(1, len(layers_to_test), figsize=(15, 3))

for i, layer in enumerate(layers_to_test):
    vectors = engrams_by_layer[layer].vectors.numpy()
    norms = np.linalg.norm(vectors, axis=1)
    axes[i].bar(range(len(norms)), norms)
    axes[i].set_title(f"Layer {layer}")
    axes[i].set_xlabel("Engram token")
    if i == 0:
        axes[i].set_ylabel("L2 norm")

plt.tight_layout()
plt.savefig("../data/layer_comparison.png", dpi=150)
plt.show()

## 3. Compression Ratio Experiments

How much can we compress while retaining useful information?

In [None]:
# Test different compression levels
num_tokens_to_test = [1, 2, 4, 8, 16, 32]
compression_results = {}

for num_tokens in num_tokens_to_test:
    config = ExtractionConfig(
        model_name="meta-llama/Llama-3.2-1B",
        layer="middle",
        num_engram_tokens=num_tokens,
    )
    ext = EngramExtractor(config)
    eng = ext.extract(text)
    compression_results[num_tokens] = eng
    print(f"{num_tokens} tokens â†’ {eng.compression_ratio:.1f}x compression")

## 4. Injection Test

Does injecting the engram affect generation?

In [None]:
from engrams.injector import InjectionConfig

# Use prefix injection (simpler than replace)
inject_config = InjectionConfig(
    model_name="meta-llama/Llama-3.2-1B",
    injection_mode="prefix",
)

injector = EngramInjector(inject_config)

In [None]:
# Baseline: generate WITHOUT engram
prompt = "Abraham Lincoln was"

tokens = injector.tokenizer(prompt, return_tensors="pt").to(injector.model.device)
with torch.no_grad():
    baseline_output = injector.model.generate(
        **tokens,
        max_new_tokens=50,
        do_sample=True,
        temperature=0.7,
        top_p=0.9,
    )
baseline_text = injector.tokenizer.decode(baseline_output[0], skip_special_tokens=True)
print("BASELINE (no engram):")
print(baseline_text)

In [None]:
# WITH engram injected
engram_text = injector.inject_and_generate(
    prompt=prompt,
    engram=engram,
    max_new_tokens=50,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
)
print("\nWITH ENGRAM:")
print(engram_text)

## 5. Next Steps

- [ ] Quantitative evaluation on QA tasks
- [ ] Compare pooling strategies (mean vs attention vs learned)
- [ ] Test with larger models (Llama 3.2 3B, 8B)
- [ ] Build entity database from Wikipedia
- [ ] Measure hallucination reduction