# Task 1.4: Embedding Extraction Across All Layers

This notebook implements `get_token_trajectory()` — a function that takes a text and a token index, and returns that token's residual stream embedding at every layer.

In [1]:
from transformer_lens import HookedTransformer

model = HookedTransformer.from_pretrained("gpt2-small")

`torch_dtype` is deprecated! Use `dtype` instead!


Loaded pretrained model gpt2-small into HookedTransformer


In [2]:
def get_token_trajectory(text: str, token_index: int) -> list[dict]:
    """Returns the residual stream vector for one token at each layer."""
    _, cache = model.run_with_cache(text)
    n_layers = model.cfg.n_layers  # 12 for GPT-2 small
    trajectory = []
    for layer in range(n_layers):
        embedding = cache["resid_post", layer][0, token_index, :]  # shape (768,)
        trajectory.append({
            "layer": layer,
            "embedding": embedding.tolist()  # convert tensor to list
        })
    return trajectory

## Test: extract trajectory for "cat" in "The cat sat on the mat"

First, let's check which token index "cat" lands on.

In [3]:
text = "The cat sat on the mat"
tokens = model.to_str_tokens(text)
print(list(enumerate(tokens)))

[(0, '<|endoftext|>'), (1, 'The'), (2, ' cat'), (3, ' sat'), (4, ' on'), (5, ' the'), (6, ' mat')]


In [4]:
# "cat" should be at index 2 (index 0 is BOS, index 1 is "The")
cat_index = 2

trajectory = get_token_trajectory(text, cat_index)

# Print shape and first 5 values at each layer to verify they change
for entry in trajectory:
    layer = entry["layer"]
    emb = entry["embedding"]
    print(f"Layer {layer:2d} | shape: ({len(emb)},) | first 5 values: {[round(v, 4) for v in emb[:5]]}")

Layer  0 | shape: (768,) | first 5 values: [0.0751, -0.649, -0.0404, 1.4728, 0.012]
Layer  1 | shape: (768,) | first 5 values: [0.7863, -0.2365, 0.142, 1.972, -0.201]
Layer  2 | shape: (768,) | first 5 values: [1.5578, 0.0035, 0.9228, 2.561, 0.0871]
Layer  3 | shape: (768,) | first 5 values: [2.0965, 0.0061, 1.1364, 3.221, 0.5977]
Layer  4 | shape: (768,) | first 5 values: [1.6133, 0.051, 0.3969, 2.7518, 0.0034]
Layer  5 | shape: (768,) | first 5 values: [1.9186, 0.8388, 1.2553, 2.3371, 0.2368]
Layer  6 | shape: (768,) | first 5 values: [2.0882, 0.6959, 1.0355, 3.5695, 0.5363]
Layer  7 | shape: (768,) | first 5 values: [2.3446, 0.6908, 2.1354, 2.8239, 0.6528]
Layer  8 | shape: (768,) | first 5 values: [1.822, 0.4927, 1.9163, 3.7362, 0.0247]
Layer  9 | shape: (768,) | first 5 values: [0.1678, 1.3988, 1.9146, 4.0028, -1.1894]
Layer 10 | shape: (768,) | first 5 values: [-0.266, 3.8733, 2.8544, 2.1644, -1.6509]
Layer 11 | shape: (768,) | first 5 values: [-3.1292, 2.583, 2.4525, 3.8415, -2.

## What to look for

- There should be **12 entries** — one per layer of GPT-2 small
- Each embedding has **768 values**
- The first 5 values should **visibly differ** across layers — this shows the model is transforming the token's representation as it passes through the network
- Early layers tend to encode surface-level token identity; later layers encode more contextual/semantic meaning