# Problem

Given a small model (gelu-1l) and dataset (c4-10k), find and store the top_k max activating dataset examples for each neuron. Then visualize the top_k max activating dataset examples for neuron L0N424 and compare the results to [neuroscope](https://neuroscope.io/gelu-1l/0/424.html).

# Setup
(No need to read)

In [1]:
# Janky code to do different setup when run in a Colab notebook vs VSCode
DEBUG_MODE = False
try:
    import google.colab
    IN_COLAB = True
    print("Running as a Colab notebook")
    %pip install git+https://github.com/neelnanda-io/TransformerLens.git
except:
    IN_COLAB = False
    print("Running as a Jupyter notebook - intended for development only!")
    from IPython import get_ipython

    ipython = get_ipython()
    # Code to automatically update the HookedTransformer code as its edited without restarting the kernel
    ipython.magic("load_ext autoreload")
    ipython.magic("autoreload 2")

Running as a Colab notebook
Collecting git+https://github.com/neelnanda-io/TransformerLens.git
  Cloning https://github.com/neelnanda-io/TransformerLens.git to /tmp/pip-req-build-6ohxkv76
  Running command git clone --filter=blob:none --quiet https://github.com/neelnanda-io/TransformerLens.git /tmp/pip-req-build-6ohxkv76
  Resolved https://github.com/neelnanda-io/TransformerLens.git to commit 0c464cbd0d929fa3fe51fa2f8ab68389155da43f
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone


In [2]:
try:
    %pip install git+https://github.com/callummcdougall/CircuitsVis.git#subdirectory=python
except:
    import os; os.environ["ACCELERATE_DISABLE_RICH"] = "1"
    from IPython import get_ipython
    ipython = get_ipython()
    ipython.run_line_magic("load_ext", "autoreload")
    ipython.run_line_magic("autoreload", "2")

Collecting git+https://github.com/callummcdougall/CircuitsVis.git#subdirectory=python
  Cloning https://github.com/callummcdougall/CircuitsVis.git to /tmp/pip-req-build-0peal_ke
  Running command git clone --filter=blob:none --quiet https://github.com/callummcdougall/CircuitsVis.git /tmp/pip-req-build-0peal_ke
  Resolved https://github.com/callummcdougall/CircuitsVis.git to commit df9bfc252807e8b1c3a26c3c4796c18342c7fc71
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone


In [3]:
# Plotly needs a different renderer for VSCode/Notebooks vs Colab argh
import plotly.io as pio

if IN_COLAB or not DEBUG_MODE:
    # Thanks to annoying rendering issues, Plotly graphics will either show up in colab OR Vscode depending on the renderer - this is bad for developing demos! Thus creating a debug mode.
    pio.renderers.default = "colab"
else:
    pio.renderers.default = "png"

In [4]:
# Import stuff
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import numpy as np
import circuitsvis as cv
import einops
import tqdm.notebook as tqdm
import random
from pathlib import Path
import plotly.express as px
from torch.utils.data import DataLoader

from jaxtyping import Float, Int
from typing import List, Union, Optional
from functools import partial
import copy

import pprint
import json
import itertools
from transformers import AutoModelForCausalLM, AutoConfig, AutoTokenizer
import dataclasses
from dataclasses import dataclass
import datasets
from IPython.display import HTML

In [5]:
import transformer_lens
import transformer_lens.utils as utils
from transformer_lens.hook_points import (
    HookedRootModule,
    HookPoint,
)  # Hooking utilities
from transformer_lens import HookedTransformer, HookedTransformerConfig, FactoredMatrix, ActivationCache

Plotting helper functions:

In [6]:
import plotly.graph_objects as go

update_layout_set = {"xaxis_range", "yaxis_range", "hovermode", "xaxis_title", "yaxis_title", "colorbar", "colorscale", "coloraxis", "title_x", "bargap", "bargroupgap", "xaxis_tickformat", "yaxis_tickformat", "title_y", "legend_title_text", "xaxis_showgrid", "xaxis_gridwidth", "xaxis_gridcolor", "yaxis_showgrid", "yaxis_gridwidth"}
def imshow(tensor, renderer=None, xaxis="", yaxis="", **kwargs):
    if isinstance(tensor, list):
        tensor = torch.stack(tensor)
    kwargs_post = {k: v for k, v in kwargs.items() if k in update_layout_set}
    kwargs_pre = {k: v for k, v in kwargs.items() if k not in update_layout_set}
    if "facet_labels" in kwargs_pre:
        facet_labels = kwargs_pre.pop("facet_labels")
    else:
        facet_labels = None
    if "color_continuous_scale" not in kwargs_pre:
        kwargs_pre["color_continuous_scale"] = "RdBu"
    fig = px.imshow(utils.to_numpy(tensor), color_continuous_midpoint=0.0,labels={"x":xaxis, "y":yaxis}, **kwargs_pre).update_layout(**kwargs_post)
    if facet_labels:
        for i, label in enumerate(facet_labels):
            fig.layout.annotations[i]['text'] = label

    fig.show(renderer)

def line(tensor, renderer=None, xaxis="", yaxis="", **kwargs):
    px.line(y=utils.to_numpy(tensor), labels={"x":xaxis, "y":yaxis}, **kwargs).show(renderer)

def scatter(x, y, xaxis="", yaxis="", caxis="", renderer=None, **kwargs):
    x = utils.to_numpy(x)
    y = utils.to_numpy(y)
    px.scatter(y=y, x=x, labels={"x":xaxis, "y":yaxis, "color":caxis}, **kwargs).show(renderer)

def lines(lines_list, x=None, mode='lines', labels=None, xaxis='', yaxis='', title = '', log_y=False, hover=None, **kwargs):
    # Helper function to plot multiple lines
    if type(lines_list)==torch.Tensor:
        lines_list = [lines_list[i] for i in range(lines_list.shape[0])]
    if x is None:
        x=np.arange(len(lines_list[0]))
    fig = go.Figure(layout={'title':title})
    fig.update_xaxes(title=xaxis)
    fig.update_yaxes(title=yaxis)
    for c, line in enumerate(lines_list):
        if type(line)==torch.Tensor:
            line = utils.to_numpy(line)
        if labels is not None:
            label = labels[c]
        else:
            label = c
        fig.add_trace(go.Scatter(x=x, y=line, mode=mode, name=label, hovertext=hover, **kwargs))
    if log_y:
        fig.update_layout(yaxis_type="log")
    fig.show()

def bar(tensor, renderer=None, xaxis="", yaxis="", **kwargs):
    px.bar(
        y=utils.to_numpy(tensor),
        labels={"x": xaxis, "y": yaxis},
        template="simple_white",
        **kwargs).show(renderer)

In [7]:
import transformer_lens.patching as patching
from transformer_lens import evals
import math
import pandas as pd

In [8]:
def disable_biases(model):
    for name, param in model.named_parameters():
        if 'b_' in name:
            param.requires_grad = False

def disable_pos_embed(model):
    assert model.cfg.positional_embedding_type == "standard"
    model.pos_embed.W_pos = nn.Parameter(torch.zeros_like(model.pos_embed.W_pos))
    model.pos_embed.W_pos.requires_grad = False

In [9]:
from html import escape
import colorsys
def create_html(strings, values, saturation=0.5, allow_different_length=False):
    # escape strings to deal with tabs, newlines, etc.
    escaped_strings = [escape(s, quote=True) for s in strings]
    processed_strings = [
        s.replace("\n", "<br/>").replace("\t", "&emsp;").replace(" ", "&nbsp;")
        for s in escaped_strings
    ]

    if isinstance(values, torch.Tensor) and len(values.shape)>1:
        values = values.flatten().tolist()

    if not allow_different_length:
        assert len(processed_strings) == len(values)

    # scale values
    max_value = max(max(values), -min(values))+1e-3
    scaled_values = [v / max_value * saturation for v in values]

    # create html
    html = ""
    for i, s in enumerate(processed_strings):
        if i<len(scaled_values):
            v = scaled_values[i]
        else:
            v = 0
        if v < 0:
            hue = 0  # hue for red in HSV
        else:
            hue = 0.66  # hue for blue in HSV
        rgb_color = colorsys.hsv_to_rgb(
            hue, v, 1
        )  # hsv color with hue 0.66 (blue), saturation as v, value 1
        hex_color = "#%02x%02x%02x" % (
            int(rgb_color[0] * 255),
            int(rgb_color[1] * 255),
            int(rgb_color[2] * 255),
        )
        html += f'<span style="background-color: {hex_color}; border: 1px solid lightgray; font-size: 16px; border-radius: 3px;">{s}</span>'

    display(HTML(html))

In [10]:
from datasets import load_dataset

# Load Model

The model is gelu-1l, opensourced by Neel Nanda. This is a 1L transformer with 2048 neurons.

In [11]:
torch.set_grad_enabled(False)

<torch.autograd.grad_mode.set_grad_enabled at 0x7d65e0ba2230>

In [12]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(device)

cuda


In [13]:
model = HookedTransformer.from_pretrained("gelu-1l")

Loaded pretrained model gelu-1l into HookedTransformer


# Load dataset

The dataset we will use is c4-10k, a 10k subset of the C4 (web text) training data. Note this is a very small subset of the training distribution, which helps make this problem more tractable to complete in a colab notebook.


The dataset was shared by Neel Nanda on [HuggingFace](https://huggingface.co/NeelNanda), so we can load it with the load_dataset function:

In [14]:
dataset = load_dataset("NeelNanda/c4-10k", split="train")
print(dataset)

Dataset({
    features: ['text', 'timestamp', 'url'],
    num_rows: 10000
})


The dataset is small enough that we can store all tokens in a single tensor. Note that each example starts with a BOS token (1). Most examples end with PAD tokens (2):

In [15]:
all_tokens = model.to_tokens(dataset['text'])
print(all_tokens.shape)
print(all_tokens)

torch.Size([10000, 1024])
tensor([[    1,  6957,  1045,  ...,     2,     2,     2],
        [    1, 11410,   308,  ...,     2,     2,     2],
        [    1,  3912,   274,  ...,     2,     2,     2],
        ...,
        [    1, 31592,   254,  ...,     2,     2,     2],
        [    1,  8608,   758,  ...,     2,     2,     2],
        [    1,  1516,  3797,  ...,     2,     2,     2]], device='cuda:0')


I recommend that you break the problem into the following two subproblems:

1. Run the model on the dataset examples (feel free to use an even smaller subset to start), and store the data indices for the top_k max activating dataset examples. I recommend storing these indices in a [top_k, length] tensor.
2. Visualize these max activating dataset examples for L0N424 in a fashion similar to [neuroscope](https://neuroscope.io/gelu-1l/0/424.html) (with str tokens highlighted based on the neuron's activation at that position)


For the visualization, you can use this create_html function which I stole from [neelutils](https://github.com/neelnanda-io/neelutils). To use it, pass in a list of str tokens and a the corresponding tensor of neuron activations for that example:

In [16]:
from html import escape
import colorsys
def create_html(strings, values, saturation=0.5, allow_different_length=False):
    # escape strings to deal with tabs, newlines, etc.
    escaped_strings = [escape(s, quote=True) for s in strings]
    processed_strings = [
        s.replace("\n", "<br/>").replace("\t", "&emsp;").replace(" ", "&nbsp;")
        for s in escaped_strings
    ]

    if isinstance(values, torch.Tensor) and len(values.shape)>1:
        values = values.flatten().tolist()

    if not allow_different_length:
        assert len(processed_strings) == len(values)

    # scale values
    max_value = max(max(values), -min(values))+1e-3
    scaled_values = [v / max_value * saturation for v in values]

    # create html
    html = ""
    for i, s in enumerate(processed_strings):
        if i<len(scaled_values):
            v = scaled_values[i]
        else:
            v = 0
        if v < 0:
            hue = 0  # hue for red in HSV
        else:
            hue = 0.66  # hue for blue in HSV
        rgb_color = colorsys.hsv_to_rgb(
            hue, v, 1
        )  # hsv color with hue 0.66 (blue), saturation as v, value 1
        hex_color = "#%02x%02x%02x" % (
            int(rgb_color[0] * 255),
            int(rgb_color[1] * 255),
            int(rgb_color[2] * 255),
        )
        html += f'<span style="background-color: {hex_color}; border: 1px solid lightgray; font-size: 16px; border-radius: 3px;">{s}</span>'

    display(HTML(html))

create_html(["The", " quick", " brown", " fox"], [1, 2, 3, 4])

# Solution

Note this solution heavily steals from the [neuroscope](https://github.com/neelnanda-io/Neuroscope) codebase.

## Collect max activating examples

This problem relies more on software engineering than mech interp techniques. Recall that we want to find and store which examples have the highest activations for each neuron. Before jumping into the code, let's brainstorm some naive ways to do this:

1. If we had infinite memory, (or a tiny dataset), we could just cache every single neuron activation for every example. This would yield a massive [batch, seq, d_mlp] tensor. We could then just find the max activation for each (batch, neuron) pair and then get the top_k activations and indices for each neuron (torch.top_k(tensor.max(dim=1)[:, neuron_index], k=top_k)). Unfortunately, this approach would blow up our GPU memory. To solve this, we'll likely want to keep some running store for the top_k activations and indices which we iteratively update.

2. If you have prior experience with data structures, you might have considered using a heap to solve this issue. For each neuron, we could have a min_heap that keeps the top_k (activation, batch_index) pairs. We could iteratively update each heap by pushing each new (max_act, batch_index) to the heap, and popping the minimum when the size of the heap is > top_k. While there might be a clever way to make something like this work, I don't see an obvious way to leverage pytorch's optimized vectorized operations, suggesting that this approach will likely be slow and annoying to implement.

Thus we want an approach that uses pytorch to leverage optimized vectorized operations on the GPU, while also iteratively updating a running store to minimize memory consumption. The following approach helps us manage these tradeoffs to get the best of both worlds:

* Initialize two [top_k, d_mlp] tensors which store the top_k activations and batch_indices for each neuron. We'll call these 'max' and 'index' tensors respectively.
* For each mini batch of the dataset, run the model to access the corresponding neuron activations. For each batch of these neuron activations, run a batch_update to update the "max" and "index" tensors as follows:
    - Sort these activations by the batch dimension. This will return the new sorted acts for each neuron as well as the corresponding new batch indices.
    - For the top_k (new_act, batch_index) pairs, we update the max / index stores (we call the update method in the code below). Note that since they are sorted, we can break out of this if no updates were made.
        - The update step just checks what activations are greater than the current minimum in the top_k activations for that neuron in the 'max' store. Note we can vectorize this update by using a boolean mask.

Note that although we still have to iterate through examples in the minibatch in the 'batch_update' step, the sorting step allows us to limit time in this state (we can only update it at most top_k times since it's sorted). We also get to leverage highly optimized low level pytorch operations (sort, min, indexing).

To implement this, we will write a class that holds the stores and performs the updates. Let's start by creating a config that we can pass to this class:

In [17]:
@dataclass
class Config:
    model_name: str = "gelu-1l"
    data_name: str ="c4-10k"
    max_tokens: int = int(1e5)
    batch_size: int = 16
    version: int = 0
    debug: bool = False
    neuron_top_k: int = 10

cfg = Config()
print(cfg)

Config(model_name='gelu-1l', data_name='c4-10k', max_tokens=100000, batch_size=16, version=0, debug=False, neuron_top_k=10)


Note max_tokens = 1e5 is an even smaller subset of the c4-10k dataset. This is just so that you can run the notebook in a few seconds. Similarly, I keep the batchsize at a fairly low 16 so that even colab GPUs should have enough memory. Feel free to play around with these.

Now we implement the NeuronMaxAct hold the store for the top_k
activations and indices.

In [18]:
class MaxStore:
    """Used to calculate max activating dataset examples - takes in batches of activations repeatedly, and tracks the top_k examples activations + indexes"""

    def __init__(self, top_k, length, device="cuda"):
        self.top_k = top_k
        self.length = length
        self.device = device

        self.max = -torch.inf * torch.ones(
            (self.top_k, self.length),
            dtype=torch.float32, device=self.device
        )
        self.index = -torch.ones(
            (self.top_k, self.length),
            dtype=torch.long, device=self.device
        )

        self.counter = 0
        self.total_updates = 0
        self.num_batches_seen = 0

    def update(self, new_act, new_index):
        """
        Given a tensor of max neuron activations and the corresponding dataset indices, update the max / index stores

        Args:
            new_act: Shape [d_mlp,]
            new_index: Shape [d_mlp,]

        Returns:
            int: number of neurons that required an update for this step
        """
        min_max_act, min_max_indices = self.max.min(0)
        mask = new_act > min_max_act
        self.max[min_max_indices[mask], mask] = new_act[mask]
        self.index[min_max_indices[mask], mask] = new_index[mask]

        num_updates = mask.sum().item()
        self.total_updates += num_updates
        return num_updates

    def batch_update(self, new_acts, text_indices=None):
        """
        Given a batch of max neuron activations, update the max / index stores

        Args:
            new_acts: Shape [batch, length]
            text_indices: Shape [batch,]
        """
        batch_size = new_acts.size(0)
        sorted_acts, sorted_indices = new_acts.sort(0, descending=True)
        if text_indices is None:
            text_indices = torch.arange(
                self.counter,
                self.counter + batch_size,
                dtype=torch.long,
                device=self.device
            )
        new_indices = text_indices[sorted_indices]
        for i in range(batch_size):
            num_updates = self.update(sorted_acts[i], new_indices[i])
            if num_updates == 0:
                break
        self.counter += batch_size
        self.num_batches_seen += 1

    def save(self, dir, folder_name=None):
        if folder_name is None:
            path = dir
        else:
            path = dir / folder_name

        path.mkdir(exist_ok=True, parents=True)
        torch.save(self.max, path / "max.pth")
        torch.save(self.index, path / "index.pth")

        with open(path / "config.json", "w") as f:
            filt_dict = {
                k: v for k, v in self.__dict__.items() if k not in ["max", "index"]
            }
            json.dump(filt_dict, f)

        print(f"Saved Max Store to {path}")

    def inference_mode(self):
        """Switch from updating mode to inference - move to the CPU and sort by max act."""
        self.max = self.max.cpu()
        self.index = self.index.cpu()
        self.max, indices = self.max.sort(0, descending=True)
        self.index = self.index.gather(0, indices)

    @classmethod
    def load(cls, dir, folder_name=None, transpose=False, continue_updating=False):
        dir = Path(dir)
        if folder_name is None:
            path = dir
        else:
            path = dir / folder_name

        max = torch.load(path / "max.pth")
        index = torch.load(path / "index.pth")
        if transpose:
            max = max.T
            index = index.T

        with open(path / "config.json", "r") as f:
            config = json.load(f)

        mas = MaxStore(config["top_k"], config["length"])
        for k, v in config.items():
            mas.__dict__[k] = v

        mas.max = max
        mas.index = index
        if not continue_updating:
            mas.inference_mode()
        return mas


class BaseMaxTracker:
    def __init__(self, cfg, model, name):
        self.cfg = cfg
        self.debug = self.cfg.debug
        self.model = model
        self.name = name

        self.base_dir = Path("./gelu_outputs") / self.name / self.cfg.data_name / self.cfg.model_name

        if self.debug:
            self.save_dir = self.base_dir
        else:
            self.save_dir = self.base_dir / f"v{self.cfg.version}"

    def save(self):
        raise NotImplementedError

    def finish(self):
        self.save()

class NeuronMaxAct(BaseMaxTracker):
    def __init__(self, cfg, model):
        super().__init__(cfg, model, name="neuron_max_act")

        self.stores = []
        for layer in range(model.cfg.n_layers):
            store = MaxStore(self.cfg.neuron_top_k, self.model.cfg.d_mlp)
            self.stores.append(store)

            def update_max_hook(neuron_acts, hook, store):
                store.batch_update(
                    einops.reduce(
                        neuron_acts, "batch pos d_mlp -> batch d_mlp", "max"
                    )
                )

            hook_fn = partial(update_max_hook, store=store)
            model.blocks[layer].mlp.hook_post.add_hook(hook_fn)

    def save(self):
        self.save_dir.mkdir(exist_ok=True, parents=True)
        for layer in range(self.model.cfg.n_layers):
            self.stores[layer].save(self.save_dir, folder_name=str(layer))
        print(f"Saved {self.name} to {self.save_dir}")

Note that the BaseMaxTracker is not strictly necessary, but can be useful to leverage this for similar functionality (like top_k logits for each neuron). See the [neuroscope codebase](https://github.com/neelnanda-io/Neuroscope/blob/main/neuroscope/scan_over_data.py#L143) for more on this.


Finally, we iterate through the dataset and run the model. Since we add a hook to update the MaxStore upon initializing the tracker, we just have to run the model on each mini batch.


After we've looked through all mini batches, we call tracker.finish() to write the MaxStore to disk.

In [19]:
model.reset_hooks()
tracker = NeuronMaxAct(cfg, model)
try:
    for index in range(0, cfg.max_tokens // model.cfg.n_ctx, cfg.batch_size):
        tokens = all_tokens[index : index + cfg.batch_size]
        model(tokens, return_type=None) # We just have to run the model. Recall that we added a hook when we initialized the tracker
finally:
    tracker.finish()
model.reset_hooks()

Saved Max Store to gelu_outputs/neuron_max_act/c4-10k/gelu-1l/v0/0
Saved neuron_max_act to gelu_outputs/neuron_max_act/c4-10k/gelu-1l/v0


## Load and visualize max activating examples for L0N424

Now that we've stored the max activating dataset examples on disk, let's load and visualize them. I'm focusing on L0N424 since it has a clear pattern in [neuroscope](https://neuroscope.io/gelu-1l/0/424.html), and I suspect the same feature should be prevalent in c4-10k. We'll first load the max store that we just saved from disk:

In [20]:
layer = 0
store = MaxStore.load(f"./gelu_outputs/neuron_max_act/{cfg.data_name}/{cfg.model_name}/v{cfg.version}/{layer}")

In [21]:
neuron_index = 424
data_indices = store.index[:, neuron_index]
print(data_indices.shape)

torch.Size([10])


In [22]:
tokens = all_tokens[data_indices]
print(tokens.shape)

torch.Size([10, 1024])


Note that the max store only stores the max activation for each dataset example, but we want the neuron activation at each position for visualization purposes. We can just recompute these using run_from_cache:

In [23]:
_, cache = model.run_with_cache(tokens, return_type=None, names_filter=utils.get_act_name("post", layer))
print(cache)

ActivationCache with keys ['blocks.0.mlp.hook_post']


In [24]:
all_neuron_acts = cache['blocks.0.mlp.hook_post']
print(all_neuron_acts.shape)

torch.Size([10, 1024, 2048])


In [25]:
neuron_acts = all_neuron_acts[..., neuron_index]
print(neuron_acts.shape)

torch.Size([10, 1024])


Note that we could do this in a more memory efficient fashion (see the [neuroscope code](https://github.com/neelnanda-io/Neuroscope/blob/8ec7bff96c3e01a511c212a974e354f1e47ce2d9/neuroscope/make_neuroscope_page.py#L171)), but that's not necessary here.


Now that we have all the activations and tokens, we just plug them into the create_html function. I'll only show a chunk of text around the max activation to make this easier on the eyes:

In [26]:
truncated_prefix_length = 50
truncated_suffix_length = 10

for i in range(tokens.shape[0]):
    max_idx = neuron_acts[i].argmax().item()
    left = max(0, max_idx - truncated_prefix_length)
    right = min(tokens.shape[-1], max_idx + truncated_suffix_length)

    print(f"Data Index: {data_indices[i]}")
    print(f"Max activating seq pos: {max_idx}")
    print(f"Max act: {neuron_acts[i, left:right].max():.4f}, Min act: {neuron_acts[i, left:right].min():.4f}")
    create_html(model.to_str_tokens(tokens[i, left:right]), neuron_acts[i, left:right])
    print("\n")

Data Index: 11
Max activating seq pos: 34
Max act: 5.0445, Min act: -0.1698




Data Index: 93
Max activating seq pos: 283
Max act: 4.7326, Min act: -0.1700




Data Index: 104
Max activating seq pos: 327
Max act: 4.5474, Min act: -0.1699




Data Index: 53
Max activating seq pos: 33
Max act: 4.4202, Min act: -0.1700




Data Index: 76
Max activating seq pos: 5
Max act: 4.2794, Min act: -0.1678




Data Index: 40
Max activating seq pos: 4
Max act: 4.2630, Min act: -0.1678




Data Index: 20
Max activating seq pos: 276
Max act: 4.1302, Min act: -0.1700




Data Index: 36
Max activating seq pos: 797
Max act: 4.0616, Min act: -0.1699




Data Index: 10
Max activating seq pos: 122
Max act: 4.0435, Min act: -0.1698




Data Index: 88
Max activating seq pos: 130
Max act: 4.0290, Min act: -0.1699






We see the max activations occur when there are consecutive capital letters, like a title or headline. This is consistent with what we see in [neuroscope](https://neuroscope.io/gelu-1l/0/424.html)!