Decoder vectors / source models for dashboard demo #115

ghproek · 2025-04-04T20:24:44Z

The accompanying paper and blog post point to a demo dashboard with some interpretations. Is it currently documented what SAE's were used and what neurons correspond to each cell in the dashboard? That would give access to the decoder vectors which would be great for experiments, teaching, etc. Thanks!

SrGonao · 2025-04-07T09:22:47Z

Yes we had made a dashboard for the demo, which unfortunately became a bit deprecated as the code evolved. We've been wanted to make a dashboard again, and indeed there's already a preliminary PR for it. The number of the neuron and the position of the SAE is in the name, e.g. https://cadentj.github.io/demo/gpt2/top/resid_10-0.html corresponds to the 0th neuron of layer 10 of the residual stream SAE trained on GPT2. The GPT2 sae is the 131k latent model from OpenAI (https://github.com/openai/sparse_autoencoder/blob/main/sparse_autoencoder/paths.py) and I the Pythia SAEs are from the Sparse Circuits Paper (https://arxiv.org/abs/2403.19647)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decoder vectors / source models for dashboard demo #115

Decoder vectors / source models for dashboard demo #115

ghproek commented Apr 4, 2025

SrGonao commented Apr 7, 2025

Decoder vectors / source models for dashboard demo #115

Decoder vectors / source models for dashboard demo #115

Comments

ghproek commented Apr 4, 2025

SrGonao commented Apr 7, 2025