Basic example of SAE #10

annxingyuan · 2024-06-13T12:32:25Z

Changes:

Adds button to model viewer for downloading activations, to be used as training data for SAE
Adds dedicated page for training SAE

The basic workflow is:

Go to the animated transformers page
Select the boundary decision task
Train it until accuracy is good (~80% max)
Click "Download activations"
Go to /#/sae page (click on the logo, then click "SAE")
Upload activations, then click train

Next step:

Actually browse some learned dictionary features to see whether they're meaningful

animated-transformer/src/app/animated-transformer/model-evaluator/model-evaluator.component.ts

iislucas · 2024-06-13T13:01:11Z

animated-transformer/src/app/animated-transformer/model-evaluator/model-evaluator.component.ts

+      const output = decoderOutputs.layers[0].seqOuput;
+      trainingData.push({
+        'input': input,
+        'mlpOutputs': {


FYI: If you wanted to save all the computation, I think you could do:

const data = getData(decoderOutputs);

I guess it depends a bit on where/how much you want to serialse vs edit this code later.

basic working example

224a2dc

annxingyuan requested a review from iislucas June 13, 2024 12:32

iislucas and others added 4 commits June 13, 2024 08:41

fix json5 import, remove webgl forced dep, auto-formatting

5829bbc

some test fixes, start of tiny-world task

13500f4

rule parsing

8319244

basic working example

4ba9bf4

iislucas approved these changes Jun 13, 2024

View reviewed changes

annxingyuan added 2 commits June 13, 2024 11:35

simplify get tree

20a7dfd

Merge branch 'main' into sae

67031e2

annxingyuan merged commit 0bccb70 into main Jun 13, 2024
3 checks passed

annxingyuan deleted the sae branch June 13, 2024 15:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Basic example of SAE #10

Basic example of SAE #10

annxingyuan commented Jun 13, 2024

iislucas Jun 13, 2024

Basic example of SAE #10

Basic example of SAE #10

Conversation

annxingyuan commented Jun 13, 2024

iislucas Jun 13, 2024

Choose a reason for hiding this comment