Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic example of SAE #10

Merged
merged 7 commits into from
Jun 13, 2024
Merged

Basic example of SAE #10

merged 7 commits into from
Jun 13, 2024

Conversation

annxingyuan
Copy link
Collaborator

Changes:

  • Adds button to model viewer for downloading activations, to be used as training data for SAE
  • Adds dedicated page for training SAE

The basic workflow is:

  • Go to the animated transformers page
  • Select the boundary decision task
  • Train it until accuracy is good (~80% max)
  • Click "Download activations"
  • Go to /#/sae page (click on the logo, then click "SAE")
  • Upload activations, then click train

Next step:

  • Actually browse some learned dictionary features to see whether they're meaningful

const output = decoderOutputs.layers[0].seqOuput;
trainingData.push({
'input': input,
'mlpOutputs': {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: If you wanted to save all the computation, I think you could do:

const data = getData(decoderOutputs);

I guess it depends a bit on where/how much you want to serialse vs edit this code later.

@annxingyuan annxingyuan merged commit 0bccb70 into main Jun 13, 2024
3 checks passed
@annxingyuan annxingyuan deleted the sae branch June 13, 2024 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants