Skip to content

Notebook 26 — Port SAE training to Mamba-2.8B #3

@caiovicentino

Description

@caiovicentino

Mamba architecture lacks SAE coverage in open source. Challenge: state-space activations are structured differently than transformer residuals. Target: train a reference SAE on state_mixer output, report observations. Likely requires hand-rolled hook path — budget 1-2 days.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions