AudioSAE

Sparse Autoencoders trained on the encoder layers of Whisper and HuBERT, from the paper AudioSAE: Towards Understanding of Audio-Processing Models with Sparse AutoEncoders (EACL 2026).

This repository contains inference and basic interpretability code. It lets you load a pretrained SAE, run it on any layer of Whisper or HuBERT, inspect the sparse feature activations, and reproduce the paper's core feature-analysis tooling. Training, feature-steering and probing utilities used in the paper are intentionally not included here.

Weights

All SAE checkpoints live on the HuggingFace Hub at Egorgij21/Audio-SAE.

Backbone	Dict size	Layers	`k`
HuBERT-base (`facebook/hubert-base-ls960`)	6144	1–12	50
HuBERT-large (`facebook/hubert-large-ll60k`)	8192	1–24	50
Whisper-small (`openai/whisper-small`)	6144	1–12	50
Whisper-large-v3 (`large-v3`)	10240	1–32	50
Whisper-large-v3-turbo (`large-v3-turbo`)	10240	1–32	50

Each checkpoint is a BatchTop-K SAE with an 8× expansion factor trained on ~2.8 k hours of mixed speech, music and environmental audio.

Installation

git clone https://github.com/audiosae/audiosae_demo.git
cd audiosae_demo
pip install -r requirements.txt

Quickstart

import torch
from audio_sae import BatchTopKSAE
from audio_sae.models import MyHubert
from huggingface_hub import hf_hub_download

device = "cuda" if torch.cuda.is_available() else "cpu"

# 1. Load a HuBERT-base encoder that exposes layer 3 activations
hubert = MyHubert("facebook/hubert-base-ls960", sae_after_layer=3).to(device).eval()

# 2. Download + load the matching SAE from the Hub
ckpt = hf_hub_download(
    repo_id="Egorgij21/Audio-SAE-hubert-base",
    filename="layer_3/ae.pt",
)
sae = BatchTopKSAE.from_pretrained(ckpt, device=device)

# 3. Run on an audio file
import librosa
wav, _ = librosa.load("example.wav", sr=16000, mono=True)
wav = torch.from_numpy(wav).unsqueeze(0).to(device)

with torch.no_grad():
    acts = hubert(wav)                                  # (1, T, d)
    features = sae.encode(acts, use_threshold=True)     # (1, T, dict_size) — sparse

features[b, t] is a sparse vector with roughly k non-zero entries per frame (controlled by the SAE's learned activation threshold; the exact count varies by clip and layer — for an exact-k sparsification, flatten to (B*T, d) and call sae.encode(x, use_threshold=False, is_eval=False)). Take the non-zero indices to read which SAE features fired, and their magnitudes to rank them.

See examples/inference.ipynb for a runnable walkthrough including Whisper and a simple top-feature visualisation.

Interpretability

audio_sae.interp ports the paper's feature-analysis primitives:

from audio_sae import (
    reconstruction_metrics,           # L0, L2, normalized_l2, frac_var_explained
    aggregate_max_activations,        # per-clip max activation table
    top_clips_for_feature,            # rank clips by a feature's activation
    dead_features,                    # features that never fired
    feature_activation_windows,       # mel-spec windows around a feature's activations
    collect_dataset_stats,            # per-dataset frame/audio fire stats
    classify_features,                # speech / sounds / music labels from those stats
    plot_feature_tsne,                # encoder-weight t-SNE coloured by label
)

feature_activation_windows is the get_mels helper from the paper — it returns the log-mel windows used to visualise what an SAE feature responds to. Averaging many windows over the clips where the feature fires gives the feature's receptive field in mel-spectrogram space.

collect_dataset_stats + classify_features + plot_feature_tsne together reproduce the speech / sounds / music classification from section 4 of the paper: accumulate per-dataset activation frequencies, label each feature by the category where it fires most often, and project the labelled encoder rows to 2-D with t-SNE.

See examples/interpretability.ipynb for an end-to-end run of reconstruction metrics, top-activating clips, and activation windows on a small set of example clips.

License

MIT — see LICENSE.

Citation

@inproceedings{aparin2026audiosae,
  title     = {AudioSAE: Towards Understanding of Audio-Processing Models with Sparse AutoEncoders},
  author    = {Aparin, Georgii and Sadekova, Tasnima and Rukhovich, Alexey and Yermekova, Assel and Kushnareva, Laida and Popov, Vadim and Kuznetsov, Kristian and Piontkovskaya, Irina},
  booktitle = {Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)},
  year      = {2026},
  address   = {Rabat, Morocco},
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
audio_sae		audio_sae
examples		examples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AudioSAE

Weights

Installation

Quickstart

Interpretability

License

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AudioSAE

Weights

Installation

Quickstart

Interpretability

License

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages