Add moonshine streaming by eustlb · Pull Request #43702 · huggingface/transformers

eustlb · 2026-02-03T09:32:42Z

What does this PR do?

* sliding window attention in encoder (first 2 and last 2 layers) * different dims for encoder and decoder * projection and position embedding addition in adapter

Not auto generated anymore.

Mostly fixing the config such that the encoder and decoder dimensions can be different.

HuggingFaceDocBuilderDev · 2026-02-03T09:50:13Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Very nice!

ArthurZucker · 2026-02-03T12:37:27Z

src/transformers/configuration_utils.py

            )
        self._output_attentions = value

+        # Set it recursively on the subconfigs


Ok, we should document this under the output_attention doc IMO !

src/transformers/configuration_utils.py

src/transformers/models/moonshine/modeling_moonshine.py

src/transformers/models/moonshine_streaming/configuration_moonshine_streaming.py

src/transformers/models/moonshine_streaming/modular_moonshine_streaming.py

ArthurZucker · 2026-02-03T12:55:47Z

src/transformers/models/moonshine_streaming/modular_moonshine_streaming.py

+        if config.encoder_config.hidden_size != self.config.hidden_size:
+            self.proj = nn.Linear(config.encoder_config.hidden_size, self.config.hidden_size, bias=False)
+        else:
+            self.proj = nn.Identity()


arf, would be nice to be able to avoid that!

totally agree, but we cannot... (we could have a all ones linear though, but this is minimal)

ArthurZucker · 2026-02-03T12:56:43Z

src/transformers/models/moonshine_streaming/modular_moonshine_streaming.py

+class MoonshineStreamingProcessorKwargs(ProcessingKwargs, total=False):
+    _defaults = {
+        "audio_kwargs": {
+            "pad_to_multiple_of": 80,
+            "padding": True,
+        },
+        "common_kwargs": {"return_tensors": "pt"},
+    }
+
+
+class MoonshineStreamingProcessor(Wav2Vec2Processor): ...


we don't need that no? We should just map to wav2vec2 processor

otherwise we are creating a new processor juste for a change in default kwargs

issue is that we need to input to be padded so the behavior should be enforced in the processor, meaning we need to have set this. I do agree having the create a new processor jsut for that is uncovenient but I don't see another way to do it

tests/models/moonshine_streaming/test_modeling_moonshine_streaming.py

xenova · 2026-02-03T17:55:07Z

docs/source/en/model_doc/moonshine_streaming.md

+rendered properly in your Markdown viewer.
+
+-->
+*This model was released on 2024-10-21 and added to Hugging Face Transformers on 2026-02-03.*


This model was released on 2024-10-21

👀? I assume this should be updated haha (unless you're a time traveller)

yes ahah 😅

eustlb

Last points to adress @keveman and we should be good to go

eustlb · 2026-02-04T11:16:48Z

docs/source/en/model_doc/moonshine_streaming.md

+processor = AutoProcessor.from_pretrained("UsefulSensors/moonshine-streaming-tiny")
+model = MoonshineStreamingForConditionalGeneration.from_pretrained(
+    "UsefulSensors/moonshine-streaming-tiny",
+    dtype=torch.float16,
+    device_map="auto",
+    attn_implementation="sdpa"
+)


then can you please add it to the hub configs, cf https://huggingface.co/zai-org/GLM-ASR-Nano-2512/blob/main/config.json for example
also cc @Deep-unlearning for the asr leaderboard evals

eustlb · 2026-02-04T11:17:56Z

src/transformers/models/moonshine_streaming/modular_moonshine_streaming.py

+class MoonshineStreamingPreTrainedModel(MoonshinePreTrainedModel):
+    supports_gradient_checkpointing = False  # TODO: check
+


@keveman this was in your original PR. Is this necessary (it is setted to True for Mooshine)

just tested with this to True, and tests pass, so yes, can be True.

Re: dtype=torch.float16, may I request removing it from doc,get it merged, and after learderboard evals run, test float16 separately, make a different pull request?

github-actions · 2026-02-04T17:16:39Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, moonshine, moonshine_streaming, musicgen

LysandreJik · 2026-02-04T18:30:09Z

Thank you @eustlb

Manjunath Kudlur and others added 26 commits January 6, 2026 20:57

Add Moonshine Streaming

b4b3fba

* sliding window attention in encoder (first 2 and last 2 layers) * different dims for encoder and decoder * projection and position embedding addition in adapter

Move fully defined MoonshineStreamingConfig to configuration

f397f0c

Not auto generated anymore.

Move preprocessor in to encoder

7839f8b

Inherit from MoonshinePretrainedModel

ae1873e

Reuse MoonshineAttention

ca568cb

Remove unused imports

eca23a5

Add moonshine_streaming to SEQ_2_SEQ dict for pipeline to work

4cfb36d

batch_decode->decode to keep it v5 friendly

04dff3b

Remove unsued get_logits, move float cast up

d88975a

Make the frame overlap function tracing friendly

5aeeaca

draft modular refacto

c6cf1f1

update test values

18dcc9f

working commit

3d7c1bc

update

4b54902

some more updates

273d2c1

config udpate

e5ce274

some more updates

0775648

udpate modular

35c67ab

Minor changes to make the tests pass

857d499

Mostly fixing the config such that the encoder and decoder dimensions can be different.

Minor fix to numbers to match GPU output

ea09bcf

Use the correct config attribute

9455304

Moved encoder dimensions fixup to config

1961a6c

Rely on encoder_config only and not encoder_ prefix

5240ddd

Removed decoder_hidden_act parameter to config

8a74a9e

Merge branch 'main' into add_moonshine

27c8b6c

make

820caf5

eustlb added 3 commits February 3, 2026 11:01

updates

7757924

revert conversion mapping

3066331

fix output attention and hidden states

b5e18c6

ArthurZucker approved these changes Feb 3, 2026

View reviewed changes

eustlb marked this pull request as ready for review February 3, 2026 12:58

xenova reviewed Feb 3, 2026

View reviewed changes

eustlb changed the title ~~Add moonshine~~ Add moonshine streaming Feb 4, 2026

eustlb added 13 commits February 4, 2026 10:14

fix test

b019233

Merge branch 'main' into add_moonshine

953f1dd

doc

2bf89e7

add config doc

3bd22f0

move in init

fe7d5b3

nit

654cee6

revert output_attentions output_hidden_states

c93cef3

fix

5f374a9

Merge branch 'main' into add_moonshine

fdd0f06

fix

1b52726

Merge branch 'main' into add_moonshine

d0d5ff4

fix

3ba301f

Merge branch 'main' into add_moonshine

7e93425

eustlb commented Feb 4, 2026

View reviewed changes

eustlb added 3 commits February 4, 2026 14:13

doc update

5942559

gradient checkpointing

d04ee81

Merge branch 'main' into add_moonshine

4d6216f

eustlb enabled auto-merge (squash) February 4, 2026 15:57

eustlb added 2 commits February 4, 2026 17:33

Merge branch 'main' into add_moonshine

bbaa8ef

address review

98de01b

Merge branch 'main' into add_moonshine

58f8315

LysandreJik disabled auto-merge February 4, 2026 18:29

LysandreJik merged commit ace7c37 into main Feb 4, 2026
24 of 26 checks passed

LysandreJik deleted the add_moonshine branch February 4, 2026 18:29

		class MoonshineStreamingPreTrainedModel(MoonshinePreTrainedModel):
		supports_gradient_checkpointing = False # TODO: check

Conversation

eustlb commented Feb 3, 2026

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Feb 3, 2026

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eustlb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 4, 2026

Uh oh!

Uh oh!

LysandreJik commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants