Adds Causal Conv 1D kernel for mamba models #40765

MekkCyber · 2025-09-09T12:02:12Z

What does this PR do?

Adds the https://huggingface.co/kernels-community/causal-conv1d to the mamba model.

vasqu

Imo, the users should have both options: kernels and/or the original cuda kernel
(This will break for users that used the original kernels)

Also it could be done for more models - mamba2, bamba, jamba, ...

src/transformers/models/mamba/modeling_mamba.py

MekkCyber · 2025-09-09T13:22:32Z

Indeed, i guess i will keep the other import of causal-conv1d in case kernels is not installed for now

vasqu

Thank you, lgtm overall

Not now, but in the future it would be nice to have lazy loading like flash attention

vasqu · 2025-09-10T09:10:26Z

src/transformers/models/mamba/modeling_mamba.py

-                        " is None. Falling back to the mamba.py backend. To install follow https://github.com/state-spaces/mamba/#installation and"
-                        " https://github.com/Dao-AILab/causal-conv1d"
+                        " is None. Falling back to the mamba.py backend. To install follow https://github.com/state-spaces/mamba/#installation for mamba-ssm and"
+                        " install the kernels library using `pip install kernels`"


I think we need to revamp the message? Just so people know that both are possible

Yes done, but for V5 I guess we need to start pushing people to use extensively kernels since it's a lot better

HuggingFaceDocBuilderDev · 2025-09-11T12:57:03Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Thanks, the _lazy_loading_kernel is counter intutive and IMO not needed as it just wraps around get kernel

ArthurZucker · 2025-09-11T13:33:34Z

src/transformers/integrations/hub_kernels.py

+def _lazy_loading_kernel(kernel_name: str) -> None:
+    kernel = get_kernel(kernel_name)
+    return kernel


not sure how this one is actualy lazy? but does not really seem needed

Hmm, not taking a careful look but the why for lazy loading on causal conv1d is the same as for flash attn:

the builds are not always stable and can lead easily to torch mismatches

if someone doesnt use this then they shouldnt be affected by it (ie it should not crash)

At some point, I hope the build system on fa etc changes to smthn like manylinux 😢 it's a pain on the original packages

no but here we hare just "naming" _lazy_loading_kernel while we are not changing get_kernel: tldr it is not helping.

I am glad to have lazy loading, but only if we properly do lazy load haha

AH ok, yea makes sense didnt take a proper look at the function oops

ArthurZucker · 2025-09-11T13:34:17Z

src/transformers/models/falcon_mamba/modeling_falcon_mamba.py

-                    " is None. Falling back to the sequential implementation of Mamba, as use_mambapy is set to False. To install follow https://github.com/state-spaces/mamba/#installation and"
-                    " https://github.com/Dao-AILab/causal-conv1d. For the mamba.py backend, follow https://github.com/alxndrTL/mamba.py."
+                    " is None. Falling back to the sequential implementation of Mamba, as use_mambapy is set to False. To install follow https://github.com/state-spaces/mamba/#installation for mamba-ssm and"
+                    " https://github.com/Dao-AILab/causal-conv1d or install kernels for causal-conv1d. For the mamba.py backend, follow https://github.com/alxndrTL/mamba.py."


Suggested change

" https://github.com/Dao-AILab/causal-conv1d or install kernels for causal-conv1d. For the mamba.py backend, follow https://github.com/alxndrTL/mamba.py."

" https://github.com/Dao-AILab/causal-conv1d or `pip install kernels` for causal-conv1d. For the mamba.py backend, follow https://github.com/alxndrTL/mamba.py."

github-actions · 2025-09-12T09:32:27Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: falcon_mamba, mamba

MekkCyber · 2025-09-12T10:22:48Z

Thank you all for the reviews 🤗

* add kernel * make style * keep causal-conv1d * small fix * small fix * fix modular converter * modular fix + lazy loading * revert changes modular * nit * hub kernels update * update * small nit

MekkCyber added 2 commits September 9, 2025 11:59

add kernel

cd13ce2

make style

a333ef3

MekkCyber requested a review from ArthurZucker September 9, 2025 12:02

vasqu reviewed Sep 9, 2025

View reviewed changes

src/transformers/models/mamba/modeling_mamba.py Outdated Show resolved Hide resolved

MekkCyber added 2 commits September 9, 2025 13:30

keep causal-conv1d

0abdbe6

small fix

5c487d9

vasqu approved these changes Sep 10, 2025

View reviewed changes

MekkCyber added 5 commits September 11, 2025 08:29

small fix

11fcf1e

fix modular converter

3556fc0

modular fix + lazy loading

aca4ccd

revert changes modular

487d972

nit

6d022ae

hub kernels update

2b93093

ArthurZucker approved these changes Sep 11, 2025

View reviewed changes

MekkCyber changed the title ~~Adds Causal Conv 1D model for mamba models~~ Adds Causal Conv 1D kernel for mamba models Sep 11, 2025

update

8318478

small nit

f297b9f

MekkCyber merged commit 6e69b60 into main Sep 12, 2025
18 checks passed

MekkCyber deleted the causal_1d_kernel branch September 12, 2025 10:22

vasqu mentioned this pull request Sep 30, 2025

Integrate mamba SSM kernels from the hub #41208

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adds Causal Conv 1D kernel for mamba models #40765

Adds Causal Conv 1D kernel for mamba models #40765

MekkCyber commented Sep 9, 2025

Uh oh!

vasqu left a comment

Uh oh!

Uh oh!

MekkCyber commented Sep 9, 2025

Uh oh!

vasqu left a comment

Uh oh!

vasqu Sep 10, 2025

Uh oh!

MekkCyber Sep 11, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Sep 11, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

ArthurZucker Sep 11, 2025

Uh oh!

vasqu Sep 11, 2025

Uh oh!

ArthurZucker Sep 11, 2025

Uh oh!

vasqu Sep 11, 2025

Uh oh!

ArthurZucker Sep 11, 2025

Uh oh!

github-actions bot commented Sep 12, 2025

Uh oh!

Uh oh!

MekkCyber commented Sep 12, 2025

Uh oh!

Uh oh!

	" https://github.com/Dao-AILab/causal-conv1d or install kernels for causal-conv1d. For the mamba.py backend, follow https://github.com/alxndrTL/mamba.py."
	" https://github.com/Dao-AILab/causal-conv1d or `pip install kernels` for causal-conv1d. For the mamba.py backend, follow https://github.com/alxndrTL/mamba.py."

Adds Causal Conv 1D kernel for mamba models #40765

Adds Causal Conv 1D kernel for mamba models #40765

Conversation

MekkCyber commented Sep 9, 2025

What does this PR do?

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MekkCyber commented Sep 9, 2025

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

vasqu Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

MekkCyber Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Sep 11, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

vasqu Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

vasqu Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 12, 2025

Uh oh!

Uh oh!

MekkCyber commented Sep 12, 2025

Uh oh!

Uh oh!