Skip to content

Conversation

MekkCyber
Copy link
Contributor

What does this PR do?

Adds the https://huggingface.co/kernels-community/causal-conv1d to the mamba model.

Copy link
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo, the users should have both options: kernels and/or the original cuda kernel
(This will break for users that used the original kernels)

Also it could be done for more models - mamba2, bamba, jamba, ...

@MekkCyber
Copy link
Contributor Author

Indeed, i guess i will keep the other import of causal-conv1d in case kernels is not installed for now

Copy link
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, lgtm overall

Not now, but in the future it would be nice to have lazy loading like flash attention

Comment on lines 220 to 234
" is None. Falling back to the mamba.py backend. To install follow https://github.com/state-spaces/mamba/#installation and"
" https://github.com/Dao-AILab/causal-conv1d"
" is None. Falling back to the mamba.py backend. To install follow https://github.com/state-spaces/mamba/#installation for mamba-ssm and"
" install the kernels library using `pip install kernels`"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to revamp the message? Just so people know that both are possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes done, but for V5 I guess we need to start pushing people to use extensively kernels since it's a lot better

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the _lazy_loading_kernel is counter intutive and IMO not needed as it just wraps around get kernel

Comment on lines 172 to 174
def _lazy_loading_kernel(kernel_name: str) -> None:
kernel = get_kernel(kernel_name)
return kernel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure how this one is actualy lazy? but does not really seem needed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not taking a careful look but the why for lazy loading on causal conv1d is the same as for flash attn:

  • the builds are not always stable and can lead easily to torch mismatches
  • if someone doesnt use this then they shouldnt be affected by it (ie it should not crash)

At some point, I hope the build system on fa etc changes to smthn like manylinux 😢 it's a pain on the original packages

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no but here we hare just "naming" _lazy_loading_kernel while we are not changing get_kernel: tldr it is not helping.

I am glad to have lazy loading, but only if we properly do lazy load haha

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AH ok, yea makes sense didnt take a proper look at the function oops

" is None. Falling back to the sequential implementation of Mamba, as use_mambapy is set to False. To install follow https://github.com/state-spaces/mamba/#installation and"
" https://github.com/Dao-AILab/causal-conv1d. For the mamba.py backend, follow https://github.com/alxndrTL/mamba.py."
" is None. Falling back to the sequential implementation of Mamba, as use_mambapy is set to False. To install follow https://github.com/state-spaces/mamba/#installation for mamba-ssm and"
" https://github.com/Dao-AILab/causal-conv1d or install kernels for causal-conv1d. For the mamba.py backend, follow https://github.com/alxndrTL/mamba.py."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
" https://github.com/Dao-AILab/causal-conv1d or install kernels for causal-conv1d. For the mamba.py backend, follow https://github.com/alxndrTL/mamba.py."
" https://github.com/Dao-AILab/causal-conv1d or `pip install kernels` for causal-conv1d. For the mamba.py backend, follow https://github.com/alxndrTL/mamba.py."

@MekkCyber MekkCyber changed the title Adds Causal Conv 1D model for mamba models Adds Causal Conv 1D kernel for mamba models Sep 11, 2025
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: falcon_mamba, mamba

@MekkCyber MekkCyber merged commit 6e69b60 into main Sep 12, 2025
18 checks passed
@MekkCyber MekkCyber deleted the causal_1d_kernel branch September 12, 2025 10:22
@MekkCyber
Copy link
Contributor Author

Thank you all for the reviews 🤗

ErfanBaghaei pushed a commit to ErfanBaghaei/transformers that referenced this pull request Sep 25, 2025
* add kernel

* make style

* keep causal-conv1d

* small fix

* small fix

* fix modular converter

* modular fix + lazy loading

* revert changes modular

* nit

* hub kernels update

* update

* small nit
vijayabhaskar-ev pushed a commit to vijayabhaskar-ev/transformers that referenced this pull request Oct 2, 2025
* add kernel

* make style

* keep causal-conv1d

* small fix

* small fix

* fix modular converter

* modular fix + lazy loading

* revert changes modular

* nit

* hub kernels update

* update

* small nit
yuchenxie4645 pushed a commit to yuchenxie4645/transformers that referenced this pull request Oct 4, 2025
* add kernel

* make style

* keep causal-conv1d

* small fix

* small fix

* fix modular converter

* modular fix + lazy loading

* revert changes modular

* nit

* hub kernels update

* update

* small nit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants