Added kernels from kernel hub for Bamba model#41540
Added kernels from kernel hub for Bamba model#41540ArthurZucker merged 28 commits intohuggingface:mainfrom
Conversation
|
Downstream changes have not been done (eg: |
|
Hi @romitjain I'm working on a pr to make the kernel function mapping easier to do so we don't have to use new functions like |
|
@MekkCyber Sure, no worries. Let me know (or you can share your PR here) once done, and I will udpate my PR |
|
sure here is the pr : #41577 |
|
@MekkCyber I believe I can refactor my PR by using your new mapping function now? |
|
Yes, i did that for falcon models here : #41664, you can do that for bamba models then using the same API, however the mamba-ssm kernel needs to be fixed now before merging the PRs. |
…eature-bamba-kernels-from-hub
Signed-off-by: romit <romit@ibm.com>
|
The For my local testing, I made local changes to the mamba_ssm repo and tested the flow. However, we would need to fix it on kernels hub for this PR to work end to end. Other than that, structurally, this is ready for review. Can you please have a look? |
|
|
||
| _HUB_KERNEL_MAPPING: dict[str, dict[str, str]] = { | ||
| "causal-conv1d": {"repo_id": "kernels-community/causal-conv1d"}, | ||
| "mamba-ssm": {"repo_id": "kernels-community/mamba-ssm", "revision": "clean-mamba-ssm"}, |
There was a problem hiding this comment.
I don't have many comments except we should align with #41664 on how to lazy load.
These kernels are essentially the same so we ought to standardize it. The mamba-ssm side should be fixed on their repo, thx for the PR there!
cc @MekkCyber if you can take look here since it's close to what you did for the other mamba related model (falcon)
| @lru_cache | ||
| def is_einops_available() -> bool: | ||
| return _is_package_available("einops") |
There was a problem hiding this comment.
Now that I recall it, we would need to this check before trying to load mamba-ssm kernels from kernels hub according to README here: https://huggingface.co/kernels-community/mamba-ssm
Should I inject these requirements in _HUB_KERNEL_MAPPING in src/transformers/integrations/hub_kernels.py
There was a problem hiding this comment.
But we don't use this check anywhere, no? It doesn't hurt to have it either way, just a bit confused about the usage
There was a problem hiding this comment.
for the clean-mamba-ssm implementation I don't think we need this, let's just remove it for now
There was a problem hiding this comment.
Sure, will remove this
Also, re: @vasqu's comment, I had added this earlier but forgot to remove this in my latest commit. Will remove it since clean-mamba-ssm won't need it
There was a problem hiding this comment.
@MekkCyber clean-mamba-ssm has a issue. It does not expose all the kernel functions: mamba_chunk_scan_combined, mamba_split_conv1d_scan_combined
see: #41540 (comment)
There was a problem hiding this comment.
Hemmm, sure will export them in the branch
SGTM. I will still leave it to @MekkCyber tho as he's the main guy for anything kernels. In essence, it just should be the same way everywhere to avoid unnecessary tech debt |
MekkCyber
left a comment
There was a problem hiding this comment.
Thanks a lot @romitjain !
| @lru_cache | ||
| def is_einops_available() -> bool: | ||
| return _is_package_available("einops") |
There was a problem hiding this comment.
for the clean-mamba-ssm implementation I don't think we need this, let's just remove it for now
Signed-off-by: romit <romit@ibm.com>
Signed-off-by: romit <romit@ibm.com>
Signed-off-by: romit <romit@ibm.com>
|
@MekkCyber I have addressed your comments, PTAL PS: It would require resolution of this issue: #41540 (comment) |
|
I did not run |
|
In your case, you indeed need to run make fix-copies to propagate the modification you did in the modular files to the real modeling files |
…/feature-bamba-kernels-from-hub
MekkCyber
left a comment
There was a problem hiding this comment.
Thanks for fixing this @romitjain, I will find some time this week to rework how imports are done inside the kernel because it's really not optimal to have such nested imports
| mamba_ssm_triton = getattr(getattr(mamba_ssm, "ops", None), "triton", None) | ||
| selective_state_update = getattr( | ||
| getattr(mamba_ssm_triton, "selective_state_update", None), "selective_state_update", None |
There was a problem hiding this comment.
I think we need to rework how are these imports nested. It doesn't make sense
There was a problem hiding this comment.
I didn't get you @MekkCyber?
I am also not in favor of these nested imports, but that is what you had previously requested
There was a problem hiding this comment.
yes i mean it shouldn't be done this way in the kernel, like we should have an init file with all necessary functions, and then we only use getattr once, and not in a nested way
There was a problem hiding this comment.
Sure, let me know once you update the kernels init file, I can make the changes here.
There was a problem hiding this comment.
Thanks a lot ! And sorry this is taking a lot of time 🙏, I’ll work on fixing things this week, at the latest.
There was a problem hiding this comment.
No worries at all. Thank you for your continued review on this!
|
Hi @romitjain ! It should be good now, you can use the kernels from this version : https://huggingface.co/kernels-community/mamba-ssm/tree/v0.0.4 instead of the |
…/feature-bamba-kernels-from-hub
Signed-off-by: romit <romit@ibm.com>
|
@MekkCyber Thanks for making the upstream fix, all the imports have been improved now to avoid nested calls. PTAL |
MekkCyber
left a comment
There was a problem hiding this comment.
Thanks @romitjain ! lgtm, only one nit and good to go
| causal_conv1d = lazy_load_kernel("causal-conv1d") | ||
| causal_conv1d_update = getattr(causal_conv1d, "causal_conv1d_update", None) | ||
| causal_conv1d_fn = getattr(causal_conv1d, "causal_conv1d_fn", None) | ||
|
|
||
| mamba_ssm = lazy_load_kernel("mamba-ssm") | ||
| selective_state_update = getattr(mamba_ssm, "selective_state_update", None) | ||
| mamba_chunk_scan_combined = getattr(mamba_ssm, "mamba_chunk_scan_combined", None) | ||
| mamba_split_conv1d_scan_combined = getattr(mamba_ssm, "mamba_split_conv1d_scan_combined", None) |
There was a problem hiding this comment.
let's move this inside the Mixer to avoid loading the kernels at import time, the same we did here : https://github.com/romitjain/transformers/blob/03590f18cf59be7a9e215b946bd7ade3d8b12a7a/src/transformers/models/mamba/modeling_mamba.py#L201:L214
There was a problem hiding this comment.
@MekkCyber Done for mamba2, jamba and bamba
Signed-off-by: romit <romit@ibm.com>
|
Thanks @romitjain ! can you fix the styling issues with |
|
Oops, I forgot to run that @MekkCyber |
MekkCyber
left a comment
There was a problem hiding this comment.
lgtm! thanks for iterating on this PR
|
Thanks @MekkCyber |
|
Failling tests seem unrelated! we just need to wait for a green CI to merge |
|
Seems like it is failing for the latest commits: https://github.com/huggingface/transformers/commits/main/ |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: bamba, granitemoehybrid, jamba, mamba2, qwen3_next |
|
View the CircleCI Test Summary for this PR: https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=41540&sha=65e283 |
* Added kernels from kernel hub for Bamba model * Updated kernel loading Signed-off-by: romit <romit@ibm.com> * Remove einops Signed-off-by: romit <romit@ibm.com> * Removed global vars Signed-off-by: romit <romit@ibm.com> * Fixed make style Signed-off-by: romit <romit@ibm.com> * Nit Signed-off-by: romit <romit@ibm.com> * Added modeling files Signed-off-by: romit <romit@ibm.com> * Fixed merge conflict Signed-off-by: romit <romit@ibm.com> * fixed lint Signed-off-by: romitjain <romit@ibm.com> * Removed global import * Small updates * Updated * Resolved merge conflicts * Fixed the nested import Signed-off-by: romit <romit@ibm.com> * Moved imports inside mixer Signed-off-by: romit <romit@ibm.com> * CI CD fix Signed-off-by: romit <romit@ibm.com> --------- Signed-off-by: romit <romit@ibm.com> Signed-off-by: romitjain <romit@ibm.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
What does this PR do?
Adds support for
mamba_ssmandcausal_conv1dkernels from the kernel-hub in bamba models.Fixes # (issue)
#41208
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@vasqu @MekkCyber @drbh