-
Notifications
You must be signed in to change notification settings - Fork 14.1k
CANN: add support for partial RoPE and Vision mode #17543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Opt Test: |
|
So far, all ROPE test cases have passed. |
hipudding
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! Just add some comments.
a97dd70 to
e0d679c
Compare
|
@noemotiovon rebase to retry test case. |
Done. |
a22b9b5 to
22ecc96
Compare
Add support for two important RoPE variants: partial rotation (rope_dims < ne0)
and Vision mode rotation.
1. Support for partial RoPE (rope_dims < ne0):
- Split tensor into head (first rope_dims dimensions) and tail portions
- Apply rotation only to head portion using RotaryPositionEmbedding operator
- Copy unrotated tail portion directly from source to destination
- Handle both contiguous and non-contiguous tensor layouts
2. Support for Vision mode (GGML_ROPE_TYPE_VISION):
- Set rope_dims = ne0 for Vision mode to rotate entire tensor
- Vision mode pairs dimension i with dimension i+n_dims (where n_dims = ne0/2)
- No tail handling needed since entire tensor is rotated
Implementation details:
- Use has_tail flag to determine execution path: head/tail splitting when
rope_dims < ne0, or full tensor rotation when rope_dims == ne0
- Support both F32 and F16 data types with intermediate F32 conversion
- Copy non-contiguous tensors to contiguous buffers before calling
RotaryPositionEmbedding operator for compatibility
- Improve cache invalidation logic to include rope_dims and indep_sects
parameters
These enhancements enable CANN backend to handle various RoPE configurations
used in modern vision-language models and models with partial rotation.
Add support for two important RoPE variants: partial rotation (rope_dims < ne0) and Vision mode rotation.
Support for partial RoPE (rope_dims < ne0):
Support for Vision mode (GGML_ROPE_TYPE_VISION):
Implementation details:
These enhancements enable CANN backend to handle various RoPE configurations used in modern vision-language models and models with partial rotation.
Make sure to read the contributing guidelines before submitting a PR