Skip to content

Conversation

@hipudding
Copy link
Collaborator

@hipudding hipudding commented Nov 20, 2025

  1. Optimize the caching logic of rope_cache_init.
  2. Add support for mRoPE and i-mRoPE.

Note that on Ascend 910B devices, it is necessary to disable FA
in CLIP and disable NZ-format conversion. These two issues are
still under investigation.

Make sure to read the contributing guidelines before submitting a PR

@github-actions github-actions bot added testing Everything test related ggml changes relating to the ggml tensor library for machine learning Ascend NPU issues specific to Ascend NPUs labels Nov 20, 2025
@hipudding
Copy link
Collaborator Author

Backend CANN0: OK
Backend 2/5: CANN1
Skipping
Backend 3/5: CANN2
Skipping
Backend 4/5: CANN3
Skipping
Backend 5/5: CPU
Skipping
5/5 backends passed
OK

Verified through tests on the qwen2.5VL-7B, qwen3VL-8B, and qwen3VL-30B-A3B models.

@hipudding hipudding marked this pull request as ready for review November 21, 2025 07:10
@hipudding hipudding removed the testing Everything test related label Nov 21, 2025
acl_theta_scale_tensor.get());

if (ext_factor != 0) {
// Step1.2: prepare rope_yarn_ramp, if this part updated, should update theta_scale_tensor.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this part, use cpu is much chaper. It will cache and calculate only once in most of cases.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot finished reviewing on behalf of hipudding November 21, 2025 07:44
@hipudding hipudding requested a review from Copilot November 21, 2025 07:56
Copilot finished reviewing on behalf of hipudding November 21, 2025 07:59
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.

Copy link
Collaborator

@noemotiovon noemotiovon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! This is truly an impressive project—just understanding the code alone is really brain-intensive! The code looks fine, except that IS_VISION hasn’t been verified yet. I’d like to contribute to that part in the future as well. Finally, thank you so much for your contribution!

1. Optimize the caching logic of rope_cache_init.
2. Add support for mRoPE and i-mRoPE.

Note that on Ascend 910B devices, it is necessary to disable FA
in CLIP and disable NZ-format conversion. These two issues are
still under investigation.
@hipudding
Copy link
Collaborator Author

@ggerganov @slaren Good day! Could you please review this PR? This change make vl models available on CANN backend. Thanks.

@hipudding hipudding merged commit eeb5605 into ggml-org:master Nov 26, 2025
77 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ascend NPU issues specific to Ascend NPUs ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants