Skip to content

Conversation

noemotiovon
Copy link
Collaborator

What does this PR do?

  • Added a check to skip aclrtSetDevice if the current device is already set.
  • Prevents unnecessary context switches while keeping thread/device consistency.

@noemotiovon noemotiovon added Ascend NPU issues specific to Ascend NPUs ggml changes relating to the ggml tensor library for machine learning labels Sep 11, 2025
@noemotiovon
Copy link
Collaborator Author

noemotiovon commented Sep 11, 2025

Qwen2.5-0.5B Model Inference Test in 2 NPUs

......
llama_perf_sampler_print:    sampling time =      40.70 ms /   175 runs   (    0.23 ms per token,  4299.54 tokens per second)
llama_perf_context_print:        load time =    7582.66 ms
llama_perf_context_print: prompt eval time =      26.60 ms /    20 tokens (    1.33 ms per token,   751.94 tokens per second)
llama_perf_context_print:        eval time =     709.67 ms /   154 runs   (    4.61 ms per token,   217.00 tokens per second)
llama_perf_context_print:       total time =    1885.73 ms /   174 tokens
llama_perf_context_print:    graphs reused =        153

Copy link
Collaborator

@hipudding hipudding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@noemotiovon noemotiovon force-pushed the set_device_opti branch 3 times, most recently from 4a99538 to 76d24c3 Compare September 15, 2025 02:11
- Added a check to skip aclrtSetDevice if the current device is already set.
- Prevents unnecessary context switches while keeping thread/device consistency.
@hipudding hipudding merged commit d5fabe3 into ggml-org:master Sep 17, 2025
49 checks passed
angt pushed a commit to angt/llama.cpp that referenced this pull request Sep 17, 2025
* CANN: Fix ggml_cann_set_device to avoid redundant device switches

- Added a check to skip aclrtSetDevice if the current device is already set.
- Prevents unnecessary context switches while keeping thread/device consistency.

* CANN: add device default id
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ascend NPU issues specific to Ascend NPUs ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants