[perf] Modify CUDA SIMD and add Triton hash encoder #408

Clarence-1103 · 2025-11-25T06:35:29Z

Purpose

What this PR does / why we need it?

Optimize the kvcomp performance on CUDA by modifying CUDA SIMD and adding the Triton hash encoder.

Modifications

Does this PR introduce any user-facing change?

Test

How was this patch tested?

1.Compare the performance of the old and new hash retrieval backend.

new hash retrieval backend spent 0.17117667198181152 s
old hash retrieval backend spent 0.4427645206451416 s

2.Benchmark testing of triton_hash_code and torch_hash_code

3.E2E test
old:

new:

ucm/sparse/kvcomp/hash_encoder.py

* [fix] fix sparse attention (#397) fix ascend attention Co-authored-by: lijiachen19 <lijiachen19@huawei.com> * [opt] Share Infra implementation and unify status codes (#399) share infra module Co-authored-by: Fang Run <Fang_Run@126.com> * [bugfix] Fix ESA to be compatible with the latest NFSStore. (#401) fix esa to adapt latest NFSStore * release v0.1.0rc4 (#402) Co-authored-by: lijiachen19 <lijiachen19@huawei.com> * [opt] Remove unused cc impl of dramstore (#406) remove unused cc impl of dramstore * [Fix]remove dram docs and modify quick-start doc (#411) * [Fix]remove dram docs and modify quick-start doc * modify index.md --------- Co-authored-by: t00939662 <tianxuehan@huawei.com> * [Feature] Added performance testing tool based on the PyTest testing framework (#295) Performance testing tool based on the PyTest testing framework. * [Misc] Add cpp-linter.yml (#422) * [docs]add metrics doc (#416) * [docs]add metrics doc * modify metrics.md * modify metrics.md --------- Co-authored-by: t00939662 <tianxuehan@huawei.com> * [perf] Modify CUDA SIMD and add Triton hash encoder (#408) * fix cpp code style --------- Co-authored-by: Lijiachen1018 <30387633+Lijiachen1018@users.noreply.github.com> Co-authored-by: lijiachen19 <lijiachen19@huawei.com> Co-authored-by: Mag1c.H <hemajun815@163.com> Co-authored-by: Fang Run <Fang_Run@126.com> Co-authored-by: MaxWang <wangwenxin21@huawei.com> Co-authored-by: hero0307 <tianxuehan0307@163.com> Co-authored-by: t00939662 <tianxuehan@huawei.com> Co-authored-by: ML <85485147+Menglths@users.noreply.github.com> Co-authored-by: ShiXiaolei <indirashi@163.com>

…p#408)

Clarence-1103 requested review from hek14, leideng, mag1c-h, pengwwang, saki-daisuki, summer-ai007, wangwenxin0312, wuhuxiao, xwLearnsLLM and ygwpz as code owners November 25, 2025 06:35

Clarence-1103 force-pushed the fix_kvcomp branch 3 times, most recently from 7ba9b69 to 030aeb6 Compare November 27, 2025 15:25

[perf] Modify CUDA SIMD and add Triton hash encoder

a19a5ab

Clarence-1103 force-pushed the fix_kvcomp branch from 030aeb6 to a19a5ab Compare November 27, 2025 15:29

hek14 approved these changes Nov 28, 2025

View reviewed changes

ucm/sparse/kvcomp/hash_encoder.py Outdated Show resolved Hide resolved

hek14 merged commit fdc31df into ModelEngine-Group:develop Nov 28, 2025
3 checks passed

sumingZero pushed a commit to sumingZero/unified-cache-management that referenced this pull request Nov 28, 2025

[perf] Modify CUDA SIMD and add Triton hash encoder (ModelEngine-Grou…

dc72c1d

…p#408)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[perf] Modify CUDA SIMD and add Triton hash encoder #408

[perf] Modify CUDA SIMD and add Triton hash encoder #408

Uh oh!

Clarence-1103 commented Nov 25, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[perf] Modify CUDA SIMD and add Triton hash encoder #408

[perf] Modify CUDA SIMD and add Triton hash encoder #408

Uh oh!

Conversation

Clarence-1103 commented Nov 25, 2025

Purpose

Modifications

Test

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants