Replace head_mapping params with num_kv_heads to attention kernel. #1997

wbn03 · 2023-12-09T04:59:00Z

Replace head_mapping params with num_kv_heads to attention kernel.
Base on this issue:
https://github.com/vllm-project/vllm/issues/1928
To avoid the head_mapping load from global memory.

Yard1

LGTM. cc @WoosukKwon

WoosukKwon · 2023-12-10T03:20:06Z

@Yard1 It seems this exactly overlaps with #1994 I think we should add @zhaoyang-star as the co-author of this PR. Let me do it after running this PR.

Yard1 · 2023-12-10T17:13:19Z

@WoosukKwon Agreed, thanks!

…llm-project#1997) Co-authored-by: wangguoya <wangguoya@baidu.com> Co-authored-by: Yang Zhao <zhaoyangstar@foxmail.com>

Replace head_mapping params with num_kv_heads to attention kernel.

8e62e90

Yard1 approved these changes Dec 9, 2023

View reviewed changes

WoosukKwon linked an issue Dec 10, 2023 that may be closed by this pull request

Why we need head_mapping as param pass to paged_attention kernel? #1928

Closed

WoosukKwon self-requested a review December 10, 2023 03:25

Minor

4c8d647

WoosukKwon merged commit dacaf5a into vllm-project:main Dec 10, 2023

WoosukKwon mentioned this pull request Dec 10, 2023

Replace head_mapping(tensor) with num_queries_per_kv(int) #1994

Closed

masahi mentioned this pull request Dec 13, 2023

Remove head_mapping constant param from the model definition octoml/mlc-llm#115

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Replace head_mapping params with num_kv_heads to attention kernel. #1997

Replace head_mapping params with num_kv_heads to attention kernel. #1997

Uh oh!

wbn03 commented Dec 9, 2023

Uh oh!

Yard1 left a comment

Uh oh!

WoosukKwon commented Dec 10, 2023 •

edited

Loading

Uh oh!

Yard1 commented Dec 10, 2023

Uh oh!

Uh oh!

Uh oh!

Replace head_mapping params with num_kv_heads to attention kernel. #1997

Replace head_mapping params with num_kv_heads to attention kernel. #1997

Uh oh!

Conversation

wbn03 commented Dec 9, 2023

Uh oh!

Yard1 left a comment

Choose a reason for hiding this comment

Uh oh!

WoosukKwon commented Dec 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Yard1 commented Dec 10, 2023

Uh oh!

Uh oh!

WoosukKwon commented Dec 10, 2023 •

edited

Loading