-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Fix mqa is false case in gpt_bigcode #806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix! Left some small comments on style.
self.num_heads = \ | ||
total_num_heads // self.tensor_model_parallel_world_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Please use parenthesis for line continuation according to google python style guide.
self.num_heads = \ | |
total_num_heads // self.tensor_model_parallel_world_size | |
self.num_heads = ( | |
total_num_heads // self.tensor_model_parallel_world_size) |
num_heads = \ | ||
total_num_heads // self.tensor_model_parallel_world_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto on line continuation
Fixed. Please have a review again. @zhuohan123 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you for your contribution!
vllm-project#806) ### What this PR does / why we need it? 1. Fix V1 error found by [nightly_ci](https://github.com/vllm-project/vllm-ascend/actions/runs/14950004754/job/41998136610), broken by [[v1] Pass BlockTable and KVCacheSpec to AttentionMetadataBuilders vllm-project#17483](vllm-project#17483), make `InputBatch` parameter consistent with vllm. 2. Disable benmark and fix it in upstream. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed --------- Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
cc @zhuohan123