Skip to content

Conversation

zhaoyang-star
Copy link
Contributor

Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! Left some small comments on style.

Comment on lines 55 to 56
self.num_heads = \
total_num_heads // self.tensor_model_parallel_world_size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Please use parenthesis for line continuation according to google python style guide.

Suggested change
self.num_heads = \
total_num_heads // self.tensor_model_parallel_world_size
self.num_heads = (
total_num_heads // self.tensor_model_parallel_world_size)

Comment on lines 291 to 292
num_heads = \
total_num_heads // self.tensor_model_parallel_world_size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto on line continuation

@zhaoyang-star
Copy link
Contributor Author

Thanks for the fix! Left some small comments on style.

Fixed. Please have a review again. @zhuohan123
PR about QKV gemm fuse in gpt_bigcode will be committed later.

Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you for your contribution!

@zhuohan123 zhuohan123 merged commit 4f85847 into vllm-project:main Aug 22, 2023
@zhaoyang-star zhaoyang-star deleted the fix_gpt_bigcode branch August 23, 2023 00:59
randxie pushed a commit to randxie/vllm that referenced this pull request Aug 29, 2023
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
amy-why-3459 pushed a commit to amy-why-3459/vllm that referenced this pull request Sep 15, 2025
vllm-project#806)

### What this PR does / why we need it?

1. Fix V1 error found by
[nightly_ci](https://github.com/vllm-project/vllm-ascend/actions/runs/14950004754/job/41998136610),
broken by [[v1] Pass BlockTable and KVCacheSpec to
AttentionMetadataBuilders
vllm-project#17483](vllm-project#17483), make
`InputBatch` parameter consistent with vllm.
2. Disable benmark and fix it in upstream.

### Does this PR introduce _any_ user-facing change?

No


### How was this patch tested?

CI passed

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants