Fix mqa is false case in gpt_bigcode #806

zhaoyang-star · 2023-08-20T15:29:53Z

zhuohan123

Thanks for the fix! Left some small comments on style.

zhuohan123 · 2023-08-21T22:52:07Z

vllm/model_executor/models/gpt_bigcode.py

+        self.num_heads = \
+            total_num_heads // self.tensor_model_parallel_world_size


Nit: Please use parenthesis for line continuation according to google python style guide.

Suggested change

self.num_heads = \

total_num_heads // self.tensor_model_parallel_world_size

self.num_heads = (

total_num_heads // self.tensor_model_parallel_world_size)

zhuohan123 · 2023-08-21T22:52:29Z

vllm/model_executor/models/gpt_bigcode.py

+                num_heads = \
+                    total_num_heads // self.tensor_model_parallel_world_size


Ditto on line continuation

zhaoyang-star · 2023-08-22T00:33:30Z

Thanks for the fix! Left some small comments on style.

Fixed. Please have a review again. @zhuohan123
PR about QKV gemm fuse in gpt_bigcode will be committed later.

zhuohan123

LGTM! Thank you for your contribution!

vllm-project#806) ### What this PR does / why we need it? 1. Fix V1 error found by [nightly_ci](https://github.com/vllm-project/vllm-ascend/actions/runs/14950004754/job/41998136610), broken by [[v1] Pass BlockTable and KVCacheSpec to AttentionMetadataBuilders vllm-project#17483](vllm-project#17483), make `InputBatch` parameter consistent with vllm. 2. Disable benmark and fix it in upstream. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed --------- Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com>

Fix mqa is false case in gpt_bigcode

6e28b49

zhuohan123 requested changes Aug 21, 2023

View reviewed changes

fix style

fa0ccb0

zhaoyang-star requested a review from zhuohan123 August 22, 2023 00:33

zhuohan123 approved these changes Aug 22, 2023

View reviewed changes

format

60a5f45

zhuohan123 merged commit 4f85847 into vllm-project:main Aug 22, 2023

This was referenced Aug 22, 2023

WizardCoder 15B not given proper output #761

Closed

Fix 'GPTBigCodeForCausalLM' object has no attribute 'tensor_model_parallel_world_size' #827

Merged

zhaoyang-star deleted the fix_gpt_bigcode branch August 23, 2023 00:59

randxie pushed a commit to randxie/vllm that referenced this pull request Aug 29, 2023

Fix mqa is false case in gpt_bigcode (vllm-project#806)

bd2f3f6

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

Fix mqa is false case in gpt_bigcode (vllm-project#806)

0ded06b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix mqa is false case in gpt_bigcode #806

Fix mqa is false case in gpt_bigcode #806

zhaoyang-star commented Aug 20, 2023

Uh oh!

zhuohan123 left a comment

Uh oh!

zhuohan123 Aug 21, 2023

Uh oh!

zhuohan123 Aug 21, 2023

Uh oh!

zhaoyang-star commented Aug 22, 2023

Uh oh!

zhuohan123 left a comment

Uh oh!

Uh oh!

		self.num_heads = \
		total_num_heads // self.tensor_model_parallel_world_size

		num_heads = \
		total_num_heads // self.tensor_model_parallel_world_size

Uh oh!

Fix mqa is false case in gpt_bigcode #806

Fix mqa is false case in gpt_bigcode #806

Conversation

zhaoyang-star commented Aug 20, 2023

Uh oh!

zhuohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

zhuohan123 Aug 21, 2023

Choose a reason for hiding this comment

Uh oh!

zhuohan123 Aug 21, 2023

Choose a reason for hiding this comment

Uh oh!

zhaoyang-star commented Aug 22, 2023

Uh oh!

zhuohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!