Fix gibberish outputs of GPT-BigCode-based models #676

HermitSun · 2023-08-04T11:49:22Z

As issue #675 mentioned, GPT-BigCode-based models will produce gibberish outputs.

This is caused by a minor mistake during calculating the number of attention heads. If we are not using a new decoder arch Falcon model and turns on the multi_query option, we should return 1 rather than the default total_num_attention_heads // parallel_config.tensor_parallel_size.

After applying this patch, I believe at least the following models will work normally (tested on A100 with CUDA 11.8):

WizardLM/WizardCoder-15B-V1.0
openchat/opencoderplus
bigcode/starcoder

zhuohan123

Thanks for catching this bug! LGTM!

fix: incorrect bigcode attention heads num

83756fb

zhuohan123 approved these changes Aug 4, 2023

View reviewed changes

zhuohan123 merged commit 621980b into vllm-project:main Aug 4, 2023

HermitSun mentioned this pull request Aug 5, 2023

Starcoder output is noise (output gibberish) ( not use model.safetensors) #665

Closed

zhuohan123 mentioned this pull request Aug 7, 2023

GPTBigCode models generate noise results #675

Closed

aranku mentioned this pull request Aug 14, 2023

decoder may have some bug #733

Closed

HermitSun mentioned this pull request Aug 15, 2023

WizardCoder 15B not given proper output #761

Closed

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

fix: incorrect bigcode attention heads num (vllm-project#676)

d3fdc29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix gibberish outputs of GPT-BigCode-based models #676

Fix gibberish outputs of GPT-BigCode-based models #676

Uh oh!

HermitSun commented Aug 4, 2023 •

edited

Loading

Uh oh!

zhuohan123 left a comment

Uh oh!

Uh oh!

Uh oh!

Fix gibberish outputs of GPT-BigCode-based models #676

Fix gibberish outputs of GPT-BigCode-based models #676

Uh oh!

Conversation

HermitSun commented Aug 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhuohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HermitSun commented Aug 4, 2023 •

edited

Loading