Drop to float16 if bfloat16 is not supported #1901

acebot712 · 2023-12-03T06:27:36Z

Issue:- #1157

Instead of throwing an error if the GPU compute capability is not supported for bfloat16, vLLM should throw a warning but use float16 instead. This helps in Colab notebooks where the compute of the T4 instance is set to 7.5 and not 8 so vLLM does not work trivially.

WoosukKwon · 2023-12-03T06:32:55Z

I personally prefer to ask users to explicitly specify dtype in this case, since otherwise it can affect the accuracy of the model silently. WDYT @zhuohan123 @simon-mo?

zhuohan123 · 2023-12-03T06:35:04Z

I personally prefer to ask users to explicitly specify dtype in this case, since otherwise it can affect the accuracy of the model silently. WDYT @zhuohan123 @simon-mo?

I think the fallback is fine if we explicitly print a warning?

simon-mo · 2023-12-03T07:23:22Z

+1. In this case warning is fine. The accuracy difference between bfloat and float should not be too crazy.

acebot712 · 2023-12-03T07:27:59Z

I came across this issue while trying to use langchain's vllm support on Colab using a T4 GPU. In case we are not going through with this change do let me know alternatively the best way to go about in the following screenshot. Thanks a lot.

WoosukKwon · 2023-12-03T07:43:47Z

@acebot712 You can simply do this:

llm = LLM(model="mistralai/Mistral-7B-Instruct-v0.1", dtype="half")

Yard1 · 2023-12-03T21:19:24Z

I agree with @WoosukKwon, I think in general we should avoid doing any "magic" that can change the outputs, even if slightly. I would suggest to instead modify the exception message to suggest the user to set the dtype to float16 themselves.

chenxu2048 · 2023-12-04T08:00:06Z

I would suggest to instead modify the exception message to suggest the user to set the dtype to float16 themselves.

We can check device capability outside the vLLM and choose the dtype depending on device. The codes could be:

llm = LLM(
    model="mistralai/Mistral-7B-Instruct-v0.1",
    dtype="float16" if torch.cuda.get_device_capability()[0] < 8 else "bfloat16",
)

WoosukKwon · 2023-12-12T18:29:03Z

Hi @acebot712 Thanks for bringing up this issue and submitting the PR! We've decided to keep the current behavior; To avoid silent accuracy changes, vLLM will ask users to set dtype="half" in such a case. Could you revert the change in this PR and add a warning that asks users to set dtype="half"? Thanks.

chuanzhubin · 2024-01-08T04:54:05Z

As a beginner, I just experienced this little setback. Finally, of course, adding the parameter solved it: --dtype="half"
But the premise is that, clever as I am, I found this issue. For the sake of future beginners, it is suggested to provide guidance in the error message, suggesting users to add --dtype="half" to specify the data type. @WoosukKwon

simon-mo · 2024-01-08T05:29:42Z

@chuanzhubin A better error message is indeed a great idea? Would you be interested in submitting a PR?

T4 requirement handling

59a3aee

acebot712 added 4 commits December 3, 2023 12:20

yapf formatting

15710bc

yapf formatting

86062c7

docstrig addition

35e08bf

Reformatting with format.sh

49f59ad

WoosukKwon closed this Dec 16, 2023

chuanzhubin mentioned this pull request Jan 8, 2024

Update a more user-friendly error message, offering more considerate advice for beginners, when using V100 GPU #1901 #2374

Merged

This was referenced Feb 5, 2024

[DOCS] tutorial on using Notus and Llama2 to generate finance preference dataset argilla-io/distilabel#325

Open

[DOCS] add clarification in distilabel vLLM reference to specify dtype argilla-io/distilabel#326

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop to float16 if bfloat16 is not supported #1901

Drop to float16 if bfloat16 is not supported #1901

acebot712 commented Dec 3, 2023 •

edited

Loading

WoosukKwon commented Dec 3, 2023 •

edited

Loading

zhuohan123 commented Dec 3, 2023

simon-mo commented Dec 3, 2023

acebot712 commented Dec 3, 2023

WoosukKwon commented Dec 3, 2023

Yard1 commented Dec 3, 2023

chenxu2048 commented Dec 4, 2023

WoosukKwon commented Dec 12, 2023

chuanzhubin commented Jan 8, 2024

simon-mo commented Jan 8, 2024

Drop to float16 if bfloat16 is not supported #1901

Drop to float16 if bfloat16 is not supported #1901

Conversation

acebot712 commented Dec 3, 2023 • edited Loading

Issue:- #1157

WoosukKwon commented Dec 3, 2023 • edited Loading

zhuohan123 commented Dec 3, 2023

simon-mo commented Dec 3, 2023

acebot712 commented Dec 3, 2023

WoosukKwon commented Dec 3, 2023

Yard1 commented Dec 3, 2023

chenxu2048 commented Dec 4, 2023

WoosukKwon commented Dec 12, 2023

chuanzhubin commented Jan 8, 2024

simon-mo commented Jan 8, 2024

acebot712 commented Dec 3, 2023 •

edited

Loading

WoosukKwon commented Dec 3, 2023 •

edited

Loading