No text is shown when using MII in fp32 and greedy search #102

marshmellow77 · 2022-11-17T06:19:09Z

When using greedy search (do_sample=False) and dtype=fp32 the generated tokens are not shown in the output of the query. I believe the text generation is happening, because different values for max_new_tokens lead to different runtimes for the query. See this notebook as a minimal example.

possibly related to Error when trying MII with dtype=fp32 with sampling #101
1 T4, GPU memory 16GB
deepspeed-mii version 0.0.3
transformers version 4.24.0
Amazon Linux 2

The text was updated successfully, but these errors were encountered:

mrwyattii · 2022-11-21T23:14:15Z

We don't currently support fp32 for the Bloom models in MII & DeepSpeed-Inference. I believe this is because the checkpoints are all in half precision. We correctly check the configs with Bloom-176B model, but fail to do so for the smaller variants. I added a fix for this in #107

I just ran your example using fp16 and I see output.

marshmellow77 · 2022-11-25T15:12:58Z

Thansk for adding a warning.

mrwyattii mentioned this issue Nov 21, 2022

Error when trying MII with dtype=fp32 with sampling #101

Closed

marshmellow77 mentioned this issue Nov 25, 2022

OOM Error when deploying BLOOM-3B on 16GB GPU via MII #103

Closed

marshmellow77 closed this as completed Nov 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No text is shown when using MII in fp32 and greedy search #102

No text is shown when using MII in fp32 and greedy search #102

marshmellow77 commented Nov 17, 2022

mrwyattii commented Nov 21, 2022

marshmellow77 commented Nov 25, 2022

No text is shown when using MII in fp32 and greedy search #102

No text is shown when using MII in fp32 and greedy search #102

Comments

marshmellow77 commented Nov 17, 2022

mrwyattii commented Nov 21, 2022

marshmellow77 commented Nov 25, 2022