Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug in DeepSpeedMLP #4389

Merged
merged 4 commits into from Oct 4, 2023
Merged

Fix a bug in DeepSpeedMLP #4389

merged 4 commits into from Oct 4, 2023

Conversation

sakogan
Copy link
Contributor

@sakogan sakogan commented Sep 22, 2023

The initialization of data_type in DeepSpeedMLP appears to be incorrect when int8 inference is requested, leading to seemingly wrong text generation.

For instance, running the following command (using inference-test.py from the DeepSpeedExamples repo)

deepspeed --num_gpus 1 inference/huggingface/text-generation/inference-test.py --model bigscience/bloom-7b1 --batch_size 1 --use_kernel --use_meta_tensor --dtype int8

produces the following output:

in=DeepSpeed is a machine learning framework
out=DeepSpeed is a machine learning framework 6 10 tràm kú Tá aixướm de de截útil恩opció 15 Beraz Tá P L Tá 7 80pata terbakar aixjai尼 Tá l'Oficina de al 3'hi aix-Els L Tá T Haut-Commissariat Tá 80juntament de de a 20 10 kú

The proposed fix modifies the change to the data_type initialization introduced in #3425

@loadams
Copy link
Contributor

loadams commented Sep 28, 2023

@sakogan - when you tested this with the DeepSpeedExamples change, did it produce quality output?

@sakogan
Copy link
Contributor Author

sakogan commented Sep 28, 2023

@loadams Not sure what DeepSpeedExamples change you are referring. I tested it with the latest version of the DeepSpeedExamples master branch. With the proposed fix, the output is valid (as well as when using --dtype float16 in that command, with or without the fix)

@loadams
Copy link
Contributor

loadams commented Sep 28, 2023

@loadams Not sure what DeepSpeedExamples change you are referring. I tested it with the latest version of the DeepSpeedExamples master branch. With the proposed fix, the output is valid (as well as when using --dtype float16 in that command, with or without the fix)

@sakogan - apologies, I meant if you tested this change with the error repro you had in DeepSpeedExamples. But sounds like you did, thanks!

@loadams loadams added this pull request to the merge queue Oct 4, 2023
Merged via the queue into microsoft:master with commit 7099f99 Oct 4, 2023
15 checks passed
mauryaavinash95 pushed a commit to mauryaavinash95/DeepSpeed that referenced this pull request Oct 9, 2023
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants