-
Notifications
You must be signed in to change notification settings - Fork 502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] output diff when temperature set zero #1688
Comments
diff first_output second_output -y -W 196
diff rate: 10/128=7.8% |
You may need to set Also, as split-kv is taking effect automatically, variable batch size and sequence length at runtime may result in different split-kv factor. This will lead to differnt accumulation order and thus differnt outcome. |
In the lmdeploy/benchmark/profile_restful_api.py Line 226 in 21be189
lmdeploy/benchmark/profile_restful_api.py Line 67 in 21be189
Do I still need to set
In terms of design and implementation, do we have to ensure that when temperature is 0, under batch inference, the results of the two requests are completely consistent? |
TurboMind will support |
Checklist
Describe the bug
I used the latest code from LMDeploy to run the vicuna 13b model. The client used temperature 0 and made two requests, resulting in some differences in the output.
Theoretically, when the temperature is 0, multiple requests should yield consistent results without any differences.
Reproduction
benchmark/profile_restful_api.py
Environment
Error traceback
No response
The text was updated successfully, but these errors were encountered: