We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
萌新~请教各位大佬,我看到文档中说需要torch2.0以上达到最佳推理性能,请问是体现在速度方面吗?会不会影响模型的推理效果呢? 谢谢大佬们!
The text was updated successfully, but these errors were encountered:
主要是 PyTorch 2.0 以后的版本才有 Flash Attention实现,会影响 attention 的速度和显存占用, 不会影响结果。
Sorry, something went wrong.
No branches or pull requests
萌新~请教各位大佬,我看到文档中说需要torch2.0以上达到最佳推理性能,请问是体现在速度方面吗?会不会影响模型的推理效果呢?
谢谢大佬们!
The text was updated successfully, but these errors were encountered: