New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize softmax warp impl #4977
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
jackalcooper
force-pushed
the
dev_warp_softmax
branch
from
June 2, 2021 07:36
2b08c61
to
1c8ac50
Compare
jackalcooper
force-pushed
the
dev_warp_softmax
branch
from
June 2, 2021 10:47
1c8ac50
to
2b08c61
Compare
…flow into dev_warp_softmax
guo-ran
requested review from
oneflow-ci-bot
and removed request for
oneflow-ci-bot and
lixinqi
June 18, 2021 04:57
liujuncheng
approved these changes
Jun 19, 2021
…dev_warp_softmax
…flow into dev_warp_softmax
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
当col_size比较小时,不再由一个warp处理一行,而是根据col_size大小决定warp/ half warp/ 1/4 warp等线程处理1行或2行,进一步提升性能