New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ROCm] Enable deterministic rocBLAS mode #48654
Conversation
💊 CI failures summary and remediationsAs of commit f2afa4c (more details on the Dr. CI page): ✅ None of the CI failures appear to be your fault 💚
🚧 1 ongoing upstream failure:These were probably caused by upstream breakages that are not fixed yet:
This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 4 times. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if I call rocBLAS with deterministic=True
, then set it to False
, and call rocBLAS again? Would the performance of the second rocBLAS hurt by this?
This handle is not just used once and discarded. It will be returned to the pool when the operation finishes, so later the rocblas_atomics_not_allowed
will remain in the handle until the end of PyTorch process.
Thanks @zasdfgbnm for the catch. In the case when the global setting is toggled after the handle is returned to the pool, if it is used again, it will still use no_atomics mode. Just added the check to query the mode each time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: The PR adds a feature to disable atomics in rocblas calls thereby making the output deterministic when it is expected in pyTorch. This mode of rocBLAS can be exercised using the global setting `torch.set_deterministic(True)` cc: ezyang jeffdaily sunway513 Pull Request resolved: pytorch#48654 Reviewed By: bdhirsh Differential Revision: D25272296 Pulled By: ezyang fbshipit-source-id: 70400572b0ab37c6db52636584de0ae61bb5270a
The PR adds a feature to disable atomics in rocblas calls thereby making the output deterministic when it is expected in pyTorch. This mode of rocBLAS can be exercised using the global setting
torch.set_deterministic(True)
cc: @ezyang @jeffdaily @sunway513