New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix ZeroDivisionError in utils.bottleneck #11987
Conversation
torch/utils/bottleneck/__main__.py
Outdated
pct_diff = cuda_prof_exec_time - cpu_prof_exec_time / cuda_prof_exec_time | ||
if abs(pct_diff) > 0.05: | ||
print_autograd_prof_summary(autograd_prof_cpu, 'CPU', autograd_prof_sortby, autograd_prof_topk) | ||
if cuda_prof_exec_time > 1e-6: |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/utils/bottleneck/__main__.py
Outdated
if abs(pct_diff) > 0.05: | ||
print_autograd_prof_summary(autograd_prof_cpu, 'CPU', autograd_prof_sortby, autograd_prof_topk) | ||
if cuda_prof_exec_time > 1e-6: | ||
pct_diff = cuda_prof_exec_time - cpu_prof_exec_time / cuda_prof_exec_time |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/utils/bottleneck/__main__.py
Outdated
pct_diff = cuda_prof_exec_time - cpu_prof_exec_time / cuda_prof_exec_time | ||
if abs(pct_diff) > 0.05: | ||
print_autograd_prof_summary(autograd_prof_cpu, 'CPU', autograd_prof_sortby, autograd_prof_topk) | ||
if cuda_prof_exec_time > 1e-6: |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/utils/bottleneck/__main__.py
Outdated
pct_diff = cuda_prof_exec_time - cpu_prof_exec_time / cuda_prof_exec_time | ||
if abs(pct_diff) > 0.05: | ||
print_autograd_prof_summary(autograd_prof_cpu, 'CPU', autograd_prof_sortby, autograd_prof_topk) | ||
if cuda_prof_exec_time > 0.0: |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change the check to check if the profiling result is empty. Thank you for noticing the problem and submitting a fix, @egg-west!
torch/utils/bottleneck/__main__.py
Outdated
pct_diff = cuda_prof_exec_time - cpu_prof_exec_time / cuda_prof_exec_time | ||
if abs(pct_diff) > 0.05: | ||
print_autograd_prof_summary(autograd_prof_cpu, 'CPU', autograd_prof_sortby, autograd_prof_topk) | ||
if cuda_prof_exec_time > 0.0: |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/utils/bottleneck/__main__.py
Outdated
pct_diff = cuda_prof_exec_time - cpu_prof_exec_time / cuda_prof_exec_time | ||
if abs(pct_diff) > 0.05: | ||
print_autograd_prof_summary(autograd_prof_cpu, 'CPU', autograd_prof_sortby, autograd_prof_topk) | ||
if cuda_prof_exec_time > 0.0: |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix the lint, other than that, lgtm. Thanks @egg-west!
@apaszke PTAL. |
@egg-west could you rebase this on top of master so that the circle-ci tests run? You can do that with:
|
ZeroDivisionError occurs when cuda_prof_exec_time is small enough. This situation is normal for a project that has little CUDA work. Or someone does not make his work transferred to CUDA successfully. In this time he profiles the code, this error occurs.
take the advice from advice and fix a newly discovered problem.
Change the check of `cuda_prof_exec_time` to check the length of cpu profile events.
line 230: remove space for blank line.
@zou3519 Thank you for all the help. I have done rebase, but there are still 2 test failed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
soumith is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
ZeroDivisionError occurs when
cuda_prof_exec_time
is small enough.This situation is normal for a project that has little CUDA work.
Or someone does not make his work transferred to CUDA successfully. In this time he profiles the code, this error occurs.