Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autograd engine, only enqueue task when it is fully initialized #50164

Closed
wants to merge 1 commit into from

Conversation

albanD
Copy link
Collaborator

@albanD albanD commented Jan 6, 2021

This solves a race condition where the worker thread might
see a partially initialized graph_task

Fixes #49652

I don't know how to reliably trigger the race so I didn't add any test. But the rocm build flakyness (it just happens to race more often on rocm builds) should disappear after this PR.

This solves a race condition where the worker thread might
see a partially initialized graph_task
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jan 6, 2021

💊 CI failures summary and remediations

As of commit 3523d50 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

This comment has been revised 3 times.

@codecov
Copy link

codecov bot commented Jan 7, 2021

Codecov Report

Merging #50164 (3523d50) into master (e4d596c) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master   #50164      +/-   ##
==========================================
- Coverage   80.66%   80.66%   -0.01%     
==========================================
  Files        1899     1899              
  Lines      206066   206067       +1     
==========================================
- Hits       166224   166223       -1     
- Misses      39842    39844       +2     

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@albanD has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@albanD merged this pull request in fc2ead0.

hwangdeyu pushed a commit to hwangdeyu/pytorch that referenced this pull request Jan 14, 2021
…rch#50164)

Summary:
This solves a race condition where the worker thread might
see a partially initialized graph_task

Fixes pytorch#49652

I don't know how to reliably trigger the race so I didn't add any test. But the rocm build flakyness (it just happens to race more often on rocm builds) should disappear after this PR.

Pull Request resolved: pytorch#50164

Reviewed By: zou3519

Differential Revision: D25824954

Pulled By: albanD

fbshipit-source-id: 6a3391753cb2afd2ab415d3fb2071a837cc565bb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ROCm CI is intermittently failing with std::out_of_range
3 participants