-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Fix test_ind_worker_queue by setting max_num_worker based on system resource #63779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…esource [ghstack-poisoned]
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 12a23e2 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
…on system resource" [ghstack-poisoned]
…on system resource" Fixes #63657 [ghstack-poisoned]
@ejguan has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
…on system resource" Fixes #63657 Differential Revision: [D30494185](https://our.internmc.facebook.com/intern/diff/D30494185) [ghstack-poisoned]
@ejguan has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
…on system resource" Fixes #63657 Differential Revision: [D30494185](https://our.internmc.facebook.com/intern/diff/D30494185) [ghstack-poisoned]
@ejguan has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
…on system resource" Fixes #63657 Prevent freezing test for CI. Confirm this test is flacky due to the limitation of CPU per machine. Did reproduce the hanging workers with ASAN build of PyTorch periodically. After this PR, the number of worker will be limited based on OS. Differential Revision: [D30494185](https://our.internmc.facebook.com/intern/diff/D30494185) [ghstack-poisoned]
@ejguan has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
for batch_size in (8, 16, 32, 64): | ||
for num_workers in range(1, 6): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: min(6, max_num_workers)
…on system resource" Fixes #63657 Prevent freezing test for CI. Confirm this test is flacky due to the limitation of CPU per machine. Did reproduce the hanging workers with ASAN build of PyTorch periodically. After this PR, the number of worker will be limited based on OS. Differential Revision: [D30494185](https://our.internmc.facebook.com/intern/diff/D30494185) [ghstack-poisoned]
@ejguan has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Fixes #63657
Stack from ghstack:
Prevent freezing test for CI.
Confirm this test is flacky due to the limitation of CPU per machine. Did reproduce the hanging workers with ASAN build of PyTorch periodically.
After this PR, the number of worker will be limited based on OS.
Differential Revision: D30494185
cc @ssnl @VitalyFedyunin @ejguan