New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add prefetch_factor for multiprocessing prefetching process #41130
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ok to me (just small edit in the doc)
Will wait for @ssnl's approval.
Can we land this since another reviewer is not responding to this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given #13023 it looks good on his side.
Just a small update on the doc phrasing and it will be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BC breaking by adding positional arguments in the middle of a signature
Hi, I don't want to sound critical. But it is less than 24 hours from opening this PR when you commented. Various people have different responsibilities that may prevent them from responding quickly (personally I am in the middle of planning a move). I think it would be better if we are a bit more patient to get more thorough reviews than to rush a patch in. Thanks for understanding! |
Re: the patch Maybe we can make this a keyword only argument? We weren't able to for a lot of these args because we supported py2. But now we can! Also there should be an error if this is set when |
I am so sorry, I was mentioned reviewers randomly and thinking that you maybe were busying in doing other important stuff and didn't mention this issue. I will take care of this next time. |
Am working on this to create a new commit. |
💊 CI failures summary and remediationsAs of commit c6d6779 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 34 times. |
No worries! Thanks for contributing! :D |
I have turned the prefetch factor to keyword-only parameter. Do we really need to raise an error? Please let me know if I wrongly understand the code. Thanks! @ssnl |
@yl-to Yes we still want to raise an error because setting |
@ssnl how could we know if the prefetch factor is default or set by user? Is setting the prefetch_factor default value to |
Just compare against the default value `2`. Similar pattern can be found at
DataLoader.__init__
…On Fri, Jul 10, 2020 at 4:33 PM yl-to ***@***.***> wrote:
@yl-to <https://github.com/yl-to> Yes we still want to raise an error
because setting prefetch_factor has no effect when num_workers = 0 so we
should tell users that they shouldn't set it with num_workers=0. You can
see similar arg check code in DataLoader.__init__.
Last question, how could we know if the prefetch factor is default or set
by user? Is setting the prefetch_factor default value to None initially a
good idea here?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#41130 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLJMZP7ZBRCHX4KL6JY7D3R253K7ANCNFSM4OUZTAAA>
.
|
Thanks! Will create a new commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also need some tests in test_dataloader.py, and revert the submodule change.
Also remember to update the signature here Lines 22 to 25 in 67a4f37
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still need some test, and revert the submodule changes
@ssnl, just to clarify 2 things:
Sorry I am really new to opensource development. |
Re test: For the following, just adding extra pytorch/test/test_dataloader.py Lines 1271 to 1281 in 5f146a4
For the following, it would be good to add a pytorch/test/test_dataloader.py Lines 1057 to 1091 in 5f146a4
pytorch/test/test_dataloader.py Lines 1113 to 1139 in 5f146a4
pytorch/test/test_dataloader.py Lines 1148 to 1175 in 5f146a4
|
For submodule change, you can see it if you click on the "Files Changed" tab on this webpage. |
e8d0f6c
to
0254129
Compare
@ssnl Hi Tongzhou, tests added and rebased the branch. Please help to review if you have time. Thanks! |
One minor thing! Looks great otherwise. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot! Waiting on CI.
Seems the CI has been passed, please help to land it when you have time, thanks! @ssnl |
@ssnl Hi Tongzhou, sorry for the bothering, is there any signal or prompt for a merging commit? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
My bad I missed the notification from Simon's accept.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@albanD has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
fix #40604
Add parameter to Dataloader to configure the per-worker prefetch number.
Before this edit, the prefetch process always prefetch 2 * num_workers data items, this commit help us make this configurable, e.x. you can specify to prefetch 10 * num_workers data items.