New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Dataloader with default parameter device #65402
Conversation
Hi @jeejakp12! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks! |
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit c955627 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
aa4abdc
to
a568e8c
Compare
@@ -45,18 +45,30 @@ def _pin_memory_loop(in_queue, out_queue, device_id, done_event): | |||
del r # save memory | |||
|
|||
|
|||
def pin_memory(data): | |||
def pin_memory(data, device=""): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is device="" an idiom that we use in other places? or is device=None better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be device=None
as more pythonic
if (len(device) == 0): | ||
return {k: pin_memory(sample) for k, sample in data.items()} | ||
else: | ||
return {k: pin_memory(sample, device) for k, sample in data.items()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VitalyFedyunin is it preferable to check device==None here and omit from call to pin_memory? or should we push device=None check into pin_memory and always pass device parameter (and simplify this code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can always pass device=device in recurrent calls, without additional checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix it.
@@ -45,18 +45,30 @@ def _pin_memory_loop(in_queue, out_queue, device_id, done_event): | |||
del r # save memory | |||
|
|||
|
|||
def pin_memory(data): | |||
def pin_memory(data, device=""): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be device=None
as more pythonic
elif hasattr(data, "pin_memory"): | ||
return data.pin_memory() | ||
return data.pin_memory() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary indent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix it
if isinstance(data, torch.Tensor): | ||
return data.pin_memory() | ||
if (len(device) == 0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be device is None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix it
torch/utils/data/dataloader.py
Outdated
@@ -154,6 +156,7 @@ class DataLoader(Generic[T_co]): | |||
pin_memory: bool | |||
drop_last: bool | |||
timeout: float | |||
device: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As device here applied to pin_memory only, please call it pin_memory_device
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix it.
torch/utils/data/dataloader.py
Outdated
@@ -491,7 +496,13 @@ def __init__(self, loader: DataLoader) -> None: | |||
self._index_sampler = loader._index_sampler | |||
self._num_workers = loader.num_workers | |||
self._prefetch_factor = loader.prefetch_factor | |||
self._pin_memory = loader.pin_memory and torch.cuda.is_available() | |||
# for CUDA, behaviour is default. for other backends |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add expection when pin_memory_device is specified, but pin_memory is false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added warn msg, in case pin_memory_device is set and pin_memory flag its set to false. as this a valid case.
Please add tests |
d9c5073
to
97191d2
Compare
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks! |
1 similar comment
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks! |
fd55645
to
b5488ce
Compare
CI Flow Status⚛️ CI FlowRuleset - Version:
You can add a comment to the PR and tag @pytorchbot with the following commands: # ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun
# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow For more information, please take a look at the CI Flow Wiki. |
@VitalyFedyunin @wconstab i have addressed the review comments. |
Looks good to me, but i'll let @VitalyFedyunin stamp. |
@VitalyFedyunin i have addressed the review comments, can you please check if rework patch is fine. |
b5488ce
to
7488722
Compare
@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Sorry it takes so much time to land. Infra system issues. Can you please rebase (to fix merge conflicts). |
7488722
to
a1de099
Compare
@VitalyFedyunin i have rebased and pushed the patch. Thanks |
@wconstab @VitalyFedyunin can this patch be merged, it is already approved. |
Sure, can you please rebase it once again to avoid merge conflicts? |
pin_memory, has optional device parameter to specify which device you want to pin for. With this above change the Dataloader will work only for CUDA backend. To add support for other backend which supports pinned memory, dataloader is updated with device as optional parameter. Signed-off-by: Jeeja <jeejakp@habana.ai>
a1de099
to
c955627
Compare
@VitalyFedyunin rebased and updated the patch. |
@VitalyFedyunin can this patch be merged? already rebased it. |
@VitalyFedyunin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: pin_memory, has optional device parameter to specify which device you want to pin for. With this above change the Dataloader will work only for CUDA backend. To add support for other backend which supports pinned memory, dataloader is updated with device as optional parameter. Fixes #{issue number} Pull Request resolved: #65402 Reviewed By: zou3519 Differential Revision: D32282204 Pulled By: VitalyFedyunin fbshipit-source-id: e2e09876969af108d0db38af7c2d1b2f1cfa9858
Hey @jeejakp12. |
pin_memory, has optional device parameter to specify
which device you want to pin for. With this above change
the Dataloader will work only for CUDA backend. To add
support for other backend which supports pinned memory,
dataloader is updated with device as optional parameter.
Fixes #{issue number}