New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP][DataLoader] Prototype of SamplerIterableDataset #49363
Conversation
[ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit 7283772 (more details on the Dr. CI page):
🕵️ 5 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages: pytorch_linux_xenial_py3_6_gcc5_4_build (1/5)Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)
|
I am thinking Sampler can be eliminated as containers can be put at any position in the sequence. We can shuffle or make different batch before fetching data to do sampling over indexes or after fetching to do bucket batching based on data. But, if we decide to add samplerDataset as a wrapper for original Sampler or user's customized sampler, I would support to add this wrapper class. In general, for our new implementation, I would not suggest using SamplerDataset. |
[ghstack-poisoned]
ghstack-source-id: e2099d04d07f6da90f1a7a5da039a24f12f4d56a Pull Request resolved: #49363
[ghstack-poisoned]
ghstack-source-id: 6a35fffc2a859bc9bb0792ab3f3979778d4a4ec1 Pull Request resolved: #49363
I suggest we need to put one more example on how to implement sampler as pure IterDataset and without old Sampler class. |
It makes sense. I will update our RFC. |
[ghstack-poisoned]
ghstack-source-id: dd31d19d561347b5556c9d0e031dea5969217030 Pull Request resolved: #49363
[ghstack-poisoned]
ghstack-source-id: 5e696324f49a2da4194cce9cfa456478fd7495f0 Pull Request resolved: #49363
Differential Revision: [D25623637](https://our.internmc.facebook.com/intern/diff/D25623637) [ghstack-poisoned]
Differential Revision: [D25623637](https://our.internmc.facebook.com/intern/diff/D25623637) [ghstack-poisoned]
ghstack-source-id: bcb27cbd8925489d4230336151631c51cd0907b5 Pull Request resolved: #49363
Ready to land |
Summary: Pull Request resolved: pytorch#49363 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D25623637 Pulled By: ejguan fbshipit-source-id: 9155d27d1fc91996b74110795cc73f1da0eedd44
Stack from ghstack:
Differential Revision: D25623637