Adding flags to datamodules #388

briankosw · 2020-11-20T05:04:36Z

What does this PR do?

Adds the flags shuffle, drop_last, and pin_memory to datamodules.

Fixes #245

Before submitting

Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
Did you make sure to update the documentation with your changes?
[] Did you write any new necessary tests?
Did you verify new and existing tests pass locally with your changes?
If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

Is this pull request ready for review? (if not, please submit in draft mode)

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

codecov · 2020-11-20T05:05:49Z

Codecov Report

Merging #388 (ebaaf18) into master (13863cc) will decrease coverage by 0.02%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master     #388      +/-   ##
==========================================
- Coverage   80.79%   80.77%   -0.03%     
==========================================
  Files         100      101       +1     
  Lines        5728     5706      -22     
==========================================
- Hits         4628     4609      -19     
+ Misses       1100     1097       -3

Flag	Coverage Δ
cpu	`25.20% <ø> (-0.03%)`	⬇️
pytest	`25.20% <ø> (-0.03%)`	⬇️
unittests	`80.05% <ø> (-0.12%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
pl_bolts/optimizers/lars_scheduling.py	`78.72% <0.00%> (-17.03%)`	⬇️
pl_bolts/datasets/kitti_dataset.py	`34.61% <0.00%> (-0.68%)`	⬇️
pl_bolts/datasets/ssl_amdim_datasets.py	`74.32% <0.00%> (-0.68%)`	⬇️
pl_bolts/utils/semi_supervised.py	`96.77% <0.00%> (-0.06%)`	⬇️
pl_bolts/datasets/cifar10_dataset.py	`96.77% <0.00%> (-0.04%)`	⬇️
pl_bolts/datasets/dummy_dataset.py	`100.00% <0.00%> (ø)`
pl_bolts/utils/__init__.py	`100.00% <0.00%> (ø)`
pl_bolts/datamodules/experience_source.py	`95.93% <0.00%> (+0.06%)`	⬆️
pl_bolts/datasets/imagenet_dataset.py	`20.11% <0.00%> (+0.70%)`	⬆️
pl_bolts/datasets/mnist_dataset.py	`57.14% <0.00%> (+14.28%)`	⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 13863cc...ebaaf18. Read the comment docs.

briankosw · 2020-12-01T05:33:48Z

@nateraw would love to get your feedback on this PR. Specifically, I wanted some feedback on whether the flags for validation and test dataloaders should be parametrized the same way as the train dataloader. For example, previously shuffle was set to True for train and False for validation and test. Should they both be set as the newly added flag? Also, I set the flags' defaults as shuffle=False, pin_memory=False, and drop_last=False, as I think the user should explicitly specify if these flags should be turned on.

pep8speaks · 2020-12-01T05:53:42Z

Hello @briankosw! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-12-13 10:18:17 UTC

akihironitta

@briankosw Thank you for your contribution as always! It seems the doc tests failed. Would you mind having a look?

pl_bolts/datamodules/kitti_datamodule.py

akihironitta · 2020-12-07T22:26:30Z

@briankosw mind resolving the conflicts, too?

briankosw · 2020-12-13T09:43:24Z

Hey @akihironitta,

Do you think shuffle=False should be hardcoded in there for val and test loaders?
Do you find the default values that I've given for shuffle, pin_memory, and drop_last to be reasonable?

akihironitta · 2020-12-14T07:19:04Z

Do you think shuffle=False should be hardcoded in there for val and test loaders?

Since they are not used for training, I guess hardcoding shuffle=False sounds fine...
@Borda What do you think about the above?

Do you find the default values that I've given for shuffle, pin_memory, and drop_last to be reasonable?

Yes, it looks good to me as is :] As DataLoader uses False by default, let's keep them all False. https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader

Borda · 2020-12-14T21:10:33Z

@akihironitta mind check if your comments were resolved, otherwise it lgtm

akihironitta

@briankosw LGTM! Thank you :]

briankosw · 2020-12-15T07:42:23Z

Do you think shuffle=False should be hardcoded in there for val and test loaders?

Since they are not used for training, I guess hardcoding shuffle=False sounds fine...

Sounds good! I went ahead and hardcoded shuffle=False for validation and testing. Do I need to add anything to the documentation or the changelog for this PR?

* Adding flags to datamodules * Finishing up changes * Fixing syntax error * More syntax errors * More * Adding drop_last flag to sklearn test * Adding drop_last flag to sklearn test * Updating doc for reflect drop_last=False * Adding flags to datamodules * Finishing up changes * Fixing syntax error * More syntax errors * More * Adding drop_last flag to sklearn test * Adding drop_last flag to sklearn test * Updating doc for reflect drop_last=False * Cleaning up parameters and docstring * Fixing syntax error * Fixing documentation * Hardcoding shuffle=False for val and test

* Add BaseDataModule * Add pre-commit hooks * Refactor cifar10_datamodule * Move torchvision warning * Refactor binary_mnist_datamodule * Refactor fashion_mnist_datamodule * Fix errors * Remove VisionDataset type hint so CI base testing does not fail (torchvision is not installed there) * Implement Nate's suggestions * Remove train and eval batch size because it brakes a lot of tests * Properly add transforms to train and val dataset * Add num_samples property to cifar10 dm * Add tesats and docs * Fix flake8 and codafactor issue * Update changelog * Fix isort * Add typing * Rename to VisionDataModule * Remove transform_lib type annotation * suggestions * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> * Add flags from #388 to API * Make tests work * Move _TORCHVISION_AVAILABLE check * Update changelog * Fix CI base testing * Fix CI base testing * Apply suggestions from code review Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>

@akihironitta

* Add DCGAN module * Undo black on conf.py * Add tests for DCGAN * Fix flake8 and codefactor * Add types and small refactoring * Make image sampler callback work * Upgrade DQN to use .log (#404) * Upgrade DQN to use .log * remove unused * pep8 * fixed other dqn * fix loss test case for batch size variation (#402) * Decouple DataModules from Models - CPCV2 (#386) * Decouple dms from CPCV2 * Update tests * Add docstrings, fix import, and update changelog * Update transforms * bugfix: batch_size parameter for DataModules remaining (#344) * bugfix: batch_size for DataModules remaining * Update sklearn datamodule tests * Fix default_transforms. Keep internal for every data module * fix typo on binary_mnist_datamodule thanks @akihironitta Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> * Fix a typo/copy paste error (#415) * Just a Typo (#413) missing a ' at the end of dataset='stl10 * Remove unused arguments (#418) * tests: Use cached datasets in LitMNIST and the doctests (#414) * Use cached datasets * Use cached datasets in doctests * clear replay buffer after trajectory (#425) * stale: update label * bugfix: Add missing imports to pl_bolts/__init__.py (#430) * Add missing imports * Add missing imports * Apply isort * Fix CIFAR num_samples (#432) * Add static type checker mypy to the tests and pre-commit hooks (#433) * Add mypy check to GitHub Actions * Run mypy on pl_bolts only * Add mypy check to pre-commit * Add an empty line at the end of files * Update mypy config * Update mypy config * Update mypy config * show Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * missing logo * Add type annotations to pl_bolts/__init__.py (#435) * Run mypy on pl_bolts only * Update mypy config * Add type hints to pl_bolts/__init__.py * mypy Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz> * skip hanging (#437) * Option to normalize latent interpolation images (#438) * add option to normalize latent interpolation images * linspace * update Co-authored-by: ananyahjha93 <ananya@pytorchlightning.ai> * 0.2.6rc1 * Warnings fix (#449) * Revert "Merge pull request #1 from ganprad/warnings_fix" This reverts commit 7c5aaf0. * Fixes warning related np.integer in SklearnDataModule Fixes this warning: ```DeprecationWarning: Converting `np.integer` or `np.signedinteger` to a dtype is deprecated. The current result is `np.dtype(np.int_)` which is not strictly correct. Note that the result depends on the system. To ensure stable results use may want to use `np.int64` or `np.int32```` * Refactor datamodules/datasets (#338) * Remove try: ... except: ... * Fix experience_source * Fix imagenet * Fix kitti * Fix sklearn * Fix vocdetection * Fix typo * Remove duplicate * Fix by flake8 * Add optional packages availability vars * binary_mnist * Use pl_bolts._SKLEARN_AVAILABLE * Apply isort * cifar10 * mnist * cityscapes * fashion mnist * ssl_imagenet * stl10 * cifar10 * dummy * fix city * fix stl10 * fix mnist * ssl_amdim * remove unused DataLoader and fix docs * use from ... import ... * fix pragma: no cover * Fix forward reference in annotations * binmnist * Same order as imports * Move vars from __init__ to utils/__init__ * Remove vars from __init__ * Update vars * Apply isort * update min requirements - PL 1.1.1 (#448) * update min requirements * rc0 * imports * isort * flake8 * 1.1.1 * flake8 * docs * Add missing optional packages to `requirements/*.txt` (#450) * Import matplotlib at the top * Add missing optional packages * Update wandb * Add mypy to requirements * update Isort (#457) * Adding flags to datamodules (#388) * Adding flags to datamodules * Finishing up changes * Fixing syntax error * More syntax errors * More * Adding drop_last flag to sklearn test * Adding drop_last flag to sklearn test * Updating doc for reflect drop_last=False * Adding flags to datamodules * Finishing up changes * Fixing syntax error * More syntax errors * More * Adding drop_last flag to sklearn test * Adding drop_last flag to sklearn test * Updating doc for reflect drop_last=False * Cleaning up parameters and docstring * Fixing syntax error * Fixing documentation * Hardcoding shuffle=False for val and test * Add DCGAN module * Small fixes * Remove DataModules * Update docs * Update docs * Update torchvision import * Import gym as optional package to build docs successfully (#458) * Import gym as optional package * Fix import * Apply isort * bugfix: batch_size parameter for DataModules remaining (#344) * bugfix: batch_size for DataModules remaining * Update sklearn datamodule tests * Fix default_transforms. Keep internal for every data module * fix typo on binary_mnist_datamodule thanks @akihironitta Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> * Option to normalize latent interpolation images (#438) * add option to normalize latent interpolation images * linspace * update Co-authored-by: ananyahjha93 <ananya@pytorchlightning.ai> * update min requirements - PL 1.1.1 (#448) * update min requirements * rc0 * imports * isort * flake8 * 1.1.1 * flake8 * docs * Apply suggestions from code review * Apply suggestions from code review * Add docs * Use LSUN instead of CIFAR10 * Update TensorboardGenerativeModelImageSampler * Update docs with lsun * Update test * Revert TensorboardGenerativeModelImageSampler changes * Remove ModelCheckpoint callback and nrow=5 arg * Apply suggestions from code review * Fix test_dcgan * Apply yapf * Apply suggestions from code review Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: Sidhant Sundrani <sidhant96@outlook.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> Co-authored-by: Héctor Laria <hector_laria@hotmail.com> Co-authored-by: Bartol Karuza <bartol.k@gmail.com> Co-authored-by: Happy Sugar Life <777Jonathansum@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz> Co-authored-by: ananyahjha93 <ananya@pytorchlightning.ai> Co-authored-by: Pradeep Ganesan <ganprad@users.noreply.github.com> Co-authored-by: Brian Ko <briankosw@gmail.com> Co-authored-by: Christoph Clement <christoph.clement@artorg.unibe.ch>

Adding flags to datamodules

0abd53a

akihironitta added the datamodule Anything related to datamodules label Nov 24, 2020

briankosw marked this pull request as ready for review December 1, 2020 05:31

briankosw added 3 commits December 1, 2020 14:46

Finishing up changes

0cc2677

Fixing syntax error

e25dc64

More syntax errors

90ba598

briankosw added 4 commits December 1, 2020 14:55

More

89c1d60

Adding drop_last flag to sklearn test

2d4f306

Adding drop_last flag to sklearn test

6124ec7

Updating doc for reflect drop_last=False

f909ad2

Borda requested review from annikabrundyn, akihironitta and ananyahjha93 December 5, 2020 01:38

akihironitta suggested changes Dec 7, 2020

View reviewed changes

pl_bolts/datamodules/kitti_datamodule.py Show resolved Hide resolved

briankosw added 10 commits December 13, 2020 18:06

Adding flags to datamodules

5d3565b

Finishing up changes

685fdac

Fixing syntax error

e70bb8e

More syntax errors

0f50cb5

More

d1b304e

Adding drop_last flag to sklearn test

d5bfe48

Adding drop_last flag to sklearn test

73b9c56

Updating doc for reflect drop_last=False

abc0dfe

Merging

e91986d

Cleaning up parameters and docstring

e9b7914

Fixing syntax error

443d669

Fixing documentation

dd234e1

akihironitta self-assigned this Dec 14, 2020

Borda approved these changes Dec 14, 2020

View reviewed changes

Borda requested a review from akihironitta December 14, 2020 21:10

akihironitta approved these changes Dec 15, 2020

View reviewed changes

Hardcoding shuffle=False for val and test

ebaaf18

Borda merged commit 7beb933 into Lightning-Universe:master Dec 16, 2020

akihironitta mentioned this pull request Dec 17, 2020

Refactor Vision DataModules #400

Merged

8 tasks

chris-clem added a commit to chris-clem/pytorch-lightning-bolts that referenced this pull request Dec 17, 2020

Add flags from Lightning-Universe#388 to API

54114c5

briankosw deleted the feature/data_module_flag branch December 21, 2020 08:02

Borda added this to the v0.3 milestone Jan 18, 2021

This was referenced Mar 13, 2021

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.0 in /python/requirements suquark/ray#8

Closed

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.0 in /python/requirements sven1977/ray#8

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding flags to datamodules #388

Adding flags to datamodules #388

briankosw commented Nov 20, 2020

codecov bot commented Nov 20, 2020 •

edited

briankosw commented Dec 1, 2020 •

edited

pep8speaks commented Dec 1, 2020 •

edited

akihironitta left a comment

akihironitta commented Dec 7, 2020

briankosw commented Dec 13, 2020

akihironitta commented Dec 14, 2020

Borda commented Dec 14, 2020

akihironitta left a comment

briankosw commented Dec 15, 2020

Adding flags to datamodules #388

Adding flags to datamodules #388

Conversation

briankosw commented Nov 20, 2020

What does this PR do?

Before submitting

PR review

Did you have fun?

codecov bot commented Nov 20, 2020 • edited

Codecov Report

briankosw commented Dec 1, 2020 • edited

pep8speaks commented Dec 1, 2020 • edited

Comment last updated at 2020-12-13 10:18:17 UTC

akihironitta left a comment

Choose a reason for hiding this comment

akihironitta commented Dec 7, 2020

briankosw commented Dec 13, 2020

akihironitta commented Dec 14, 2020

Borda commented Dec 14, 2020

akihironitta left a comment

Choose a reason for hiding this comment

briankosw commented Dec 15, 2020

codecov bot commented Nov 20, 2020 •

edited

briankosw commented Dec 1, 2020 •

edited

pep8speaks commented Dec 1, 2020 •

edited