Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EMNISTDataModule #676

Merged
merged 114 commits into from Aug 13, 2021

Conversation

sugatoray
Copy link
Contributor

@sugatoray sugatoray commented Jun 26, 2021

What does this PR do?

Closes #672, #676 and #685.

A summary of changes and modifications ⭐ πŸ”₯ [CLICK TO EXPAND]

  • File Added:

    • pl_bolts/datasets/emnist_dataset.py 🟒
    • Contents:
      • EMNIST_METADATA
      • EMNIST dataset
      • BinaryEMNIST dataset Need New PR or add to #672 ⚠️
  • File Added:

    • pl_bolts/datamodules/emnist_dataset.py 🟒
    • Contents:
      • EMNISTDataModule
      • BinaryEMNISTDataModule Need New PR or add to #672 ⚠️
  • Files Modified

    • Package: pl_bolts

      • pl_bolts/datasets/__init__.py 🟒
      • pl_bolts/datamodules/__init__.py 🟒
    • Tests:

      • For datamodules:
        • tests/datamodules/test_imports.py 🟒
        • tests/datamodules/test_datamodules.py WIP 🟠

Adding BinaryEMNIST and BinaryEMNISTDataModule was logical, looking at how MNIST and BinaryMNIST (dataset and datamodules) were implemented.

About the dataset

image
source: https://arxiv.org/pdf/1702.05373.pdf [Table-I]

image
source: https://arxiv.org/pdf/1702.05373.pdf [Table-II]

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements) Make EMNISTDataModuleΒ #672
  • Did you read the contributor guideline, Pull Request section? Y 🟒
  • Did you make sure your PR does only one thing, instead of bundling different changes together? Y 🟒
  • Did you make sure to update the documentation with your changes? Y 🟒
  • Did you write any new necessary tests? [not needed for typos/docs] Y 🟒
  • Did you verify new and existing tests pass locally with your changes? Y 🟒
  • If you made a notable change (that affects users), did you update the CHANGELOG? Y 🟒

PR review

  • Is this pull request ready for review? (if not, please submit in draft mode) READY 🟒

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

@pep8speaks
Copy link

pep8speaks commented Jun 26, 2021

Hello @sugatoray! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-08-13 13:04:38 UTC

@github-actions github-actions bot added the datamodule Anything related to datamodules label Jun 26, 2021
@codecov
Copy link

codecov bot commented Jun 26, 2021

Codecov Report

Merging #676 (ef13456) into master (bd28835) will increase coverage by 47.35%.
The diff coverage is 80.00%.

❗ Current head ef13456 differs from pull request most recent head 5833e60. Consider uploading reports for the commit 5833e60 to get more accurate results
Impacted file tree graph

@@             Coverage Diff             @@
##           master     #676       +/-   ##
===========================================
+ Coverage   24.32%   71.67%   +47.35%     
===========================================
  Files         120      120               
  Lines        7486     7397       -89     
===========================================
+ Hits         1821     5302     +3481     
+ Misses       5665     2095     -3570     
Flag Coverage Ξ”
cpu 71.67% <80.00%> (+47.35%) ⬆️
pytest 71.67% <80.00%> (+47.35%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Ξ”
pl_bolts/transforms/dataset_normalizations.py 72.00% <20.00%> (+37.00%) ⬆️
pl_bolts/datasets/emnist_dataset.py 90.90% <90.90%> (ΓΈ)
pl_bolts/datamodules/__init__.py 100.00% <100.00%> (ΓΈ)
pl_bolts/datasets/__init__.py 100.00% <100.00%> (ΓΈ)
pl_bolts/models/rl/advantage_actor_critic_model.py
pl_bolts/losses/self_supervised_learning.py 71.33% <0.00%> (+0.18%) ⬆️
pl_bolts/models/self_supervised/cpc/cpc_module.py 20.58% <0.00%> (+0.58%) ⬆️
...l_bolts/models/self_supervised/byol/byol_module.py 22.34% <0.00%> (+0.83%) ⬆️
pl_bolts/models/vision/image_gpt/igpt_module.py 20.65% <0.00%> (+0.87%) ⬆️
... and 70 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Ξ” = absolute <relative> (impact), ΓΈ = not affected, ? = missing data
Powered by Codecov. Last update bd28835...5833e60. Read the comment docs.

Changes made in: `datamodules/binary_mnist_datamodule.py`
- class ``BinaryEMNISTDataModule`
  - [x] modified `prepare_data()`
  - [x] added `setup()`
Changes made in: `datamodules/emnist_datamodule.py`
- class `EMNISTDataModule`
  - [x] modified `prepare_data()`
  - [x] added `setup()`
- updated docs
- updated property: `num_classes`
  > Now the code shows correct `num_classes` based on the `split` provided by the user.
- updated docs
- updated property: `num_classes`
  > Now the code shows correct `num_classes` based on the `split` provided by the user.
@sugatoray sugatoray marked this pull request as ready for review June 29, 2021 00:08
@akihironitta akihironitta marked this pull request as ready for review July 31, 2021 21:08
Copy link
Contributor

@akihironitta akihironitta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Description of the change: Introduced a new arg strict_val_split: bool to use the balanced validation set defined in the EMNIST paper.

(always open for any discussions/comments on the API (strict_val_split) :])


Note: The failing tests are irrelevant to the changes in this PR.

Copy link
Contributor Author

@sugatoray sugatoray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akihironitta Looks good to me. Thank you, for working on it. I am not sure, if you were waiting for any feedback from me. It seems that this has not been reviewed by other reviewers since your approval. Please let me know if there is something I need to do.

@akihironitta
Copy link
Contributor

@sugatoray Hi, thanks for having a look again!

@PyTorchLightning/core-bolts Would you mind having a look at the changes?

pl_bolts/datamodules/emnist_datamodule.py Outdated Show resolved Hide resolved
pl_bolts/datamodules/emnist_datamodule.py Outdated Show resolved Hide resolved
pl_bolts/datamodules/emnist_datamodule.py Show resolved Hide resolved
@akihironitta
Copy link
Contributor

The failing tests are irrelevant to the change in this PR.

pytest summary at c6adbae
=========================== short test summary info ============================
FAILED tests/callbacks/verification/test_base.py::test_verification_base_get_input_array[device0]
FAILED tests/callbacks/verification/test_batch_gradient.py::test_batch_gradient_verification_pl_module[device0-True]
FAILED tests/callbacks/verification/test_batch_gradient.py::test_batch_gradient_verification_pl_module[device0-False]
FAILED tests/callbacks/verification/test_batch_gradient.py::test_batch_gradient_verification_callback[0]
FAILED tests/models/rl/test_scripts.py::test_cli_run_rl_dqn[ --env PongNoFrameskip-v4 --max_steps 10 --fast_dev_run 1 --warm_start_size 10 --n_steps 2 --batch_size 10]
FAILED tests/models/rl/test_scripts.py::test_cli_run_rl_reinforce[ --env CartPole-v0 --max_steps 10 --fast_dev_run 1 --batch_size 10]
FAILED tests/models/rl/test_scripts.py::test_cli_run_rl_vanilla_policy_gradient[ --env CartPole-v0 --max_steps 10 --fast_dev_run 1 --batch_size 10]
= 7 failed, 288 passed, 23 skipped, 1 xfailed, 532 warnings in 1053.39s (0:17:33) =

@akihironitta
Copy link
Contributor

@PyTorchLightning/core-bolts Could you have another look?

@mergify mergify bot removed the has conflicts label Aug 13, 2021
@mergify mergify bot removed the has conflicts label Aug 13, 2021
@Borda Borda merged commit 4ece8db into Lightning-Universe:master Aug 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datamodule Anything related to datamodules ready
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make EMNISTDataModule
5 participants