fix _reset_eval_dataloader() for IterableDataset #1560

ybrovman · 2020-04-22T15:26:52Z

Before submitting

Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure to update the docs?
Did you write any new necessary tests?
If you made a notable change (that affects users), did you update the CHANGELOG?

What does this PR do?

I encountered the following error TypeError: object of type 'MyIterableDataset' has no len() from line 190 in _reset_eval_dataloader() in data_loading.py file when using an IterableDataset for the validation dataset.

The if dl caused the issue. if dl is equivalent to bool(dl) = dataloader.__bool__, but there is no dataloader.__bool__ so bool() uses dataloader.__len__ > 0. But... dataloader.__len__ uses IterableDataset.__len__ for IterableDatasets for which __len__ is undefined.

Resolving the issue by comparing to None, if dl is not None, in _reset_eval_dataloader().

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

👍

pytorch_lightning/trainer/data_loading.py

pep8speaks · 2020-04-23T00:00:46Z

Hello @ybrovman! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-05-05 15:48:37 UTC

awaelchli

Good catch, but I don't understand why the test test_inf_val_dataloader passes if this bug exists. How do I reproduce this actually? The description of this PR makes it look like this should fail with any IterableDataset, but I don't get this error.

ybrovman · 2020-04-27T18:57:21Z

Good catch, but I don't understand why the test test_inf_val_dataloader passes if this bug exists. How do I reproduce this actually? The description of this PR makes it look like this should fail with any IterableDataset, but I don't get this error.

I am not too familiar with the testing details here, however, I think the issue might be with the CustomInfDataloader. Perhaps since if dl = bool(dl) and CustomInfDataloader.__bool__ or CustomInfDataloader.__len__ does not exist, bool(CustomInfDataloader) is always True, so the test_inf_val_dataloader passes.

Borda

LGTM 🤖

Borda · 2020-05-05T15:33:26Z

@ybrovman mind add a test for this iter dataset?

codecov · 2020-05-05T16:04:09Z

Codecov Report

Merging #1560 into master will not change coverage.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #1560   +/-   ##
======================================
  Coverage      88%     88%           
======================================
  Files          69      69           
  Lines        4151    4151           
======================================
  Hits         3661    3661           
  Misses        490     490

awaelchli

In which case is dl actually None? Have you considered removing the if statement completely?

ybrovman · 2020-05-05T17:51:27Z

In which case is dl actually None? Have you considered removing the if statement completely?

@awaelchli I did remove the if statement in the original commit, however, added back the None check after @tullie 's comment.

tullie · 2020-05-05T18:07:26Z

I tried to look at cases where it could be None and it's hard to track down exactly. However, i'm fairly sure it's only None if the val_dataloader or test_dataloader returns None.

Borda · 2020-05-05T18:58:49Z

I tried to look at cases where it could be None and it's hard to track down exactly. However, i'm fairly sure it's only None if the val_dataloader or test_dataloader returns None.

shall we raise a warning if any dataloader is none or just skip it...

tullie · 2020-05-05T20:52:45Z

Yeah ideally we should raise a warning but not the biggest issue.

Borda · 2020-05-05T21:11:21Z

Yeah ideally we should raise a warning but not the biggest issue.

so just a warning to logging or standard Runtime warning... and just once and then remove the None dataloader from the list

@ybrovman mind send a followup PR?

ybrovman · 2020-05-05T23:34:34Z

@Borda I created PR #1745 to address your comment.

This function has the if statement `if (train_dataloader or val_dataloaders) and datamodule:`. The issue is similar to that in Lightning-AI#1560. The problem is that the `if(dl)` translates to `if(bool(dl))`, but there's no dataloader.__bool__ so bool() uses dataloader.__len__ > 0. But... dataloader.__len__ uses IterableDataset.__len__ for IterableDatasets for which __len__ is undefined. The fix is also the same, the `if dl` should be replaced by `if dl is not None`.

…2957) This function has the if statement `if (train_dataloader or val_dataloaders) and datamodule:`. The issue is similar to that in #1560. The problem is that the `if(dl)` translates to `if(bool(dl))`, but there's no dataloader.__bool__ so bool() uses dataloader.__len__ > 0. But... dataloader.__len__ uses IterableDataset.__len__ for IterableDatasets for which __len__ is undefined. The fix is also the same, the `if dl` should be replaced by `if dl is not None`. Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

…ightning-AI#2957) This function has the if statement `if (train_dataloader or val_dataloaders) and datamodule:`. The issue is similar to that in Lightning-AI#1560. The problem is that the `if(dl)` translates to `if(bool(dl))`, but there's no dataloader.__bool__ so bool() uses dataloader.__len__ > 0. But... dataloader.__len__ uses IterableDataset.__len__ for IterableDatasets for which __len__ is undefined. The fix is also the same, the `if dl` should be replaced by `if dl is not None`. Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

mergify bot requested a review from a team April 22, 2020 15:27

Borda assigned tullie Apr 22, 2020

tullie suggested changes Apr 22, 2020

View reviewed changes

pytorch_lightning/trainer/data_loading.py Outdated Show resolved Hide resolved

mergify bot requested a review from a team April 22, 2020 21:38

Borda requested review from tullie and Borda April 23, 2020 11:31

Borda added the bug Something isn't working label Apr 23, 2020

Borda added this to the 0.7.4 milestone Apr 23, 2020

Borda approved these changes Apr 23, 2020

View reviewed changes

mergify bot requested a review from a team April 23, 2020 11:34

justusschock approved these changes Apr 24, 2020

View reviewed changes

mergify bot requested a review from a team April 24, 2020 07:55

Borda modified the milestones: 0.7.4, 0.7.5 Apr 24, 2020

awaelchli reviewed Apr 25, 2020

View reviewed changes

mergify bot requested a review from a team April 25, 2020 23:21

Borda requested review from Borda, awaelchli and justusschock May 4, 2020 21:08

awaelchli mentioned this pull request May 5, 2020

IterableDataset does not work in validation #1731

Closed

awaelchli linked an issue May 5, 2020 that may be closed by this pull request

IterableDataset does not work in validation #1731

Closed

Borda unassigned tullie May 5, 2020

Borda approved these changes May 5, 2020

View reviewed changes

mergify bot requested a review from a team May 5, 2020 15:32

Borda added the ready PRs ready to be merged label May 5, 2020

ybrovman added 2 commits May 5, 2020 17:48

removed if dl from _reset_eval_dataloader()

da7045b

changed to if dl != None to be more safe

b5ff528

hints from pep8speaks

6e805a0

Borda force-pushed the dataloader branch from b03a4a1 to 6e805a0 Compare May 5, 2020 15:48

awaelchli reviewed May 5, 2020

View reviewed changes

mergify bot requested a review from a team May 5, 2020 17:22

tullie approved these changes May 5, 2020

View reviewed changes

williamFalcon merged commit 35bbe17 into Lightning-AI:master May 5, 2020

ybrovman mentioned this pull request May 5, 2020

added warning for None dataloader #1745

Merged

5 tasks

SiddhantRanade mentioned this pull request Aug 13, 2020

Using IterableDatasets without __len__ for Training #2955

Closed

SiddhantRanade mentioned this pull request Aug 13, 2020

Fix enforce_datamodule_dataloader_override() for iterable datasets #2957

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix _reset_eval_dataloader() for IterableDataset #1560

fix _reset_eval_dataloader() for IterableDataset #1560

ybrovman commented Apr 22, 2020 •

edited

pep8speaks commented Apr 23, 2020 •

edited

awaelchli left a comment •

edited

ybrovman commented Apr 27, 2020

Borda left a comment

Borda commented May 5, 2020

codecov bot commented May 5, 2020

awaelchli left a comment

ybrovman commented May 5, 2020

tullie commented May 5, 2020

Borda commented May 5, 2020

tullie commented May 5, 2020

Borda commented May 5, 2020

ybrovman commented May 5, 2020

fix _reset_eval_dataloader() for IterableDataset #1560

fix _reset_eval_dataloader() for IterableDataset #1560

Conversation

ybrovman commented Apr 22, 2020 • edited

Before submitting

What does this PR do?

PR review

Did you have fun?

pep8speaks commented Apr 23, 2020 • edited

Comment last updated at 2020-05-05 15:48:37 UTC

awaelchli left a comment • edited

Choose a reason for hiding this comment

ybrovman commented Apr 27, 2020

Borda left a comment

Choose a reason for hiding this comment

Borda commented May 5, 2020

codecov bot commented May 5, 2020

Codecov Report

awaelchli left a comment

Choose a reason for hiding this comment

ybrovman commented May 5, 2020

tullie commented May 5, 2020

Borda commented May 5, 2020

tullie commented May 5, 2020

Borda commented May 5, 2020

ybrovman commented May 5, 2020

ybrovman commented Apr 22, 2020 •

edited

pep8speaks commented Apr 23, 2020 •

edited

awaelchli left a comment •

edited