Validation loader flat loss #59

Dewald928 · 2020-08-04T08:50:59Z

I copied your example notebook to colab and ran the code without changing anything. But the validation loss I get goes flat, which is clearly a mistake when compared to your example. I also experienced this with my other networks which do the same, the loss just goes flat.

You can see my results from colab and your example in the figures below.

EDIT: If I replace val_iter with val_loader inside loss = self._validate(...) it does seem to "work" as I'd expect. So somewhere there seems to be a mistake in how the val_iter is iterated.

Colab	Your Example notebook

The text was updated successfully, but these errors were encountered:

NaleRaphael · 2020-08-04T13:34:38Z

@Dewald928 Thanks for reporting this.

After a quick run on that notebook ("examples/lrfinder_mnist.ipynb"), I found that running range_test() with val_loader is as quickly as running without val_loader, which is absolutely abnormal. I'll keep investigating it.

UPDATE: Currently, it seems there is something wrong in commit 52c189a.

NaleRaphael · 2020-08-04T14:30:41Z

OK, I figure out the reason why range_test() is running as quickly as it runs without a val_loader.

In range_test(), the following loop works normally at the first iteration:

# @LRFinder.range_test()
# ...
for iteration in tqdm(range(num_iter)):
    # Train on batch and retrieve loss
    loss = self._train_batch(
        train_iter,
        accumulation_steps,
        non_blocking_transfer=non_blocking_transfer,
    )
    if val_loader:
        loss = self._validate(
            val_iter, non_blocking_transfer=non_blocking_transfer
        )
# ...

However, val_iter._iterator has run out of values after that iteration and won't be reset in the following execution. Hence that self._validate() won't do anything and just return the default output: 0.0 (known as running_loss in that method). Therefore, loss returned by self._train_batch() is overwritten by 0.0, and it will be re-calculated by the following code:

# Track the best loss and smooth it if smooth_f is specified
if iteration == 0:
    self.best_loss = loss
else:
    if smooth_f > 0:
        loss = smooth_f * loss + (1 - smooth_f) * self.history["loss"][-1]
    if loss < self.best_loss:
        self.best_loss = loss

And that's why the lr-loss curve goes flat like the result you provided.

I'll make a patch for it later.

acquired by the syntax of normal `iterator` In `LRFinder.range_test()`, `val_iter` won't be reset after it runs out of values, and it makes `LRFinder._validate()` failed to work correctly after the first iteration of `range_test()`. To fix it, we add a counter to count the times a `ValDataLoaderIter` has run (i.e. times of `__next__()` is called). And reset it only when its `__iter__()` is called. So that it won't be reset automatically like the way `TrainDataLoaderIter` works. See also davidtvs#59 and the docstring of `ValDataLoaderIter` for further details.

… the syntax of normal `iterator` (#60) * MAINT: make `ValDataLoaderIter` able to be reset only when it is acquired by the syntax of normal `iterator` In `LRFinder.range_test()`, `val_iter` won't be reset after it runs out of values, and it makes `LRFinder._validate()` failed to work correctly after the first iteration of `range_test()`. To fix it, we add a counter to count the times a `ValDataLoaderIter` has run (i.e. times of `__next__()` is called). And reset it only when its `__iter__()` is called. So that it won't be reset automatically like the way `TrainDataLoaderIter` works. See also #59 and the docstring of `ValDataLoaderIter` for further details. * TST: add tests for `TrainDataLoaderIter` and `ValDataLoaderIter` * MAINT: remove redundant argument in `ValDataLoaderIter.__init__()` * TST: add tests to check valid types of `train_loader` and `val_loader` used in `range_test()` * TST: explicitly specify the message of exception we expect to get

ivanpanshin · 2020-08-21T10:05:00Z

Hey, man. Any progress on this issue?

davidtvs · 2020-08-21T10:22:07Z

@NaleRaphael fixed te issue in #60 which has been merged to the master branch.

I'm closing this issue, thanks for reporting it and feel free to reopen if needed

ivanpanshin · 2020-08-21T10:24:06Z

Yeah, that was my bad, didn't see those commits. Updated the package, now the evaluation with valid loader takes much longer to finish, so I guess now it works.

NaleRaphael mentioned this issue Aug 4, 2020

Make ValDataLoaderIter able to be reset only when it is acquired by the syntax of normal iterator #60

Merged

davidtvs closed this as completed Aug 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validation loader flat loss #59

Validation loader flat loss #59

Dewald928 commented Aug 4, 2020 •

edited

NaleRaphael commented Aug 4, 2020 •

edited

NaleRaphael commented Aug 4, 2020

ivanpanshin commented Aug 21, 2020

davidtvs commented Aug 21, 2020

ivanpanshin commented Aug 21, 2020

Validation loader flat loss #59

Validation loader flat loss #59

Comments

Dewald928 commented Aug 4, 2020 • edited

NaleRaphael commented Aug 4, 2020 • edited

NaleRaphael commented Aug 4, 2020

ivanpanshin commented Aug 21, 2020

davidtvs commented Aug 21, 2020

ivanpanshin commented Aug 21, 2020

Dewald928 commented Aug 4, 2020 •

edited

NaleRaphael commented Aug 4, 2020 •

edited