Correctly reset metric objects in self.log #7055

SkafteNicki · 2021-04-16T12:02:57Z

What does this PR do?

Fixes #7052
Since commit Lightning-AI/torchmetrics@19b77cc in torchmetrics, the integration with lightning has been broken if the metric object have been tried to be logged directly. This is because the lightning logging implicit depends on the self._computed variable not being reset whenever reset is called.

This PR tries to solve this by making sure that reset of metric objects is only called once at the very end. Please note that I am not so familiar with the logger connection of the code base.

As an alternative we revert the changes in torchmetrics. However, it does make some sense that the self._computed variable should be reset when reset is called.

cc: @ethanwharris

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

codecov · 2021-04-16T12:15:42Z

Codecov Report

❗ No coverage uploaded for pull request base (master@30b7440). Click here to learn what that means.
The diff coverage is 100%.

@@           Coverage Diff            @@
##             master   #7055   +/-   ##
========================================
  Coverage          ?     87%           
========================================
  Files             ?     196           
  Lines             ?   12577           
  Branches          ?       0           
========================================
  Hits              ?   10949           
  Misses            ?    1628           
  Partials          ?       0

pytorch_lightning/core/step_result.py

pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py

ananthsub · 2021-04-16T14:15:43Z

pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py

+            if self._internal_type == ResultStoreType.INSIDE_BATCH_TRAIN_LOOP:
+                for opt_idx in list(epoch_metrics):
+                    epoch_metrics[opt_idx].reset()


could you explain this check? at the surface, inside the batch train loop reads like we're not at the epoch end?

I am not completely sure about what is going on myself, but apparently the self._internal_type is equal to ResultStoreType.INSIDE_BATCH_TRAIN_LOOP even when we are at epoch end for all training metrics. self._internal_type is equal to ResultStoreType.OUTSIDE_BATCH_TRAIN_LOOP for validation metrics (atleast when this reset function is called).

Correct. I think BATCH should be removed.

But I'd also like to clear all this entirely at some point.

pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py

pytorch_lightning/trainer/trainer.py

ananthsub · 2021-04-16T14:17:31Z

thanks for the fix @SkafteNicki !

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>

ethanwharris

Working for me 😃

edenlightning · 2021-04-16T14:44:33Z

Nice job @SkafteNicki, thanks for the swift fix.

…/pytorch-lightning into metric_correct_reset

pep8speaks · 2021-04-16T17:02:36Z

Hello @SkafteNicki! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-04-19 08:46:31 UTC

carmocca · 2021-04-16T21:21:32Z

pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py

+            if self._internal_type == ResultStoreType.INSIDE_BATCH_TRAIN_LOOP:
+                for opt_idx in list(epoch_metrics):
+                    epoch_metrics[opt_idx].reset()


Correct. I think BATCH should be removed.

But I'd also like to clear all this entirely at some point.

Borda

lgtm

pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py

tchaton

LGTM !

tchaton · 2021-04-19T09:00:18Z

pytorch_lightning/core/step_result.py

@@ -587,6 +578,14 @@ def get_non_metrics_keys(self):
        """
        return [k for k, v in self.items() if not isinstance(v, Metric)]

+    def reset(self) -> None:


Suggested change

def reset(self) -> None:

def reset_metrics(self) -> None:

tchaton · 2021-04-19T09:10:59Z

pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py

+        Call at the end of epoch to reset Result objects
+        """
+        for dl_idx in range(self.num_dataloaders):
+            epoch_metrics = self._internals[dl_idx] if not self.has_reduced else self._internals_reduced[dl_idx]


In current usage, self.has_reduced should always be True right ?
Better to add assert self.has_reduced == True there and use self._internals_reduced directly ?

reset

d713d49

SkafteNicki added the bug Something isn't working label Apr 16, 2021

SkafteNicki requested review from Borda, carmocca, justusschock, kaushikb11, SeanNaren and tchaton as code owners April 16, 2021 12:02

fix tests

0e6c141

SkafteNicki requested a review from williamFalcon as a code owner April 16, 2021 13:38

fix tests

747b643

SkafteNicki requested a review from awaelchli as a code owner April 16, 2021 13:52

ananthsub approved these changes Apr 16, 2021

View reviewed changes

Apply suggestions from code review

6c8e5c2

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>

ethanwharris approved these changes Apr 16, 2021

View reviewed changes

ananthsub added the priority: 0 High priority task label Apr 16, 2021

SkafteNicki added 2 commits April 16, 2021 18:59

move logic

7184738

Merge branch 'metric_correct_reset' of https://github.com/SkafteNicki…

794eeff

…/pytorch-lightning into metric_correct_reset

SkafteNicki and others added 5 commits April 16, 2021 19:15

chglog

ee8e623

pep8

128e12a

Add test

d52cdc6

Improve test

783cfd3

Merge branch 'master' into metric_correct_reset

d6bd61a

carmocca approved these changes Apr 16, 2021

View reviewed changes

Borda approved these changes Apr 18, 2021

View reviewed changes

pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py Show resolved Hide resolved

Borda added the ready PRs ready to be merged label Apr 18, 2021

Borda added metrics logging Related to the `LoggerConnector` and `log()` labels Apr 18, 2021

mergify bot added the has conflicts label Apr 18, 2021

SkafteNicki mentioned this pull request Apr 19, 2021

Retrieval metrics problem with pytorch lightning integration in compute() Lightning-AI/torchmetrics#152

Closed

Merge branch 'master' into metric_correct_reset

2969903

mergify bot removed the has conflicts label Apr 19, 2021

Borda enabled auto-merge (squash) April 19, 2021 08:58

edenlightning added the 3rd party Related to a 3rd-party label Apr 19, 2021

tchaton approved these changes Apr 19, 2021

View reviewed changes

Borda merged commit fbee5a8 into Lightning-AI:master Apr 19, 2021

SkafteNicki deleted the metric_correct_reset branch April 28, 2023 11:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correctly reset metric objects in self.log #7055

Correctly reset metric objects in self.log #7055

SkafteNicki commented Apr 16, 2021 •

edited by ethanwharris

codecov bot commented Apr 16, 2021 •

edited

ananthsub Apr 16, 2021 •

edited

SkafteNicki Apr 16, 2021

carmocca Apr 16, 2021

ananthsub commented Apr 16, 2021

ethanwharris left a comment

edenlightning commented Apr 16, 2021

pep8speaks commented Apr 16, 2021 •

edited

carmocca Apr 16, 2021

Borda left a comment

tchaton left a comment

tchaton Apr 19, 2021

tchaton Apr 19, 2021

Correctly reset metric objects in self.log #7055

Correctly reset metric objects in self.log #7055

Conversation

SkafteNicki commented Apr 16, 2021 • edited by ethanwharris

What does this PR do?

Before submitting

PR review

Did you have fun?

codecov bot commented Apr 16, 2021 • edited

Codecov Report

ananthsub Apr 16, 2021 • edited

Choose a reason for hiding this comment

SkafteNicki Apr 16, 2021

Choose a reason for hiding this comment

carmocca Apr 16, 2021

Choose a reason for hiding this comment

ananthsub commented Apr 16, 2021

ethanwharris left a comment

Choose a reason for hiding this comment

edenlightning commented Apr 16, 2021

pep8speaks commented Apr 16, 2021 • edited

Comment last updated at 2021-04-19 08:46:31 UTC

carmocca Apr 16, 2021

Choose a reason for hiding this comment

Borda left a comment

Choose a reason for hiding this comment

tchaton left a comment

Choose a reason for hiding this comment

tchaton Apr 19, 2021

Choose a reason for hiding this comment

tchaton Apr 19, 2021

Choose a reason for hiding this comment

SkafteNicki commented Apr 16, 2021 •

edited by ethanwharris

codecov bot commented Apr 16, 2021 •

edited

ananthsub Apr 16, 2021 •

edited

pep8speaks commented Apr 16, 2021 •

edited