Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New modular metric interface #2528

Merged
merged 42 commits into from Aug 26, 2020

Conversation

SkafteNicki
Copy link
Member

@SkafteNicki SkafteNicki commented Jul 6, 2020

What does this PR do?

Fixes #3069

This is a proposal on how an extension for the modular interface for
metric packages could look like. What our interface is missing, is the option to do computations
after dpp sync. Consider the following example for rmse:

20 samples, 2 devices, 10 for each devices
With our current setup, each device would return a metric value which then
get synced.

First machine returns (assume sum of squared error is 200)
sqrt(1/10 * 200) = 4.47
Second machine return (assume sum of squared error is 100)
sqrt(1/10 * 100) = 3.16
After ddp sync the value that is returned would be
(4.47 + 3.16) / 2 = 3.815
But the correct value is:
sqrt(1/20 * 300) = 3.872

It is possible to arrive at the correct result if each  machine instead returns 
a *metric state* i.e a collection of values that can be synced and the correct results can
derived from (in above case that would be the number of samples and the
sum of squared errors). Thus, we need to extend the modular interface, such
that we (or the user) can do calculations after the ddp sync.

This PR therefore propose to go from our decorator orientated modular interface
to a hook-based interface. All hooks are optional, such that the user only needs
to implement forward if the inherent from either TensorMetric or NumpyMetric.

Hooks to add:

  • input_convert -> used to convert the input to correct type
  • output_convert -> used to convert the output of forward
  • ddp_sync -> do ddp_sync
  • compute -> this is the missing part, that enables us to do computations after ddp sync

Note: this PR just implement the hooks, but I still need to go over each metric,
fixing those where the compute hook is needed.
That will be fixed in follow up PR, since this is already extensive enough.

Tagging @justusschock and @Borda for opinion

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@mergify
Copy link
Contributor

mergify bot commented Jul 6, 2020

This pull request is now in conflict... :(

@mergify mergify bot requested a review from a team July 6, 2020 11:32
Copy link
Member

@justusschock justusschock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In total I really like those changes!

I feel like we should automatically sync the output of forward whenever being on ddp, and then call aggregate on it. We could also think on defaults (i.e. aggregate could be a simple mean by default).

So the next step would be to revisit all metrics for reduction?

Edit: This should also make it simpler to pickle metrics, since we only need to make sure the converters can be pickled :)

pytorch_lightning/metrics/metric.py Outdated Show resolved Hide resolved
@mergify mergify bot requested a review from a team July 6, 2020 12:10
@Borda Borda added feature Is an improvement or enhancement Important labels Jul 6, 2020
@Borda Borda added this to in Progress in Metrics package via automation Jul 6, 2020
@Borda Borda added this to the 0.9.0 milestone Jul 6, 2020
@SkafteNicki
Copy link
Member Author

@justusschock thanks, I took me a while to figure out how to keep this as close to native pytorch as possible, but still expressive enough to support the features we need.
I think the first step is to change the backend to accommodate the features that we need and then do the actual implementation of the features.

For aggregation over multiple batches, one way to achieve this to introduce a new Accumulate class that inserts a hook just before ddp_sync, that accumulate the states:

class Accumulate(nn.Module):
    def __init__(self, base_metric):
        super().__init__()
        self.base_metric = base_metric
        self.base_metric._forward_hooks.popitem(-1) # remove compute hook
        self.base_metric.register_forward_hook(self.save_state) # add save_state hook
        self.base_metric.register_forward_hook(self.base_metric.compute) # re-add compute hook
        self.state = None
    def forward(self, *args, **kwargs):
        return self.base_metric(*args, **kwargs)
    def save_state(self, module, input, output):
        if self.state is None:
            self.state = output
        else:
            state = [s+o for s,o in zip(self.state, output)]
            self.state = state
            output = state
        return output
    def reset(self):
        self.state = None
# Use case
metric = Accumulate(Metric(...))
for (data, target) in dataloader:
    pred = model(data)
    val = metric(pred, target) # val will be the accumulated value

@codecov
Copy link

codecov bot commented Jul 6, 2020

Codecov Report

Merging #2528 into master will decrease coverage by 9%.
The diff coverage is 88%.

@@           Coverage Diff            @@
##           master   #2528     +/-   ##
========================================
- Coverage      90%     81%     -9%     
========================================
  Files          81      84      +3     
  Lines        7858    9321   +1463     
========================================
+ Hits         7034    7518    +484     
- Misses        824    1803    +979     

@justusschock
Copy link
Member

@SkafteNicki I see your point. But isn't it basically the same whether you accumulate across nodes or across different batches in the same node? This would probably avoid some code duplication.

We should have a chat in slack considering the perfect integration, once we finished the API for metrics.

@SkafteNicki
Copy link
Member Author

@justusschock it probably is the same syncing between nodes and across the same node, so I agree that it should somewhat be handled in the same way. Only difference is that accumulation on the same node, should probably be a feature that the user can enable/disable (i.e. could be a accumulate flag in the metric __init__).

For now, I will rename the ddp_sync hook to aggregate as you proposed, and it will still just support sync between nodes. Then we can discuss on slack the remaining parts.

@mergify
Copy link
Contributor

mergify bot commented Jul 8, 2020

This pull request is now in conflict... :(

1 similar comment
@mergify
Copy link
Contributor

mergify bot commented Jul 21, 2020

This pull request is now in conflict... :(

@justusschock
Copy link
Member

@SkafteNicki How is it going with this PR?

@mergify
Copy link
Contributor

mergify bot commented Aug 5, 2020

This pull request is now in conflict... :(

1 similar comment
@mergify
Copy link
Contributor

mergify bot commented Aug 7, 2020

This pull request is now in conflict... :(

@mergify mergify bot requested a review from a team August 25, 2020 19:25
CHANGELOG.md Outdated Show resolved Hide resolved
@mergify mergify bot requested a review from a team August 25, 2020 19:33
@mergify mergify bot requested a review from a team August 25, 2020 21:09
@mergify mergify bot requested a review from a team August 25, 2020 21:12
Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🐰

Metrics package automation moved this from in Progress to in Review Aug 26, 2020
@Borda Borda requested a review from rohitgr7 August 26, 2020 08:27
@Borda Borda added the ready PRs ready to be merged label Aug 26, 2020
@Borda
Copy link
Member

Borda commented Aug 26, 2020

@SkafteNicki some metric test seems to be hanging...

@Borda Borda removed the ready PRs ready to be merged label Aug 26, 2020
@SkafteNicki
Copy link
Member Author

@Borda which test are hanging, I cannot figure it out from drone details?

@Borda
Copy link
Member

Borda commented Aug 26, 2020

@Borda which test are hanging, I cannot figure it out from drone details?

see this build - http://35.192.60.23/PyTorchLightning/pytorch-lightning/8999
the las passing tests was tests/metrics/test_metrics.py::test_metric[metric1] PASSED [ 37%] so the next one hanged...

@SkafteNicki
Copy link
Member Author

@Borda fixed the bug, we can merge this now :]

@Borda Borda added the ready PRs ready to be merged label Aug 26, 2020
@Borda Borda merged commit 17d8773 into Lightning-AI:master Aug 26, 2020
Metrics package automation moved this from in Review to Done Aug 26, 2020
@SkafteNicki SkafteNicki mentioned this pull request Sep 18, 2020
7 tasks
@SkafteNicki SkafteNicki deleted the new_metric_interface branch October 8, 2020 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement ready PRs ready to be merged
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

metrics.sklearns don't work with DDP: Tensors must be CUDA and dense
6 participants