Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bootstrap wrapper for metrics? #96

Closed
maximsch2 opened this issue Mar 16, 2021 · 2 comments 路 Fixed by #101
Closed

Bootstrap wrapper for metrics? #96

maximsch2 opened this issue Mar 16, 2021 · 2 comments 路 Fixed by #101
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@maximsch2
Copy link
Contributor

馃殌 Feature

We should provide ability to compute bootstrapped confidence intervals for metrics.

Motivation

Confidence intervals are important and we should make it easy for people to increase rigor of their research and model evaluations.

Pitch

I'm thinking we can have something like this (very high level):

class Bootstrapper(Metric):
   def __init__(self, num_samples, metric):
       self.metrics = nn.ModuleList([deepcopy(metric) for _ in range(num_samples)])

  def update(self, preds, targets):
     for idx in range(self.num_samples):
        preds_sampled, targets_sampled = sample_for_bootstrap(preds, targets)
        self.metrics[i].update(preds_sampled, targets_sampled)

which will let people to wrap any metric, have a set of copies of the metric internally updated with different samples of the data, giving us then ability to get a distribution of metric values.

Alternatives

We can skip it on the class-based metrics side and assume anyone doing bootstrap will load everything in memory and do bootstrap using functional metrics.

@maximsch2 maximsch2 added enhancement New feature or request help wanted Extra attention is needed labels Mar 16, 2021
@SkafteNicki
Copy link
Member

Hi @maximsch2, great idea. Would you be up for doing a PR?
Just to be clear, would the compute method look something like:

def compute(self):
    computed_vals = torch.stack([m.compute() for m in self.metrics], dim=0)
    return computed_vals.mean(dim=0), computed_vals.std(dim=0)

or would you just return the computed values?

@maximsch2
Copy link
Contributor Author

Yes, we need to decide how we want to represent the bootstrapped result (e.g. mean + std, or maybe 5%, 50%, 95% percentiles, possibly even configurable), but yes, your compute function is approximately what I have in mind.

I don't have too much bandwidth to do it right now, so throwing it out there if someone wants to take it on. I might get back to it in the future though.

@SkafteNicki SkafteNicki mentioned this issue Mar 17, 2021
4 tasks
@SkafteNicki SkafteNicki self-assigned this Mar 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants