Metrics support mask #54

YuxianMeng · 2020-07-11T04:31:40Z

🚀 Feature

Current metrics like Accuracy/Recall would be better to support mask.

Motivation

For example, when I deal with a Sequence Labeling Task and pad some sequence to max-length, I do not want to calculate metrics at the padding locations.

Pitch

I guess a simple manipulation would work for accuracy.(here is the original one)

from typing import Any, Optional

import torch
from pytorch_lightning.metrics.functional.classification import (
    accuracy,
)
from pytorch_lightning.metrics.metric import TensorMetric


class MaskedAccuracy(TensorMetric):
    """
    Computes the accuracy classification score
    Example:
        >>> pred = torch.tensor([0, 1, 2, 3])
        >>> target = torch.tensor([0, 1, 2, 2])
        >>> mask = torch.tensor([1, 1, 1, 0])
        >>> metric = MaskedAccuracy(num_classes=4)
        >>> metric(pred, target, mask)
        tensor(1.)
    """

    def __init__(
        self,
        num_classes: Optional[int] = None,
        reduction: str = 'elementwise_mean',
        reduce_group: Any = None,
        reduce_op: Any = None,
    ):
        """
        Args:
            num_classes: number of classes
            reduction: a method for reducing accuracies over labels (default: takes the mean)
                Available reduction methods:
                - elementwise_mean: takes the mean
                - none: pass array
                - sum: add elements
            reduce_group: the process group to reduce metric results from DDP
            reduce_op: the operation to perform for ddp reduction
        """
        super().__init__(name='accuracy',
                         reduce_group=reduce_group,
                         reduce_op=reduce_op)
        self.num_classes = num_classes
        self.reduction = reduction

    def forward(self, pred: torch.Tensor, target: torch.Tensor, mask: torch.Tensor) -> torch.Tensor:
        """
        Actual metric computation
        Args:
            pred: predicted labels
            target: ground truth labels
            mask: only caculate metrics where mask==1
        Return:
            A Tensor with the classification score.
        """
        mask_fill = (1-mask).bool()
        pred = pred.masked_fill_(mask=mask_fill, value=-1)
        target = target.masked_fill_(mask=mask_fill, value=-1)

        return accuracy(pred=pred, target=target,
                        num_classes=self.num_classes, reduction=self.reduction)

Alternatives

Additional context

Borda · 2020-07-11T10:11:24Z

Looks nice! @YuxianMeng want to send it as a PR?
Cc: @justusschock @SkafteNicki

YuxianMeng · 2020-07-11T12:14:24Z

Looks nice! @YuxianMeng want to send it as a PR?
Cc: @justusschock @SkafteNicki

My pleasure:) A little question is should this PR contain only masked precision metrics or also contain other metrics?

Borda · 2020-07-11T13:41:51Z

I would say all, in fact it would be nice to have an abstract function/class that do this masking and the new metrics would be created just its application, so for example:

for functional make a wrapper which does the masking and all the new masked-like function will call existing functions with this wrapper
for class make abstract class and the masked-like metrics will be created as inheriting from the mask and metric class

Does it make sense? @justusschock @SkafteNicki

justusschock · 2020-07-11T13:47:13Z

@YuxianMeng But with your implementation, you calculate it also for the values you set to -1 I think.

What you instead need to do is accuracy(pred[mask], target[mask]) which is why I wouldn't add extras for them to be honest. We can't include every special case here and masking tensors is not much overhead, which is why I'd prefer not to include this into the metrics package. Thoughts @SkafteNicki ?

YuxianMeng · 2020-07-12T07:23:35Z

@YuxianMeng But with your implementation, you calculate it also for the values you set to -1 I think.

What you instead need to do is accuracy(pred[mask], target[mask]) which is why I wouldn't add extras for them to be honest. We can't include every special case here and masking tensors is not much overhead, which is why I'd prefer not to include this into the metrics package. Thoughts @SkafteNicki ?

@justusschock As for accuracy, actually only the non-negative classes are calculated. I thought about using accuracy(pred[mask], target[mask]), but it may cause speed trouble when training on TPU

SkafteNicki · 2020-08-03T07:50:13Z

I agree with @Borda that this should be an abstract function/class. The most simple, in my opinion, would be a class that the user can wrap their already existing metric with: masked_accuracy=MaskedMetric(Accuracy()). This would add a additional argument to the call: value = masked_accuracy(pred, target, mask). The alternative, re-writing each metric to include this feature, is not feasible at the moment.

Borda · 2020-08-03T08:20:26Z

@YuxianMeng mind send a PR and I guess @SkafteNicki or @justusschock could help/guide you throw 🐰

justusschock · 2020-08-03T08:21:44Z

I think I speak for both of us, saying that we'd for sure do that and really appreciate the PR :)

SkafteNicki · 2020-08-03T09:10:14Z

Yes just ping us in the PR when you are ready, and we will assist you.

YuxianMeng · 2020-08-04T01:56:56Z

Working on it, I will let you when I'm ready :)

hadim · 2020-11-19T14:21:24Z

This issue has been closed. Does the mask metrics features has landed? Or nobody has worked on it yet?

SkafteNicki · 2020-11-19T14:40:17Z

It was closed due to no activity, so it is still not a part of lightning. @hadim please feel free to pick it up and send a PR :]

davzaman · 2021-03-06T21:13:21Z

I have implemented a version of this in my own project, would anyone like to collaborate on making a PR for this?

SkafteNicki · 2021-03-07T20:23:30Z

@davzaman please yes, would be a great addition :]

davzaman · 2021-03-12T00:01:43Z

I didn't implement it as a class wrapper, but I have a few ideas on how to do it. it might take me a little while as i have deadlines for other things but i will be working on this!

davzaman · 2021-03-21T00:07:40Z

Hello, perhaps I'm missing something but I'm not sure that there's a one-size-fits-all answer to this that can just be implemented as a wrapper.

I may be breaking down the problem incorrectly. For metrics with a simple internal sum, just replacing values in the mask with 0 will suffice. For metrics that have an internal mean, the generic solution would be to sum over dim=1 and then replacen_obs with mask.sum(axis=1), and divide only where the denominator (or number of elements in the row of the mask is not 0). However, I'm not quite sure how to cover all metrics I feel like I could be missing scenarios. I'm not sure if there are metrics that could be dividing pred/target, and also if we want to support custom metrics.

What are your thoughts?

SkafteNicki · 2021-03-29T13:51:53Z

@davzaman I definitely see the problem. My original idea for this feature would be a simple wrapper that just internally does metric(pred[mask], target[mask]) when the user calls metric(pred, target, mask) (or something similar). However, that would not work for all metrics I guess.

davzaman · 2021-03-29T17:28:02Z

yeah I think each metric would need to have its own, there's not an insane amount of metrics but the overhead of including tests for all of them might be much. should we just let users figure out masking on their own? Is there something we can at least include to make the process easier?

Borda · 2021-07-16T22:36:17Z

@davzaman @SkafteNicki how is it going here?

davzaman · 2021-07-17T00:16:11Z

Hi @Borda We ran into issues in trying to follow a one-size-fits-all approach to including masks for metrics, since internally the computations might be very different (which would change the logic required to properly compute a "masked" version of the metric). I wasn't sure how to proceed from here. From what I could tell, it would be best to have a masked version of each metric separately, even though it's more work. There's a chance there's a solution that I didn't see.

yassersouri · 2021-08-24T17:15:01Z

I think this issue is related to #362.

Borda · 2021-11-03T12:30:13Z

@davzaman @yassersouri could you pls open a draft PR so we have a more concrete discussion...
and eventually, we can help to find a solution? 🐰
cc: @justusschock @SkafteNicki

yassersouri · 2021-11-03T13:36:00Z

@Borda Sorry, but I am quite busy right now. I don't think I will have time to allocate to this or #362.

davzaman · 2021-11-10T18:34:19Z

@Borda I don't think I have the time to allocate to this at the moment but I am happy to help move things along.

stale · 2022-01-10T03:45:35Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

ZhaofengWu · 2022-10-11T19:48:58Z

Bumping this. I believe ignore_index could be fragile in certain circumstances, and a mask would be more reliable. For example, while it is a sensible default to set ignore_index to be the padding token index of a tokenizer, some tokenizers such as GPT-2's do not have a padding token. Masking is also how the AllenNLP metrics (https://github.com/allenai/allennlp/tree/main/allennlp/training/metrics) dealt with this. I believe individually implementing a mask for each metric is probably a good idea. I definitely understand it's a non-trivial amount of work, but I believe it's worth it in the long run, and hopefully the AllenNLP implementations could be a useful reference for some common metrics.

davzaman · 2022-10-11T20:46:23Z

There was also some sort of attempt to support masked Tensors but I think it has died out https://github.com/pytorch/maskedtensor

stale bot closed this as completed Oct 28, 2020

Borda reopened this Mar 6, 2021

Borda assigned davzaman Mar 6, 2021

Borda transferred this issue from Lightning-AI/pytorch-lightning Mar 12, 2021

Borda added enhancement New feature or request help wanted Extra attention is needed labels Mar 17, 2021

stale bot added the wontfix label Jun 1, 2021

Lightning-AI deleted a comment from stale bot Jun 1, 2021

stale bot removed the wontfix label Jun 1, 2021

Lightning-AI deleted a comment from stale bot Jul 16, 2021

Lightning-AI deleted a comment from github-actions bot Jul 16, 2021

Borda removed the help wanted Extra attention is needed label Sep 20, 2021

Borda unassigned davzaman Nov 10, 2021

Borda added the help wanted Extra attention is needed label Nov 10, 2021

stale bot added the wontfix label Jan 10, 2022

stale bot closed this as completed Jan 20, 2022

Munzu mentioned this issue Jun 5, 2022

class definition is missed ShannonAI/Neural-Semi-Supervised-Learning-for-Text-Classification#1

Open

ZhaofengWu mentioned this issue Oct 20, 2022

Supporting mask #1282

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metrics support mask #54

Metrics support mask #54

YuxianMeng commented Jul 11, 2020 •

edited by Borda

Borda commented Jul 11, 2020

YuxianMeng commented Jul 11, 2020

Borda commented Jul 11, 2020

justusschock commented Jul 11, 2020

YuxianMeng commented Jul 12, 2020 •

edited

SkafteNicki commented Aug 3, 2020

Borda commented Aug 3, 2020

justusschock commented Aug 3, 2020

SkafteNicki commented Aug 3, 2020

YuxianMeng commented Aug 4, 2020

hadim commented Nov 19, 2020

SkafteNicki commented Nov 19, 2020

davzaman commented Mar 6, 2021

SkafteNicki commented Mar 7, 2021

davzaman commented Mar 12, 2021

davzaman commented Mar 21, 2021

SkafteNicki commented Mar 29, 2021

davzaman commented Mar 29, 2021

Borda commented Jul 16, 2021

davzaman commented Jul 17, 2021

yassersouri commented Aug 24, 2021

Borda commented Nov 3, 2021

yassersouri commented Nov 3, 2021

davzaman commented Nov 10, 2021

stale bot commented Jan 10, 2022

ZhaofengWu commented Oct 11, 2022

davzaman commented Oct 11, 2022

Metrics support mask #54

Metrics support mask #54

Comments

YuxianMeng commented Jul 11, 2020 • edited by Borda

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

Borda commented Jul 11, 2020

YuxianMeng commented Jul 11, 2020

Borda commented Jul 11, 2020

justusschock commented Jul 11, 2020

YuxianMeng commented Jul 12, 2020 • edited

SkafteNicki commented Aug 3, 2020

Borda commented Aug 3, 2020

justusschock commented Aug 3, 2020

SkafteNicki commented Aug 3, 2020

YuxianMeng commented Aug 4, 2020

hadim commented Nov 19, 2020

SkafteNicki commented Nov 19, 2020

davzaman commented Mar 6, 2021

SkafteNicki commented Mar 7, 2021

davzaman commented Mar 12, 2021

davzaman commented Mar 21, 2021

SkafteNicki commented Mar 29, 2021

davzaman commented Mar 29, 2021

Borda commented Jul 16, 2021

davzaman commented Jul 17, 2021

yassersouri commented Aug 24, 2021

Borda commented Nov 3, 2021

yassersouri commented Nov 3, 2021

davzaman commented Nov 10, 2021

stale bot commented Jan 10, 2022

ZhaofengWu commented Oct 11, 2022

davzaman commented Oct 11, 2022

YuxianMeng commented Jul 11, 2020 •

edited by Borda

YuxianMeng commented Jul 12, 2020 •

edited