[Task]: Anomalib's Thresholding Mechanism Upgrade: Achieving Full Unsupervised Pipeline #1029

samet-akcay · 2023-04-24T09:46:52Z

What is the motivation for this task?

Background

Anomaly Detection

Anomaly detection is the process of identifying data points, events, or observations within a dataset that significantly deviate from the normal or expected behavior. Anomalies may be caused by a variety of factors, including failures, fraud, or unusual behavior.

Anomaly detection is an essential task in numerous industries, including cybersecurity, finance, healthcare, and manufacturing. It can be used to detect fraud in financial transactions, to identify anomalies in medical data for early disease diagnosis, to detect flaws in manufacturing processes, and to monitor traffic for security threats.

Anomalib

Anomalib is a deep learning library that aims to collect the best anomaly detection algorithms for testing on both public and private datasets. Anomalib offers several ready-to-use implementations of anomaly detection algorithms described in recent research, as well as a set of tools that make it easier to build and use custom models. The library has a strong focus on image-based anomaly detection, where the goal of the algorithm is to identify anomalous images or anomalous pixel regions within images in a dataset.

The thresholding problem

Anomaly detection models in Anomalib are trained only on normal images. During inference, the models are tasked with distinguishing anomalous samples from normal samples. The task is similar to a classical binary classification problem, but instead of generating a class label and a confidence score, Anomalib models generate an anomaly score, which quantifies the distance of the sample to the distribution of normal samples seen during training. The range of possible anomaly score values is unbounded and may differ widely between models and/or datasets, which makes it challenging to set a good threshold for mapping the raw anomaly scores to a binary class label (normal vs. anomalous).

Describe the solution you'd like

Anomalib currently has an adaptive thresholding mechanism in place which aims to address the thresholding problem. The adaptive thresholding mechanism computes the F1 score over a validation set for a range of thresholds. The final threshold value is the threshold value that results in the highest F1 score. A major drawback of this approach is that the validation set is required to contain anomalous samples, which might not always be available in real world anomaly detection problems.

The goal of this hackathon is to design a fully unsupervised thresholding mechanism that does not rely on anomalous samples.

Possible approaches

One possibility is to generate fake anomalous images for the validation set, and then use Anomalib's existing adaptive thresholding mechanism. A simple version of this solution, which makes use of random Perlin noise masks, is already implemented in Anomalib. Different noise methods or generative models could possibly improve the results of this method.
Use the characteristics of the anomaly score distribution of the normal training images to set a confidence interval.
Other, better, solutions may exist. Be creative!

Additional context

No response

SimonB97 · 2023-04-24T12:22:43Z

Hi!

First, i have a question regarding the current threshold calculation:
Is it true, that currently it is very prone to class biases as the f1 score is prone to it and the threshold relies on the f1 score?

Second, i asked gpt-4 to come up with an idea for this Task. This is the suggestion it gave:

Modify the AnomalyScoreThreshold class to add a new method called compute_unsupervised_threshold.

class AnomalyScoreThreshold(PrecisionRecallCurve):
    ...

    def compute_unsupervised_threshold(self, anomaly_scores: Tensor) -> Tensor:
        """Compute the unsupervised threshold based on the anomaly scores of normal training images.

        Args:
            anomaly_scores: Anomaly scores of the normal training images.

        Returns:
            Unsupervised threshold value.
        """
        mean = torch.mean(anomaly_scores)
        std = torch.std(anomaly_scores)
        confidence_interval = 3  # You can adjust this value based on your desired level of confidence
        threshold = mean + confidence_interval * std
        self.value = threshold
        return self.value

To use this new method, you need to pass the anomaly scores of the normal training images to the compute_unsupervised_threshold method. You can do this after training your anomaly detection model and before using it for inference.

# Assuming you have the anomaly scores of the normal training images in a tensor called 'train_anomaly_scores'
threshold = AnomalyScoreThreshold()
unsupervised_threshold = threshold.compute_unsupervised_threshold(train_anomaly_scores)

I haven't tried this yet, but to me this seems reasonable. Maybe it would be worth giving this (or at least the idea) a try.

OscarHo1999 · 2023-04-27T01:06:17Z

anomalib_team01 will be working on this

samet-akcay · 2023-08-07T14:51:17Z

Duplicate of #1028

rishabh-akridata · 2024-09-11T11:38:00Z

@SimonB97 Have you tried the thresholding mechanism? Does it work as expected?

SimonB97 · 2024-09-11T11:56:07Z

@SimonB97 Have you tried the thresholding mechanism? Does it work as expected?

I'm not sure if I remember correctly, but I don't think I used the mechanism suggested by gpt-4 (as is).

Back then I worked on a proprietary system which I don't have access to anymore, so I can't check if I used it, but I'm pretty confident i didn't use this.
Still though, I used anomalib to successfully implement the system we needed, so I'd still recommend this library to anyone working with visual anomaly detection!

rishabh-akridata · 2024-09-11T11:57:37Z

Okay, thanks.

samet-akcay added Hackathon labels Apr 24, 2023

samet-akcay added the T1 label Apr 28, 2023

samet-akcay removed the Feature label Aug 4, 2023

samet-akcay mentioned this issue Aug 7, 2023

[Task]: An Enhanced Thresholding for Anomalib's Fully Unsupervised Training #1028

Closed

samet-akcay marked this as a duplicate of #1028 Aug 7, 2023

samet-akcay closed this as completed Aug 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Task]: Anomalib's Thresholding Mechanism Upgrade: Achieving Full Unsupervised Pipeline #1029

[Task]: Anomalib's Thresholding Mechanism Upgrade: Achieving Full Unsupervised Pipeline #1029

samet-akcay commented Apr 24, 2023

SimonB97 commented Apr 24, 2023 •

edited

Loading

OscarHo1999 commented Apr 27, 2023

samet-akcay commented Aug 7, 2023

rishabh-akridata commented Sep 11, 2024 •

edited

Loading

SimonB97 commented Sep 11, 2024

rishabh-akridata commented Sep 11, 2024

[Task]: Anomalib's Thresholding Mechanism Upgrade: Achieving Full Unsupervised Pipeline #1029

[Task]: Anomalib's Thresholding Mechanism Upgrade: Achieving Full Unsupervised Pipeline #1029

Comments

samet-akcay commented Apr 24, 2023

What is the motivation for this task?

Background

Anomaly Detection

Anomalib

The thresholding problem

Describe the solution you'd like

Possible approaches

Additional context

SimonB97 commented Apr 24, 2023 • edited Loading

OscarHo1999 commented Apr 27, 2023

samet-akcay commented Aug 7, 2023

rishabh-akridata commented Sep 11, 2024 • edited Loading

SimonB97 commented Sep 11, 2024

rishabh-akridata commented Sep 11, 2024

SimonB97 commented Apr 24, 2023 •

edited

Loading

rishabh-akridata commented Sep 11, 2024 •

edited

Loading