-
Notifications
You must be signed in to change notification settings - Fork 677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Task]: Anomalib's Thresholding Mechanism Upgrade: Achieving Full Unsupervised Pipeline #1029
Comments
Hi! First, i have a question regarding the current threshold calculation: Second, i asked gpt-4 to come up with an idea for this Task. This is the suggestion it gave:
I haven't tried this yet, but to me this seems reasonable. Maybe it would be worth giving this (or at least the idea) a try. |
anomalib_team01 will be working on this |
Duplicate of #1028 |
@SimonB97 Have you tried the thresholding mechanism? Does it work as expected? |
I'm not sure if I remember correctly, but I don't think I used the mechanism suggested by gpt-4 (as is). Back then I worked on a proprietary system which I don't have access to anymore, so I can't check if I used it, but I'm pretty confident i didn't use this. |
Okay, thanks. |
What is the motivation for this task?
Background
Anomaly Detection
Anomaly detection is the process of identifying data points, events, or observations within a dataset that significantly deviate from the normal or expected behavior. Anomalies may be caused by a variety of factors, including failures, fraud, or unusual behavior.
Anomaly detection is an essential task in numerous industries, including cybersecurity, finance, healthcare, and manufacturing. It can be used to detect fraud in financial transactions, to identify anomalies in medical data for early disease diagnosis, to detect flaws in manufacturing processes, and to monitor traffic for security threats.
Anomalib
Anomalib is a deep learning library that aims to collect the best anomaly detection algorithms for testing on both public and private datasets. Anomalib offers several ready-to-use implementations of anomaly detection algorithms described in recent research, as well as a set of tools that make it easier to build and use custom models. The library has a strong focus on image-based anomaly detection, where the goal of the algorithm is to identify anomalous images or anomalous pixel regions within images in a dataset.
The thresholding problem
Anomaly detection models in Anomalib are trained only on normal images. During inference, the models are tasked with distinguishing anomalous samples from normal samples. The task is similar to a classical binary classification problem, but instead of generating a class label and a confidence score, Anomalib models generate an anomaly score, which quantifies the distance of the sample to the distribution of normal samples seen during training. The range of possible anomaly score values is unbounded and may differ widely between models and/or datasets, which makes it challenging to set a good threshold for mapping the raw anomaly scores to a binary class label (normal vs. anomalous).
Describe the solution you'd like
Anomalib currently has an adaptive thresholding mechanism in place which aims to address the thresholding problem. The adaptive thresholding mechanism computes the F1 score over a validation set for a range of thresholds. The final threshold value is the threshold value that results in the highest F1 score. A major drawback of this approach is that the validation set is required to contain anomalous samples, which might not always be available in real world anomaly detection problems.
The goal of this hackathon is to design a fully unsupervised thresholding mechanism that does not rely on anomalous samples.
Possible approaches
Additional context
No response
The text was updated successfully, but these errors were encountered: