Zero-Shot Anomaly Segmentation by DOT Prompt, a model designed to detect anomalies in images, precisely identify their locations, and restore abnormal images to normal using LLM prompting and zero-shot segmentation techniques.
This codebase utilizes Anaconda for managing environmental dependencies. Please follow these steps to set up the environment:
- Download Anaconda: Click here to download Anaconda.
- Clone the Repository:
Clone the repository using the following command.
git clone https://github.com/sybeam27/DOT-Prompt
- Install Requirements:
- Navigate to the cloned repository:
cd DOT-Prompt-ZSAS
- Create a Conda environment from the provided
environment.yaml
file:conda env create -f environment.yaml
- Activate the Conda environment:
conda activate dot_as
- Navigate to the cloned repository:
This will set up the environment required to run the codebase.
Below are the details and download links for datasets used in our experiments:
- MVTec-AD (Download): The MVTec AD dataset comprises approximately 5,000 images across 15 classes, including texture-related categories such as fabric and wood.
- KSDD1 (Download): The KSDD1 dataset includes 347 normal images and 52 abnormal images, specifically for detecting micro-defects on metal surfaces.
- MTD (Download): The MTD dataset contains images of magnetic tiles, featuring various types of defects. These datasets provide valuable resources for our experiments and each known for their high-resolution, texture-rich images that are well-suited for texture anomaly segmentation.
Replace <dataset>
with one of the following options: mvtec
, ksdd
, mtd
.
Replace <model>
with one of the following options: base
, dot_zsas
.
python test_zsas.py --dataset <dataset name> --model <model name>
This command excel our proposed model for zero-shot anomaly segmentation(ZSAS) on the specified dataset using the selected model, with best configurations loaded, running 10 epochs each.
Ablatin study on MVTec-AD texture dataset.
python test_ablation.py --image True --prompt True --filter True
--gpu gpu number
--dataset dataset name
--model model name
--box_threshold GroundingSAM box threshold
--text_threshold GroundingSAM text threshold
--size_threshold Bounding-box size threshold
--iou_threshold IoU threshold
--random_img_num random image extraction number
--eval_resolution Description of evaluation resolution
--exp_idx Description of experiment index
--version Description of evaluation version
We extend our gratitude to the authors of the following libraries for generously sharing their source code and dataset:
RAM, Llama3, Grounding DINO, SAM, SAA+ Your contributions are greatly appreciated.