<center><img src="https://user-images.githubusercontent.com/17668390/164736543-0ef58c66-dfb0-47c5-8e6e-26774dbc6fd3.gif" alt="Paris" class="center"></center>

---

[Anomalib](https://github.com/openvinotoolkit/anomalib): Anomalib is a deep learning library that aims to collect state-of-the-art anomaly detection algorithms for benchmarking on both public and private datasets. Anomalib provides several ready-to-use implementations of anomaly detection algorithms described in the recent literature, as well as a set of tools that facilitate the development and implementation of custom models. The library has a strong focus on image-based anomaly detection, where the goal of the algorithm is to identify anomalous images, or anomalous pixel regions within images in a dataset.

It supports [`MVTec AD`](https://www.mvtec.com/company/research/datasets/mvtec-ad) (CC BY-NC-SA 4.0) and [`BeanTech`](https://paperswithcode.com/dataset/btad) (CC-BY-SA) for **benchmarking** and `folder` for custom dataset **training/inference**. In this notebook, we will explore `anomalib` with `MVTec AD` dataset. 

## Benchmarking Dataset: MVTec AD



**MVTec AD** is a dataset for benchmarking anomaly detection methods with a focus on industrial inspection. It contains over **5000** high-resolution images divided into **fifteen** different object and texture categories. Each category comprises a set of defect-free training images and a test set of images with various kinds of defects as well as images without defects. It's uploaded in kaggle platform, [HERE](https://www.kaggle.com/datasets/ipythonx/mvtec-ad). And now, we can use from kaggle environment. 

For custom dataset, it's a good practice to prepare the dataset according to **MVTech** format. Normally, we can find the data structure of **MVTec-AD** for [each object](https://www.kaggle.com/code/ipythonx/mvtec-ad-anomaly-detection-with-anomalib-library/data?scriptVersionId=93610841&select=bottle) as follows:

```yaml 
category (i.e. 'bottle')
  ground_truth
    defect_type_1_mask
    defect_type_2_mask
    ...
  test
    defect_type_1
    defect_type_2
    ...
    good
  train
    good
```

# Installation 

In [3]:
import os
DATA_DIR = '/kaggle/working/anomalib/anomalib'

# load repo with data if it is not exists
if not os.path.exists(DATA_DIR):
    !git clone https://github.com/openvinotoolkit/anomalib.git
    %cd anomalib
    !pip install -e . -q

# Imports

In [4]:
import numpy as np
import matplotlib.pyplot as plt
import os, pprint, yaml, warnings, math, glob, cv2, random, logging

def warn(*args, **kwargs):
    pass
warnings.warn = warn
warnings.filterwarnings('ignore')
logger = logging.getLogger("anomalib")

import anomalib
from pytorch_lightning import Trainer, seed_everything
from anomalib.config import get_configurable_parameters
from anomalib.data import get_datamodule
from anomalib.models import get_model
from anomalib.utils.callbacks import LoadModelCallback, get_callbacks
from anomalib.utils.loggers import configure_logger, get_experiment_logger

In [5]:
import torch
print(torch.version.cuda)
print(torch.backends.cudnn.version())
print(torch.cuda.is_available())
print(torch.cuda.device_count())      
print(torch.cuda.current_device())
print(torch.cuda.device(0))
print(torch.cuda.get_device_name(0))        

# Model Config Path

Currently, there are **7** anomaly detection models available in `anomalib` library. Namely, 

- [Patchcore](https://arxiv.org/pdf/2106.08265.pdf)
- [Padim](https://arxiv.org/pdf/2011.08785.pdf)
- [DFKDE](https://github.com/openvinotoolkit/anomalib/tree/development/anomalib/models/dfkde)
- [DFM](https://arxiv.org/pdf/1909.11786.pdf)
- [CFlow](https://arxiv.org/pdf/2107.12571v1.pdf)
- [Ganomaly](https://arxiv.org/abs/1805.06725)
- [STFPM](https://arxiv.org/pdf/2103.04257.pdf)


Now, let's get their config paths from the respected folders.

In [6]:
CONFIG_PATHS = '/kaggle/working/anomalib/anomalib/models'
MODEL_CONFIG_PAIRS = {
    'patchcore': f'{CONFIG_PATHS}/patchcore/config.yaml',
    'padim':     f'{CONFIG_PATHS}/padim/config.yaml',
    'cflow':     f'{CONFIG_PATHS}/cflow/config.yaml',
    'dfkde':     f'{CONFIG_PATHS}/dfkde/config.yaml',
    'dfm':       f'{CONFIG_PATHS}/dfm/config.yaml',
    'ganomaly':  f'{CONFIG_PATHS}/ganomaly/config.yaml',
    'stfpm':     f'{CONFIG_PATHS}/stfpm/config.yaml',
}

## Quick Look

In this demonstration, we will choose `PATCHCORE` model from the above config; which is index 1 in the above dictionary. Let's take a quick look of its config file. 

In [7]:
MODEL = 'patchcore' # 'padim', 'cflow', 'stfpm', 'ganomaly', 'dfkde', 'patchcore'
print(open(os.path.join(MODEL_CONFIG_PAIRS[MODEL]), 'r').read())

# Update Config 

In order to train on **MV-Tec** dataset, which is hosted on Kaggle, [HERE](https://www.kaggle.com/datasets/ipythonx/mvtec-ad), we may need to udpate some parameter in the configuration file, for example, `dataset.path`. Also, we may wish to tweak other parameters as well, i.e. `image_size`, `train_batch_size` etc.

In [8]:
new_update = {
    "path": '/kaggle/input/mvtec-ad',
    'category': 'hazelnut', 
    'image_size': 256,
    'train_batch_size':48,
    'seed': 101
}

In the above dictionary, the keys (`path`, `category`, `image_size`, `train_batch_size`) are already exist in the model's config file. We just want to update their default values. In the following cell, we write a simple function that will do the job. Note that, in the config file, the `path` key is the nested key of both `dataset` and `project` key. We only need to update the value of `dataset.path` and not `project.path` for now. 

In [9]:
# update yaml key's value
def update_yaml(old_yaml, new_yaml, new_update):
    # load yaml
    with open(old_yaml) as f:
        old = yaml.safe_load(f)
                  
    temp = []
    def set_state(old, key, value):
        if isinstance(old, dict):
            for k, v in old.items():
                if k == 'project':
                    temp.append(k)
                if k == key:
                    if temp and k == 'path':
                        # right now, we don't wanna change `project.path`
                        continue
                    old[k] = value
                elif isinstance(v, dict):
                    set_state(v, key, value)
    
    # iterate over the new update key-value pari
    for key, value in new_update.items():
        set_state(old, key, value)
    
    # save the updated / modified yaml file
    with open(new_yaml, 'w') as f:
        yaml.safe_dump(old, f, default_flow_style=False)

In [10]:
# let's set a new path location of new config file 
new_yaml = CONFIG_PATHS + '/' + list(MODEL_CONFIG_PAIRS.keys())[0] + '_new.yaml'

# run the update yaml method to update desired key's values
update_yaml(MODEL_CONFIG_PAIRS[MODEL], new_yaml, new_update)

In [11]:
with open(new_yaml) as f:new = yaml.safe_load(f)
pprint.pprint(new) # check if it's updated

# Prepare Model, Dataloader, Callbacks with `config`

Now, the config file is updated as we want. We can now start model training with it.

In [12]:
if new['project']['seed'] != 0:
    print(new['project']['seed'])
    seed_everything(new['project']['seed'])

In [13]:
# It will return the configurable parameters in DictConfig object.
config = get_configurable_parameters(
    model_name=new['model']['name'],
    config_path=new_yaml
)

In [14]:
# pass the config file to model, logger, callbacks and datamodule
model      = get_model(config)
experiment_logger = get_experiment_logger(config)
callbacks  = get_callbacks(config)
datamodule = get_datamodule(config)

In [15]:
# start training
trainer = Trainer(**config.trainer, logger=experiment_logger, callbacks=callbacks)
trainer.fit(model=model, datamodule=datamodule)

In [16]:
# load best model from checkpoint before evaluating
load_model_callback = LoadModelCallback(
    weights_path=trainer.checkpoint_callback.best_model_path
)
trainer.callbacks.insert(0, load_model_callback)
trainer.test(model=model, datamodule=datamodule)

# Visualization 

In [17]:
RESULT_PATH = os.path.join(
    new['project']['path'],
    new['model']['name'],
    new['dataset']['format'], 
    new['dataset']['category']
)
RESULT_PATH

In [18]:
# a simple function to visualize the model's prediction (anomaly heatmap)
def vis(paths, n_images, is_random=True, figsize=(16, 16)):
    for i in range(n_images):
        image_name = paths[i]
        if is_random: image_name = random.choice(paths)
        img = cv2.imread(image_name)[:,:,::-1]
        
        category_type = image_name.split('/')[-4:-3:][0]
        defected_type = image_name.split('/')[-2:-1:][0]
        
        plt.figure(figsize=figsize)
        plt.imshow(img)
        plt.title(
            f"Category : {category_type} and Defected Type : {defected_type}", 
            fontdict={'fontsize': 20, 'fontweight': 'medium'}
        )
        plt.xticks([])
        plt.yticks([])
        plt.tight_layout()
    plt.show()

In [19]:
for content in os.listdir(RESULT_PATH):
    if content == 'images':
        full_path = glob.glob(os.path.join(RESULT_PATH, content, '**',  '*.png'), recursive=True)
        print('Total Image ', len(full_path))
        print(full_path[0].split('/'))
        print(full_path[0].split('/')[-2:-1:])
        print(full_path[0].split('/')[-4:-3:])

In [20]:
vis(full_path, 25, is_random=True, figsize=(30, 30))

# Inference

In [21]:
SOME_TEST_IMAGES = [
    '/kaggle/input/mvtec-ad/bottle/test/broken_large/000.png',
    '/kaggle/input/mvtec-ad/bottle/test/broken_large/004.png',
    '/kaggle/input/mvtec-ad/bottle/test/broken_small/002.png',
]

In [22]:
# for infer_img_path in SOME_TEST_IMAGES:
#     !python tools/inference.py \
#         --config anomalib/models/{new['model']['name']}/config.yaml \
#         --weight_path {trainer.checkpoint_callback.best_model_path} \
#         --image_path {infer_img_path} \
#         --save_path ./infer_results
    
#     infer_resutls = "/kaggle/working/anomalib/infer_results/"
#     print(os.listdir(infer_resutls))
    
# for img in os.listdir(infer_resutls):
#     read_img = cv2.imread(os.path.join(infer_resutls,img))
#     plt.imshow(read_img[:,:,::-1])
#     plt.title(str(read_img.shape))
#     plt.show()