
## In this notebook, you will see all the steps sequentially performed to be able to utilize the complete functionality of OAB framework. The steps are as follows :
0. SETUP
1. DATA
2. DATA SELECTION
3. PREPROCESSING
4. SAMPLING
5. ALGORITHM TRAINING AND TESTING
6. EVALUATION
7. SHOW BENCHMARK RESULTS
8. REPRODUCIBILTY
9. EXTENDING THE BENCHMARK(with own Algorithm)

This notebook focuses on <b>Semisupervised Image Data</b>. Let's begin!

# **0. SETUP**

`oab` framework can be integrated in your Python environment  as a `PyPi package`  using the following command:

In [None]:
#ID 1(0)

#%%capture
# pip install oab
!pip install example-pkg-jd-kiel --extra-index-url=https://test.pypi.org/simple/

`Cloning` the repository:

`oab` is an open-source framework which can be accessed at https://github.com/ISDM-CAU-Kiel/oab. To use this .ipynb notebook successfully, the formerly mentioned repository needs to be cloned with the following command and this notebook must be run(if this is not the case already) within the cloned repository from the path:

<b>/oab/notebooks/benchmark_image/Semisupervised_Anomaly_Detection_on_Benchmark_Image_Data_v0.3.ipynb</b>

In [None]:
#ID 2(0)
!git clone https://github.com/ISDM-CAU-Kiel/oab.git

Now, importing the necessary functions and internal variables :

In [3]:
#ID 3(0)

import sys
import os
from datetime import datetime 
from pathlib import Path         
sys.path.append('../..')           

%load_ext autoreload
%autoreload 2

# necessary imports for loading datasets as well as information from recipe files
from oab.data.semisupervised import SemisupervisedAnomalyDataset
from oab.data.load_image_dataset import _load_image_dataset
from oab.data.load_dataset import load_dataset
from oab.data.utils_image import image_datasets
from oab.data.load_recipe_functions import *


# necessary imports for algorithm comparisons and defining seeds
from oab.evaluation import EvaluationObject, ComparisonObject,all_metrics
import tensorflow as tf
import numpy as np
import random
import torch

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## **0.1 NOTEBOOK AND CELL STRUCTURE** 

In this notebook there are certain sections where the user is required to enter its own information which are marked as comments of the form :

<b>### ADD YOUR CODE ###</b>  , so <b>###</b> can be searched to know what are those sections.

All cells are assigned an ID, as a comment at the top of the cell,for example as: <b>#ID 10(5)</b>, where 10 denotes the cell ID and 5 denotes the Section.

## 0.2 DETAILS OF THIS BENCHMARK RUN

In [4]:
#ID 4(0)
### ADD YOUR BENCHMARK NAME HERE ###
benchmark_name="Paper_A" 


dataset_folder="datasets" # all dataset-folders are contained in this folder
#print(dataset_folder)

benchmark_type="ssi"     # benchmark run for semisupervised image datasets(ssi)
if not os.path.exists(benchmark_name): #creating directory for this benchhmark for storing recipes
    os.makedirs(benchmark_name)

    
time=datetime.now().strftime("%Y%m%d%H%M%S") # timestamp set for this run  
new_recipe_path=f"{benchmark_name}/{time}-{benchmark_name}-{benchmark_type}-recipe.yaml" # recipe path for new recipe created in this run   
print(f"{time}-{benchmark_name}-{benchmark_type}-recipe.yaml")

20211223090123-Paper_A-ssi-recipe.yaml


### For reproducing a previously-created recipe without adding new datasets and algorithms from this benchmark  ,  skip to :

### `#ID 20(5)`

# **1. DATA**

First of all, we will have a look at the Datasets that are pre-installed in OAB which can be used for benchmarking

In [5]:
#ID 5(1)
lst_all_datasetnames =image_datasets
for i,dataset in enumerate(lst_all_datasetnames):
    print(f"{i}.{dataset}")

0.mnist
1.fashion_mnist
2.cifar10
3.cifar100
4.mvtec_ad_carpet
5.mvtec_ad_grid
6.mvtec_ad_leather
7.mvtec_ad_tile
8.mvtec_ad_wood
9.mvtec_ad_bottle
10.mvtec_ad_cable
11.mvtec_ad_capsule
12.mvtec_ad_hazelnut
13.mvtec_ad_metal_nut
14.mvtec_ad_pill
15.mvtec_ad_screw
16.mvtec_ad_toothbrush
17.mvtec_ad_transistor
18.mvtec_ad_zipper
19.crack



`oab` provides a variety of image datasets that can easily be loaded, in either of the following ways: 

`1.` If a user is interested in using her own image dataset  loading **via local folder directory**, the following steps have to be followed: (1) Ensure that the format is readable by oab. This requires there to be a folder for the dataset with the subfolders `normal` and `anomaly`. Naturally, all normal images are in the folder `normal` and all images of anomalies in the folder `anomaly`. (2) Based on this folder structure, the dataset can be loaded.


"**Local folder structure - without URL usage**" :
```
dataset_name
        │
        ├── normal
        │    
        └── anomaly
``` 

`2.` If user's dataset is provided **via a URL**, then it would be downloaded and stored in the OAB's "datasets" folder, given that it is already formatted as per the required folder heirarchy, where the folder :

`good`(which should not be renamed) : contains normal images whereas 

`anomaly_folder_n`(can be renamed):  contains anomalous images



"**Uploaded folder structure - with URL usage**" :
```
dataset_name
        │
        ├── train
        │   ├── good
        │
        │  
        └── test
            ├── good
            ├── anomaly_folder_1
            ├── ...
            ├── ...
            ├── anomaly_folder_2
```
 
Note: Alternatively, it is of course possible to load the images into `numpy` arrays and treat them as if they were tabular data. If this approach is to be followed, please look at the notebooks for tabular data.


- Notes: Limited to 256x256

In [6]:
#ID 6(1)


#### ADD YOUR DATASETNAME(S) HERE ###

own_datasets_list=["myImageDataset"]  # More of user's own datasets can be added in this list
benchmark_datasets_list=['mnist','mvtec_ad_transistor']   # More of OAB's datasets can be added to this list


# 'myImageDataset' is the name of the Dataset(which the user loads for benchmarking as well as the name of the folder containing normal and anomaly folders
#  which further contain the respective images
#  and make sure folder structure is correct ({dataset_folder}/{name-without mvtec_ad_}/[normal/anomaly])




# calling the helper function to update internal OAB variables 
mvtec_ad_own_datasets_list=[]
for dataset_name in own_datasets_list:
 
  mvtec_ad_own_datasets_list.append(add_own_dataset(dataset_name))
print( f" Dataset(s) successfully added as :  {mvtec_ad_own_datasets_list}")




#Now,  we'll have a look at all the datasets again which are pre-installed in OAB,after adding own_datasets 
lst_all_datasetnames =image_datasets
print("Datasets of OAB:")
for i,dataset in enumerate(lst_all_datasetnames):
    print(f"{i}.{dataset}")

 Dataset(s) successfully added as :  ['mvtec_ad_myImageDataset']
Datasets of OAB:
0.mnist
1.fashion_mnist
2.cifar10
3.cifar100
4.mvtec_ad_carpet
5.mvtec_ad_grid
6.mvtec_ad_leather
7.mvtec_ad_tile
8.mvtec_ad_wood
9.mvtec_ad_bottle
10.mvtec_ad_cable
11.mvtec_ad_capsule
12.mvtec_ad_hazelnut
13.mvtec_ad_metal_nut
14.mvtec_ad_pill
15.mvtec_ad_screw
16.mvtec_ad_toothbrush
17.mvtec_ad_transistor
18.mvtec_ad_zipper
19.crack
20.mvtec_ad_myImageDataset


Now, If this dataset(s) was already stored in the dataset_folder, structured as mentioned in the initial description of this section above, then we have to create the file "applied_modification.txt" in Path(dataset_folder)/dataset_name/ "applied_modification.txt"). If this is not available in the location, then we download the dataset from the given URL(When data is downloaded for the URL, the orientation of the folders as well as the image resizing operation is performed and  information about that is stored in "applied_modification.txt")

In [7]:
#ID 7(1)
for dataset_name in own_datasets_list:
   open(Path(dataset_folder)/dataset_name/"applied_modification.txt", "w") 
   
  

# **2. DATA SELECTION**


Datasets can either be loaded directly as anomaly datasets or as classification datasets. In the former case, the dataset is automatically fully prepared and ready for sampling. In the latter case, further preprocessing is still possible and necessary.

Note that the automatic preprocessing for image datasets is to scale each value by `1/255`.

**After adding own dataset(s) in #ID 6(1),the user is able to load own dataset(s) using this method :**

In [8]:
#ID 8(2)


datasets={}  # contain dataset objects of own dataset names 


for dataset_name in mvtec_ad_own_datasets_list:  # loading own datasets 
    
    datasets[dataset_name[9:]]=_load_image_dataset(dataset_name ,anomaly_dataset=False,preprocess_classification_dataset=False,dataset_folder=dataset_folder)

### **2.1 Load anomaly detection datasets (with or without further preprocessing)**

In this section, we load some pre-installed data sets. This can be achieved using the `load_dataset` function. By default, it creates an anomaly dataset from which sampling is directly possible  but we can first create classifcation dataset and then anomaly dataset,either with the preprocessing applied (`preprocess_classification_dataset=True`) i.e. Scaling is performed on all values by the factor of 1/255, or without (`preprocess_classification_dataset=False`, default).

`In our case` we set have already imported own datasets with `anomaly_dataset=False ` and `preprocess_classification_dataset=False` in <b>#ID 8(2)</b> and we will also load the OAB datasets in the same way in <b>#ID 9(2)</b>


Note that as discussed in the paper, multiclass classification datasets like Cifar10 and MNIST are loaded with the class label `0` as normal label and all other labels as anomaly labels by default. (Alternatively, `oab` can automatically iterate through all classes as normal classes. This is not covered here.)

In [9]:
#ID 9(2)

#### ADD YOUR OWN NUMBER OF DATASETS AND FROM OAB FOR BENCHMARKING  ###

for dataset_name in benchmark_datasets_list:  # loading benchmark's datasets
    datasets[dataset_name]=load_dataset(dataset_name,anomaly_dataset=False,preprocess_classification_dataset=False,dataset_folder=dataset_folder)
print(datasets)

{'myImageDataset': <oab.data.classification_dataset.ClassificationDataset object at 0x7fe37e47c450>, 'mnist': <oab.data.classification_dataset.ClassificationDataset object at 0x7fe4402b8850>, 'mvtec_ad_transistor': <oab.data.classification_dataset.ClassificationDataset object at 0x7fe37e493590>}


# **3. PREPROCESSING**

The  resizing of images(only when dataset is downloaded using URL) and scaling of images has already been performed while loading the datasets as shown in previous the Sections.

Standard preprocessing steps like deleting columns, encoding categorical values differently, or removing missing values do not apply to image data. Therefore, these methods (as well as own preprocessing steps and how these are captured) are covered in the tabular dataset benchmarks.

Here, we only show two preprocessing steps that are applied to datasets stored in `datasets` dictionary(loaded in 2.2), which can also be performed individually depending upon requirement :
- `Scale` all values by `1/255`.
- `Transform the dataset into an anomaly dataset` for semisupervised anomaly detection by setting the class label `0` to normal and all other class labels to anomalous.

In [10]:
#ID 10(3)                            SCALING APPLIED

                                         
for dataset_name in datasets:
    
    datasets[dataset_name].scale(scaling_factor=1/255)
    operations=datasets[dataset_name].operations_performed
    dataset_info_store(dataset_name,new_recipe_path,info_type='standard_functions',content=operations) 
   


#print("Scaling performed on datasets!")    
#print(datasets)

The file <b>f"{time}-{benchmark_name}-{benchmark_type}-recipe.yaml"</b> now contains information about how to preprocess(i.e. perform scaling) the file 

In [11]:
#ID 11(3)

!cat {new_recipe_path}

myImageDataset:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098
mnist:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098
mvtec_ad_transistor:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098


In [12]:
#ID 12(3)                            ANOMALY-DATASET CONVERSION PERFORMED
 

datasets_ad={}    
    # for storing dataset objects converted to anomaly-dataset
for dataset_name in datasets:   
    
     datasets_ad[dataset_name]= SemisupervisedAnomalyDataset(classification_dataset=datasets[dataset_name],
                                                       normal_labels=0)  
   
     normal_labels=datasets_ad[dataset_name].normal_labels 
     dataset_info_store(dataset_name,new_recipe_path,info_type='anomaly_dataset',content=normal_labels)   
                                                                            
print("datasets after adding anomaly-conversion datasets: ")    
print(datasets_ad)

datasets after adding anomaly-conversion datasets: 
{'myImageDataset': <oab.data.semisupervised.SemisupervisedAnomalyDataset object at 0x7fe37e47c750>, 'mnist': <oab.data.semisupervised.SemisupervisedAnomalyDataset object at 0x7fe39877ba50>, 'mvtec_ad_transistor': <oab.data.semisupervised.SemisupervisedAnomalyDataset object at 0x7fe4402b86d0>}


In [13]:
#ID 13(3)

!cat {new_recipe_path}

myImageDataset:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098
- anomaly_dataset:
    arguments:
      normal_labels:
      - 0
      anomaly_labels:
mnist:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098
- anomaly_dataset:
    arguments:
      normal_labels:
      - 0
      anomaly_labels:
mvtec_ad_transistor:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098
- anomaly_dataset:
    arguments:
      normal_labels:
      - 0
      anomaly_labels:


# **4. SAMPLING**

Here, we define the sampling parameters to sample from the datasets

In [14]:
#ID 14(4)

### ADD YOUR OWN SAMPLING PARAMETERS ###

# sampling parameters
training_split = 0.7                 # the defined ratio of  training instances while sampling
max_contamination_rate = 0.5         # the defined ratio of anomalys while sampling  
n_steps = 1        # n_steps=10      # the number of times data is to be sampled

#These below are the possible sampling types to sample from datasets
sampling_types=['semisupervised_multiple','semisupervised_explicit_numbers_single','semisupervised_training_split_multiple','semisupervised_training_split_single']


sampling_type='semisupervised_training_split_multiple'  #by default for this run

sampling_params_current_run=[{'training_split':training_split,'max_contamination_rate':max_contamination_rate,'n_steps':n_steps,'flatten_images':False},sampling_type] 

sampling=[{sampling_type:sampling_params_current_run[0]}]
#print(sampling)


for dataset_name in datasets_ad:
    
    if not dataset_name[:9]=='mvtec_ad_':
        #storing sampling info to recipe
        dataset_info_store(dataset_name,new_recipe_path,'sampling',content=sampling)
        
    
    
    
# set for this run

The above sampling parameters are utilized in
<b>#ID 23(5)</b> 
for sampling the datasets(except the pre-installed mv_tec_ad_datasets) before training the algorithms.

In [15]:
#ID 15(4)

!cat {new_recipe_path}

myImageDataset:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098
- anomaly_dataset:
    arguments:
      normal_labels:
      - 0
      anomaly_labels:
- sampling:
    semisupervised_training_split_multiple:
      training_split: 0.7
      max_contamination_rate: 0.5
      n_steps: 1
      flatten_images: false
mnist:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098
- anomaly_dataset:
    arguments:
      normal_labels:
      - 0
      anomaly_labels:
- sampling:
    semisupervised_training_split_multiple:
      training_split: 0.7
      max_contamination_rate: 0.5
      n_steps: 1
      flatten_images: false
mvtec_ad_transistor:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098
- anomaly_dataset:
    arguments:
      normal_labels:
      - 0
      anomaly_labels:


Now, we will associate sampling information with each dataset loaded in the benchmark run:

In [16]:
#ID 16(4)
benchmarking_datasets={}

for (x,y) in datasets_ad.items():
    benchmarking_datasets[x]=[y,sampling_params_current_run]


print(benchmarking_datasets)    

{'myImageDataset': [<oab.data.semisupervised.SemisupervisedAnomalyDataset object at 0x7fe37e47c750>, [{'training_split': 0.7, 'max_contamination_rate': 0.5, 'n_steps': 1, 'flatten_images': False}, 'semisupervised_training_split_multiple']], 'mnist': [<oab.data.semisupervised.SemisupervisedAnomalyDataset object at 0x7fe39877ba50>, [{'training_split': 0.7, 'max_contamination_rate': 0.5, 'n_steps': 1, 'flatten_images': False}, 'semisupervised_training_split_multiple']], 'mvtec_ad_transistor': [<oab.data.semisupervised.SemisupervisedAnomalyDataset object at 0x7fe4402b86d0>, [{'training_split': 0.7, 'max_contamination_rate': 0.5, 'n_steps': 1, 'flatten_images': False}, 'semisupervised_training_split_multiple']]}


The above dictionary <b>benchmarking_datasets</b> will be used for the Benchmarking as it contains all the information:"
    
    
    1.dataset_name
    2.final_dataset_object(preprocessed and anomaly-converted)
    3.sampling_info



# **5. ALGORITHM TRAINING AND TESTING**

<b>To load own algorithm(s), refer to #ID 33(9) and #ID 34(9)</b> where an example algorithm is loaded, then 

come back to this cell  and <b>load own algorithm(s) details in #ID 17(5),#ID 18(5) AND #ID 19(5)</b> in the same way as benchmark algorithms.

We first download and import algorithms used for anomaly decection.

In [None]:
#ID 17(5)

import wget

wget.download('https://raw.githubusercontent.com/ISDM-CAU-Kiel/oab/master/notebooks/benchmark_image/cae_ocsvm.py','cae_ocsvm.py')
wget.download('https://raw.githubusercontent.com/ISDM-CAU-Kiel/oab/master/notebooks/benchmark_image/cae_iforest.py','cae_iforest.py')
wget.download('https://raw.githubusercontent.com/ISDM-CAU-Kiel/oab/master/notebooks/benchmark_image/conv_ae.py','conv_ae.py')

### ADD your algo import(s) below ###


In [18]:
#ID 18(5)



### ADD your algo import(s) below ###

#from module_name import class_name

from conv_ae import ConvAutoEncoder 
from cae_ocsvm import CAEOCSVM
from cae_iforest import CAEIForest


Firstly, we define hyperparameters for all algorithms and choose for benchmarking:

In [20]:
#ID 19(5)

   
### Extend Algos dictionary with own algorithm specifications as shown below for OAB algorithms ###   

### ADD YOUR OWN (HYPER)PARAMETERS AND THEIR VALUES FOR PRE-INSTALLED ALGOS###


lst_benchmark_algorithms = [
      
       {
       "algo_module_name": "cae_iforest",   
       "algo_class_name": "CAEIForest",
       "algo_name_in_result_table": "CAE v2",
       "algo_parameters": {"CAE_parameters": {'latent_dim': 100, 'epochs': 50, 'verbose': 0}, "IForest_parameters": {'random_state': 42} },
        "fit": {'method_name': 'fit', 'params': {}}, 
        "decision_function": {'method_name': 'decision_function', 'params': {}}
        }]
''' ### uncomment to use these algorithms for benchmarking ###
{
       "algo_module_name": "conv_ae"   ,
       "algo_class_name": "ConvAutoEncoder",
       "algo_name_in_result_table": "CAE v1",
       "algo_parameters":   {'latent_dim': 100, 'epochs': 50, 'verbose': 0},
        "fit": {'method_name': 'fit', 'params': {}}, 
       "decision_function": {'method_name': 'decision_function', 'params': {}}
       }  


    {   
       "algo_module_name": "cae_ocsvm" , 
       "algo_class_name": "CAEOCSVM",
       "algo_name_in_result_table": "CAE v3",
       "algo_parameters": {"CAE_parameters": {'latent_dim': 100, 'epochs': 50, 'verbose': 0},"OCSVM_parameters": {'degree': 3}},
       "fit": {'method_name': 'fit', 'params': {}}, 
       "decision_function": {'method_name': 'decision_function', 'params': {}}
    } 
]    
'''      
  

### ADD OWN ALGORITHMS  with algorithm specifications as shown above for OAB algorithms ###   

own_algorithms=[]  #add to this list e.g. { "algo_module_name": "own_algo" , "algo_class_name": "ownAlgoClass",.........."decision_function": {'method_name': 'decision_function', 'params': {}}} 

lst_benchmark_algorithms.extend(own_algorithms)


#seed defined for ths benchmark run for obtaining consistent results 
seed=42


<b>LOAD YOUR RECIPE</b> to be repdroduced and use it in the current benchmark run.

In [21]:
#ID 20(5)        # Execute this cell only when you already have a recipe file  to load from 

### ADD AN OPTIONAL RECIPE  PATH TO ADD TO THIS BENCHMARK RUN START ###   

# Note: recipes of type "semisupervised tabular(sst) " i.e. of the format: 
#               "timestamp-benchmark_name-sst-recipe.yaml"
# can only be used for benchmarking in this notebook.

recipe_path="Paper_B/20211210173740-Paper_B-ssi-recipe.yaml"

### ADD AN OPTIONAL RECIPE  PATH TO ADD TO THIS BENCHMARK RUN END ###
 
    
    
### UNCOMMENT ONLY IF NO NEW DATASETS WERE ADDED IN THE BENCHMARK EXCEPT FROM RECIPE START ###     

#benchmarking_datasets={}
#lst_benchmark_algorithms=[]

### UNCOMMENT ONLY IF NO NEW DATASETS WERE ADDED IN THE BENCHMARK EXCEPT FROM RECIPE END ###  

!cat {recipe_path} 

seed: 
  - 42
cifar10:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098 
- anomaly_dataset:
    arguments:
      normal_labels:
      - 0
      anomaly_labels: 
- sampling:
    semisupervised_training_split_multiple:
      training_split: 0.7
      max_contamination_rate: 0.5
      n_steps: 1
      flatten_images: false
myImageDataset2:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098 
- anomaly_dataset:
    arguments:
      normal_labels:
      - 0
      anomaly_labels: 
- sampling:
    semisupervised_training_split_multiple:
      training_split: 0.7
      max_contamination_rate: 0.5
      n_steps: 1
      flatten_images: false     
conv_ae:
- algo_name
- init:
    params:
      latent_dim: 100
      epochs: 50
      verbose: 0
  fit:
    method_name: fit
    params: {}
  decision_function:
    method_name: decision_function

In [22]:
#ID 21(5)    # Execute this cell only when you already have a recipe file  to load from 


recipe_algos=data_from_recipe('algos',recipe_path) # all algo names from recipe extracted
#print(f"recipe_algos:\n{recipe_algos}")

recipe_datasets=data_from_recipe('datasets',recipe_path) # all dataset info(anomaly dataset object/anomalydataset params/sampling params) is perfomed and obtained
#print(f"\nrecipe_datasets:\n{recipe_datasets}")
 
recipe_seed=data_from_recipe('seed',recipe_path)  # obtained seeds to feed in this benchmark 
seed=recipe_seed   # seed of current benchmark is overwritten by recipe seed

# adding recipe_datasets to benchmarking_datasets
for dataset_name in recipe_datasets:
    benchmarking_datasets[dataset_name]=recipe_datasets[dataset_name][:2]
#print(f"benchmarking_datasets: {benchmarking_datasets}") 
                                         
#adding algos from recipe_algos to lst_benchmarking_algos
for algo in recipe_algos:
    #print(algo)
    lst_benchmark_algorithms.append(algo)
   

conv_ae----


cifar10------
standard/custom preprocessing performed!
transformed to anomaly dataset!

myImageDataset2------
standard/custom preprocessing performed!
transformed to anomaly dataset!


In [23]:
#ID 22(5)

  
print("\nAll Datasets for this benchmark run:")    
for dataset_name in benchmarking_datasets:
    print(dataset_name)

    
 
print("\nAll algos for this benchmark run:")
for algo in lst_benchmark_algorithms:
    #print(algo)
    print(algo['algo_module_name'])




All Datasets for this benchmark run:
myImageDataset
mnist
mvtec_ad_transistor
cifar10
myImageDataset2

All algos for this benchmark run:
cae_iforest
conv_ae


Now, For every benchmark dataset , we sample from that dataset to train the algorithms and then predict the outcomes for each dataset with each algortihm and then store results in a evaluation object, which is then added to the comparison object to show the final Benchmarking results

In [26]:
#ID 23(5)


co = ComparisonObject()

for dataset_name in list(benchmarking_datasets.keys()):
    print(f'-------{dataset_name}-------') 
    
    
    for alg in lst_benchmark_algorithms:
        
        print("------"+alg["algo_module_name"])
        eval_obj = EvaluationObject(algorithm_name=alg["algo_name_in_result_table"])
        algo= getattr(__import__(alg["algo_module_name"]),alg["algo_class_name"])(**alg['algo_parameters'])  # Algo object imported from class 

        
        
        if dataset_name[:9]=='mvtec_ad_':  # if dataset is a pre-installed 'mv_tec' dataset
            

            (x_train, x_test, y_test), sample_config in benchmarking_datasets[dataset_name][0].sample_original_mvtec_split(flatten_images=False)
            torch.manual_seed(seed)
            random.seed(seed)
            tf.random.set_seed(seed)
            np.random.seed(seed) 
            os.environ['PYTHONHASHSEED'] = str(seed)
            torch.cuda.manual_seed(seed)
            torch.cuda.manual_seed_all(seed)
            torch.backends.cudnn.deterministic = True
            torch.backends.cudnn.benchmark = False
            torch.use_deterministic_algorithms(True)
            
            print('.', end='') # update to see progress
            getattr(algo,alg["fit"]["method_name"])(x_train, **alg["fit"]["params"])  # fitting algo
            pred= getattr(algo,alg["decision_function"]["method_name"])(x_test, **alg["decision_function"]["params"]) # decision functions
            eval_obj.add(ground_truth=y_test, prediction=pred, description=sample_config)  
                    
            
        else:
          sampling_type=benchmarking_datasets[dataset_name][1][1]
          sampling_params=benchmarking_datasets[dataset_name][1][0]
    
          for (x_train, x_test, y_test), sample_config in sample_v3(dataset_name,sampling_type,sampling_params,benchmarking_datasets[dataset_name][0]):
                
                torch.manual_seed(seed)
                random.seed(seed)
                tf.random.set_seed(seed)
                np.random.seed(seed) 
                os.environ['PYTHONHASHSEED'] = str(seed)
                torch.cuda.manual_seed(seed)
                torch.cuda.manual_seed_all(seed)
                torch.backends.cudnn.deterministic = True
                torch.backends.cudnn.benchmark = False
                torch.use_deterministic_algorithms(True)
                
                print('.', end='') # update to see progress 
                getattr(algo,alg["fit"]["method_name"])(x_train, **alg["fit"]["params"])  # fitting algo
                pred = getattr(algo,alg["decision_function"]["method_name"])(x_test, **alg["decision_function"]["params"]) # decision functions
                eval_obj.add(ground_truth=y_test, prediction=pred, description=sample_config)  
            
        eval_desc = eval_obj.evaluate(print=True, metrics=['roc_auc', 'adjusted_average_precision', 'precision_recall_auc'])
        co.add_evaluation(eval_desc)
        print("\n")
      

-------myImageDataset-------
------cae_iforest
.Evaluation on dataset mvtec_ad_myImageDataset with normal labels [0] and anomaly labels [1.0].
Total of 1 datasets. Per dataset:
28 training instances, 24 test instances, training contamination rate 0.0, test contamination rate 0.5.
Mean 	 Std_dev 	 Metric
0.278 	 0.000 		 roc_auc
-0.162 	 0.000 		 adjusted_average_precision
0.375 	 0.000 		 precision_recall_auc


------conv_ae
.Evaluation on dataset mvtec_ad_myImageDataset with normal labels [0] and anomaly labels [1.0].
Total of 1 datasets. Per dataset:
28 training instances, 24 test instances, training contamination rate 0.0, test contamination rate 0.5.
Mean 	 Std_dev 	 Metric
0.285 	 0.000 		 roc_auc
-0.182 	 0.000 		 adjusted_average_precision
0.375 	 0.000 		 precision_recall_auc


-------mnist-------
------cae_iforest
.Evaluation on dataset mnist with normal labels [0] and anomaly labels [1, 2, 3, 4, 5, 6, 7, 8, 9].
Total of 1 datasets. Per dataset:
4832 training instances, 4142 t

# **6. EVALUATION**

Here , we will see how different metrics can be selected when evaluating an algorithm's performance.

In previous section while creating an evalutation description,  we used all metrics for evaluation:

     eval_desc = eval_obj.evaluate(print=False, metrics=all_metrics)
    
    

In [27]:
#ID 24(6)

# to use a subset, first see which ones are available

print(all_metrics)

['roc_auc', 'average_precision', 'adjusted_average_precision', 'precision_n', 'adjusted_precision_n', 'precision_recall_auc']


In [28]:
#ID 25(6)

#### ADD YOUR OWN NUMBER OF METRICS ###

#Then we can  select an arbitrary subset
metrics=['roc_auc', 'precision_recall_auc']

# **7. SHOW BENCHMARK RESULTS**

We compare by printing, the results of the evaluations of different Algo-Dataset combinations.

\[Latex version: bold for highest, italics for second highest, ?\]

In [29]:
#ID 26(7)

# print results in easily readable format
co.print_results()

For roc_auc:
                 mvtec_ad_myImageDataset     mnist   cifar10  \
CAE v2                          0.277778  0.982190  0.625590   
ConvAutoEncoder                 0.284722  0.935276  0.713568   
Average                         0.281250  0.958733  0.669579   

                 mvtec_ad_myImageDataset2   Average  
CAE v2                           0.631944  0.629375  
ConvAutoEncoder                  0.583333  0.629225  
Average                          0.607639       NaN  
For adjusted_average_precision:
                 mvtec_ad_myImageDataset     mnist   cifar10  \
CAE v2                         -0.162009  0.960543  0.179836   
ConvAutoEncoder                -0.181918  0.832317  0.325453   
Average                        -0.171964  0.896430  0.252644   

                 mvtec_ad_myImageDataset2   Average  
CAE v2                           0.370228  0.337149  
ConvAutoEncoder                  0.179934  0.288946  
Average                          0.275081       NaN  
For preci

In [30]:
#ID 27(7)

# print results in easily readable format with standard deviations
co.print_results(include_stdevs=True)

For roc_auc:
                mvtec_ad_myImageDataset         mnist       cifar10  \
CAE v2                     0.278+-0.000  0.982+-0.000  0.626+-0.000   
ConvAutoEncoder            0.285+-0.000  0.935+-0.000  0.714+-0.000   
Average                           0.281         0.959         0.670   

                mvtec_ad_myImageDataset2   Average  
CAE v2                      0.632+-0.000  0.629375  
ConvAutoEncoder             0.583+-0.000  0.629225  
Average                            0.608       NaN  

For adjusted_average_precision:
                mvtec_ad_myImageDataset         mnist       cifar10  \
CAE v2                    -0.162+-0.000  0.961+-0.000  0.180+-0.000   
ConvAutoEncoder           -0.182+-0.000  0.832+-0.000  0.325+-0.000   
Average                          -0.172         0.896         0.253   

                mvtec_ad_myImageDataset2   Average  
CAE v2                      0.370+-0.000  0.337149  
ConvAutoEncoder             0.180+-0.000  0.288946  
Average      

In [31]:
# ID 28(7)

co.print_latex(include_stdevs=True)

For roc_auc:
\begin{center}
\begin{tabular}{  c c c c c c  }
  & mvtec\_ad\_myImageDataset & mnist & cifar10 & mvtec\_ad\_myImageDataset2 & Average \\
  CAE v2 & \textit{0.278$\pm$0.000} & \textbf{0.982$\pm$0.000} & \textit{0.626$\pm$0.000} & \textbf{0.632$\pm$0.000} & \textbf{0.629} \\
  ConvAutoEncoder & \textbf{0.285$\pm$0.000} & \textit{0.935$\pm$0.000} & \textbf{0.714$\pm$0.000} & \textit{0.583$\pm$0.000} & \textit{0.629} \\
  Average & 0.281 & 0.959 & 0.670 & 0.608 &    \\
\end{tabular}
\end{center}

For adjusted_average_precision:
\begin{center}
\begin{tabular}{  c c c c c c  }
  & mvtec\_ad\_myImageDataset & mnist & cifar10 & mvtec\_ad\_myImageDataset2 & Average \\
  CAE v2 & \textbf{-0.162$\pm$0.000} & \textbf{0.961$\pm$0.000} & \textit{0.180$\pm$0.000} & \textbf{0.370$\pm$0.000} & \textbf{0.337} \\
  ConvAutoEncoder & \textit{-0.182$\pm$0.000} & \textit{0.832$\pm$0.000} & \textbf{0.325$\pm$0.000} & \textit{0.180$\pm$0.000} & \textit{0.289} \\
  Average & -0.172 & 0.896 & 0.25

# **8. REPRODUCIBILITY**

 ## **8.1 Creating recipes**

This section shows **how `oab` can be used to make sampling results easily reproducible** .
 

`yaml` files play an integral role in making reproducibility work, as they store the operations and parameters performed on the data.

We will see how to produce a recipe(.yaml) of the Benchmarkrun already performed  in <b>#ID 23(5)</b>

In <b>#ID 10(3) #ID 12(3) #ID 14(4)</b>,  We already performed operations on own datasets and OAB's datasets, and then already stored the daasets information as we can see below: 

In [32]:
#ID 29(8)
!cat {new_recipe_path}

myImageDataset:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098
- anomaly_dataset:
    arguments:
      normal_labels:
      - 0
      anomaly_labels:
- sampling:
    semisupervised_training_split_multiple:
      training_split: 0.7
      max_contamination_rate: 0.5
      n_steps: 1
      flatten_images: false
mnist:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098
- anomaly_dataset:
    arguments:
      normal_labels:
      - 0
      anomaly_labels:
- sampling:
    semisupervised_training_split_multiple:
      training_split: 0.7
      max_contamination_rate: 0.5
      n_steps: 1
      flatten_images: false
mvtec_ad_transistor:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098
- anomaly_dataset:
    arguments:
      normal_labels:
      - 0
      anomaly_labels:


Now, we will store the information of  datasets and algorithms information from <b>Paper_B's</b> recipe
and only of the algorithms of this benchmark in the new recipe:

In [33]:
#ID 30(8)   # Execute this cell only when you already loaded datasets from a recipe file  
# adding datasets from recipe used in in benchmark run in #ID 23(5)
for dataset_name in recipe_datasets:
    
    #storing anomaly parameters
    dataset_info_store(dataset_name,new_recipe_path,info_type='anomaly_dataset',content=recipe_datasets[dataset_name][0].normal_labels)
    
    # storing preprocesing parameters
    dataset_info_store(dataset_name,new_recipe_path,info_type='standard_functions',content=recipe_datasets[dataset_name][3]) 
    #dataset_info_store(dataset_name,new_recipe_path,info_type='custom_functions',content=recipe_datasets[dataset_name][4]) 
    
    
    #storing sampling parameters
    if dataset_name[:9]!='mvtec_ad_':
      sampling_data=recipe_datasets[dataset_name][1]
      dataset_info_store(dataset_name,new_recipe_path,'sampling',content=sampling_data)
   
    

Now,we will store information about <b>Algorithms and their hyperparameters</b> in the recipe(.yaml) 

In [34]:
#ID 31(8)
for algo in lst_benchmark_algorithms:
    
    x=algo["algo_module_name"]
    y=['algo_name',
         
         {
         'init': 
          
               {

       'params':algo["algo_parameters"]
          
               },
        
        'fit':algo["fit"]   
        ,

        'decision_function':algo["decision_function"]
         },
         
         algo["algo_class_name"]
        
        ]
                 
     
    yaml=YAML(typ='rt')
    yaml_content = yaml.load(Path("./") / new_recipe_path)
    yaml_content[x]=y
    yaml_content['seed']=[seed]          # adding seed to new recipe
    yaml.dump(yaml_content, Path("./") /new_recipe_path)

In [35]:
#ID 32(8)
!cat {new_recipe_path}

myImageDataset:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098
- anomaly_dataset:
    arguments:
      normal_labels:
      - 0
      anomaly_labels:
- sampling:
    semisupervised_training_split_multiple:
      training_split: 0.7
      max_contamination_rate: 0.5
      n_steps: 1
      flatten_images: false
mnist:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098
- anomaly_dataset:
    arguments:
      normal_labels:
      - 0
      anomaly_labels:
- sampling:
    semisupervised_training_split_multiple:
      training_split: 0.7
      max_contamination_rate: 0.5
      n_steps: 1
      flatten_images: false
mvtec_ad_transistor:
- dataset
- standard_functions:
  - name: scale
    parameters:
      scaling_factor: 0.00392156862745098
- anomaly_dataset:
    arguments:
      normal_labels:
      - 0
      anomaly_labels:
cifar10:


In **f"{time}-{benchmark_name}-{benchmark_type}-recipe.yaml"**, we now see the sampling parameters, anomaly- dataset-conversion parameters, hyperparamters along with the algorithms for "semisupervised_multiple_with_training_split". If sampling is done in a different scenario, e.g., unsupervised multiple, this would also be stored in f"{benchmark_name}/{time}-{benchmark_name}-{benchmark_type}-recipe.yaml" using a different key in the sampling dict.



### 2. Reproducing the experiment

To reproduce the recipe created in the previous section,
we refer to <b>Section 5 #ID 20(5)</b> where we can reproduce the run as well as extend benchmarks!

# **9. EXTEND EXISTING BENCHMARK(own algorithm)**

To extend the existing benchmark here basically means to add  our own algorithm to the benchmark and to show the comparison results of pre-installed algorithms while also loading our own dataset.


1. We load the datasets. To know how to do that, we can refer to  **Section "1. Data" and "2. Data Selection"**
2. Then, load own algorithm as we will see in the next sub-section.

## **9.1 Loading own Algorithm**

In this subsection, you will see **how an own semisupervised anomaly detection algorithm** can easily be used within oab to be evaluated. We will see how a class representing an algorithm can be structured and how its performance is evaluated.

Of course, this is not the only way to use the functionality provided by oab. We do consider it to be the simplest way however.

In [36]:
#ID 33(9)

# download example algorithm and inspect content
import wget
wget.download('https://raw.githubusercontent.com/jandeller/test/main/RandomGuesserSemisupervised.py',"RandomGuesserSemisupervised.py")
!cat RandomGuesserSemisupervised.py

  0% [                                                              ]   0 / 291100% [..............................................................] 291 / 291import numpy as np

class RandomGuesserSemisupervised():

    def fit(self, X_train):
        pass
      
    def decision_function(self, X_test):
        "Assign a random number to each sample from the test set"
        n_samples = X_test.shape[0]
        return np.random.randn(n_samples)


The sample `RandomGuesser` algorithm shown here is - as the name suggests - a random guesser, i.e., it assigns random anomaly scores to the samples.

An algorithm used for semisupervised anomaly detection needs to specify a `fit(X_train)` method for training and a `decision_function(X_test)` method for inference that returns an anomaly score per data point in the test set.

It is of course possible to rename the method and field, use a method for accessing the anomaly scores, etc. Note that if this is done, the following code has to be changed accordingly. Adhering to the conventions described above (`fit(X_train)` and `decision_function(X_test)`) allows you to use the same interface as algorithms from [`PyOD`](https://pyod.readthedocs.io/en/latest/) as shown when [comparing algorithms using `oab`](https://colab.research.google.com/drive/1aV_itaYCJgzdZ1lQ7SUyHQ7z01xSPxDN?usp=sharing#scrollTo=QnAfCGTGL7xv).

In [37]:
#ID 34(9)
# used imports from #ID 3(0),#ID 18(5)
#used sampling parameters from #ID 14(4)

# and import the RandomGuesser
from RandomGuesserSemisupervised import RandomGuesserSemisupervised
    
own_algorithms=[{
    
       ### ADD YOUR OWN ALGO DETAILS IN THIS FORM ###
       "algo_module_name": "RandomGuesserSemisupervised",   
       "algo_class_name": "RandomGuesserSemisupervised",
       "algo_name_in_result_table": "RandomGuesserSemisupervised",
       "algo_parameters": {},
        "fit": {'method_name': 'fit', 'params': {}}, 
        "decision_function": {'method_name': 'decision_function', 'params': {}}
        }]




The `own_algorithms` list in the above cell #ID 34(9) can be added to `lst_benchmarking_algos` as mentioned in #ID 19(5) to use this algorithm in a benchmark run shown in #ID 23(5) along with other algorithms

In [38]:
#ID 35(9)
        

#  A comparison object is created for comparing the evaluations of different Algo-Dataset combinations
co = ComparisonObject()

for dataset_name in benchmarking_datasets:
  # evaluate the random guesser
  print(dataset_name)
  eval_obj = EvaluationObject(algorithm_name="RandomGuesser")
  for (X_train, X_test, y_test), settings in benchmarking_datasets[dataset_name][0].sample_multiple_with_training_split(training_split=training_split, 
                                                                  max_contamination_rate=max_contamination_rate, 
                                                                  n_steps=n_steps):
      print(".", end=" ") # update to see progress
      rg = RandomGuesserSemisupervised()
      rg.fit(X_train) # data is fitted to RandomGuesser
      pred = rg.decision_function(X_test) # and decision_scores_ is accessed
      eval_obj.add(y_test, pred, settings)
  print("\n")
  eval_desc = eval_obj.evaluate(metrics=['roc_auc', 'adjusted_average_precision', 'precision_recall_auc'])
  # added to comparison object
  co.add_evaluation(eval_desc)
  print("\n")


myImageDataset
. 

Evaluation on dataset mvtec_ad_myImageDataset with normal labels [0] and anomaly labels [1.0].
Total of 1 datasets. Per dataset:
28 training instances, 24 test instances, training contamination rate 0.0, test contamination rate 0.5.
Mean 	 Std_dev 	 Metric
0.542 	 0.000 		 roc_auc
0.248 	 0.000 		 adjusted_average_precision
0.606 	 0.000 		 precision_recall_auc


mnist
. 

Evaluation on dataset mnist with normal labels [0] and anomaly labels [1, 2, 3, 4, 5, 6, 7, 8, 9].
Total of 1 datasets. Per dataset:
4832 training instances, 4142 test instances, training contamination rate 0.0, test contamination rate 0.5.
Mean 	 Std_dev 	 Metric
0.501 	 0.000 		 roc_auc
0.006 	 0.000 		 adjusted_average_precision
0.503 	 0.000 		 precision_recall_auc


mvtec_ad_transistor
. 

Evaluation on dataset mvtec_ad_transistor with normal labels [0] and anomaly labels [1.0].
Total of 1 datasets. Per dataset:
191 training instances, 122 test instances, training contamination rate 0.0, test 

As in the above code, We store the evaluations of our own algorithm in evaluation object which is then added to comparison object.Similarly, we can create evaluation objects for other algorithms and add them to comparison object for final benchmarking  as shown in Section 5

Finally, we show below the benchmarking results of our algorithm as described in "**Section 7. Show Benchmarking Results**"

In [40]:
#ID 36(9)

# print results in easily readable format
co.print_results()

For roc_auc:
                 mvtec_ad_myImageDataset     mnist   cifar10  \
CAE v2                          0.354167  0.982190  0.659082   
ConvAutoEncoder                 0.555556  0.935276  0.713073   
RandomGuesser                   0.541667  0.500874  0.481623   
Average                         0.483796  0.806114  0.617926   

                 mvtec_ad_myImageDataset2  mvtec_ad_transistor   Average  
CAE v2                           0.500000                  NaN  0.623860  
ConvAutoEncoder                  0.420000                  NaN  0.655976  
RandomGuesser                    0.480000             0.426524  0.486138  
Average                          0.466667             0.426524       NaN  
For adjusted_average_precision:
                 mvtec_ad_myImageDataset     mnist   cifar10  \
CAE v2                         -0.048762  0.960543  0.241751   
ConvAutoEncoder                 0.131004  0.832317  0.324656   
RandomGuesser                   0.248323  0.006392 -0.019738   
Ave

In [41]:
#ID 37(9)
# print results in easily readable format with standard deviations
co.print_results(include_stdevs=True)

For roc_auc:
                mvtec_ad_myImageDataset         mnist       cifar10  \
CAE v2                     0.278+-0.000  0.982+-0.000  0.626+-0.000   
ConvAutoEncoder            0.285+-0.000  0.935+-0.000  0.714+-0.000   
RandomGuesser              0.542+-0.000  0.501+-0.000  0.482+-0.000   
Average                           0.368         0.806         0.607   

                mvtec_ad_myImageDataset2 mvtec_ad_transistor   Average  
CAE v2                      0.632+-0.000            nan+-nan  0.629375  
ConvAutoEncoder             0.583+-0.000            nan+-nan  0.629225  
RandomGuesser               0.535+-0.000        0.427+-0.000  0.497082  
Average                            0.583               0.427       NaN  

For adjusted_average_precision:
                mvtec_ad_myImageDataset         mnist        cifar10  \
CAE v2                    -0.162+-0.000  0.961+-0.000   0.180+-0.000   
ConvAutoEncoder           -0.182+-0.000  0.832+-0.000   0.325+-0.000   
RandomGuesser    

In [42]:
#ID 38(9)

co.print_latex(include_stdevs=True)

For roc_auc:
\begin{center}
\begin{tabular}{  c c c c c c c  }
  & mvtec\_ad\_myImageDataset & mnist & cifar10 & mvtec\_ad\_myImageDataset2 & mvtec\_ad\_transistor & Average \\
  CAE v2 & 0.278$\pm$0.000 & \textbf{0.982$\pm$0.000} & \textit{0.626$\pm$0.000} & \textbf{0.632$\pm$0.000} & \textit{nan$\pm$nan} & \textbf{0.629} \\
  ConvAutoEncoder & \textit{0.285$\pm$0.000} & \textit{0.935$\pm$0.000} & \textbf{0.714$\pm$0.000} & \textit{0.583$\pm$0.000} & nan$\pm$nan & \textit{0.629} \\
  RandomGuesser & \textbf{0.542$\pm$0.000} & 0.501$\pm$0.000 & 0.482$\pm$0.000 & 0.535$\pm$0.000 & \textbf{0.427$\pm$0.000} & 0.497 \\
  Average & 0.368 & 0.806 & 0.607 & 0.583 & 0.427 &    \\
\end{tabular}
\end{center}

For adjusted_average_precision:
\begin{center}
\begin{tabular}{  c c c c c c c  }
  & mvtec\_ad\_myImageDataset & mnist & cifar10 & mvtec\_ad\_myImageDataset2 & mvtec\_ad\_transistor & Average \\
  CAE v2 & \textit{-0.162$\pm$0.000} & \textbf{0.961$\pm$0.000} & \textit{0.180$\pm$0.000} & \t

So,This was our example algorithm. Other algorithms can be used to run and extend benchmarks,  Please refer  to <b>5. ALGORITHM TRAINING AND TESTING</b>.