
## In this notebook, you will see all the steps sequentially performed to be able to utilize the complete functionality of OAB framework. The steps are as follows :
0. SETUP
1. DATA
2. DATA SELECTION
3. PREPROCESSING
4. SAMPLING
5. ALGORITHM TRAINING AND TESTING
6. EVALUATION
7. SHOW BENCHMARK RESULTS
8. REPRODUCIBILTY
9. EXTENDING THE BENCHMARK(with own Algorithm)

This notebook focuses on <b>Unsupervised Tabular Data</b>. Let's begin!

# **0. SETUP**

`oab` framework can be integrated in your Python environment  as a `PyPi package`  using the following command:

In [None]:
#ID 1(0)

#%%capture
# pip install oab
!pip install example-pkg-jd-kiel --extra-index-url=https://test.pypi.org/simple/

`Cloning` the repository:

`oab` is an open-source framework which can be accessed at https://github.com/ISDM-CAU-Kiel/oab. To use this .ipynb notebook successfully, the formerly mentioned repository needs to be cloned with the following command and this notebook must be run(if this is not the case already) within the cloned repository from the path:

<b>/oab/notebooks/benchmark_image/Unsupervised_Anomaly_Detection_on_Benchmark_Tabular_Data_v0.2.ipynb</b>

In [None]:
#ID 2(0)
!git clone https://github.com/ISDM-CAU-Kiel/oab.git

Now, importing the necessary functions and internal variables :

In [2]:
#ID 3(0)

import sys
import os
from datetime import datetime 
from pathlib import Path         
sys.path.append('../..')           


%load_ext autoreload
%autoreload 2

# necessary imports for loading datasets as well as information from recipe files
from oab.data.unsupervised import UnsupervisedAnomalyDataset
from oab.algorithms.unsupervised_wrapper import UnsupervisedWrapperToRecipe
from oab.data.load_image_dataset import _load_image_dataset
from oab.data.load_dataset import load_dataset
from oab.data.utils_image import image_datasets
from oab.data.utils import _append_to_yaml
from oab.data.load_recipe_functions import *

# necessary imports for algorithm comparisons and defining seeds
from oab.evaluation import EvaluationObject, ComparisonObject,all_metrics

import numpy as np
import random


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## **0.1 NOTEBOOK AND CELL STRUCTURE** 

In this notebook there are certain sections where the user is required to enter its own information which are marked as comments of the form :

<b>### ADD YOUR CODE ###</b>  , so <b>###</b> can be searched to know what are those sections.

All cells are assigned an ID, as a comment at the top of the cell,for example as: <b>#ID 10(5)</b>, where 10 denotes the cell ID and 5 denotes the Section.

## 0.2 DETAILS OF THIS BENCHMARK RUN

In [3]:
#ID 4(0)


#timestamp is set  for this run  

time=datetime.now().strftime("%Y%m%d%H%M%S")

### ADD YOUR OWN BENCHMARK NAME ###
benchmark_name="Paper_A"


benchmark_type="ust"   # benchmark run for unsupervised tabular datasets(ust)

dataset_folder="datasets"    # all files of datasets stored in this folder
datasets_yaml_path= Path(os.getcwd()).parent.parent/"oab"/"data"/"datasets.yaml"      # information about all datasets are stored in this file


#creating folder for this benchmark with current timestamp name to store recipes of this run
if not os.path.exists(f"{benchmark_name}/{time}-{benchmark_type}"): #creating directory for this benchhmark run for storing recipes
    os.makedirs(f"{benchmark_name}/{time}-{benchmark_type}")
    
new_recipe_path=f"{benchmark_name}/{time}-{benchmark_type}/{time}-{benchmark_name}" 
print(f"{benchmark_name}/{time}-{benchmark_type}/{time}-{benchmark_name}")

Paper_A/20220301200056-ust/20220301200056-Paper_A


### For reproducing  previously-created recipes of another benchmark run, without adding new datasets and algorithms from this benchmark  ,  skip to :

### `#ID 20(5)`

# **1. DATA**

First of all, we will have a look at the Datasets that are pre-installed in OAB which can be used for benchmarking

In [4]:
#ID 5(1)
for i in get_tabular_dataset_names(datasets_yaml_path):
    print(f"{i[0]}.{i[1]}")

0.NASA_ground_data
1.annthyroid
2.arrhytmia
3.boston
4.breastw
5.cardio
6.forest_cover
7.http
8.ionosphere
9.mammography
10.musk
11.myTabularDataset
12.myTabularDataset2
13.optdigits
14.page-blocks
15.pendigits
16.pima
17.pulsar_star
18.shuttle
19.smtp
20.spambase
21.thyroid
22.vertebral
23.wilt
24.wine



`oab` provides a variety of tabular datasets that can easily be loaded, 

`1.` If a user is interested in using her own tabular dataset, the following steps have to be followed: 

 (a) Ensure that `own` dataset(s) information is stored in the file `datasets.yaml` which is located at
<b>"/oab/data"</b>  (by executon of  #ID 7(1))
 
 
 (b) The `folder` containing the dataset must be stored in `datasets` folder of the OAB and name should be same as the dataset_name
 
 (c) Then, own dataset(s) can be loaded just like the pre-installed OAB datasets 
 
 `2.` If user's dataset is provided **via a URL**, then it would be downloaded and stored in the OAB's "datasets" folder.
 
 The files in the `folder` that are downloaded or stored manually in "datasets" folder , can be of variety of formats such as:
 
 1. 'csv           
 2. 'zip'
 3.'gz_single_file'
 4.'mat'
 5.'mat_old
 
 If user has her dataset file in one of these formats, or has multiple files, then `oab` automatically makes one file out of that which is then input to the oab. We will see the case here, when user loads her dataset through local `folder directory`.

Here's the structure of how the datasets' information are stored in `datasets.yaml`.

In [5]:
# ID 6(1)
!cat {datasets_yaml_path}

NASA_ground_data:
  class_labels: last
  credits: Sayyad Shirabad, J. and Menzies, T.J. (2005) The PROMISE Repository of
    Software Engineering Databases. School of Information Technology and Engineering,
    University of Ottawa, Canada.
  dataset_format: csv
  destination_filenames: [NASA_ground_data.csv]
  destination_yaml: NASA_ground_data_preprocessing.yaml
  filename_in_folder: NASA_ground_data.csv
  filenames_to_concatenate: null
  foldername: NASA_ground_data
  load_csv_arguments: {header: 0}
  name: NASA_ground_data
  short_description: null
  url_yaml: https://raw.githubusercontent.com/jandeller/test/main/test6.yaml
  urls_dataset: ['https://www.openml.org/data/get_csv/53950']
annthyroid:
  class_labels: last
  credits: "Dataset provided by OODS on http://odds.cs.stonybrook.edu/annthyroid-dataset/.
    If you use this for publications, please keep the publication policy in mind (http://odds.cs.stonybrook.edu/about-odds/).
    \n Shebuti Rayana (2016).  O

In [6]:
#ID 7(1)


### ADD OWN DATASET(S) DETAILS ###   see #ID 3(1) for exact parameters which are to be entered

own_datasets_info= [{
                        'dataset_name':'myTabularDataset',
                    }]


 # 'myTabularDataset' is the name of the Dataset(which the user loads for benchmarking)
### Add more dictionaries to the list `own_datasets_info` with datasets information like example below
                   #           { 
                   #               'dataset_name':'XYZDataset'
                   #               'filenames_to_concatenate':['train_example.csv','test_example.csv']
                   #               'dataset_format':'csv'         (type of dataset file , by default="csv")
                   #               'class_labels':'last'          ( Which is the "lables" column , default="last" )
                   #               'csv_header': 0                (by default 0, specifies the first row to be column headers)
                   #                'urls_dataset':"https://mydataset-url.com"    (If user want's to download the dataset via url)
                   #                'destination_filenames':[XYZDataset.mat]      (The name of file(s) to download at the url provided )
                   #              }
                
  



### ADD DATASETNAME(S) FROM OAB'S DATASETS HERE ###

benchmark_datasets_list=['annthyroid']   # More of OAB's datasets can be added to this list




## contains dataset objects of own_datasets_list as well as benchmark_datasets_list   
datasets={}


# Adding and Loading own datasets
for dataset_details in own_datasets_info:
     dataset_details['datasets_yaml_path']=datasets_yaml_path
     dataset_details['dataset_folder']=dataset_folder
     datasets[dataset_details['dataset_name']]=load_own_tabular_dataset(**dataset_details)


print(datasets)


{'myTabularDataset': <oab.data.classification_dataset.ClassificationDataset object at 0x7fb000044190>}


# **2. DATA SELECTION**

Datasets can either be loaded directly as anomaly datasets or as classification datasets. In the former case, the dataset is automatically fully prepared and ready for sampling. In the latter case, further preprocessing is still possible and necessary.

**After adding and loading own dataset(s) in #ID 7(1) and now the user is able to select other benchmarking datasets:**

Now,  we'll have a look at all the datasets again which are pre-installed in OAB, so that they can be chosen for the benchmark run.

In [7]:
#ID 8(2)


for i in get_tabular_dataset_names(datasets_yaml_path):
    print(f"{i[0]}.{i[1]}")

0.NASA_ground_data
1.annthyroid
2.arrhytmia
3.boston
4.breastw
5.cardio
6.forest_cover
7.http
8.ionosphere
9.mammography
10.musk
11.myTabularDataset
12.myTabularDataset2
13.optdigits
14.page-blocks
15.pendigits
16.pima
17.pulsar_star
18.shuttle
19.smtp
20.spambase
21.thyroid
22.vertebral
23.wilt
24.wine


### **2.1 Load anomaly detection datasets (with or without further preprocessing)**

In this section, we load some pre-installed data sets. This can be achieved using the `load_dataset` function. By default, it creates an anomaly dataset from which sampling is directly possible  but we can first create classsifcation dataset and then anomaly dataset,either with the preprocessing applied (`preprocess_classification_dataset=True`) i.e. standard or custom operations like treat_missing_values,delete_columns,etc. are performed, or without (`preprocess_classification_dataset=False`, default).

`In our case` we set have already imported own datasets with `anomaly_dataset=False ` and `preprocess_classification_dataset=False` in <b>#ID 7(1)</b> and we will also load the OAB datasets in the same way in <b>#ID 9(2)</b>

Note that as discussed in the paper, multiclass classification datasets like `spambase` and `annthyroid` are loaded with the class label `0` as normal label and all other labels as anomaly labels by default. (Alternatively, `oab` can automatically iterate through all classes as normal classes. This is not covered here.

In [8]:
#ID 9(2)

#### ADD YOUR OWN NUMBER OF DATASETS AND FROM OAB FOR BENCHMARKING  ###

for dataset_name in benchmark_datasets_list:  # loading benchmark's datasets
    datasets[dataset_name]=load_dataset(dataset_name,anomaly_dataset=False,preprocess_classification_dataset=False,dataset_folder=dataset_folder)


Credits: Dataset provided by OODS on http://odds.cs.stonybrook.edu/annthyroid-dataset/. If you use this for publications, please keep the publication policy in mind (http://odds.cs.stonybrook.edu/about-odds/). 
 Shebuti Rayana (2016).  ODDS Library [http://odds.cs.stonybrook.edu]. Stony Brook, NY: Stony Brook University, Department of Computer Science.


In [9]:
#ID 10(2)
print(datasets)

{'myTabularDataset': <oab.data.classification_dataset.ClassificationDataset object at 0x7fb000044190>, 'annthyroid': <oab.data.classification_dataset.ClassificationDataset object at 0x7faf8f2b3b90>}


# **3. PREPROCESSING**

Standard preprocessing steps(or Custom preprocessing steps which are defined by user) like deleting columns, encoding categorical values differently, or removing missing values can be performed  to tabular data. Therefore, these methods (as well as own preprocessing steps and how these are captured) are covered here in this section.

Here, we only show two preprocessing steps that are applied to datasets in `preprocess_datasets`(loaded in 2.2), which can also be performed individually depending upon requirement :
- Perform `Standard/Custom Preprocessing functions`
- `Transform the dataset into an anomaly dataset`

In [10]:
#ID 11(3)                            SCALING APPLIED

                                         
for dataset_name in datasets:
    
    datasets[dataset_name].treat_missing_values()
    datasets[dataset_name].normalize_columns()
    datasets[dataset_name].delete_duplicates()
    #NEW RECIPE CREATED
    datasets[dataset_name].write_operations_to_yaml(f"{new_recipe_path}-[{dataset_name}]-dataset-recipe.yaml")
#print("Scaling performed on datasets!")    
#print(datasets)

The individual recipe files now contains information about how to preprocess(i.e. perform scaling) the datasets

In [11]:
#ID 12(3)


for (i,dataset_name) in enumerate(datasets):
    print(f"{i+1}.{new_recipe_path}-[{dataset_name}]-dataset-recipe.yaml\n")
    !cat {new_recipe_path}-[{dataset_name}]-dataset-recipe.yaml

1.Paper_A/20220301200056-ust/20220301200056-Paper_A-[myTabularDataset]-dataset-recipe.yaml

standard_functions:
- name: treat_missing_values
  parameters:
    delete_attributes: true
    missing_value: np.nan
- name: normalize_columns
  parameters:
    cols_to_normalize: null
- name: delete_duplicates
  parameters: {}
2.Paper_A/20220301200056-ust/20220301200056-Paper_A-[annthyroid]-dataset-recipe.yaml

standard_functions:
- name: treat_missing_values
  parameters:
    delete_attributes: true
    missing_value: np.nan
- name: normalize_columns
  parameters:
    cols_to_normalize: null
- name: delete_duplicates
  parameters: {}


In [12]:
#ID 13(3)                            ANOMALY-DATASET CONVERSION PERFORMED
 

datasets_ad={}    
    # for storing dataset objects converted to anomaly-dataset
for dataset_name in datasets:   
     datasets_ad[dataset_name]= UnsupervisedAnomalyDataset(classification_dataset=datasets[dataset_name],
                                                       normal_labels=0,
                                                       yamlpath_append=f"{new_recipe_path}-[{dataset_name}]-dataset-recipe.yaml")   
     
        
                                                                            
print("datasets after adding anomaly-conversion datasets: ")    
print(datasets_ad)

datasets after adding anomaly-conversion datasets: 
{'myTabularDataset': <oab.data.unsupervised.UnsupervisedAnomalyDataset object at 0x7faf8ef52150>, 'annthyroid': <oab.data.unsupervised.UnsupervisedAnomalyDataset object at 0x7fb000020510>}


In [13]:
#ID 14(3)

for (i,dataset_name) in enumerate(datasets):
    print(f"{i+1}.{new_recipe_path}-[{dataset_name}]-dataset-recipe.yaml\n")
    !cat {new_recipe_path}-[{dataset_name}]-dataset-recipe.yaml

1.Paper_A/20220301200056-ust/20220301200056-Paper_A-[myTabularDataset]-dataset-recipe.yaml

standard_functions:
- name: treat_missing_values
  parameters:
    delete_attributes: true
    missing_value: np.nan
- name: normalize_columns
  parameters:
    cols_to_normalize:
- name: delete_duplicates
  parameters: {}
anomaly_dataset:
  arguments:
    normal_labels: 0
    anomaly_labels:
2.Paper_A/20220301200056-ust/20220301200056-Paper_A-[annthyroid]-dataset-recipe.yaml

standard_functions:
- name: treat_missing_values
  parameters:
    delete_attributes: true
    missing_value: np.nan
- name: normalize_columns
  parameters:
    cols_to_normalize:
- name: delete_duplicates
  parameters: {}
anomaly_dataset:
  arguments:
    normal_labels: 0
    anomaly_labels:


# **4. SAMPLING**

Here, we define the sampling parameters to sample from the datasets

In [14]:
#ID 15(4)


### ADD YOUR OWN SAMPLING PARAMETERS ###

# sampling parameters

n=25                              #Number of data points to sample                    
contamination_rate = 0.5         #Contamination rate when sampling, defaults to 0.05
n_steps = 10        # n_steps=10  #Number of samples to take, i.e., number of times sampling is repeated, defaults to 10   

#These below are the possible sampling types to sample from datasets
sampling_types=['unsupervised_multiple','unsupervised_single','unsupervised_multiple_benchmark']


sampling_type='unsupervised_multiple'  #by default for this run


sampling_params={'n':n,'contamination_rate':contamination_rate,'n_steps':n_steps}



The above sampling parameters are utilized in
<b>#ID 23(5)</b> 
for sampling the datasets(except the pre-installed mv_tec_ad_datasets) before training the algorithms.

In [15]:
#ID 16(4)

benchmarking_datasets={}

for (x,y) in datasets_ad.items():
    benchmarking_datasets[x]=y

print(benchmarking_datasets)      

{'myTabularDataset': <oab.data.unsupervised.UnsupervisedAnomalyDataset object at 0x7faf8ef52150>, 'annthyroid': <oab.data.unsupervised.UnsupervisedAnomalyDataset object at 0x7fb000020510>}


The above dictionary <b>benchmarking_datasets</b> will be used for the Benchmarking as it contains all the information:"
    
    
    1.dataset_name
    2.final_dataset_object(preprocessed and anomaly-converted)



# **5. ALGORITHM TRAINING AND TESTING**

<b>To load own algorithm(s), refer to #ID 30(9) and #ID 31(9)</b> where an example algorithm is loaded, then 

come back to this cell  and <b>load own algorithm(s) details in #ID 17(5),#ID 18(5) AND #ID 19(5)</b> in the same way as benchmark algorithms.

We first download and import algorithms used for anomaly decection.

In [16]:
#ID 17(5)

import wget

wget.download('https://raw.githubusercontent.com/ISDM-CAU-Kiel/oab/master/notebooks/benchmark_tabular/ae_lof.py', "ae_lof.py")
from ae_lof import AELOF


  0% [                                                                      ]    0 / 2486100% [......................................................................] 2486 / 2486

In [17]:
#ID 18(5)



# import anomaly detection algorithms from pyod
from pyod.models.ocsvm import OCSVM # fit and decision_function
from pyod.models.iforest import IForest
from pyod.models.pca import PCA
from pyod.models.auto_encoder import AutoEncoder
from pyod.models.vae import VAE
### ADD your algo import here ###


Firstly, we define hyperparameters for all algorithms and choose for benchmarking:

In [18]:
#ID 19(5)

  
### Extend Algos dictionary with own algorithm specifications as shown below for OAB algorithms ###   

### ADD YOUR OWN (HYPER)PARAMETERS AND THEIR VALUES FOR PRE-INSTALLED ALGOS###

# KNN
knn_factor = 0.05
knn_minimum = 10


# LOF
lof_factor = 0.1
lof_minimum = 10

# ABOD
abod_factor = 0.01
abod_minimum = 10



lst_benchmark_algorithms =[
    {   
       "algo_module_name": "pyod.models.abod" , 
       "algo_class_name": "ABOD",
       "algo_name_in_result_table": "ABOD", 
       "algo_parameters": {'n_neighbors':{'abod_factor':00.1 ,'abod_minimum':10 }}, 
       "fit": {'method_name': 'fit', 'params': {}}, 
       "decision_scores": {'field_name': 'decision_scores_'}
    } 
    
    

]
''' ### uncomment to use these algos below for benchmarking ###
       ,
        
        {
       "algo_module_name": "pyod.models.iforest",   
       "algo_class_name": "IForest",
       "algo_name_in_result_table": "IForest",    
       "algo_parameters": {'random_state': 42} ,
        "fit": {'method_name': 'fit', 'params': {}}, 
        "decision_scores": {'field_name': 'decision_scores_'}
        },
        
        {
       "algo_module_name": "pyod.models.auto_encoder"   ,
       "algo_class_name": "AutoEncoder",
       "algo_name_in_result_table": "AutoEncoder",
       "algo_parameters":   {'hidden_neurons':[6,3,3,6], 'random_state': 42, 'verbose': 0},
        "fit": {'method_name': 'fit', 'params': {}}, 
       "decision_scores": {'field_name': 'decision_scores_'}
       } ,
       
    {   
       "algo_module_name": "pyod.models.knn" , 
       "algo_class_name": "KNN",
       "algo_name_in_result_table": "KNN",
       "algo_parameters": {'n_neighbors': {'knn_factor':0.05 ,'knn_minimum':10 }},
       "fit": {'method_name': 'fit', 'params': {}}, 
       "decision_scores": {'field_name': 'decision_scores_'}
    }  
    ,
    {   
       "algo_module_name": "pyod.models.lof" , 
       "algo_class_name": "LOF",
       "algo_name_in_result_table": "LOF",
       "algo_parameters":  {'n_neighbors': {'lof_factor':0.1 ,'lof_minimum':10 }},
       "fit": {'method_name': 'fit', 'params': {}}, 
       "decision_scores": {'field_name': 'decision_scores_'}
    }
    ,
    
    {   
       "algo_module_name": "aelof" , 
       "algo_class_name": "AELOF",
       "algo_name_in_result_table": "AELOF", 
       "algo_parameters": {'AE_parameters':{'hidden_neurons':[6,3,3,6], 'random_state': 42, 'verbose': 0},
                            'LOF_parameters' : {'n_neighbors': {'lof_factor':0.1 ,'lof_minimum':10 }},
                           'random_state': 42},
    
       "fit": {'method_name': 'fit', 'params': {}}, 
       "decision_scores": {'field_name': 'decision_scores_'}
    }
    
    
    
]    
    
'''

### ADD OWN ALGORITHMS NAME(S) with algorithm specifications as shown above for OAB algorithms ###   

own_algorithms=[]  #add to this list e.g. { "algo_module_name": "own_algo" , "algo_class_name": "ownAlgoClass",.........."decision_scores": 'decision_scores_'} 

lst_benchmark_algorithms.extend(own_algorithms)
recipe_datasets={}

#seed defined for ths benchmark run for obtaining consistent results 
seed=42

<b>LOAD YOUR RECIPE FOLDER CONTAINING ALL RECIPES</b> to be repdroduced and use it in the current benchmark run.

In [19]:
#ID 20(5)        # Execute this cell only when you already have a recipe folder  to load from 

### ADD AN OPTIONAL RECIPE FOLDER CONTAINING ALL DATASETS AND ALGOS RECIPES PATH TO ADD TO THIS BENCHMARK RUN START ###   

# Note: recipe folders of benchmarking type "unsupervised tabular(ust) " i.e. of the format: 
#               "timestamp-ust"
# can only be used for benchmarking in this notebook.

recipes_folder= "Paper_B/20211222221219-ust" 

### ADD AN OPTIONAL RECIPE FOLDER CONATAINING ALL DATASETS AND ALGOS RECIPES PATH TO ADD TO THIS BENCHMARK RUN END ###   
 
    
    
### UNCOMMENT ONLY IF NO NEW DATASETS WERE ADDED IN THE BENCHMARK EXCEPT FROM RECIPE FOLDER START ###     

#benchmarking_datasets={}
#lst_benchmark_algorithms=[]

### UNCOMMENT ONLY IF NO NEW DATASETS WERE ADDED IN THE BENCHMARK EXCEPT FROM RECIPE FOLDER END ###  

for i,file in enumerate(os.listdir(recipes_folder)):
    print(f"{i+1}.{file}\n")
    !cat {recipes_folder}/{file}
    print("\n")

1.20211222221219-Paper_B-[myTabularDataset2]-dataset-recipe.yaml

standard_functions:
- name: treat_missing_values
  parameters:
    delete_attributes: true
    missing_value: np.nan
- name: normalize_columns
  parameters:
    cols_to_normalize:
- name: delete_duplicates
  parameters: {}
anomaly_dataset:
  arguments:
    normal_labels: 0
    anomaly_labels:
seed: 42
sampling:
  unsupervised_multiple:
    n: 25
    n_steps: 10
    contamination_rate: 0.5
    shuffle: true
    random_seed: 42
    apply_random_seed: true
    keep_frequency_ratio_normals: false
    equal_frequency_normals: false
    keep_frequency_ratio_anomalies: false
    equal_frequency_anomalies: false
    include_description: true
    flatten_images: true


2.20211222221219-Paper_B-[pyod.models.knn][KNN]-algo-recipe.yaml

decision_scores:
  field_name: decision_scores_
fit:
  method_name: fit
  params: {}
init:
  params:
    n_neighbors:
      knn_factor: 0.05
      knn_minimum: 10
seed: 42


3.20211222221219-Paper_B-

In [20]:
#ID 21(5)    # Execute this cell only when you already have a recipe fOlder  to load from 

recipe_datasets,recipe_algos,seed=data_from_multiple_recipes(recipes_folder,new_recipe_path,dataset_folder,datasets_yaml_path) # get information of all datasets and algos from the files in the recipes_folder 

# adding recipe_datasets to benchmarking_datasets
for dataset_name in recipe_datasets:
    benchmarking_datasets[dataset_name]=recipe_datasets[dataset_name][0]
#print(f"benchmarking_datasets: {benchmarking_datasets}") 
                                         
#adding algos from recipe_algos to lst_benchmarking_algos
for algo in recipe_algos:
    if algo not in lst_benchmark_algorithms:
     lst_benchmark_algorithms.append(algo)
    
#print(lst_benchmark_algorithms)   


myTabularDataset2-----

preprocessing performed!!
anomaly-dataset transformation performed!!

pyod.models.knn----


spambase-----

preprocessing performed!!
anomaly-dataset transformation performed!!


In [21]:
#ID 22(5)

  
print("\nAll Datasets for this benchmark run:")  
#print(benchmarking_datasets)
for dataset_name in benchmarking_datasets:
    print(dataset_name)

    
 
print("\nAll algos for this benchmark run:")
for algo in lst_benchmark_algorithms:
    #print(algo)
    print(algo['algo_module_name'])




All Datasets for this benchmark run:
myTabularDataset
annthyroid
myTabularDataset2
spambase

All algos for this benchmark run:
pyod.models.abod
pyod.models.knn


Now, For every benchmark dataset , we sample from that dataset to train the algorithms and then predict the outcomes for each dataset with each algortihm and then store results in a evaluation object, which is then added to the comparison object to show the final Benchmarking results

In [22]:
#ID 23(5)                # RUNNING THE BENCHMARK
co = ComparisonObject()
for dataset_name in benchmarking_datasets:
    print(f'-------{dataset_name}-------') 
    _append_to_yaml(f"{new_recipe_path}-[{dataset_name}]-dataset-recipe.yaml",'seed',seed)
    #print(mvtec_ad_own_datasets_list)
    for alg in lst_benchmark_algorithms:
        name=alg["algo_module_name"]
        c_name=alg["algo_class_name"]
        print(f"------{name}")
        eval_obj = EvaluationObject(algorithm_name=alg["algo_name_in_result_table"])
        steps_seed=seed
        
        if dataset_name in recipe_datasets:
          for (x,y), sample_config in benchmarking_datasets[dataset_name].sample_from_yaml(recipe_datasets[dataset_name][1]):
            
            w = UnsupervisedWrapperToRecipe()
            random.seed(steps_seed)
            np.random.seed(steps_seed) 
            os.environ['PYTHONHASHSEED'] = str(steps_seed)
            try:
                import tensorflow as tf
                tf.random.set_seed(steps_seed)
            except Exception as e:
                pass    
            try:
                import torch
                torch.cuda.manual_seed(steps_seed)
                torch.cuda.manual_seed_all(steps_seed)
                torch.backends.cudnn.deterministic = True
                torch.backends.cudnn.benchmark = False
                torch.use_deterministic_algorithms(True)
                torch.manual_seed(steps_seed)
            except Exception as e:
                pass
            print('.', end='') # update to see progress 
            algo_parameters= algo_params_tabular(alg,l=len(x))
            mod = __import__(alg["algo_module_name"], fromlist=[alg["algo_class_name"]])
            algo = getattr(mod, alg["algo_class_name"])      
            algo_initialized=algo(**algo_parameters)
            w.track_init(algo, params=alg['algo_parameters'])
            w.track_fit(x=x, obj=algo_initialized, params=alg['fit']['params'], fit_method=alg['fit']['method_name']) # the last parameter is the name of the method used for fitting
            pred = w.track_decision_scores(algo_initialized,field_name=alg['decision_scores']['field_name']) # the last parameter is the field name used to store anomaly scores by the model
            w.store_recipe(f"{new_recipe_path}-[{name}][{c_name}]-algo-recipe.yaml")
            _append_to_yaml(f"{new_recipe_path}-[{name}][{c_name}]-algo-recipe.yaml",'seed',seed)
            eval_obj.add(ground_truth=y, prediction=pred, description=sample_config)
            steps_seed+=1
        else:
            
            sampling_params['random_seed']=seed
            
            for (x,y), sample_config in sample_v2(dataset_name,sampling_type,sampling_params,new_recipe_path,benchmarking_datasets[dataset_name]):
                
                w = UnsupervisedWrapperToRecipe()
                random.seed(steps_seed)
                np.random.seed(steps_seed) 
                os.environ['PYTHONHASHSEED'] = str(steps_seed)
                try:
                    import tensorflow as tf
                    tf.random.set_seed(steps_seed)
                except Exception as e:
                    pass    
                try:
                    import torch
                    torch.cuda.manual_seed(steps_seed)
                    torch.cuda.manual_seed_all(steps_seed)
                    torch.backends.cudnn.deterministic = True
                    torch.backends.cudnn.benchmark = False
                    torch.use_deterministic_algorithms(True)
                    torch.manual_seed(steps_seed)
                except Exception as e:
                    pass
                print('.', end='') # update to see progress 
                algo_parameters=algo_params_tabular(alg,l=len(x))
                mod = __import__(alg["algo_module_name"], fromlist=[alg["algo_class_name"]])
                algo = getattr(mod, alg["algo_class_name"])      
                algo_initialized=algo(**algo_parameters)
                w.track_init(algo, params=alg['algo_parameters'])
                w.track_fit(x=x, obj=algo_initialized, params=alg['fit']['params'], fit_method=alg['fit']['method_name']) # the last parameter is the name of the method used for fitting
                pred = w.track_decision_scores(algo_initialized,field_name=alg['decision_scores']['field_name']) # the last parameter is the field name used to store anomaly scores by the model
                w.store_recipe(f"{new_recipe_path}-[{name}][{c_name}]-algo-recipe.yaml")
                _append_to_yaml(f"{new_recipe_path}-[{name}][{c_name}]-algo-recipe.yaml",'seed',seed)
                eval_obj.add(ground_truth=y, prediction=pred, description=sample_config)
                steps_seed+=1
                
        print("\n")    
        eval_desc = eval_obj.evaluate(print=True,metrics=['roc_auc', 'adjusted_average_precision', 'precision_recall_auc'])
        co.add_evaluation(eval_desc)
        print("\n")
      

-------myTabularDataset-------
------pyod.models.abod
..........

Evaluation on dataset myTabularDataset with normal labels [0] and anomaly labels [1, 2].
Total of 10 datasets. Per dataset:
25 instances, contamination_rate 0.5.
Mean 	 Std_dev 	 Metric
0.288 	 0.089 		 roc_auc
-0.148 	 0.115 		 adjusted_average_precision
0.417 	 0.056 		 precision_recall_auc


------pyod.models.knn
..........

Evaluation on dataset myTabularDataset with normal labels [0] and anomaly labels [1, 2].
Total of 10 datasets. Per dataset:
25 instances, contamination_rate 0.5.
Mean 	 Std_dev 	 Metric
0.227 	 0.075 		 roc_auc
-0.257 	 0.047 		 adjusted_average_precision
0.370 	 0.021 		 precision_recall_auc


-------annthyroid-------
------pyod.models.abod
..........

Evaluation on dataset annthyroid with normal labels [0] and anomaly labels [1.0].
Total of 10 datasets. Per dataset:
25 instances, contamination_rate 0.5.
Mean 	 Std_dev 	 Metric
0.549 	 0.109 		 roc_auc
0.233 	 0.138 		 adjusted_average_precision


# **6. EVALUATION**

Here , we will see how different metrics can be selected when evaluating an algorithm's performance.

In previous section while creating an evalutation description,  we used all metrics for evaluation:

     eval_desc = eval_obj.evaluate(print=False, metrics=all_metrics)
    
    

In [23]:
#ID 24(6)

# to use a subset, first see which ones are available

print(all_metrics)

['roc_auc', 'average_precision', 'adjusted_average_precision', 'precision_n', 'adjusted_precision_n', 'precision_recall_auc']


In [24]:
#ID 25(6)

#### ADD YOUR OWN NUMBER OF METRICS ###

#Then we can  select an arbitrary subset
metrics=['roc_auc', 'precision_recall_auc']

# **7. SHOW BENCHMARK RESULTS**

We compare by printing, the results of the evaluations of different Algo-Dataset combinations.

\[Latex version: bold for highest, italics for second highest, ?\]

In [25]:
#ID 26(7)

# print results in easily readable format
co.print_results()

For roc_auc:
                 myTabularDataset  annthyroid  myTabularDataset2  spambase  \
ABOD                     0.288462    0.549359           0.288462  0.557051   
pyod.models.knn          0.227244    0.525000           0.227244  0.514423   
Average                  0.257853    0.537179           0.257853  0.535737   

                  Average  
ABOD             0.420833  
pyod.models.knn  0.373478  
Average               NaN  
For adjusted_average_precision:
                 myTabularDataset  annthyroid  myTabularDataset2  spambase  \
ABOD                    -0.147795    0.233184          -0.147795  0.138159   
pyod.models.knn         -0.256806    0.175870          -0.256806  0.071718   
Average                 -0.202301    0.204527          -0.202301  0.104938   

                  Average  
ABOD             0.018938  
pyod.models.knn -0.066506  
Average               NaN  
For precision_recall_auc:
                 myTabularDataset  annthyroid  myTabularDataset2  spambase  \
A

In [26]:
#ID 27(7)

# print results in easily readable format with standard deviations
co.print_results(include_stdevs=True)

For roc_auc:
                myTabularDataset    annthyroid myTabularDataset2  \
ABOD                0.288+-0.089  0.549+-0.109      0.288+-0.089   
pyod.models.knn     0.227+-0.075  0.525+-0.128      0.227+-0.075   
Average                    0.258         0.537             0.258   

                     spambase   Average  
ABOD             0.557+-0.090  0.420833  
pyod.models.knn  0.514+-0.127  0.373478  
Average                 0.536       NaN  

For adjusted_average_precision:
                myTabularDataset    annthyroid myTabularDataset2  \
ABOD               -0.148+-0.115  0.233+-0.138     -0.148+-0.115   
pyod.models.knn    -0.257+-0.047  0.176+-0.201     -0.257+-0.047   
Average                   -0.202         0.205            -0.202   

                     spambase   Average  
ABOD             0.138+-0.148  0.018938  
pyod.models.knn  0.072+-0.182 -0.066506  
Average                 0.105       NaN  

For precision_recall_auc:
                myTabularDataset    annthyroi

In [27]:
# ID 28(7)

co.print_latex(include_stdevs=True)

For roc_auc:
\begin{center}
\begin{tabular}{  c c c c c c  }
  & myTabularDataset & annthyroid & myTabularDataset2 & spambase & Average \\
  ABOD & \textbf{0.288$\pm$0.089} & \textbf{0.549$\pm$0.109} & \textbf{0.288$\pm$0.089} & \textbf{0.557$\pm$0.090} & \textbf{0.421} \\
  pyod.models.knn & \textit{0.227$\pm$0.075} & \textit{0.525$\pm$0.128} & \textit{0.227$\pm$0.075} & \textit{0.514$\pm$0.127} & \textit{0.373} \\
  Average & 0.258 & 0.537 & 0.258 & 0.536 &    \\
\end{tabular}
\end{center}

For adjusted_average_precision:
\begin{center}
\begin{tabular}{  c c c c c c  }
  & myTabularDataset & annthyroid & myTabularDataset2 & spambase & Average \\
  ABOD & \textbf{-0.148$\pm$0.115} & \textbf{0.233$\pm$0.138} & \textbf{-0.148$\pm$0.115} & \textbf{0.138$\pm$0.148} & \textbf{0.019} \\
  pyod.models.knn & \textit{-0.257$\pm$0.047} & \textit{0.176$\pm$0.201} & \textit{-0.257$\pm$0.047} & \textit{0.072$\pm$0.182} & \textit{-0.067} \\
  Average & -0.202 & 0.205 & -0.202 & 0.105 &    \\
\end{t

# **8. REPRODUCIBILITY**

 ## **8.1 Creating recipes**

This section shows **how `oab` can be used to make sampling results easily reproducible** .
 

`yaml` files play an integral role in making reproducibility work, as they store the operations and parameters performed on the data.

We will see how to produce a recipe(.yaml) of the Benchmarkrun already performed  in <b>#ID 23(5)</b>

In <b>#ID 11(3) #ID 13(3) and #ID 23(5) </b>,  We already performed recipe-creating-operations on </b>own datasets,OAB's datasets as well as input recipe's datasets and all algorithms of this benchmark run,we can see below  the structure of those recipes:


In [28]:
#ID 29(8)


for i,file in enumerate(os.listdir(f"{benchmark_name}/{time}-{benchmark_type}")):
    print(f"{i+1}.{file}\n")
    !cat {benchmark_name}/{time}-{benchmark_type}/{file}
    print("\n")

1.20220301200056-Paper_A-[pyod.models.knn][KNN]-algo-recipe.yaml

decision_scores:
  field_name: decision_scores_
fit:
  method_name: fit
  params: {}
init:
  params:
    n_neighbors:
      knn_factor: 0.05
      knn_minimum: 10
seed: 42


2.20220301200056-Paper_A-[myTabularDataset]-dataset-recipe.yaml

standard_functions:
- name: treat_missing_values
  parameters:
    delete_attributes: true
    missing_value: np.nan
- name: normalize_columns
  parameters:
    cols_to_normalize:
- name: delete_duplicates
  parameters: {}
anomaly_dataset:
  arguments:
    normal_labels: 0
    anomaly_labels:
seed: 42
sampling:
  unsupervised_multiple:
    n: 25
    n_steps: 10
    contamination_rate: 0.5
    shuffle: true
    random_seed: 42
    apply_random_seed: true
    keep_frequency_ratio_normals: false
    equal_frequency_normals: false
    keep_frequency_ratio_anomalies: false
    equal_frequency_anomalies: false
    include_description: true
    flatten_images: true


3.20220301200056-Paper_A-[

### 2. Reproducing the experiment

To reproduce the recipes created in the previous section,
we refer to <b>Section 5 #ID 20(5)</b> where we can reproduce the run as well as extend benchmarks!

# **9. EXTEND EXISTING BENCHMARK(own algorithm)**

To extend the existing benchmark here basically means to add  our own algorithm to the benchmark and to show the comparison results of pre-installed algorithms while also loading our own dataset.


1. We load the datasets. To know how to do that, we can refer to  **Section "1. Data" and "2. Data Selection"**
2. Then, load own algorithm as we will see in the next sub-section.

## **9.1 Loading own Algorithm**

In this subsection, you will see **how an own unsupervised anomaly detection algorithm** can easily be used within oab to be evaluated. We will see how a class representing an algorithm can be structured and how its performance is evaluated.

Of course, this is not the only way to use the functionality provided by oab. We do consider it to be the simplest way however.

In [29]:
#ID 30(9)

# download example algorithm and inspect content
import wget
wget.download('https://raw.githubusercontent.com/jandeller/test/main/RandomGuesser.py',"RandomGuesser.py")
!cat RandomGuesser.py

  0% [                                                                        ]   0 / 204100% [........................................................................] 204 / 204import numpy as np

class RandomGuesser():

    def fit(self, X):
        "Assign a random number to each sample"
        n_samples = X.shape[0]
        self.decision_scores_ = np.random.randn(n_samples)


The sample `RandomGuesser` algorithm shown here is - as the name suggests - a random guesser, i.e., it assigns random anomaly scores to the samples.

An algorithm used for unsupervised anomaly detection needs to specify a `fit(X_train)` method for training and a `decision_function(X_test)` method for inference that returns an anomaly score per data point in the test set.

It is of course possible to rename the method and field, use a method for accessing the anomaly scores, etc. Note that if this is done, the following code has to be changed accordingly. Adhering to the conventions described above (`fit(X_train)` and `decision_function(X_test)`) allows you to use the same interface as algorithms from [`PyOD`](https://pyod.readthedocs.io/en/latest/) as shown when [comparing algorithms using `oab`](https://colab.research.google.com/drive/1aV_itaYCJgzdZ1lQ7SUyHQ7z01xSPxDN?usp=sharing#scrollTo=QnAfCGTGL7xv).

In [30]:
#ID 31(9)
# used imports from #ID 3(0),#ID 18(5)
#used sampling parameters from #ID 14(4)

# and import the RandomGuesser
from RandomGuesser import RandomGuesser
    
own_algorithms=[{
    
       ### ADD YOUR OWN ALGO DETAILS IN THIS FORM ###
       "algo_module_name": "RandomGuesser",   
       "algo_class_name": "RandomGuesser",
       "algo_name_in_result_table": "RandomGuesser",
       "algo_parameters": {},
        "fit": {'method_name': 'fit', 'params': {}}, 
        "decision_function": {'method_name': 'decision_function', 'params': {}}
        }]




The `own_algorithms` list in the above cell #ID 31(9) can be added to `lst_benchmarking_algos` as mentioned in #ID 19(5) to use this algorithm in a benchmark run shown in #ID 23(5) along with other algorithms

In [31]:
#ID 32(9)
        

#  A comparison object is created for comparing the evaluations of different Algo-Dataset combinations
co = ComparisonObject()

for dataset_name in benchmarking_datasets:
  print(f"-------{dataset_name}-------")
  eval_obj = EvaluationObject(algorithm_name="RandomGuesser")
  print("-------RandomGuesser")  
  for (x,y), settings in benchmarking_datasets[dataset_name].sample_multiple(n=n, 
                                                                  contamination_rate=contamination_rate, 
                                                                 n_steps=n_steps):
      print(".", end=" ") # update to see progress
      rg = RandomGuesser()
      rg.fit(x) # data is fitted to RandomGuesser
      pred = rg.decision_scores_ # and decision_scores_ is accessed
      eval_obj.add(y, pred, settings)
  print("\n")
  eval_desc = eval_obj.evaluate(metrics=['roc_auc', 'adjusted_average_precision', 'precision_recall_auc'])
  # added to comparison object
  co.add_evaluation(eval_desc)
  print("\n")




-------myTabularDataset-------
-------RandomGuesser
. . . . . . . . . . 

Evaluation on dataset myTabularDataset with normal labels [0] and anomaly labels [1, 2].
Total of 10 datasets. Per dataset:
25 instances, contamination_rate 0.5.
Mean 	 Std_dev 	 Metric
0.476 	 0.118 		 roc_auc
0.093 	 0.221 		 adjusted_average_precision
0.541 	 0.114 		 precision_recall_auc


-------annthyroid-------
-------RandomGuesser
. . . . . . . . . . 

Evaluation on dataset annthyroid with normal labels [0] and anomaly labels [1.0].
Total of 10 datasets. Per dataset:
25 instances, contamination_rate 0.5.
Mean 	 Std_dev 	 Metric
0.457 	 0.112 		 roc_auc
0.066 	 0.176 		 adjusted_average_precision
0.521 	 0.093 		 precision_recall_auc


-------myTabularDataset2-------
-------RandomGuesser
. . . . . . . . . . 

Evaluation on dataset myTabularDataset2 with normal labels [0] and anomaly labels [1, 2].
Total of 10 datasets. Per dataset:
25 instances, contamination_rate 0.5.
Mean 	 Std_dev 	 Metric
0.476 	 0.118

As in the above code, We store the evaluations of our own algorithm in evaluation object which is then added to comparison object.Similarly, we can create evaluation objects for other algorithms and add them to comparison object for final benchmarking  as shown in Section 5

Finally, we show below the benchmarking results of our algorithm as described in "**Section 7. Show Benchmarking Results**"

In [32]:
#ID 33(9)

# print results in easily readable format
co.print_results()

For roc_auc:
                 myTabularDataset  annthyroid  myTabularDataset2  spambase  \
ABOD                     0.288462    0.549359           0.288462  0.557051   
pyod.models.knn          0.227244    0.525000           0.227244  0.514423   
RandomGuesser            0.476282    0.457051           0.476282  0.539103   
Average                  0.330662    0.510470           0.330662  0.536859   

                  Average  
ABOD             0.420833  
pyod.models.knn  0.373478  
RandomGuesser    0.487179  
Average               NaN  
For adjusted_average_precision:
                 myTabularDataset  annthyroid  myTabularDataset2  spambase  \
ABOD                    -0.147795    0.233184          -0.147795  0.138159   
pyod.models.knn         -0.256806    0.175870          -0.256806  0.071718   
RandomGuesser            0.093162    0.065774           0.093162  0.181948   
Average                 -0.103813    0.158276          -0.103813  0.130608   

                  Average  
ABOD 

In [33]:
#ID 34(9)
# print results in easily readable format with standard deviations
co.print_results(include_stdevs=True)

For roc_auc:
                myTabularDataset    annthyroid myTabularDataset2  \
ABOD                0.288+-0.089  0.549+-0.109      0.288+-0.089   
pyod.models.knn     0.227+-0.075  0.525+-0.128      0.227+-0.075   
RandomGuesser       0.476+-0.118  0.457+-0.112      0.476+-0.118   
Average                    0.331         0.510             0.331   

                     spambase   Average  
ABOD             0.557+-0.090  0.420833  
pyod.models.knn  0.514+-0.127  0.373478  
RandomGuesser    0.539+-0.088  0.487179  
Average                 0.537       NaN  

For adjusted_average_precision:
                myTabularDataset    annthyroid myTabularDataset2  \
ABOD               -0.148+-0.115  0.233+-0.138     -0.148+-0.115   
pyod.models.knn    -0.257+-0.047  0.176+-0.201     -0.257+-0.047   
RandomGuesser       0.093+-0.221  0.066+-0.176      0.093+-0.221   
Average                   -0.104         0.158            -0.104   

                     spambase   Average  
ABOD             0.1

In [34]:
#ID 35(9)

co.print_latex(include_stdevs=True)

For roc_auc:
\begin{center}
\begin{tabular}{  c c c c c c  }
  & myTabularDataset & annthyroid & myTabularDataset2 & spambase & Average \\
  ABOD & \textit{0.295$\pm$0.103} & \textbf{0.628$\pm$0.051} & \textit{0.295$\pm$0.103} & \textbf{0.615$\pm$0.019} & \textit{0.458} \\
  pyod.models.knn & 0.220$\pm$0.043 & \textit{0.619$\pm$0.003} & 0.220$\pm$0.043 & \textit{0.561$\pm$0.106} & 0.405 \\
  RandomGuesser & \textbf{0.500$\pm$0.135} & 0.500$\pm$0.058 & \textbf{0.500$\pm$0.135} & 0.471$\pm$0.010 & \textbf{0.493} \\
  Average & 0.338 & 0.582 & 0.338 & 0.549 &    \\
\end{tabular}
\end{center}

For adjusted_average_precision:
\begin{center}
\begin{tabular}{  c c c c c c  }
  & myTabularDataset & annthyroid & myTabularDataset2 & spambase & Average \\
  ABOD & \textit{-0.139$\pm$0.134} & \textbf{0.371$\pm$0.108} & \textit{-0.139$\pm$0.134} & \textbf{0.280$\pm$0.053} & \textbf{0.093} \\
  pyod.models.knn & -0.265$\pm$0.029 & \textit{0.327$\pm$0.059} & -0.265$\pm$0.029 & \textit{0.164$\pm$0.206

So,This was our example algorithm. Other algorithms can be used to run and extend benchmarks,  Please refer  to <b>5. ALGORITHM TRAINING AND TESTING</b>.