## Introduction
In this notebook, we will delve into the comprehensive exploration of the paper titled ["Anomaly Detection for Tabular Data with Internal Contrastive Learning."](https://openreview.net/forum?id=_hszZbt46bT) This paper introduces an innovative approach to anomaly detection by addressing the challenge of identifying out-of-class samples within tabular data, particularly when the data's structural characteristics are not well understood.

## Masking in ICL

The image below illustrates how ICL masks a feature vector for contrastive learning. The underlying learning problem. Given a **sample vector** $x_i$, they consider the **subvector** $a_i^3$ and its **complementary** $b_i^3$. The networks are trained to produce similar embeddings for this pair of vectors, while distancing the embedding of $a_i^{j^{\prime}}$ for $j^{\prime} \neq 3$ from that of $b_i^3$.

<center>
<img src="https://drive.google.com/uc?export=view&id=1WOfi9boxWG4ET3AKochRS8OaGtDjxmZe" width="500" aligh="center">
</center>


In [1]:
# ruff: noqa
# type: ignore
""" Installing packages and importing libs.
Installing requirements and import files
"""

%pip install pandas
%pip install copulas
%pip install pyod
import os
import pandas as pd
import warnings

warnings.filterwarnings("ignore")
from data_generator import DataGenerator
from myutils import Utils
from deepod.models.icl import ICL
from pyod_base import PYOD

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.
Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.
Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


## Dataset
In this part, we will work with the ["Outlier Detection DataSets (ODDS)"](https://odds.cs.stonybrook.edu/) dataset, a widely-used benchmark collection of datasets specifically designed for evaluating outlier detection algorithms. The ODDS dataset encompasses a diverse range of data types, structures, and characteristics, making it an ideal choice for assessing the effectiveness of anomaly detection methodologies. In the next cell we see all the datasets are reachable via ODDS.

In [2]:
datagenerator = DataGenerator()  # data generator
utils = Utils()  # utils function
os.listdir("datasets/Classical")

['3_backdoor.npz',
 '12_fault.npz',
 '9_census.npz',
 '41_Waveform.npz',
 '36_speech.npz',
 '21_Lymphography.npz',
 '23_mammography.npz',
 '15_Hepatitis.npz',
 '44_Wilt.npz',
 '29_Pima.npz',
 '35_SpamBase.npz',
 '26_optdigits.npz',
 '13_fraud.npz',
 '18_Ionosphere.npz',
 '34_smtp.npz',
 '8_celeba.npz',
 '22_magic.gamma.npz',
 '6_cardio.npz',
 '1_ALOI.npz',
 '10_cover.npz',
 '20_letter.npz',
 '47_yeast.npz',
 '24_mnist.npz',
 '46_WPBC.npz',
 '42_WBC.npz',
 '2_annthyroid.npz',
 '39_vertebral.npz',
 '28_pendigits.npz',
 '30_satellite.npz',
 '43_WDBC.npz',
 '31_satimage-2.npz',
 '27_PageBlocks.npz',
 '16_http.npz',
 '33_skin.npz',
 '4_breastw.npz',
 '32_shuttle.npz',
 '11_donors.npz',
 '25_musk.npz',
 '19_landsat.npz',
 '45_wine.npz',
 '14_glass.npz',
 '17_InternetAds.npz',
 '40_vowels.npz',
 '38_thyroid.npz',
 '5_campaign.npz',
 '7_Cardiotocography.npz',
 '37_Stamps.npz']

In [3]:
dataset_list = [
    "6_cardio",
    "13_fraud",
    "24_mnist",
    "42_WBC",
    "43_WDBC",
    "4_breastw",
    "25_musk",
    "40_vowels",
    "38_thyroid",
]  # choosing the datasets
model_dict = {"ICL": PYOD}  # choosing the model

# save the results
df_AUCROC = pd.DataFrame(data=None, index=dataset_list, columns=model_dict.keys())
df_AUCPR = pd.DataFrame(data=None, index=dataset_list, columns=model_dict.keys())

In [4]:
# seed for reproducible results
seed = 42

for dataset in dataset_list:
    # import the dataset
    datagenerator.dataset = dataset  # specify the dataset name
    data = datagenerator.generator(
        la=0.1, realistic_synthetic_mode=None, noise_type=None
    )  # only 10% labeled anomalies are available

    for name, clf in model_dict.items():
        # model initialization
        clf = clf(seed=seed, model_name=name)

        # training, for unsupervised models the y label will be discarded
        clf = clf.fit(X_train=data["X_train"], y_train=data["y_train"])

        # output predicted anomaly score on testing set
        score = clf.predict_score(data["X_test"])

        # evaluation
        result = utils.metric(y_true=data["y_test"], y_score=score)

        # save results
        df_AUCROC.loc[dataset, name] = result["aucroc"]

current noise type: None
{'Samples': 1831, 'Features': 21, 'Anomalies': 176, 'Anomalies Ratio(%)': 9.61}
best param: None
Start Training...
ensemble size: 4
kernel size: 2
ICLNet(
  (enc_f_net): MLPnet(
    (network): Sequential(
      (0): LinearBlock(
        (linear): Linear(in_features=19, out_features=100, bias=False)
        (act_layer): Tanh()
        (bn_layer): BatchNorm1d(20, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (1): LinearBlock(
        (linear): Linear(in_features=100, out_features=50, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(20, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (2): LinearBlock(
        (linear): Linear(in_features=50, out_features=128, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(20, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
    )
  )
  (enc_g_net): M

epoch 10, training loss: 2.456641, time: 0.1s
epoch 20, training loss: 1.718342, time: 0.2s
epoch 30, training loss: 1.969649, time: 0.1s
epoch 40, training loss: 1.387178, time: 0.1s
epoch 50, training loss: 1.321773, time: 0.1s
epoch 60, training loss: 1.184589, time: 0.1s
epoch 70, training loss: 1.262119, time: 0.1s
epoch 80, training loss: 1.176873, time: 0.1s
epoch 90, training loss: 1.132189, time: 0.1s
epoch100, training loss: 1.130423, time: 0.1s
Start Inference on the training data...


testing: 100%|██████████| 21/21 [00:00<00:00, 367.72it/s]
testing: 100%|██████████| 21/21 [00:00<00:00, 377.23it/s]
testing: 100%|██████████| 21/21 [00:00<00:00, 376.37it/s]
testing: 100%|██████████| 21/21 [00:00<00:00, 380.26it/s]
testing: 100%|██████████| 9/9 [00:00<00:00, 353.30it/s]
testing: 100%|██████████| 9/9 [00:00<00:00, 366.42it/s]
testing: 100%|██████████| 9/9 [00:00<00:00, 295.28it/s]
testing: 100%|██████████| 9/9 [00:00<00:00, 322.63it/s]


subsampling for dataset 13_fraud...
current noise type: None
{'Samples': 10000, 'Features': 29, 'Anomalies': 16, 'Anomalies Ratio(%)': 0.16}
best param: None
Start Training...
ensemble size: 3
kernel size: 2
ICLNet(
  (enc_f_net): MLPnet(
    (network): Sequential(
      (0): LinearBlock(
        (linear): Linear(in_features=27, out_features=100, bias=False)
        (act_layer): Tanh()
        (bn_layer): BatchNorm1d(28, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (1): LinearBlock(
        (linear): Linear(in_features=100, out_features=50, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(28, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (2): LinearBlock(
        (linear): Linear(in_features=50, out_features=128, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(28, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True

testing: 100%|██████████| 110/110 [00:00<00:00, 363.80it/s]
testing: 100%|██████████| 110/110 [00:00<00:00, 364.06it/s]
testing: 100%|██████████| 110/110 [00:00<00:00, 361.36it/s]
testing: 100%|██████████| 47/47 [00:00<00:00, 335.04it/s]
testing: 100%|██████████| 47/47 [00:00<00:00, 362.56it/s]
testing: 100%|██████████| 47/47 [00:00<00:00, 351.99it/s]


current noise type: None
{'Samples': 7603, 'Features': 100, 'Anomalies': 700, 'Anomalies Ratio(%)': 9.21}
best param: None
Start Training...
ensemble size: 1
kernel size: 10
ICLNet(
  (enc_f_net): MLPnet(
    (network): Sequential(
      (0): LinearBlock(
        (linear): Linear(in_features=90, out_features=100, bias=False)
        (act_layer): Tanh()
        (bn_layer): BatchNorm1d(91, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (1): LinearBlock(
        (linear): Linear(in_features=100, out_features=50, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(91, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (2): LinearBlock(
        (linear): Linear(in_features=50, out_features=128, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(91, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
    )
  )
  (enc_g_net):

testing: 100%|██████████| 84/84 [00:00<00:00, 334.60it/s]
testing: 100%|██████████| 36/36 [00:00<00:00, 335.46it/s]


generating duplicate samples for dataset 42_WBC...
current noise type: None
{'Samples': 1000, 'Features': 9, 'Anomalies': 44, 'Anomalies Ratio(%)': 4.4}
best param: None
Start Training...
ensemble size: 7
kernel size: 2
ICLNet(
  (enc_f_net): MLPnet(
    (network): Sequential(
      (0): LinearBlock(
        (linear): Linear(in_features=7, out_features=100, bias=False)
        (act_layer): Tanh()
        (bn_layer): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (1): LinearBlock(
        (linear): Linear(in_features=100, out_features=50, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (2): LinearBlock(
        (linear): Linear(in_features=50, out_features=128, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=False, track_running_st

epoch 10, training loss: 7.460176, time: 0.1s
epoch 20, training loss: 4.514318, time: 0.1s
epoch 30, training loss: 3.253325, time: 0.1s
epoch 40, training loss: 2.597877, time: 0.1s
epoch 50, training loss: 2.775661, time: 0.1s
epoch 60, training loss: 2.399912, time: 0.1s
epoch 70, training loss: 2.188711, time: 0.1s
epoch 80, training loss: 2.147900, time: 0.1s
epoch 90, training loss: 1.657577, time: 0.1s
epoch100, training loss: 1.558133, time: 0.1s
kernel size: 2
ICLNet(
  (enc_f_net): MLPnet(
    (network): Sequential(
      (0): LinearBlock(
        (linear): Linear(in_features=7, out_features=100, bias=False)
        (act_layer): Tanh()
        (bn_layer): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (1): LinearBlock(
        (linear): Linear(in_features=100, out_features=50, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=False, track_running_st

testing: 100%|██████████| 11/11 [00:00<00:00, 386.95it/s]
testing: 100%|██████████| 11/11 [00:00<00:00, 398.80it/s]
testing: 100%|██████████| 11/11 [00:00<00:00, 316.58it/s]
testing: 100%|██████████| 11/11 [00:00<00:00, 374.18it/s]
testing: 100%|██████████| 11/11 [00:00<00:00, 412.07it/s]
testing: 100%|██████████| 11/11 [00:00<00:00, 404.08it/s]
testing: 100%|██████████| 11/11 [00:00<00:00, 381.30it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 363.41it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 357.80it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 370.80it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 346.85it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 363.06it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 351.98it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 398.50it/s]


generating duplicate samples for dataset 43_WDBC...
current noise type: None
{'Samples': 1000, 'Features': 30, 'Anomalies': 33, 'Anomalies Ratio(%)': 3.3}
best param: None
Start Training...
ensemble size: 3
kernel size: 2
ICLNet(
  (enc_f_net): MLPnet(
    (network): Sequential(
      (0): LinearBlock(
        (linear): Linear(in_features=28, out_features=100, bias=False)
        (act_layer): Tanh()
        (bn_layer): BatchNorm1d(29, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (1): LinearBlock(
        (linear): Linear(in_features=100, out_features=50, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(29, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (2): LinearBlock(
        (linear): Linear(in_features=50, out_features=128, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(29, eps=1e-05, momentum=0.1, affine=False, track_runn

testing: 100%|██████████| 11/11 [00:00<00:00, 337.40it/s]
testing: 100%|██████████| 11/11 [00:00<00:00, 328.56it/s]
testing: 100%|██████████| 11/11 [00:00<00:00, 354.18it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 315.07it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 323.37it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 316.77it/s]

generating duplicate samples for dataset 4_breastw...
current noise type: None
{'Samples': 1000, 'Features': 9, 'Anomalies': 360, 'Anomalies Ratio(%)': 36.0}
best param: None
Start Training...
ensemble size: 7
kernel size: 2
ICLNet(
  (enc_f_net): MLPnet(
    (network): Sequential(
      (0): LinearBlock(
        (linear): Linear(in_features=7, out_features=100, bias=False)
        (act_layer): Tanh()
        (bn_layer): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (1): LinearBlock(
        (linear): Linear(in_features=100, out_features=50, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (2): LinearBlock(
        (linear): Linear(in_features=50, out_features=128, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=False, track_runni




epoch  1, training loss: 67.960076, time: 0.1s
epoch 10, training loss: 5.922819, time: 0.1s
epoch 20, training loss: 4.512699, time: 0.1s
epoch 30, training loss: 2.826114, time: 0.1s
epoch 40, training loss: 2.110901, time: 0.1s
epoch 50, training loss: 1.642380, time: 0.1s
epoch 60, training loss: 1.608773, time: 0.1s
epoch 70, training loss: 1.415740, time: 0.1s
epoch 80, training loss: 1.408265, time: 0.1s
epoch 90, training loss: 1.284295, time: 0.1s
epoch100, training loss: 1.084434, time: 0.1s
kernel size: 2
ICLNet(
  (enc_f_net): MLPnet(
    (network): Sequential(
      (0): LinearBlock(
        (linear): Linear(in_features=7, out_features=100, bias=False)
        (act_layer): Tanh()
        (bn_layer): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (1): LinearBlock(
        (linear): Linear(in_features=100, out_features=50, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(8, eps=1e-0

epoch 10, training loss: 7.010805, time: 0.1s
epoch 20, training loss: 4.063835, time: 0.1s
epoch 30, training loss: 3.150791, time: 0.1s
epoch 40, training loss: 2.122366, time: 0.1s
epoch 50, training loss: 2.116168, time: 0.1s
epoch 60, training loss: 1.786091, time: 0.1s
epoch 70, training loss: 1.661699, time: 0.1s
epoch 80, training loss: 1.374450, time: 0.1s
epoch 90, training loss: 1.266468, time: 0.1s
epoch100, training loss: 1.407918, time: 0.1s
kernel size: 2
ICLNet(
  (enc_f_net): MLPnet(
    (network): Sequential(
      (0): LinearBlock(
        (linear): Linear(in_features=7, out_features=100, bias=False)
        (act_layer): Tanh()
        (bn_layer): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (1): LinearBlock(
        (linear): Linear(in_features=100, out_features=50, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=False, track_running_st

testing: 100%|██████████| 11/11 [00:00<00:00, 379.59it/s]
testing: 100%|██████████| 11/11 [00:00<00:00, 397.16it/s]
testing: 100%|██████████| 11/11 [00:00<00:00, 410.16it/s]
testing: 100%|██████████| 11/11 [00:00<00:00, 391.41it/s]
testing: 100%|██████████| 11/11 [00:00<00:00, 395.87it/s]
testing: 100%|██████████| 11/11 [00:00<00:00, 406.41it/s]
testing: 100%|██████████| 11/11 [00:00<00:00, 405.49it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 246.03it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 386.74it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 407.21it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 356.39it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 394.17it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 371.68it/s]
testing: 100%|██████████| 5/5 [00:00<00:00, 257.04it/s]


current noise type: None
{'Samples': 3062, 'Features': 166, 'Anomalies': 97, 'Anomalies Ratio(%)': 3.17}
best param: None
Start Training...
ensemble size: 1
kernel size: 16
ICLNet(
  (enc_f_net): MLPnet(
    (network): Sequential(
      (0): LinearBlock(
        (linear): Linear(in_features=150, out_features=100, bias=False)
        (act_layer): Tanh()
        (bn_layer): BatchNorm1d(151, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (1): LinearBlock(
        (linear): Linear(in_features=100, out_features=50, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(151, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (2): LinearBlock(
        (linear): Linear(in_features=50, out_features=128, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(151, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
    )
  )
  (enc_g_ne

testing: 100%|██████████| 34/34 [00:00<00:00, 293.51it/s]
testing: 100%|██████████| 15/15 [00:00<00:00, 304.79it/s]

current noise type: None
{'Samples': 1456, 'Features': 12, 'Anomalies': 50, 'Anomalies Ratio(%)': 3.43}
best param: None
Start Training...
ensemble size: 6
kernel size: 2
ICLNet(
  (enc_f_net): MLPnet(
    (network): Sequential(
      (0): LinearBlock(
        (linear): Linear(in_features=10, out_features=100, bias=False)
        (act_layer): Tanh()
        (bn_layer): BatchNorm1d(11, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (1): LinearBlock(
        (linear): Linear(in_features=100, out_features=50, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(11, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (2): LinearBlock(
        (linear): Linear(in_features=50, out_features=128, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(11, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
    )
  )
  (enc_g_net): ML




epoch  1, training loss: 33.162752, time: 0.1s
epoch 10, training loss: 1.450437, time: 0.1s
epoch 20, training loss: 1.168082, time: 0.1s
epoch 30, training loss: 1.022032, time: 0.1s
epoch 40, training loss: 0.913861, time: 0.1s
epoch 50, training loss: 0.858695, time: 0.1s
epoch 60, training loss: 0.832699, time: 0.1s
epoch 70, training loss: 0.792271, time: 0.1s
epoch 80, training loss: 0.711186, time: 0.1s
epoch 90, training loss: 0.691919, time: 0.1s
epoch100, training loss: 0.697396, time: 0.1s
kernel size: 2
ICLNet(
  (enc_f_net): MLPnet(
    (network): Sequential(
      (0): LinearBlock(
        (linear): Linear(in_features=10, out_features=100, bias=False)
        (act_layer): Tanh()
        (bn_layer): BatchNorm1d(11, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (1): LinearBlock(
        (linear): Linear(in_features=100, out_features=50, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(11, eps=1

epoch 10, training loss: 1.771345, time: 0.1s
epoch 20, training loss: 1.277593, time: 0.1s
epoch 30, training loss: 1.009374, time: 0.1s
epoch 40, training loss: 0.939300, time: 0.1s
epoch 50, training loss: 0.930740, time: 0.1s
epoch 60, training loss: 0.801810, time: 0.1s
epoch 70, training loss: 0.846673, time: 0.1s
epoch 80, training loss: 0.733163, time: 0.1s
epoch 90, training loss: 0.694137, time: 0.1s
epoch100, training loss: 0.683217, time: 0.1s
kernel size: 2
ICLNet(
  (enc_f_net): MLPnet(
    (network): Sequential(
      (0): LinearBlock(
        (linear): Linear(in_features=10, out_features=100, bias=False)
        (act_layer): Tanh()
        (bn_layer): BatchNorm1d(11, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (1): LinearBlock(
        (linear): Linear(in_features=100, out_features=50, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(11, eps=1e-05, momentum=0.1, affine=False, track_running

testing: 100%|██████████| 16/16 [00:00<00:00, 405.22it/s]
testing: 100%|██████████| 16/16 [00:00<00:00, 354.22it/s]
testing: 100%|██████████| 16/16 [00:00<00:00, 383.62it/s]
testing: 100%|██████████| 16/16 [00:00<00:00, 390.31it/s]
testing: 100%|██████████| 16/16 [00:00<00:00, 364.83it/s]
testing: 100%|██████████| 16/16 [00:00<00:00, 391.62it/s]
testing: 100%|██████████| 7/7 [00:00<00:00, 388.56it/s]
testing: 100%|██████████| 7/7 [00:00<00:00, 377.82it/s]
testing: 100%|██████████| 7/7 [00:00<00:00, 380.24it/s]
testing: 100%|██████████| 7/7 [00:00<00:00, 367.36it/s]
testing: 100%|██████████| 7/7 [00:00<00:00, 374.43it/s]
testing: 100%|██████████| 7/7 [00:00<00:00, 321.34it/s]


current noise type: None
{'Samples': 3772, 'Features': 6, 'Anomalies': 93, 'Anomalies Ratio(%)': 2.47}
best param: None
Start Training...
ensemble size: 8
kernel size: 2
ICLNet(
  (enc_f_net): MLPnet(
    (network): Sequential(
      (0): LinearBlock(
        (linear): Linear(in_features=4, out_features=100, bias=False)
        (act_layer): Tanh()
        (bn_layer): BatchNorm1d(5, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (1): LinearBlock(
        (linear): Linear(in_features=100, out_features=50, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(5, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (2): LinearBlock(
        (linear): Linear(in_features=50, out_features=128, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(5, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
    )
  )
  (enc_g_net): MLPnet(

epoch  1, training loss: 36.904547, time: 0.3s
epoch 10, training loss: 1.104519, time: 0.2s
epoch 20, training loss: 0.607233, time: 0.2s
epoch 30, training loss: 0.453211, time: 0.2s
epoch 40, training loss: 0.424869, time: 0.3s
epoch 50, training loss: 0.385977, time: 0.2s
epoch 60, training loss: 0.357614, time: 0.2s
epoch 70, training loss: 0.294661, time: 0.2s
epoch 80, training loss: 0.334352, time: 0.2s
epoch 90, training loss: 0.266011, time: 0.2s
epoch100, training loss: 0.230215, time: 0.3s
kernel size: 2
ICLNet(
  (enc_f_net): MLPnet(
    (network): Sequential(
      (0): LinearBlock(
        (linear): Linear(in_features=4, out_features=100, bias=False)
        (act_layer): Tanh()
        (bn_layer): BatchNorm1d(5, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
      )
      (1): LinearBlock(
        (linear): Linear(in_features=100, out_features=50, bias=False)
        (act_layer): LeakyReLU(negative_slope=0.01)
        (bn_layer): BatchNorm1d(5, eps=1e-0

epoch  1, training loss: 21.067305, time: 0.2s
epoch 10, training loss: 0.737990, time: 0.2s
epoch 20, training loss: 0.385057, time: 0.2s
epoch 30, training loss: 0.329420, time: 0.2s
epoch 40, training loss: 0.295736, time: 0.2s
epoch 50, training loss: 0.328127, time: 0.2s
epoch 60, training loss: 0.231093, time: 0.2s
epoch 70, training loss: 0.239335, time: 0.2s
epoch 80, training loss: 0.238083, time: 0.2s
epoch 90, training loss: 0.222905, time: 0.2s
epoch100, training loss: 0.205138, time: 0.2s
Start Inference on the training data...


testing: 100%|██████████| 42/42 [00:00<00:00, 468.70it/s]
testing: 100%|██████████| 42/42 [00:00<00:00, 463.48it/s]
testing: 100%|██████████| 42/42 [00:00<00:00, 475.17it/s]
testing: 100%|██████████| 42/42 [00:00<00:00, 433.58it/s]
testing: 100%|██████████| 42/42 [00:00<00:00, 460.87it/s]
testing: 100%|██████████| 42/42 [00:00<00:00, 466.67it/s]
testing: 100%|██████████| 42/42 [00:00<00:00, 426.78it/s]
testing: 100%|██████████| 42/42 [00:00<00:00, 441.52it/s]
testing: 100%|██████████| 18/18 [00:00<00:00, 452.84it/s]
testing: 100%|██████████| 18/18 [00:00<00:00, 444.04it/s]
testing: 100%|██████████| 18/18 [00:00<00:00, 460.25it/s]
testing: 100%|██████████| 18/18 [00:00<00:00, 457.02it/s]
testing: 100%|██████████| 18/18 [00:00<00:00, 398.96it/s]
testing: 100%|██████████| 18/18 [00:00<00:00, 424.02it/s]
testing: 100%|██████████| 18/18 [00:00<00:00, 425.99it/s]
testing: 100%|██████████| 18/18 [00:00<00:00, 411.14it/s]


## Metrics and Results 
The Area Under the Receiver Operating Characteristic (AUROC) curve is a fundamental metric used to evaluate the performance of anomaly detection models. It provides insight into how well a model can distinguish between normal and anomalous instances across various threshold settings. AUC represents the degree or measure of separability calculated using ROC curves. It indicates how well the model can distinguish between classes. The higher the AUC, the better the model predicts 0 classes as 0 and 1 classes as 1. Below is a table that shows AUCROC values for different datasets.

In [5]:
df_AUCROC

Unnamed: 0,ICL
6_cardio,0.793364
13_fraud,0.867379
24_mnist,0.892911
42_WBC,0.952828
43_WDBC,0.974138
4_breastw,0.826051
25_musk,0.999109
40_vowels,0.850237
38_thyroid,0.864163
