# Anomaly Detection (Binary 3D)

Discuss dataset here including shape. Also discuss which data set you specifically chose here.

In [1]:
%load_ext autoreload
%autoreload 2

import pickle
import sys
sys.path.insert(0, "supporting/anomaly")

from supporting.anomaly.preprocessing import load_anomaly_data
import supporting.anomaly.settings as anomaly_settings
import pyMAISE as mai

2024-05-28 14:13:42.858582: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-05-28 14:13:42.858639: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-05-28 14:13:42.860069: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-05-28 14:13:42.868022: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


## pyMAISE Initialization

Discuss initialization: settings file, use of GPU for LSTM/GRU.

In [2]:
_ = mai.init(
    problem_type=anomaly_settings.problem_type,
    verbosity=anomaly_settings.verbosity,
    random_state=anomaly_settings.random_state,
    cuda_visible_devices="1",  # Use GPU 1
)

Num GPUs Available:  1


2024-05-28 14:13:44.898622: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-05-28 14:13:44.938958: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-05-28 14:13:44.939312: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-

Discuss train test split, input data scaling, and the external file we used. Also removal of some positive cases and discuss frequency plot.

![Frequency of positive/negative values in anomaly detection data.](supporting/anomaly/figs/bc1_frequency.png)

In [3]:
xtrain, xtest, ytrain, ytest, xscaler = load_anomaly_data(
    stack_series=False,
    multiclass=False,
    test_size=anomaly_settings.test_size,
    non_faulty_frac=anomaly_settings.non_faulty_frac,
    timestep_step=1,
)

xtrain shape: (239, 4500, 14)
xtest shape: (103, 4500, 14)
ytrain shape: (239, 2)
ytest shape: (103, 2)


In [4]:
xtrain

In [5]:
ytrain

## Model Initialization and Hyperparameter Tuning

Discuss the models we accessed, their hyperparameter search spaces, and dumping/loading of pickled configurations. Discuss convergence plot below too.

![Convergence of Bayesian optimizer to best hyperparameter configuration.](supporting/anomaly/figs/bc1_convergence.png)

In [6]:
with open("supporting/anomaly/configs/binary_case_1.pkl", "rb") as f:
    configs = pickle.load(f)

2024-05-28 14:13:45.554211: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-05-28 14:13:45.554539: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-05-28 14:13:45.554766: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-

## Model Postprocessing

Discuss post processing, including an increase of epochs for both models from 5 to 100.

In [7]:
postprocessor = mai.PostProcessor(
    data=(xtrain, xtest, ytrain, ytest),
    model_configs=[configs],
    new_model_settings={
        "LSTM": {"fitting_params": {"epochs": 100}},
        "GRU": {"fitting_params": {"epochs": 100}},
    },
)

Epoch 1/100


2024-05-28 14:13:49.707911: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:454] Loaded cuDNN version 8904
2024-05-28 14:13:50.868514: I external/local_xla/xla/service/service.cc:168] XLA service 0x70d1b773d5f0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-05-28 14:13:50.868550: I external/local_xla/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 4090, Compute Capability 8.9
2024-05-28 14:13:50.874863: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
I0000 00:00:1716920031.002792 3876013 device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 7

Discuss performance metrics. Include equations and definitions of each metric.

In [10]:
postprocessor.metrics()

Unnamed: 0,Model Types,Parameter Configurations,Train Accuracy,Train Recall,Train Precision,Train F1,Test Accuracy,Test Recall,Test Precision,Test F1
3,GRU,"{'LSTM_input_0_units': 59, 'LSTM_num_layers': ...",0.728033,0.718969,0.790736,0.707025,0.718447,0.670455,0.835227,0.655598
1,LSTM,"{'LSTM_input_0_units': 29, 'LSTM_num_layers': ...",0.669456,0.657153,0.781533,0.62287,0.68932,0.639253,0.790128,0.61603
0,LSTM,"{'LSTM_input_0_units': 59, 'LSTM_num_layers': ...",0.631799,0.617391,0.792453,0.559188,0.650485,0.590909,0.810526,0.536963
2,GRU,"{'LSTM_input_0_units': 29, 'LSTM_num_layers': ...",0.518828,0.5,0.259414,0.341598,0.572816,0.5,0.286408,0.364198


Discuss the results and information provided in the performance metrics.

In [11]:
for model in ["LSTM", "GRU"]:
    for key, value in postprocessor.get_params(model_type=model).to_dict().items():
        print(f"{key}: {value[0]}")
    print()

Model Types: LSTM
LSTM_input_0_units: 29
LSTM_num_layers: 3
LSTM_output_0_units: 133
LSTM_output_0_activation: sigmoid
Dense_num_layers: 1
Adam_learning_rate: 0.0003141707794552247
Adam_clipnorm: 1.1666214434511002
Adam_clipvalue: 0.4380990281818533
LSTM_0_units: 25
LSTM_0_activation: tanh
LSTM_1_units: 25
LSTM_1_activation: tanh
LSTM_2_units: 25
LSTM_2_activation: tanh
Dense_0_units: 25
batch_size: 16

Model Types: GRU
LSTM_input_0_units: 59
LSTM_num_layers: 0
LSTM_output_0_units: 29
LSTM_output_0_activation: tanh
Dense_num_layers: 1
Adam_learning_rate: 0.00010554374535855709
Adam_clipnorm: 0.8772480895916048
Adam_clipvalue: 0.3793316821470448
LSTM_0_units: 41
LSTM_0_activation: sigmoid
LSTM_1_units: 45
LSTM_1_activation: sigmoid
LSTM_2_units: 101
LSTM_2_activation: sigmoid
Dense_0_units: 63
batch_size: 32



Discuss anything interesting (if anything) about the hyperparameter configurations.

In [12]:
postprocessor.confusion_matrix(model_type="LSTM")

ValueError: multilabel-indicator is not supported

In [None]:
postprocessor.confusion_matrix(model_type="GRU")

Discuss confusion matrix results.

In [None]:
postprocessor.nn_learning_plot(model_type="LSTM")

In [None]:
postprocessor.nn_learning_plot(model_type="GRU")

Discuss neural network learning curves.

![pyMAISE Logo](../docs/source/_images/pyMAISElogo.png)