# Autoencoder (AE) Implementation with Nature-Inspired Optimization
### MSC/DSA/134

This notebook implements an Autoencoder for fraud detection. 
Optimization Goal: Find the best architecture (Encoder Layers, Decoder Layers, Latent Size, Units, Dropout) that maximizes the F1 score (Anomaly Detection performance).

In [1]:
# import libraries and dependencies

import globals.torch_gpu_processing as torch_gpu_processing
import pandas as pd
import globals.ae_runner as ae_runner
import globals.data_utils as data_utils
import joblib

In [2]:
# import datasets
data_base_path = "data/processed/null_value_option_1_with_validation_set/scaled_only"

X_train = pd.read_csv(f"{data_base_path}/unified_transaction_data_option2_x_train_scaled.csv")
X_validation = pd.read_csv(f"{data_base_path}/unified_transaction_data_option2_x_validation_scaled.csv")
X_test = pd.read_csv(f"{data_base_path}/unified_transaction_data_option2_x_test_scaled.csv")

y_train = pd.read_csv(f"{data_base_path}/unified_transaction_data_option2_y_train.csv")
y_validation = pd.read_csv(f"{data_base_path}/unified_transaction_data_option2_y_validation.csv")
y_test = pd.read_csv(f"{data_base_path}/unified_transaction_data_option2_y_test.csv")

print("X_train:", X_train.shape)
print("X_validation:", X_validation.shape)
print("X_test:", X_test.shape)

X_train: (354305, 26)
X_validation: (118102, 26)
X_test: (118102, 26)


In [3]:
x_sam, y_sam = data_utils.get_stratified_sample(X_validation, y_validation, 10000)
data_utils.show_class_distribution(x_sam, y_sam.to_numpy().ravel(), "Test validation data sample")



Test validation data sample:
  Total samples: 10000
  Y df samples:  [0 0 0 ... 0 0 0]
  Class 0 (non-fraud): 9650 (96.50%)
  Class 1 (fraud): 350 (3.50%)


In [4]:
torch_gpu_processing.test_direct_ml_processing()

DirectML device: privateuseone:0
Test operation successful: [2. 4.]


True

In [4]:
# get a sample for optimization (Non-Fraud Only!)
sample_size = 100000
seed = 42

X_train_sample, y_train_sample = ae_runner.get_sampling_data(
    X_train,
    y_train,
    sample_size=sample_size,
    seed=seed
)

print(f"Optimization Sample Size (Non-Fraud Only): {len(X_train_sample)}")

Optimization Sample Size (Non-Fraud Only): 100000


In [9]:
# set meta data
# Settings
param_optimizer_algorithm = "PSO" # (FA, PSO, GWO)
population = 15
iterations = 6
epochs_for_evaluation = 20
batch_size = 1024
early_stopping = 4

In [6]:
best_hp = ae_runner.run_optimization(
    X_train_sample,
    y_train_sample,
    X_validation,
    y_validation,
    algorithm=param_optimizer_algorithm,
    population=population,
    iterations=iterations,
    batch_size=1024,
    epochs=10
)

Starting AE Optimization using PSO...
Settings: Pop=15, Iter=6, Batch=1024, Epochs=10
Training samples (non-fraud): 100000
Validation (non-fraud): 113971, full: 118102
DirectML: Using optimized DataLoader + clipping
Optimizer using DEVICE: privateuseone:0


2026/01/27 10:04:45 AM, INFO, mealpy.swarm_based.PSO.OriginalPSO: OriginalPSO(epoch=6, pop_size=15, c1=2.05, c2=2.05, w=0.4)



Downsampled validation data:
  Total samples: 20000
  Y df samples:  [0 0 0 ... 0 0 0]
  Class 0 (non-fraud): 19300 (96.50%)
  Class 1 (fraud): 700 (3.50%)
......... [AUPRC: 0.0585 (baseline: 0.0350) | F1: 0.1024 | ROC: 0.6488].......... [AUPRC: 0.0551 (baseline: 0.0350) | F1: 0.0949 | ROC: 0.6231].......... [AUPRC: 0.0609 (baseline: 0.0350) | F1: 0.1023 | ROC: 0.6528].......... [AUPRC: 0.0589 (baseline: 0.0350) | F1: 0.0949 | ROC: 0.6353].......... [AUPRC: 0.0589 (baseline: 0.0350) | F1: 0.0988 | ROC: 0.6352].......... [AUPRC: 0.0558 (baseline: 0.0350) | F1: 0.0984 | ROC: 0.6234]........ [AUPRC: 0.0555 (baseline: 0.0350) | F1: 0.0947 | ROC: 0.6301].......... [AUPRC: 0.0527 (baseline: 0.0350) | F1: 0.0890 | ROC: 0.5651].......... [AUPRC: 0.0564 (baseline: 0.0350) | F1: 0.0964 | ROC: 0.6244].......... [AUPRC: 0.0585 (baseline: 0.0350) | F1: 0.0937 | ROC: 0.5925]....... [AUPRC: 0.0615 (baseline: 0.0350) | F1: 0.1074 | ROC: 0.6607].......... [AUPRC: 0.0550 (baseline: 0.0350) | F1: 0.0892

2026/01/27 02:09:37 PM, INFO, mealpy.swarm_based.PSO.OriginalPSO: >>>Problem: P, Epoch: 1, Current best: 0.9350251439892338, Global best: 0.9350251439892338, Runtime: 6502.30326 seconds


.......... [AUPRC: 0.0565 (baseline: 0.0350) | F1: 0.0989 | ROC: 0.6010].......... [AUPRC: 0.0635 (baseline: 0.0350) | F1: 0.1048 | ROC: 0.6527].......... [AUPRC: 0.0591 (baseline: 0.0350) | F1: 0.0972 | ROC: 0.6252].......... [AUPRC: 0.0532 (baseline: 0.0350) | F1: 0.0925 | ROC: 0.6055].......... [AUPRC: 0.0607 (baseline: 0.0350) | F1: 0.0958 | ROC: 0.6258].......... [AUPRC: 0.0592 (baseline: 0.0350) | F1: 0.0993 | ROC: 0.6190].......... [AUPRC: 0.0632 (baseline: 0.0350) | F1: 0.0987 | ROC: 0.6338].......... [AUPRC: 0.0561 (baseline: 0.0350) | F1: 0.0988 | ROC: 0.6447].......... [AUPRC: 0.0622 (baseline: 0.0350) | F1: 0.0964 | ROC: 0.6151].......... [AUPRC: 0.0530 (baseline: 0.0350) | F1: 0.0944 | ROC: 0.6101].......... [AUPRC: 0.0562 (baseline: 0.0350) | F1: 0.1049 | ROC: 0.6253].......... [AUPRC: 0.0637 (baseline: 0.0350) | F1: 0.1167 | ROC: 0.6591].......... [AUPRC: 0.0590 (baseline: 0.0350) | F1: 0.0936 | ROC: 0.6181].......... [AUPRC: 0.0631 (baseline: 0.0350) | F1: 0.1151 | ROC:

2026/01/27 04:02:25 PM, INFO, mealpy.swarm_based.PSO.OriginalPSO: >>>Problem: P, Epoch: 2, Current best: 0.9350251439892338, Global best: 0.9350251439892338, Runtime: 6768.34020 seconds


.......... [AUPRC: 0.0643 (baseline: 0.0350) | F1: 0.0946 | ROC: 0.6314].......... [AUPRC: 0.0512 (baseline: 0.0350) | F1: 0.0892 | ROC: 0.5638].......... [AUPRC: 0.0509 (baseline: 0.0350) | F1: 0.0917 | ROC: 0.6059]......... [AUPRC: 0.0535 (baseline: 0.0350) | F1: 0.1015 | ROC: 0.6050].......... [AUPRC: 0.0585 (baseline: 0.0350) | F1: 0.0922 | ROC: 0.6213].......... [AUPRC: 0.0508 (baseline: 0.0350) | F1: 0.0914 | ROC: 0.6122].......... [AUPRC: 0.0525 (baseline: 0.0350) | F1: 0.0898 | ROC: 0.5995]......... [AUPRC: 0.0576 (baseline: 0.0350) | F1: 0.1018 | ROC: 0.6389].......... [AUPRC: 0.0616 (baseline: 0.0350) | F1: 0.1034 | ROC: 0.6272].......... [AUPRC: 0.0630 (baseline: 0.0350) | F1: 0.1028 | ROC: 0.6453].......... [AUPRC: 0.0525 (baseline: 0.0350) | F1: 0.1029 | ROC: 0.6204].......... [AUPRC: 0.0586 (baseline: 0.0350) | F1: 0.0926 | ROC: 0.6300].......... [AUPRC: 0.0530 (baseline: 0.0350) | F1: 0.0922 | ROC: 0.6250].......... [AUPRC: 0.0601 (baseline: 0.0350) | F1: 0.0976 | ROC: 0

2026/01/27 06:20:54 PM, INFO, mealpy.swarm_based.PSO.OriginalPSO: >>>Problem: P, Epoch: 3, Current best: 0.9350251439892338, Global best: 0.9350251439892338, Runtime: 8309.73587 seconds


.......... [AUPRC: 0.0566 (baseline: 0.0350) | F1: 0.0961 | ROC: 0.6299].......... [AUPRC: 0.0595 (baseline: 0.0350) | F1: 0.0997 | ROC: 0.6339].......... [AUPRC: 0.0681 (baseline: 0.0350) | F1: 0.0935 | ROC: 0.6108].......... [AUPRC: 0.0544 (baseline: 0.0350) | F1: 0.1069 | ROC: 0.6434].......... [AUPRC: 0.0621 (baseline: 0.0350) | F1: 0.0961 | ROC: 0.6243].......... [AUPRC: 0.0661 (baseline: 0.0350) | F1: 0.0963 | ROC: 0.6333].......... [AUPRC: 0.0615 (baseline: 0.0350) | F1: 0.0927 | ROC: 0.6217].......... [AUPRC: 0.0524 (baseline: 0.0350) | F1: 0.0934 | ROC: 0.5983].......... [AUPRC: 0.0610 (baseline: 0.0350) | F1: 0.0970 | ROC: 0.6140].......... [AUPRC: 0.0544 (baseline: 0.0350) | F1: 0.0959 | ROC: 0.6242].......... [AUPRC: 0.0572 (baseline: 0.0350) | F1: 0.1026 | ROC: 0.6326].......... [AUPRC: 0.0661 (baseline: 0.0350) | F1: 0.1020 | ROC: 0.6416].......... [AUPRC: 0.0540 (baseline: 0.0350) | F1: 0.0888 | ROC: 0.5929]........ [AUPRC: 0.0570 (baseline: 0.0350) | F1: 0.0897 | ROC: 0

2026/01/27 08:15:13 PM, INFO, mealpy.swarm_based.PSO.OriginalPSO: >>>Problem: P, Epoch: 4, Current best: 0.9319211123319088, Global best: 0.9319211123319088, Runtime: 6859.27364 seconds


..... [AUPRC: 0.0561 (baseline: 0.0350) | F1: 0.1196 | ROC: 0.6643]....... [AUPRC: 0.0636 (baseline: 0.0350) | F1: 0.1243 | ROC: 0.6756].......... [AUPRC: 0.0610 (baseline: 0.0350) | F1: 0.0971 | ROC: 0.6066].......... [AUPRC: 0.0546 (baseline: 0.0350) | F1: 0.0966 | ROC: 0.6123].......... [AUPRC: 0.0649 (baseline: 0.0350) | F1: 0.1020 | ROC: 0.6481].......... [AUPRC: 0.0533 (baseline: 0.0350) | F1: 0.0942 | ROC: 0.6146].......... [AUPRC: 0.0807 (baseline: 0.0350) | F1: 0.1415 | ROC: 0.6934].......... [AUPRC: 0.0550 (baseline: 0.0350) | F1: 0.1115 | ROC: 0.6329]......... [AUPRC: 0.0550 (baseline: 0.0350) | F1: 0.0961 | ROC: 0.6202].......... [AUPRC: 0.0555 (baseline: 0.0350) | F1: 0.0981 | ROC: 0.6347].......... [AUPRC: 0.0659 (baseline: 0.0350) | F1: 0.1183 | ROC: 0.6717].......... [AUPRC: 0.0598 (baseline: 0.0350) | F1: 0.0967 | ROC: 0.6362].......... [AUPRC: 0.0646 (baseline: 0.0350) | F1: 0.1057 | ROC: 0.6309].......... [AUPRC: 0.0521 (baseline: 0.0350) | F1: 0.0875 | ROC: 0.5991].

2026/01/27 10:14:16 PM, INFO, mealpy.swarm_based.PSO.OriginalPSO: >>>Problem: P, Epoch: 5, Current best: 0.919291132890992, Global best: 0.919291132890992, Runtime: 7142.69410 seconds


.......... [AUPRC: 0.0613 (baseline: 0.0350) | F1: 0.1263 | ROC: 0.6438].......... [AUPRC: 0.0598 (baseline: 0.0350) | F1: 0.0944 | ROC: 0.6109].......... [AUPRC: 0.0565 (baseline: 0.0350) | F1: 0.0970 | ROC: 0.6202].......... [AUPRC: 0.0517 (baseline: 0.0350) | F1: 0.0915 | ROC: 0.5744].......... [AUPRC: 0.0575 (baseline: 0.0350) | F1: 0.0938 | ROC: 0.6076].......... [AUPRC: 0.0620 (baseline: 0.0350) | F1: 0.0988 | ROC: 0.6304].......... [AUPRC: 0.0586 (baseline: 0.0350) | F1: 0.0962 | ROC: 0.6286]........ [AUPRC: 0.0519 (baseline: 0.0350) | F1: 0.0898 | ROC: 0.6071].......... [AUPRC: 0.0650 (baseline: 0.0350) | F1: 0.1085 | ROC: 0.6628].......... [AUPRC: 0.0592 (baseline: 0.0350) | F1: 0.1007 | ROC: 0.6378].......... [AUPRC: 0.0537 (baseline: 0.0350) | F1: 0.0929 | ROC: 0.6022].......... [AUPRC: 0.0585 (baseline: 0.0350) | F1: 0.0947 | ROC: 0.6057].......... [AUPRC: 0.0609 (baseline: 0.0350) | F1: 0.0922 | ROC: 0.5988].......... [AUPRC: 0.0599 (baseline: 0.0350) | F1: 0.0960 | ROC: 0

2026/01/27 11:50:35 PM, INFO, mealpy.swarm_based.PSO.OriginalPSO: >>>Problem: P, Epoch: 6, Current best: 0.919291132890992, Global best: 0.919291132890992, Runtime: 5779.19855 seconds



Best Objective: 0.919291 => AUPRC: 0.080709


In [6]:
# temp set hyperparams
best_hp = {'n_encoder_layers': 4, 'n_decoder_layers': 3, 'latent_size': 8, 'encoder_units': [384, 672, 144, 544], 'encoder_activations': ['elu', 'elu', 'relu', 'selu'], 'decoder_units': [112, 144, 1008], 'decoder_activations': ['elu', 'silu', 'elu'], 'dropout_rate': 0.01177586505224017, 'batch_norm': True}

In [7]:
# show best hyperparameters
print("Hyperparameters To Train:")
print(best_hp)


Hyperparameters To Train:
{'n_encoder_layers': 4, 'n_decoder_layers': 3, 'latent_size': 8, 'encoder_units': [384, 672, 144, 544], 'encoder_activations': ['elu', 'elu', 'relu', 'selu'], 'decoder_units': [112, 144, 1008], 'decoder_activations': ['elu', 'silu', 'elu'], 'dropout_rate': 0.01177586505224017, 'batch_norm': True}


In [10]:
# final model training with best hyperparameters

print("Using validation set for early stopping, test set for final evaluation only.")
model, metrics = torch_gpu_processing.train_final_ae_model(
    best_hp,
    X_train.to_numpy(),
    y_train.to_numpy(),
    X_validation.to_numpy(),  # Validation set for early stopping
    y_validation.to_numpy(),
    X_test.to_numpy(),  # Test set for final evaluation only
    y_test.to_numpy(),
    batch_size=batch_size,
    max_epochs=100  # Increased for better convergence
)

print("\nFinal Test Set Metrics:")
print("=" * 60)


if metrics.get('optimal_auprc') is not None:
    fraud_rate = y_test.to_numpy().flatten().mean()
    auprc = metrics.get('optimal_auprc')
    print(f"PRIMARY METRIC (Optimization Objective):")
    print(f"  Test AUPRC:       {auprc:.4f}")
    print(f"  Baseline (random): {fraud_rate:.4f}")
    print(f"  Improvement:       {auprc/fraud_rate:.2f}x over random")
    print()

if metrics.get('optimal_roc_auc') is not None:
    print(f"  Test ROC-AUC:     {metrics.get('optimal_roc_auc'):.4f}")
    print()

print("Threshold-Dependent Metrics:")
print(f"  Optimal Threshold: {metrics.get('optimal_threshold'):.6f}")
print(f"  Optimal F1:        {metrics.get('optimal_f1'):.4f}")
print(f"  Optimal Precision: {metrics.get('optimal_precision'):.4f}")
print(f"  Optimal Recall:    {metrics.get('optimal_recall'):.4f}")
print("=" * 60)

Using validation set for early stopping, test set for final evaluation only.
FINAL AE TRAINING (Max Epochs: 100)
Training on device: privateuseone:0

Filtering Training Data:
  Original size: 354305
  Fraud samples removed: 12394
  Final training size (non-fraud only): 341911

Validation Data Split:
  For early stopping (non-fraud only): 113971
  For AUPRC evaluation (full set): 118102 (4131 fraud samples)

Training Configuration:
  Noise std: 0.1
  Batch size: 1024
  Early stopping patience: 10
  Optimizer: Adam (lr=0.001, weight_decay=1e-5)

Starting training...
Epoch 1/100: Train Loss=0.096155, Val Loss (clean)=0.139647
Epoch 6/100: Train Loss=0.029589, Val Loss (clean)=0.035414
Epoch 11/100: Train Loss=0.020532, Val Loss (clean)=0.128904
Epoch 16/100: Train Loss=0.021285, Val Loss (clean)=0.134629
Epoch 21/100: Train Loss=0.022649, Val Loss (clean)=0.324141

Early stopping triggered at epoch 24
Loaded best model (Val Loss: 0.032786)

VALIDATION SET EVALUATION
Validation AUPRC:     

In [11]:

# define model export path
fitted_models_base = "models/deep_learning/"
model_name = "ae_model.joblib"
joblib.dump(model, fitted_models_base + model_name)

['models/deep_learning/ae_model.joblib']