This repository contains the experimental code used in the following article:
Lee, K., Lee, J., Kong, I. et al. (2025)
On the use of supervised anomaly detection algorithms for extremely imbalanced data.
Journal of the Korean Statistical Society (2025)
DOI: https://doi.org/10.1007/s42952-025-00347-x
To compare and analyze anomaly detection methods with imbalanced binary classification methods under extremely imbalanced settings.
Data loading → train/validation/test split → (optional) adjustment of minority class size → over/under-sampling → MinMax scaling → model training and score generation → metric storage.
python experiment.py -d 35 -m 0 -s 0 -l 3 -mi 3 -se 0 -g 0 --epochs 50 --lr 0.001 --batch_size 128 --thresholdq 0.95
-
-d Dataset ID
-
-m Model (0: mlp, 1: lr, 9: rf, 10: svm, 4–7: Anomaly methods)
-
-s Sampling method (none/smote/adasyn, etc.)
-
-l Loss function (focal, etc.)
-
-mi Number of minority (anomaly) samples
-
-se Seed preset
-
-g GPU
-
--epochs / --lr / --batch_size Training configuration
-
--thresholdq Quantile threshold (q)
Anomaly: y=1, Normal: y=0. Scores are assumed to follow “higher = more anomalous.”