The repository contains code refered to the work:
Giuseppina Andresini, Annalisa Appice, Francesco Paolo Caforio, Donato Malerba
Improving Cyber-Threat Detection by Moving the Boundary around the Normal Samples
Please cite our work if you find it useful for your research and work.
@Inbook{Andresini2021,
author="Andresini, Giuseppina
and Appice, Annalisa
and Paolo Caforio, Francesco
and Malerba, Donato",
editor="Maleh, Yassine
and Shojafar, Mohammad
and Alazab, Mamoun
and Baddi, Youssef",
title="Improving Cyber-Threat Detection by Moving the Boundary Around the Normal Samples",
bookTitle="Machine Intelligence and Big Data Analytics for Cybersecurity Applications",
year="2021",
publisher="Springer International Publishing",
address="Cham",
pages="105--127",
doi="10.1007/978-3-030-57024-8_5",
url="https://doi.org/10.1007/978-3-030-57024-8_5"
}
The code relies on the following python3.6+ libs.
Packages need are:
The datasets used for experiments are accessible from DATASETS. Original dataset is transformed in a binary classification: "threat, normal" (_oneCls files). The repository contains the orginal dataset (folder: "original") and the dataset after the preprocessing phase (folder: "numeric")
Preprocessing phase is done mapping categorical feature and performing the Min Max scaler.
Repository contains scripts of all experiments included in the paper:
- main.py : script to run THEODORA
- mindful.py : script to run MINDFUL model
Code contains models and datasets used for experiments in the work.
To replicate experiments reported in the work, you can use models and datasets stored in homonym folders. Global variables are stored in THEODORA.conf file
N_CLASSES = 2
PREPROCESSING1 = 0 #if set to 1 code execute preprocessing phase on original date
LOAD_AUTOENCODER_ADV = 1 #if 1 the autoencoder for attacks items is loaded from models folder
LOAD_AUTOENCODER_NORMAL = 1 #if 1 the autoencoder for normal items is loaded from models folder
LOAD_CNN = 1 #if 1 the classifier is loaded from models folder
VALIDATION_SPLIT #the percentage of validation set used to train models
CHANGE_CLASS_SVC = 1 #if set to 1 the boundary re-positiong is performed
LOAD_SVC = 1 #if 1 the SVM model for decision boundary is load
THRESHOLD = 0.70 #threshold for change of normal class
All models and plots performed for the experiment about Decision boundary re-positioning can be dowloaded here