XAI-IDS: Towards Proposing an Explainable Artificial Intelligence Framework for Enhancing Network Intrusion Detection Systems

Abstract

The exponential growth of different intrusions on networked systems inspires new research directions on developing advanced artificial intelligence (AI) techniques for intrusion detection systems (IDS). There are several challenges for such dependence on AI for IDS including the performance of such AI models, and the lack of explainability of the decisions made by such AI algorithms where its outputs are not understandable by the human security analyst. To close such a research gap, we propose an end-to-end explainable AI (XAI) framework for enhancing understandability of AI models for network intrusion detection tasks. We first benchmark eight black-box AI models on two real-world network intrusion datasets with different characteristics. We then generate local and global explanations using different XAI models. We also generate model-specific and intrusion-specific important features. We furthermore generate the common important features that affect different AI models. Our framework has different levels of explanations that can help the network security analysts make more informed decisions based on such explanations. We release our source codes for the community to access it as a baseline XAI framework and to build on it with new datasets and models.

Performance

Overall performances for AI models with top 15 features for the RoEduNet-SIMARGL2021 dataset.

Overall performances for AI models with top 15 features for the CICIDS-2017 dataset.

Low-Level XAI Pipeline Components

Loading Intrusion Database: Loading either one of the datasets.
Feature Extraction: Selecting 15 top features or all the features.
Redundancy Elimination and Randomizing Rows: Eliminate duplicate rows and data shuffle.
Data Balancing: Oversample technique is used.
Feature Normalization: All features are normalize in a scale from 0 to 1.
Black-box AI Models: The model we are using in each program (SVM, DNN, MLP, KNN,LightGBM, XGBoost, ADA, Random Forest).
Black-box AI Evaluation: The metrics used: accuracy (ACC), precision (Prec), recall (Rec), F1-score (F1), Matthews correlation coefficient (MCC), balanced accuracy (BACC), and the area under ROC curve (AUCROC).
XAI Global Explanations: Shap generates Global Summary and Beeswarm Plot.
XAI Local Explanations: Shap and LIME generate single sample explanations.

How to use the programs:

For Global Explanations.

Download one of the datasets.

RoEduNet-SIMARGL2021: https://www.kaggle.com/datasets/7f91274fa3074d53e983f6eb7a7b24ad1dca136ca967ad0ebe48955e246c24ee

CICIDS-2017: https://www.kaggle.com/datasets/cicdataset/cicids2017

NSL-KDD: https://www.unb.ca/cic/datasets/nsl.html

Each program is a standalone program that is aimed to run one form of AI model within a set of features. (i.e. DNN_final.py in the CICIDS-2017 folder will run the DNN model with 15 features for that given dataset. On the other hand. DNN_all_final.py will run the DNN model for all features for the given dataset).
Each program outputs a confusion matrix, metrics scores (i.e. accuracy (ACC), precision (Prec), recall (Rec), F1-score (F1), Matthews correlation coefficient (MCC), balanced accuracy (BACC), and the area under ROC curve (AUCROC)), and the Global Summary/Beeswarm Plot.
Extra: there is a standalone example program RF_example.ipynb in the RoEduNet-SIMARGL2021 folder.

For Local Explanations.

Download one of the datasets.

RoEduNet-SIMARGL2021: https://www.kaggle.com/datasets/7f91274fa3074d53e983f6eb7a7b24ad1dca136ca967ad0ebe48955e246c24ee

CICIDS-2017: https://www.kaggle.com/datasets/cicdataset/cicids2017

NSL-KDD: https://www.unb.ca/cic/datasets/nsl.html

Run the example python notebook called "RF_LIME_SHAP.ipynb" in a python notebook environment.
The program outputs one Local Waterfall shap explanation and one Local LIME explanation for the same sample using the Random Forest method

Visualization results

Global Summary/Beeswarm plots with SHAP.

Results example for Random Forest using RoEduNet-SIMARGL2021.

Local Explanation with LIME and SHAP.

Results using SHAP and LIME for the same Random Forest prediction for a normal traffic sample from the CICIDS-2017 dataset.

XAI-Framework.

Preprocessing File:

To begin using our framework, run the preprocessing file. This file is designed to process your raw data, encompassing steps like normalization, encoding, and feature selection. Upon completion, it will output four key datasets: X_train, X_test, Y_train, and Y_test. These datasets are crucial for feeding into the subsequent model training and testing phases.

All_Model File:

For users looking to test our architecture with their data, the All_Model file is your next step. This interactive file allows you to choose from seven different machine learning algorithms, depending on your specific needs or experimental setup. The options range from ensemble methods like AdaBoost and Random Forest to neural networks, KNN, MLP, and SGD. Simply run the All_Model file, select your desired algorithm, and it will automatically apply it to the datasets generated from your preprocessing step.

Framework Folder:

A pivotal component of our project is the XAI_Framework folder, focusing on Explainable AI (XAI). This folder contains tools and functions for analyzing and visualizing the decision-making process of the models. It is designed to be modified or extended as per your requirements, offering insights into how and why specific model predictions are made.

Modularity and Flexibility:

Our project's architecture is intentionally modular and user-friendly. Each component - preprocessing, model training, and XAI - is independent yet seamlessly integrated. This design ensures that small changes in one part do not impact the overall functionality, offering a flexible and adaptable environment for users. Whether you're conducting academic research or applying it in industry, our framework is equipped to cater to a wide range of applications and user scenarios.

Extra: Xplique toolbox comparison.

We want to cite the following pages for this section.

IGTD github: https://github.com/zhuyitan/IGTD
Xplique github: https://github.com/deel-ai/xplique

To generate the results below, please go to the Xplique folder and:

Convert the tabular dataset into images using tabconversion_cic.py
Run the metrics_cic.ipynb
Run the Attributions_Regression_CIC.ipynb

The feature importance using Xplique for CICIDS-2017 dataset for SHAP (on the left) and LIME (on the right).

Xplique metrics (Deletion, Insetion, MuFidelity, and Stability) and their corresponding results for SHAP and LIME for our three datasets.

Citation:

Please cite this work if it was useful to you :)

https://www.mdpi.com/2076-3417/14/10/4170

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

XAI-IDS: Towards Proposing an Explainable Artificial Intelligence Framework for Enhancing Network Intrusion Detection Systems

Abstract

Performance

How to use the programs:

For Global Explanations.

For Local Explanations.

Visualization results

Global Summary/Beeswarm plots with SHAP.

Local Explanation with LIME and SHAP.

XAI-Framework.

Extra: Xplique toolbox comparison.

Citation:

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
CICIDS-2017		CICIDS-2017
Framework		Framework
NSL-KDD		NSL-KDD
RoEduNet-SIMARGL2021		RoEduNet-SIMARGL2021
Xplique		Xplique
images		images
README.md		README.md
RF_LIME_SHAP.ipynb		RF_LIME_SHAP.ipynb

ogarreche/XAI_NIDS

Folders and files

Latest commit

History

Repository files navigation

XAI-IDS: Towards Proposing an Explainable Artificial Intelligence Framework for Enhancing Network Intrusion Detection Systems

Abstract

Performance

How to use the programs:

For Global Explanations.

For Local Explanations.

Visualization results

Global Summary/Beeswarm plots with SHAP.

Local Explanation with LIME and SHAP.

XAI-Framework.

Extra: Xplique toolbox comparison.

Citation:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages