Skip to content

aaaastark/Intrusion-Detection-System

Repository files navigation

Attack Detection, Parameter Optimization and Performance Analysis in Enterprise Networks (ML Networks) for Intrusion Detection System IDS.

Get Started with Relevant Project Implementation

  • If you're looking for assistance with a project implementation that aligns with your needs, feel free to get in touch with us LinkedIn.
  • To get in touch with us and discuss your project implementation needs, please send an email to 4444stark@gmail.com.
  • Thank you for considering our services. We look forward to working with you!

Intrusion Detection System (IDS) as one of the most trusted layers of security for an organization to defend against all sorts of cyber attacks is ubiquitous. In the proposed thesis, we present an experimental analysis to empirically find the optimal parameter settings of 4 classification techniques from 2 machine learning families. To investigate the studied algorithms, we choose 2 widely used IDS datasets (CIC-IDS2017, CSE-CIC-IDS2018) that resemble real-world network traffic for both benign and malicious activities. CatBoost and LightGBM algorithms work well for both binary classification and multiclass classification of malicious traffic into several attack groups. We conduct experiments on binary classification, i.e., given an IDS log, we predict whether the log is either benign (Normal) or malicious (Attacks), on such datasets. Moreover, we take a step further to analyze such algorithms’ performances on multiclass classification; for example, we aim to detect the type of attack from each dataset. Such as DoS/DDoS, PortScan, BruteForce, WebAttack, Bot and Infiltration.

Research Paper: Attack Detection in Enterprise Networks by Machine Learning Methods

Research Paper: Parameter Optimization and Performance Analysis of State-of-the-Art Machine Learning Techniques for Intrusion Detection System (IDS)

Loading the dataset (Canadian Institute for Cybersecurity) Intrusion Detection Evaluation Dataset (CIC-IDS2017)

CICIDS2017 dataset contains benign and the most up-to-date common attacks, which resembles the true real-world data (PCAPs). It also includes the results of the network traffic analysis using CICFlowMeter with labeled flows based on the time stamp, source, and destination IPs, source and destination ports, protocols and attack (CSV files).

Loading the dataset (Canadian Institute for Cybersecurity) Communications Security Establishment (CSE) & the Canadian Institute for Cybersecurity (CIC) CSE-CIC-IDS2018 on AWS

In CSE-CIC-IDS2018 dataset, we use the notion of profiles to generate datasets in a systematic manner, which will contain detailed descriptions of intrusions and abstract distribution models for applications, protocols, or lower level network entities.

Workflow that is used in this Project

  • Data Processing (ETL, Wrangling)
  • Data Normalization
  • Binary Class Classification
  • Multi Class Classification
  • Feature Extraction (BC and MC)
  • LightGBM and CatBoost (Gradient Boosting)
  • Visualization (Classification Report and Ploat Confusion Matrix)
  • Accuracy, Mean Absolute Error, Mean Squared Error, Root Mean Squared Error, and R2 Score

APIs that are used in this Project

  • tensorflow
  • sklearn
  • keras
  • lightgbm
  • catboost
  • matplotlib
  • numpy
  • pandas

Paper vs Proposed Results: Attack Detection, Parameter Optimization and Performance Analysis in Enterprise Networks (ML Networks) for Intrusion Detection System IDS.

Binary Class Classification (Proposed Results): LightGBM (Gradient Boosting)

Binary Class Classification (Proposed Results): CatBoost (Gradient Boosting)

Multi Class Classification (Proposed Results): LightGBM (Gradient Boosting)

Multi Class Classification (Proposed Results): CatBoost (Gradient Boosting)