Skip to content

readysetgit24/CS418-Final-Project

Repository files navigation

Network Anomaly Detection in Cybersecurity

Anomaly detection has been the main focus of many researchers’ due to its potential in detecting novel attacks. However, its adoption to real-world applications has been hampered due to system complexity as these systems require a substantial amount of testing, evaluation, and tuning prior to deployment.

Our project aims to help the field experts in Cybersecurity and non-experts by notifying them of potential malicious activity and its nature. This dataset was generated by the Canadian Institute for Cybersecurity (CIC) and the Communications Security Establishment (CSE) to leverage anomaly detection techniques to detect network intrusion.

The attacking infrastructure includes 50 machines and the victim organization has 5 departments and includes 420 machines and 30 servers. The dataset includes the captures network traffic and system logs of each machine, along with 80 features extracted from the captured traffic using CICFlowMeter-V3.[1]

We make use of AWS Sagemaker to do pre-processing, EDA and model training and testing.

The dataset used in this project was "A Realistic Cyber Defense Dataset (CSE-CIC-IDS2018)". It was accessed on 14th Nov 2021 from https://registry.opendata.aws/cse-cic-ids2018

Concatenating the dataset

We process the csv files into a pandas dataframe and also reduce the memory utilization. We do this in Preprocessing for Pickling file.

Cleaning the dataset

We clean the dataset of all outliers and invalid entries in Unpickle, Clean and Drop

Exploratory Data Analysis

We look at various correlation heatmaps which are by class labels and the entire data distribution as a whole in EDA.

Baseline model

We train and evaluate a baseline model in LogisticRegression

Model selection and hyperparameter tuning

We evaluate 3 ensemble models on the sample of the processed data and then perform hyperparameter tuning on the full dataset in Model Selection

References

[1] https://www.unb.ca/cic/datasets/ids-2018.html

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published