NIDS — Network Intrusion Detection System

MSCS 640 · Artificial Intelligence and Cyber Security · Spring 2026

An ML-based NIDS that classifies raw network flows as Benign or a specific attack type (DDoS, PortScan, Bot, DoS, Infiltration, Web Attack, Brute Force) using the CIC-IDS2017 benchmark dataset.

Project Structure

nids_project/
├── main.py               # Training entry point (run this first)
├── app.py                # Streamlit SOC dashboard
├── requirements.txt
├── README.md
└── src/
    ├── config.py         # All paths, hyper-params, label maps, colours
    ├── data_pipeline.py  # Load → clean → SMOTE → split → scale
    ├── model.py          # XGBoost / MLP build, train, save, load
    └── evaluate.py       # Confusion matrix, ROC curves, F1 report

Quick Start

1 · Install dependencies

pip install -r requirements.txt

2 · Train the model

# Recommended (XGBoost + SMOTE resampling):
python main.py

# MLP neural network alternative:
python main.py --model mlp

# Skip SMOTE (faster on low-memory machines):
python main.py --no-resample

Training artefacts are saved to models/ and evaluation reports to reports/.

3 · Launch the dashboard

streamlit run app.py

Open http://localhost:8501 and upload any CIC-IDS2017 CSV to get an instant threat report.

Dataset

Property	Value
Dataset	CIC-IDS2017
Source	Canadian Institute for Cybersecurity
Files	8 daily traffic captures
Features	~80 packet-flow statistics
Target	Multi-class (Benign + 7 attack families)

Attack categories

Category	Raw labels mapped
BENIGN	BENIGN
DDoS	DDoS
PortScan	PortScan
Bot	Bot
Infiltration	Infiltration
Web Attack	Web Attack – Brute Force / XSS / Sql Injection
DoS	DoS Hulk / GoldenEye / slowloris / Slowhttptest / Heartbleed
Brute Force	FTP-Patator / SSH-Patator

Methodology

Phase 1 — Data Engineering

Strip column-name whitespace (CIC-IDS2017 quirk)
Replace ±∞ with NaN, drop rows with missing values
Remove exact duplicates
Map raw labels to canonical attack categories
Imbalance handling: RandomUnderSampler (majority class) → SMOTE (minority classes)
Standardise with StandardScaler (fit on train, transform both sets)

Phase 2 — Model Development

Stage	Architecture	Primary Metric
Baseline	XGBoost (`hist` tree method)	Macro F1-Score
Advanced	MLP (256 → 128 → 64, ReLU, Adam)	Macro F1-Score

Why Macro F1? Plain accuracy is misleading in highly imbalanced datasets (predicting BENIGN always would give >99% accuracy). Macro F1 treats every class equally regardless of support.

Phase 3 — Deployment

Streamlit dashboard with dark SOC-terminal UI:

CSV upload → real-time classification
Summary metrics, donut chart, bar chart, confidence timeline
Colour-coded malicious-flow table
One-click CSV export of the full threat report

Evaluation Outputs (`reports/`)

File	Description
`classification_report.txt`	Per-class precision, recall, F1, macro averages
`confusion_matrix.png`	Labelled confusion-matrix heatmap
`roc_curves.png`	Per-class ROC curves with AUC scores

Rules Compliance

No data leakage — validation strictly on held-out test set (stratified 80/20 split)
Open-source libraries only (scikit-learn, XGBoost, imbalanced-learn, Streamlit)
All team members must commit to the Git repository

Team

Name	Role
Member 1	Data Engineering, Model Training
Member 2	Dashboard Development, Evaluation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NIDS — Network Intrusion Detection System

Project Structure

Quick Start

1 · Install dependencies

2 · Train the model

3 · Launch the dashboard

Dataset

Attack categories

Methodology

Phase 1 — Data Engineering

Phase 2 — Model Development

Phase 3 — Deployment

Evaluation Outputs (`reports/`)

Rules Compliance

Team

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
models		models
reports		reports
src		src
templates		templates
.gitignore		.gitignore
README.md		README.md
app.py		app.py
flask_app.py		flask_app.py
main.py		main.py
requirements.txt		requirements.txt
training_out.txt		training_out.txt

Folders and files

Latest commit

History

Repository files navigation

NIDS — Network Intrusion Detection System

Project Structure

Quick Start

1 · Install dependencies

2 · Train the model

3 · Launch the dashboard

Dataset

Attack categories

Methodology

Phase 1 — Data Engineering

Phase 2 — Model Development

Phase 3 — Deployment

Evaluation Outputs (reports/)

Rules Compliance

Team

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Evaluation Outputs (`reports/`)

Packages