Skip to content

atk239/ARGUS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NIDS — Network Intrusion Detection System

MSCS 640 · Artificial Intelligence and Cyber Security · Spring 2026

An ML-based NIDS that classifies raw network flows as Benign or a specific attack type (DDoS, PortScan, Bot, DoS, Infiltration, Web Attack, Brute Force) using the CIC-IDS2017 benchmark dataset.


Project Structure

nids_project/
├── main.py               # Training entry point (run this first)
├── app.py                # Streamlit SOC dashboard
├── requirements.txt
├── README.md
└── src/
    ├── config.py         # All paths, hyper-params, label maps, colours
    ├── data_pipeline.py  # Load → clean → SMOTE → split → scale
    ├── model.py          # XGBoost / MLP build, train, save, load
    └── evaluate.py       # Confusion matrix, ROC curves, F1 report

Quick Start

1 · Install dependencies

pip install -r requirements.txt

2 · Train the model

# Recommended (XGBoost + SMOTE resampling):
python main.py

# MLP neural network alternative:
python main.py --model mlp

# Skip SMOTE (faster on low-memory machines):
python main.py --no-resample

Training artefacts are saved to models/ and evaluation reports to reports/.

3 · Launch the dashboard

streamlit run app.py

Open http://localhost:8501 and upload any CIC-IDS2017 CSV to get an instant threat report.


Dataset

Property Value
Dataset CIC-IDS2017
Source Canadian Institute for Cybersecurity
Files 8 daily traffic captures
Features ~80 packet-flow statistics
Target Multi-class (Benign + 7 attack families)

Attack categories

Category Raw labels mapped
BENIGN BENIGN
DDoS DDoS
PortScan PortScan
Bot Bot
Infiltration Infiltration
Web Attack Web Attack – Brute Force / XSS / Sql Injection
DoS DoS Hulk / GoldenEye / slowloris / Slowhttptest / Heartbleed
Brute Force FTP-Patator / SSH-Patator

Methodology

Phase 1 — Data Engineering

  • Strip column-name whitespace (CIC-IDS2017 quirk)
  • Replace ±∞ with NaN, drop rows with missing values
  • Remove exact duplicates
  • Map raw labels to canonical attack categories
  • Imbalance handling: RandomUnderSampler (majority class) → SMOTE (minority classes)
  • Standardise with StandardScaler (fit on train, transform both sets)

Phase 2 — Model Development

Stage Architecture Primary Metric
Baseline XGBoost (hist tree method) Macro F1-Score
Advanced MLP (256 → 128 → 64, ReLU, Adam) Macro F1-Score

Why Macro F1? Plain accuracy is misleading in highly imbalanced datasets (predicting BENIGN always would give >99% accuracy). Macro F1 treats every class equally regardless of support.

Phase 3 — Deployment

Streamlit dashboard with dark SOC-terminal UI:

  • CSV upload → real-time classification
  • Summary metrics, donut chart, bar chart, confidence timeline
  • Colour-coded malicious-flow table
  • One-click CSV export of the full threat report

Evaluation Outputs (reports/)

File Description
classification_report.txt Per-class precision, recall, F1, macro averages
confusion_matrix.png Labelled confusion-matrix heatmap
roc_curves.png Per-class ROC curves with AUC scores

Rules Compliance

  • No data leakage — validation strictly on held-out test set (stratified 80/20 split)
  • Open-source libraries only (scikit-learn, XGBoost, imbalanced-learn, Streamlit)
  • All team members must commit to the Git repository

Team

Name Role
Member 1 Data Engineering, Model Training
Member 2 Dashboard Development, Evaluation

About

ARGUS — ML-based Network Intrusion Detection System. XGBoost on CIC-IDS2017: classifies network flows as Benign or one of 7 attack families (DDoS, PortScan, Bot, DoS, Infiltration, Web Attack, Brute Force). Flask dashboard with SHAP explanations. MSCS 640 Spring 2026.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages