MSCS 640 · Artificial Intelligence and Cyber Security · Spring 2026
An ML-based NIDS that classifies raw network flows as Benign or a specific attack type (DDoS, PortScan, Bot, DoS, Infiltration, Web Attack, Brute Force) using the CIC-IDS2017 benchmark dataset.
nids_project/
├── main.py # Training entry point (run this first)
├── app.py # Streamlit SOC dashboard
├── requirements.txt
├── README.md
└── src/
├── config.py # All paths, hyper-params, label maps, colours
├── data_pipeline.py # Load → clean → SMOTE → split → scale
├── model.py # XGBoost / MLP build, train, save, load
└── evaluate.py # Confusion matrix, ROC curves, F1 report
pip install -r requirements.txt# Recommended (XGBoost + SMOTE resampling):
python main.py
# MLP neural network alternative:
python main.py --model mlp
# Skip SMOTE (faster on low-memory machines):
python main.py --no-resampleTraining artefacts are saved to models/ and evaluation reports to reports/.
streamlit run app.pyOpen http://localhost:8501 and upload any CIC-IDS2017 CSV to get an instant threat report.
| Property | Value |
|---|---|
| Dataset | CIC-IDS2017 |
| Source | Canadian Institute for Cybersecurity |
| Files | 8 daily traffic captures |
| Features | ~80 packet-flow statistics |
| Target | Multi-class (Benign + 7 attack families) |
| Category | Raw labels mapped |
|---|---|
| BENIGN | BENIGN |
| DDoS | DDoS |
| PortScan | PortScan |
| Bot | Bot |
| Infiltration | Infiltration |
| Web Attack | Web Attack – Brute Force / XSS / Sql Injection |
| DoS | DoS Hulk / GoldenEye / slowloris / Slowhttptest / Heartbleed |
| Brute Force | FTP-Patator / SSH-Patator |
- Strip column-name whitespace (CIC-IDS2017 quirk)
- Replace ±∞ with NaN, drop rows with missing values
- Remove exact duplicates
- Map raw labels to canonical attack categories
- Imbalance handling: RandomUnderSampler (majority class) → SMOTE (minority classes)
- Standardise with
StandardScaler(fit on train, transform both sets)
| Stage | Architecture | Primary Metric |
|---|---|---|
| Baseline | XGBoost (hist tree method) |
Macro F1-Score |
| Advanced | MLP (256 → 128 → 64, ReLU, Adam) | Macro F1-Score |
Why Macro F1? Plain accuracy is misleading in highly imbalanced datasets (predicting BENIGN always would give >99% accuracy). Macro F1 treats every class equally regardless of support.
Streamlit dashboard with dark SOC-terminal UI:
- CSV upload → real-time classification
- Summary metrics, donut chart, bar chart, confidence timeline
- Colour-coded malicious-flow table
- One-click CSV export of the full threat report
| File | Description |
|---|---|
classification_report.txt |
Per-class precision, recall, F1, macro averages |
confusion_matrix.png |
Labelled confusion-matrix heatmap |
roc_curves.png |
Per-class ROC curves with AUC scores |
- No data leakage — validation strictly on held-out test set (stratified 80/20 split)
- Open-source libraries only (scikit-learn, XGBoost, imbalanced-learn, Streamlit)
- All team members must commit to the Git repository
| Name | Role |
|---|---|
| Member 1 | Data Engineering, Model Training |
| Member 2 | Dashboard Development, Evaluation |