A cybersecurity tool that maps IDS alerts to CVE vulnerabilities using TF-IDF ranked retrieval, Snort rule integration, CVSS severity scoring, and CWE-based remediation guidance.
Authors: Shahed Alabdulrahman & Germaine Hounakey
- Parses real NVD CVE dataset (compressed JSON, 43,943 entries)
- Extracts CVE IDs, descriptions, CVSS scores, and CWE weakness types
- Supports:
- Direct CVE lookup (e.g., CVE-2025-XXXX)
- Natural language IDS alerts
- Snort-style alerts (e.g., ET WEB_SERVER SQL Injection Attempt)
- TF-IDF weighted ranking with length normalisation
- CVSS severity boosting (CRITICAL ×1.30 / HIGH ×1.15 / MEDIUM ×1.05)
- Snort .rules file integration with CVE reference and token overlap matching
- CWE-based recommendation engine (19 CWE types, 16 keyword fallback rules)
- Normalised confidence scoring
- Graceful fallback when CVE is not found in dataset
- Built-in evaluation suite (Hit@3, Precision@1)
- Structured timestamped output logging
python IAE_main.pyEnsure CVE_data.json.gz is in the same directory.
Optionally place snort_rules.rules in the same directory to enable Snort integration.
| Input | Action |
|---|---|
CVE-2024-1234 |
Direct CVE lookup |
| Any alert text | TF-IDF + Snort retrieval |
eval |
Run evaluation suite |
exit |
Quit |
CVE-2025-3456
ET WEB_SERVER SQL Injection Attempt -- select from
buffer overflow in SSH authentication module
The system generates:
- CVE ID
- Severity (with CVSS score)
- CWE type
- Confidence score
- Snort rules matched (if applicable)
- Vulnerability description
- CWE-based security recommendation
┌─ RESULT #1 (Direct CVE Match)
│ CVE ID : CVE-2025-3456
│ Severity : HIGH (CVSS 8.1)
│ CWE : CWE-89
│ Confidence: 100%
│
│ Snort Rules Matched:
│ SID 1001 [web-application-attack] — ET WEB_SERVER SQL Injection Attempt
│ match=CVE ref cve_refs=CVE-2025-3456
│
│ Summary:
│ ...
│
│ Recommendations:
│ • Use parameterised queries and validate all database inputs.
- Load CVE dataset and build TF-IDF index
- Load Snort rules file (if present)
- Match alert against Snort rules (CVE ref or token overlap)
- Extract CVE from alert (if present), otherwise tokenise alert
- Rank CVEs by TF-IDF score × CVSS boost × Snort boost
- Compute normalised confidence score
- Generate CWE-based recommendation
- Print structured explanation and append to output.txt
Running eval tests the system on 10 real-world alert patterns:
| Metric | Result |
|---|---|
| Hit@3 (≥1 relevant result in top 3) | 10/10 (100%) |
| Precision@1 (rank-1 result is relevant) | 7/10 (70%) |
intrusion_explainer/
├── IAE_main.py ← main script
├── CVE_data.json.gz ← NVD CVE dataset
├── snort_rules.rules ← Snort rules file (optional)
├── output.txt ← auto-generated alert log
└── README.md
This project is an educational cybersecurity information retrieval system inspired by real-world IDS alert analysis workflows.