You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A compiler-inspired NLP tool that parses CVE descriptions, maps them to known vulnerability patterns, and enriches each finding with CWE IDs, mitigations, and code-construct linkage.
Security teams deal with hundreds of CVE descriptions daily. Reading each one manually to determine the vulnerability class, root cause, affected component, severity, and appropriate mitigation is slow and error-prone.
This tool applies compiler design principles β lexical analysis, parsing, semantic analysis, and IR generation β to automate that workflow:
Parse raw CVE text using NLP tokenisation
Pattern-match to 11 known vulnerability categories
Semantically extract root cause, impact, and affected component
Enrich records with CWE IDs, mitigations, and linked code constructs
Output structured JSON + a formatted terminal report
cve_parser/
β
βββ main.py # Entry point β orchestrates the full pipeline
βββ lexer.py # Tokenisation and text normalisation
βββ parser.py # Vulnerability pattern detection
βββ semantic.py # Root cause / impact / component / severity extraction
βββ ir.py # Intermediate Representation builder
βββ patterns.py # Vulnerability keyword patterns + CWE metadata
βββ utils.py # File I/O, statistics, terminal report printer
β
βββ data/
β βββ cves.txt # Input file β one CVE description per line
β
βββ output.json # Generated output (created on first run)
βββ README.md # This file
Setup & Installation
Requirements
Python 3.10+ (uses str | None union syntax)
No third-party packages required β pure standard library
Clone / Download
git clone <repo-url>cd cve_parser
Prepare Input Data
Edit data/cves.txt and add your CVE descriptions β one per line:
# Lines starting with '#' are comments and will be ignored
A buffer overflow vulnerability exists in the input validation function...
SQL injection vulnerability in the login module exposes sensitive database...
Usage
Basic Run
python main.py
Reads data/cves.txt, writes output.json, and prints a full terminal report.