A full-stack vulnerability management pipeline that parses scanner reports (OpenVAS & Nessus), enriches findings with threat intelligence data (CISA KEV, EPSS, NVD CVSS, Exploit-DB), persists everything into a SQLite database, and serves real-time Engineer and Executive web dashboards.
- How It Works
- Prerequisites
- Installation
- Usage Guide
- Web Dashboards
- REST API Reference
- CLI Tools Reference
- Project Structure
- Output Schema
- Scoring Models
- Data Sources
- Testing
- Configuration & Environment Variables
- Branching Strategy
- Lab Environment Setup
The pipeline follows a four-stage workflow:
┌─────────────────┐ ┌───────────────┐ ┌──────────────────────┐ ┌─────────────────┐
│ Scanner Report │ ──▶ │ Parse & │ ──▶ │ Enrich & Score │ ──▶ │ Dashboard & │
│ (XML file) │ │ Normalize │ │ (KEV/EPSS/NVD/ │ │ REST API │
│ OpenVAS/Nessus │ │ → JSON → DB │ │ Exploit-DB) │ │ (Flask) │
└─────────────────┘ └───────────────┘ └──────────────────────┘ └─────────────────┘
-
Parse — OpenVAS or Nessus XML reports are converted into a normalized JSON format (defined by
schema.json). Vulnerabilities are grouped by asset (IP/hostname) and severity (critical/high/medium/low). -
Ingest — The normalized JSON is ingested into a SQLite database via SQLAlchemy. Duplicate scans (same
scan_id) are automatically detected and skipped. -
Enrich — The enrichment worker queries live threat intelligence APIs to augment each vulnerability:
- CISA KEV — Is this vulnerability known to be actively exploited?
- EPSS — What is the probability of exploitation in the next 30 days?
- NVD CVSS — What is the official severity score?
- Exploit-DB — Does a public exploit exist?
- Scores are used to compute the Exposure Imminence Score (EIS) and per-asset Exposure Score.
-
Visualize — A Flask web server exposes:
- An Engineer Dashboard — interactive vulnerability table with filters, search, and pagination
- An Executive Dashboard — summary cards, severity breakdown bar, and top risky assets
- A REST API — programmatic access to all data
- Python 3.9+ (tested with 3.11)
- pip package manager
- Internet access (for enrichment API calls; optional if using cached data)
# Clone the repository
git clone https://github.com/pizn-01/parse.git
cd parse/python
# Install dependencies
pip install -r requirements.txtDependencies installed:
| Package | Version | Purpose |
|---|---|---|
jsonschema |
≥ 4.0 | JSON schema validation for parser output |
requests |
≥ 2.28 | HTTP client for KEV/EPSS/NVD/Exploit-DB APIs |
flask |
≥ 3.0 | REST API server & dashboard hosting |
sqlalchemy |
≥ 2.0 | ORM & SQLite database layer |
pytest |
≥ 7.0 | Test runner |
All commands are run from the python/ directory:
cd parse/pythonConvert your scanner's XML output into normalized JSON.
OpenVAS:
python parse_openvas.py \
--input scan_report.xml \
--kev sample_kev.csv \
--epss sample_epss.csv \
--output openvas_output.jsonNessus:
python parse_nessus.py \
--input scan_report.nessus \
--kev sample_kev.csv \
--epss sample_epss.csv \
--output nessus_output.json| Argument | Required | Description |
|---|---|---|
--input, -i |
✅ | Path to the scanner XML file |
--kev |
Optional | CSV file mapping CVE IDs to KEV status |
--epss |
Optional | CSV file with EPSS scores (header: cve,epss) |
--output, -o |
Optional | Output file path (default: <scanner>_output.json) |
Note: If you don't have KEV/EPSS CSV files, the parsers will still work — those fields will simply be
null. The enrichment worker (Step 3) can populate them later from live APIs.
Load the parsed JSON into the SQLite database:
python db.py --input openvas_output.jsonWhat happens:
- Creates
vulndb.dbin the current directory (if it doesn't exist) - Creates the
scans,assets, andvulnerabilitiestables - Inserts all scan data
- If a scan with the same
scan_idalready exists → skipped (duplicate protection)
You can ingest multiple scan files:
python db.py --input openvas_output.json
python db.py --input nessus_output.jsonRun the enrichment worker to pull live data from threat intelligence APIs:
python db_enrichment_worker.py --cache-dir cacheWhat happens:
- Reads all CVE IDs from the database
- Fetches/refreshes from CISA KEV feed (known exploited vulnerabilities)
- Fetches/refreshes EPSS scores (exploitation probability)
- Fetches/refreshes CVSS scores from NVD (severity)
- Checks Exploit-DB for public exploits
- Updates every vulnerability row with fresh scores
- Recalculates the Exposure Imminence Score (EIS) for each vulnerability
- Recalculates the Asset Exposure Score for each host
| Argument | Default | Description |
|---|---|---|
--cache-dir |
cache/ |
Directory for caching API responses |
--force |
false |
Force-refresh all caches (ignores freshness) |
Caching: API responses are cached locally with a 24-hour freshness window. Subsequent runs only fetch new data if the cache is stale. Use
--forceto bypass caching.
python api.pyThe server starts at http://localhost:5000 with these pages:
| URL | Description |
|---|---|
http://localhost:5000/ |
Engineer Dashboard — vulnerability table with filters |
http://localhost:5000/executive |
Executive Dashboard — summary cards and top risky assets |
A full-featured vulnerability table designed for security engineers:
- Search — filter by CVE ID or vulnerability title
- Severity Filter — show only Critical, High, Medium, or Low
- KEV Filter — show only known-exploited vulnerabilities
- Asset IP Filter — narrow down to a specific host
- Pagination — 50 results per page with page navigation
- Live data — all data fetched in real-time from the REST API
Columns displayed: CVE ID, Title, Severity, CVSS, EPSS, EIS Score, KEV status, Public Exploit status, Asset IP
A high-level summary view for management and stakeholders:
- Summary Cards — Total assets, total vulnerabilities, counts by severity, exploited count, KEV count, average EPSS
- Severity Distribution Bar — visual proportional bar showing critical/high/medium/low breakdown
- Top Risky Assets — table of the most exposed hosts ranked by exposure score
All endpoints return JSON. Base URL: http://localhost:5000
Paginated list of all vulnerabilities with optional filters.
| Query Param | Type | Example | Description |
|---|---|---|---|
severity |
string | critical |
Filter by severity level |
asset_ip |
string | 192.168.1.10 |
Filter by asset IP |
is_kev |
bool | true |
Filter by KEV status |
search |
string | CVE-2023 |
Search in CVE ID or title |
page |
int | 1 |
Page number (default: 1) |
per_page |
int | 50 |
Results per page (default: 50) |
Response:
{
"total": 142,
"page": 1,
"per_page": 50,
"data": [
{
"id": 1,
"vulnerability_id": "CVE-2023-12345",
"title": "Critical Remote Code Execution",
"severity": "critical",
"cvss_score": 9.8,
"epss_score": 0.87,
"eis_score": 92.0,
"is_kev": true,
"is_exploited": true,
"has_public_exploit": true,
"asset_ip": "192.168.1.100",
"asset_hostname": "server-01",
"description": "...",
"remediation": "Upgrade to version 2.4.52 or later"
}
]
}List all assets sorted by exposure score (descending).
Response:
{
"total": 5,
"data": [
{
"id": 1,
"asset_id": "asset_001",
"ip": "192.168.1.100",
"hostname": "server-01",
"os": "Linux",
"exposure_score": 78.5,
"vuln_count": 12
}
]
}Single asset detail including all its vulnerabilities.
Executive summary computed from live database data.
Response:
{
"total_assets": 5,
"total_vulnerabilities": 142,
"by_severity": { "critical": 8, "high": 23, "medium": 67, "low": 44 },
"exploited_vulns": 12,
"kev_vulns": 15,
"avg_epss": 0.3421,
"top_risky_assets": [ ... ]
}Upload and ingest a parsed scan JSON directly via API.
Request: Send JSON body or multipart file upload (field: file).
# Using curl
curl -X POST http://localhost:5000/api/ingest \
-H "Content-Type: application/json" \
-d @openvas_output.jsonResponse:
{ "status": "ingested", "scan_id": "scan-12345" } // 201 Created
{ "status": "duplicate", "scan_id": "scan-12345" } // 200 OK (already exists)All tools support --help for detailed usage.
| Tool | Command | Description |
|---|---|---|
| OpenVAS Parser | python parse_openvas.py -i scan.xml -o out.json |
Convert OpenVAS XML to normalized JSON |
| Nessus Parser | python parse_nessus.py -i scan.nessus -o out.json |
Convert Nessus XML to normalized JSON |
| DB Ingestion | python db.py --input out.json |
Load parsed JSON into SQLite database |
| Enrichment Worker | python db_enrichment_worker.py --cache-dir cache |
Enrich DB with live KEV/EPSS/NVD/Exploit-DB |
| API Server | python api.py |
Start Flask dashboard on port 5000 |
| Executive Summary | python executive_summary.py -i out.json -f table |
CLI table/JSON summary (no server needed) |
| Standalone Enrichment | python enrichment.py --cves CVE-2023-1234 |
Fetch enrichment data for specific CVEs |
| Deduplication | python dedup.py -e old.json -n new.json -o merged.json |
Merge two scan JSONs, removing duplicates |
| Node.js Parser | node index.js --input scan.xml --output out.json |
Node.js alternative for OpenVAS parsing |
parse/
├── .gitignore
├── README.md
├── python/
│ ├── requirements.txt # Python dependencies
│ ├── schema.json # JSON schema for normalized output
│ ├── sample_openvas.xml # Sample OpenVAS report for testing
│ ├── sample_kev.csv # Sample KEV data
│ ├── sample_epss.csv # Sample EPSS data
│ │
│ ├── parse_openvas.py # OpenVAS XML → normalized JSON
│ ├── parse_nessus.py # Nessus XML → normalized JSON
│ ├── utils.py # Shared utilities (severity, EIS, exposure scoring)
│ ├── enrichment.py # KEV/EPSS/NVD/Exploit-DB connectors + caching
│ ├── dedup.py # Scan deduplication & merge
│ │
│ ├── db.py # SQLAlchemy models + ingestion pipeline
│ ├── db_enrichment_worker.py # DB enrichment worker
│ ├── api.py # Flask REST API + dashboard server
│ ├── executive_summary.py # CLI executive summary generator
│ │
│ ├── templates/
│ │ ├── dashboard.html # Engineer dashboard (vuln table + filters)
│ │ └── executive.html # Executive dashboard (summary cards)
│ │
│ ├── tests/
│ │ ├── test_parser_schema.py # Parser output & schema validation (25 tests)
│ │ ├── test_db.py # Database layer tests (12 tests)
│ │ ├── test_api.py # REST API endpoint tests (17 tests)
│ │ ├── test_dedup.py # Deduplication logic tests (11 tests)
│ │ ├── test_enrichment.py # Enrichment & EIS scoring tests (13 tests)
│ │ └── test_executive_summary.py # Executive summary tests (9 tests)
│ │
│ ├── cache/ # Auto-created: cached API responses
│ └── vulndb.db # Auto-created: SQLite database
│
└── node/
├── package.json
└── index.js # Node.js OpenVAS parser
Both parsers produce identical JSON conforming to schema.json:
{
"scan_metadata": {
"scan_id": "scan-12345",
"scan_start": "2024-01-15T10:30:00Z",
"scan_end": "2024-01-15T11:45:00Z"
},
"assets": [
{
"asset_id": "asset_001",
"ip": "192.168.1.100",
"hostname": "server-01",
"os": "Linux",
"exposure_score": 78.5,
"vulnerabilities_by_severity": {
"critical": [ { "vulnerability_id": "CVE-2023-12345", "cvss_score": 9.8, ... } ],
"high": [],
"medium": [ ... ],
"low": [ ... ]
}
}
]
}Each vulnerability record includes:
| Field | Type | Description |
|---|---|---|
vulnerability_id |
string | CVE ID or scanner OID |
nvt_oid |
string | Scanner-specific test identifier |
title |
string | Human-readable vulnerability name |
cvss_score |
float | null | CVSS v3.x base score (0–10) |
epss_score |
float | null | EPSS exploitation probability (0–1) |
eis_score |
float | null | Exposure Imminence Score (0–100) |
is_kev |
boolean | Listed in CISA KEV catalog |
is_exploited |
boolean | Known to be actively exploited |
has_public_exploit |
boolean | null | Public exploit exists on Exploit-DB |
description |
string | Vulnerability description |
remediation |
string | Recommended fix or patch |
affected_software |
string | Impacted software/version |
qod |
integer | Quality of detection (scanner confidence %) |
Per-vulnerability risk score combining three signals:
EIS = ( 0.5 × EPSS + 0.3 × CVSS/10 + 0.2 × KEV_flag ) × 100
| Factor | Weight | Range | Source |
|---|---|---|---|
| EPSS | 50% | 0–1 | FIRST.org API |
| CVSS | 30% | 0–10 (normalized to 0–1) | NVD API |
| KEV | 20% | 0 or 1 | CISA KEV catalog |
Result: 0–100 score. Higher = more imminent threat.
Per-asset aggregate risk combining EIS breadth and depth:
Asset Score = (avg_EIS) × log₂(vuln_count + 1)
The logarithmic weighting rewards breadth of exposure — an asset with many moderate-risk vulnerabilities scores higher than one with a single moderate finding. Capped at 100.
| Source | URL | Data Provided |
|---|---|---|
| CISA KEV | cisa.gov/known-exploited-vulnerabilities-catalog | Known exploited vulnerability catalog |
| FIRST EPSS | api.first.org/data/v1/epss | Exploitation probability scores |
| NVD | services.nvd.nist.gov/rest/json/cves/2.0 | CVSS base scores, vulnerability metadata |
| Exploit-DB | exploit-db.com | Public exploit availability |
All API responses are cached locally in the cache/ directory with a 24-hour freshness window. Use --force to bypass cache.
Run the full test suite (85 tests):
cd python
python -m pytest tests/ -v| Test File | Tests | What It Covers |
|---|---|---|
test_parser_schema.py |
25 | Both parsers produce valid JSON against schema.json; cross-parser consistency |
test_db.py |
12 | Table creation, scan ingestion, duplicate rejection, query filters |
test_api.py |
17 | All REST endpoints, HTML pages, ingestion + duplicate handling |
test_dedup.py |
11 | Vulnerability dedup, scan merging, asset matching |
test_enrichment.py |
13 | EIS calculation, cache freshness, enrichment connectors |
test_executive_summary.py |
9 | Summary metrics (severity counts, top risky assets, EPSS avg) |
Run a specific test file:
python -m pytest tests/test_api.py -v| Variable | Default | Description |
|---|---|---|
VULN_DB_PATH |
python/vulndb.db |
Path to the SQLite database file |
The database path can also be set via CLI: python db.py --input scan.json --db /path/to/custom.db
Cache directory is configured per-run via --cache-dir (default: cache/).
We follow a lightweight Git branching workflow:
| Branch | Purpose |
|---|---|
main |
Stable, production-ready code. Only merge from develop after testing. |
develop |
Integration branch for current sprint work |
feature/<name> |
Individual feature branches (e.g., feature/db-pipeline, feature/dashboard) |
bugfix/<name> |
Bug fix branches |
Workflow:
- Create a feature branch from
develop:git checkout -b feature/my-feature develop - Work on the feature, commit frequently with descriptive messages
- Open a Pull Request to merge into
develop - After sprint review and CI passes, merge
develop→main
To set up vulnerable targets for running real scans:
Create a docker-compose.lab.yml:
version: '3.8'
services:
dvwa:
image: vulnerables/web-dvwa
ports:
- "8080:80"
environment:
MYSQL_PASS: dvwa
restart: unless-stopped
metasploitable:
image: tleemcjr/metasploitable2
ports:
- "2222:22"
- "8081:80"
- "445:445"
restart: unless-stoppeddocker-compose -f docker-compose.lab.yml up -dTargets will be available at:
- DVWA:
http://localhost:8080(default creds:admin/password) - Metasploitable:
ssh -p 2222 msfadmin@localhost(creds:msfadmin/msfadmin)
- Metasploitable 2 — Download from SourceForge. Import
.vmdkinto VirtualBox/VMware with a Host-only network adapter. - DVWA — Download from GitHub. Deploy on any LAMP/WAMP stack and configure
config/config.inc.php.
# 1. Scan the lab targets with OpenVAS/Nessus
# Export results as XML
# 2. Parse → Ingest → Enrich → View
cd python
python parse_openvas.py -i lab_scan.xml -o lab_output.json
python db.py --input lab_output.json
python db_enrichment_worker.py
python api.py
# 3. Open http://localhost:5000/ in your browserThis project was developed as part of a university cybersecurity course.