Skip to content

pizn-01/parse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🛡️ Vulnerability Management Dashboard

A full-stack vulnerability management pipeline that parses scanner reports (OpenVAS & Nessus), enriches findings with threat intelligence data (CISA KEV, EPSS, NVD CVSS, Exploit-DB), persists everything into a SQLite database, and serves real-time Engineer and Executive web dashboards.


Table of Contents


How It Works

The pipeline follows a four-stage workflow:

┌─────────────────┐     ┌───────────────┐     ┌──────────────────────┐     ┌─────────────────┐
│  Scanner Report  │ ──▶ │   Parse &     │ ──▶ │  Enrich & Score     │ ──▶ │  Dashboard &    │
│  (XML file)      │     │  Normalize    │     │  (KEV/EPSS/NVD/     │     │  REST API       │
│  OpenVAS/Nessus  │     │  → JSON → DB  │     │   Exploit-DB)       │     │  (Flask)        │
└─────────────────┘     └───────────────┘     └──────────────────────┘     └─────────────────┘
  1. Parse — OpenVAS or Nessus XML reports are converted into a normalized JSON format (defined by schema.json). Vulnerabilities are grouped by asset (IP/hostname) and severity (critical/high/medium/low).

  2. Ingest — The normalized JSON is ingested into a SQLite database via SQLAlchemy. Duplicate scans (same scan_id) are automatically detected and skipped.

  3. Enrich — The enrichment worker queries live threat intelligence APIs to augment each vulnerability:

    • CISA KEV — Is this vulnerability known to be actively exploited?
    • EPSS — What is the probability of exploitation in the next 30 days?
    • NVD CVSS — What is the official severity score?
    • Exploit-DB — Does a public exploit exist?
    • Scores are used to compute the Exposure Imminence Score (EIS) and per-asset Exposure Score.
  4. Visualize — A Flask web server exposes:

    • An Engineer Dashboard — interactive vulnerability table with filters, search, and pagination
    • An Executive Dashboard — summary cards, severity breakdown bar, and top risky assets
    • A REST API — programmatic access to all data

Prerequisites

  • Python 3.9+ (tested with 3.11)
  • pip package manager
  • Internet access (for enrichment API calls; optional if using cached data)

Installation

# Clone the repository
git clone https://github.com/pizn-01/parse.git
cd parse/python

# Install dependencies
pip install -r requirements.txt

Dependencies installed:

Package Version Purpose
jsonschema ≥ 4.0 JSON schema validation for parser output
requests ≥ 2.28 HTTP client for KEV/EPSS/NVD/Exploit-DB APIs
flask ≥ 3.0 REST API server & dashboard hosting
sqlalchemy ≥ 2.0 ORM & SQLite database layer
pytest ≥ 7.0 Test runner

Usage Guide

All commands are run from the python/ directory:

cd parse/python

Step 1 — Parse a Scan Report

Convert your scanner's XML output into normalized JSON.

OpenVAS:

python parse_openvas.py \
  --input scan_report.xml \
  --kev sample_kev.csv \
  --epss sample_epss.csv \
  --output openvas_output.json

Nessus:

python parse_nessus.py \
  --input scan_report.nessus \
  --kev sample_kev.csv \
  --epss sample_epss.csv \
  --output nessus_output.json
Argument Required Description
--input, -i Path to the scanner XML file
--kev Optional CSV file mapping CVE IDs to KEV status
--epss Optional CSV file with EPSS scores (header: cve,epss)
--output, -o Optional Output file path (default: <scanner>_output.json)

Note: If you don't have KEV/EPSS CSV files, the parsers will still work — those fields will simply be null. The enrichment worker (Step 3) can populate them later from live APIs.

Step 2 — Ingest Into Database

Load the parsed JSON into the SQLite database:

python db.py --input openvas_output.json

What happens:

  • Creates vulndb.db in the current directory (if it doesn't exist)
  • Creates the scans, assets, and vulnerabilities tables
  • Inserts all scan data
  • If a scan with the same scan_id already exists → skipped (duplicate protection)

You can ingest multiple scan files:

python db.py --input openvas_output.json
python db.py --input nessus_output.json

Step 3 — Enrich With Threat Intelligence

Run the enrichment worker to pull live data from threat intelligence APIs:

python db_enrichment_worker.py --cache-dir cache

What happens:

  1. Reads all CVE IDs from the database
  2. Fetches/refreshes from CISA KEV feed (known exploited vulnerabilities)
  3. Fetches/refreshes EPSS scores (exploitation probability)
  4. Fetches/refreshes CVSS scores from NVD (severity)
  5. Checks Exploit-DB for public exploits
  6. Updates every vulnerability row with fresh scores
  7. Recalculates the Exposure Imminence Score (EIS) for each vulnerability
  8. Recalculates the Asset Exposure Score for each host
Argument Default Description
--cache-dir cache/ Directory for caching API responses
--force false Force-refresh all caches (ignores freshness)

Caching: API responses are cached locally with a 24-hour freshness window. Subsequent runs only fetch new data if the cache is stale. Use --force to bypass caching.

Step 4 — Launch Dashboard

python api.py

The server starts at http://localhost:5000 with these pages:

URL Description
http://localhost:5000/ Engineer Dashboard — vulnerability table with filters
http://localhost:5000/executive Executive Dashboard — summary cards and top risky assets

Web Dashboards

Engineer Dashboard (/)

A full-featured vulnerability table designed for security engineers:

  • Search — filter by CVE ID or vulnerability title
  • Severity Filter — show only Critical, High, Medium, or Low
  • KEV Filter — show only known-exploited vulnerabilities
  • Asset IP Filter — narrow down to a specific host
  • Pagination — 50 results per page with page navigation
  • Live data — all data fetched in real-time from the REST API

Columns displayed: CVE ID, Title, Severity, CVSS, EPSS, EIS Score, KEV status, Public Exploit status, Asset IP

Executive Dashboard (/executive)

A high-level summary view for management and stakeholders:

  • Summary Cards — Total assets, total vulnerabilities, counts by severity, exploited count, KEV count, average EPSS
  • Severity Distribution Bar — visual proportional bar showing critical/high/medium/low breakdown
  • Top Risky Assets — table of the most exposed hosts ranked by exposure score

REST API Reference

All endpoints return JSON. Base URL: http://localhost:5000

GET /api/vulnerabilities

Paginated list of all vulnerabilities with optional filters.

Query Param Type Example Description
severity string critical Filter by severity level
asset_ip string 192.168.1.10 Filter by asset IP
is_kev bool true Filter by KEV status
search string CVE-2023 Search in CVE ID or title
page int 1 Page number (default: 1)
per_page int 50 Results per page (default: 50)

Response:

{
  "total": 142,
  "page": 1,
  "per_page": 50,
  "data": [
    {
      "id": 1,
      "vulnerability_id": "CVE-2023-12345",
      "title": "Critical Remote Code Execution",
      "severity": "critical",
      "cvss_score": 9.8,
      "epss_score": 0.87,
      "eis_score": 92.0,
      "is_kev": true,
      "is_exploited": true,
      "has_public_exploit": true,
      "asset_ip": "192.168.1.100",
      "asset_hostname": "server-01",
      "description": "...",
      "remediation": "Upgrade to version 2.4.52 or later"
    }
  ]
}

GET /api/assets

List all assets sorted by exposure score (descending).

Response:

{
  "total": 5,
  "data": [
    {
      "id": 1,
      "asset_id": "asset_001",
      "ip": "192.168.1.100",
      "hostname": "server-01",
      "os": "Linux",
      "exposure_score": 78.5,
      "vuln_count": 12
    }
  ]
}

GET /api/assets/<id>

Single asset detail including all its vulnerabilities.

GET /api/summary

Executive summary computed from live database data.

Response:

{
  "total_assets": 5,
  "total_vulnerabilities": 142,
  "by_severity": { "critical": 8, "high": 23, "medium": 67, "low": 44 },
  "exploited_vulns": 12,
  "kev_vulns": 15,
  "avg_epss": 0.3421,
  "top_risky_assets": [ ... ]
}

POST /api/ingest

Upload and ingest a parsed scan JSON directly via API.

Request: Send JSON body or multipart file upload (field: file).

# Using curl
curl -X POST http://localhost:5000/api/ingest \
  -H "Content-Type: application/json" \
  -d @openvas_output.json

Response:

{ "status": "ingested", "scan_id": "scan-12345" }    // 201 Created
{ "status": "duplicate", "scan_id": "scan-12345" }   // 200 OK (already exists)

CLI Tools Reference

All tools support --help for detailed usage.

Tool Command Description
OpenVAS Parser python parse_openvas.py -i scan.xml -o out.json Convert OpenVAS XML to normalized JSON
Nessus Parser python parse_nessus.py -i scan.nessus -o out.json Convert Nessus XML to normalized JSON
DB Ingestion python db.py --input out.json Load parsed JSON into SQLite database
Enrichment Worker python db_enrichment_worker.py --cache-dir cache Enrich DB with live KEV/EPSS/NVD/Exploit-DB
API Server python api.py Start Flask dashboard on port 5000
Executive Summary python executive_summary.py -i out.json -f table CLI table/JSON summary (no server needed)
Standalone Enrichment python enrichment.py --cves CVE-2023-1234 Fetch enrichment data for specific CVEs
Deduplication python dedup.py -e old.json -n new.json -o merged.json Merge two scan JSONs, removing duplicates
Node.js Parser node index.js --input scan.xml --output out.json Node.js alternative for OpenVAS parsing

Project Structure

parse/
├── .gitignore
├── README.md
├── python/
│   ├── requirements.txt          # Python dependencies
│   ├── schema.json               # JSON schema for normalized output
│   ├── sample_openvas.xml        # Sample OpenVAS report for testing
│   ├── sample_kev.csv            # Sample KEV data
│   ├── sample_epss.csv           # Sample EPSS data
│   │
│   ├── parse_openvas.py          # OpenVAS XML → normalized JSON
│   ├── parse_nessus.py           # Nessus XML → normalized JSON
│   ├── utils.py                  # Shared utilities (severity, EIS, exposure scoring)
│   ├── enrichment.py             # KEV/EPSS/NVD/Exploit-DB connectors + caching
│   ├── dedup.py                  # Scan deduplication & merge
│   │
│   ├── db.py                     # SQLAlchemy models + ingestion pipeline
│   ├── db_enrichment_worker.py   # DB enrichment worker
│   ├── api.py                    # Flask REST API + dashboard server
│   ├── executive_summary.py      # CLI executive summary generator
│   │
│   ├── templates/
│   │   ├── dashboard.html        # Engineer dashboard (vuln table + filters)
│   │   └── executive.html        # Executive dashboard (summary cards)
│   │
│   ├── tests/
│   │   ├── test_parser_schema.py # Parser output & schema validation (25 tests)
│   │   ├── test_db.py            # Database layer tests (12 tests)
│   │   ├── test_api.py           # REST API endpoint tests (17 tests)
│   │   ├── test_dedup.py         # Deduplication logic tests (11 tests)
│   │   ├── test_enrichment.py    # Enrichment & EIS scoring tests (13 tests)
│   │   └── test_executive_summary.py  # Executive summary tests (9 tests)
│   │
│   ├── cache/                    # Auto-created: cached API responses
│   └── vulndb.db                 # Auto-created: SQLite database
│
└── node/
    ├── package.json
    └── index.js                  # Node.js OpenVAS parser

Output Schema

Both parsers produce identical JSON conforming to schema.json:

{
  "scan_metadata": {
    "scan_id": "scan-12345",
    "scan_start": "2024-01-15T10:30:00Z",
    "scan_end": "2024-01-15T11:45:00Z"
  },
  "assets": [
    {
      "asset_id": "asset_001",
      "ip": "192.168.1.100",
      "hostname": "server-01",
      "os": "Linux",
      "exposure_score": 78.5,
      "vulnerabilities_by_severity": {
        "critical": [ { "vulnerability_id": "CVE-2023-12345", "cvss_score": 9.8, ... } ],
        "high": [],
        "medium": [ ... ],
        "low": [ ... ]
      }
    }
  ]
}

Each vulnerability record includes:

Field Type Description
vulnerability_id string CVE ID or scanner OID
nvt_oid string Scanner-specific test identifier
title string Human-readable vulnerability name
cvss_score float | null CVSS v3.x base score (0–10)
epss_score float | null EPSS exploitation probability (0–1)
eis_score float | null Exposure Imminence Score (0–100)
is_kev boolean Listed in CISA KEV catalog
is_exploited boolean Known to be actively exploited
has_public_exploit boolean | null Public exploit exists on Exploit-DB
description string Vulnerability description
remediation string Recommended fix or patch
affected_software string Impacted software/version
qod integer Quality of detection (scanner confidence %)

Scoring Models

Exposure Imminence Score (EIS)

Per-vulnerability risk score combining three signals:

EIS = ( 0.5 × EPSS + 0.3 × CVSS/10 + 0.2 × KEV_flag ) × 100
Factor Weight Range Source
EPSS 50% 0–1 FIRST.org API
CVSS 30% 0–10 (normalized to 0–1) NVD API
KEV 20% 0 or 1 CISA KEV catalog

Result: 0–100 score. Higher = more imminent threat.

Asset Exposure Score

Per-asset aggregate risk combining EIS breadth and depth:

Asset Score = (avg_EIS) × log₂(vuln_count + 1)

The logarithmic weighting rewards breadth of exposure — an asset with many moderate-risk vulnerabilities scores higher than one with a single moderate finding. Capped at 100.


Data Sources

Source URL Data Provided
CISA KEV cisa.gov/known-exploited-vulnerabilities-catalog Known exploited vulnerability catalog
FIRST EPSS api.first.org/data/v1/epss Exploitation probability scores
NVD services.nvd.nist.gov/rest/json/cves/2.0 CVSS base scores, vulnerability metadata
Exploit-DB exploit-db.com Public exploit availability

All API responses are cached locally in the cache/ directory with a 24-hour freshness window. Use --force to bypass cache.


Testing

Run the full test suite (85 tests):

cd python
python -m pytest tests/ -v
Test File Tests What It Covers
test_parser_schema.py 25 Both parsers produce valid JSON against schema.json; cross-parser consistency
test_db.py 12 Table creation, scan ingestion, duplicate rejection, query filters
test_api.py 17 All REST endpoints, HTML pages, ingestion + duplicate handling
test_dedup.py 11 Vulnerability dedup, scan merging, asset matching
test_enrichment.py 13 EIS calculation, cache freshness, enrichment connectors
test_executive_summary.py 9 Summary metrics (severity counts, top risky assets, EPSS avg)

Run a specific test file:

python -m pytest tests/test_api.py -v

Configuration & Environment Variables

Variable Default Description
VULN_DB_PATH python/vulndb.db Path to the SQLite database file

The database path can also be set via CLI: python db.py --input scan.json --db /path/to/custom.db

Cache directory is configured per-run via --cache-dir (default: cache/).


Branching Strategy

We follow a lightweight Git branching workflow:

Branch Purpose
main Stable, production-ready code. Only merge from develop after testing.
develop Integration branch for current sprint work
feature/<name> Individual feature branches (e.g., feature/db-pipeline, feature/dashboard)
bugfix/<name> Bug fix branches

Workflow:

  1. Create a feature branch from develop: git checkout -b feature/my-feature develop
  2. Work on the feature, commit frequently with descriptive messages
  3. Open a Pull Request to merge into develop
  4. After sprint review and CI passes, merge developmain

Lab Environment Setup

To set up vulnerable targets for running real scans:

Option A: Docker Compose (Recommended)

Create a docker-compose.lab.yml:

version: '3.8'
services:
  dvwa:
    image: vulnerables/web-dvwa
    ports:
      - "8080:80"
    environment:
      MYSQL_PASS: dvwa
    restart: unless-stopped

  metasploitable:
    image: tleemcjr/metasploitable2
    ports:
      - "2222:22"
      - "8081:80"
      - "445:445"
    restart: unless-stopped
docker-compose -f docker-compose.lab.yml up -d

Targets will be available at:

  • DVWA: http://localhost:8080 (default creds: admin / password)
  • Metasploitable: ssh -p 2222 msfadmin@localhost (creds: msfadmin / msfadmin)

Option B: Manual VM Setup

  1. Metasploitable 2 — Download from SourceForge. Import .vmdk into VirtualBox/VMware with a Host-only network adapter.
  2. DVWA — Download from GitHub. Deploy on any LAMP/WAMP stack and configure config/config.inc.php.

End-to-End Lab Workflow

# 1. Scan the lab targets with OpenVAS/Nessus
#    Export results as XML

# 2. Parse → Ingest → Enrich → View
cd python
python parse_openvas.py -i lab_scan.xml -o lab_output.json
python db.py --input lab_output.json
python db_enrichment_worker.py
python api.py

# 3. Open http://localhost:5000/ in your browser

License

This project was developed as part of a university cybersecurity course.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors