Metadata-only CVE-to-PoC collection and a public sanitized dashboard for tracking the gap between CVE publication, PoC availability, and confirmed exploitation signals.
- The collector stores public metadata only.
- It does not clone, download, mirror, execute, or archive PoC repositories.
- Public dashboard links go directly to external GitHub repositories and are labeled as untrusted.
- Raw code snippets, workflow logs, API tokens, and internal analyst notes are not written to
api/. - Local hooks and GitHub Actions run a strict sensitive-data scan before commits, pushes, deploys, and collector-generated API updates.
- CVE List V5 for CVE records.
- NVD 2.0 for publication windows, CVSS, CWE, and CPE enrichment.
- CISA KEV for confirmed known-exploited status.
- FIRST EPSS for exploit probability and percentile.
- GitHub Search for repository metadata matching CVE IDs.
api/index.json: totals, source freshness, top timeline metrics.api/cves/{CVE-ID}.json: normalized CVE detail, PoC candidates, and timelines.api/pocs/latest.json: latest medium/high confidence PoC candidates.api/timelines/summary.json: leadership timeline metrics and research table rows.api/sources/status.json: source health, record counts, and collection errors.
Schema references live in schemas/index.schema.json and schemas/cve.schema.json.
Run tests:
python -m unittest discover -s testsRun a light metadata refresh:
python scripts/collect.py --mode light --days 14 --max-cves 40Run a fuller refresh:
python scripts/collect.py --mode full --days 120 --max-cves 200 --github-pages 2Run all recent CVE enrichment without GitHub PoC searching:
python scripts/collect.py --mode light --days 30 --max-cves 0 --github-pages 0Index the 2026 publication year while limiting GitHub PoC candidate searches:
python scripts/collect.py --mode full --year 2026 --max-cves 0 --github-pages 1 --poc-search-limit 250 --stale-poc-days 30Run a bounded daily-style refresh for newly published CVEs:
python scripts/collect.py --mode light --days 3 --max-cves 300 --github-pages 1 --poc-search-limit 120 --stale-poc-days 1Refresh an explicit CVE:
python scripts/collect.py --mode light --days 0 --cve CVE-2024-6387 --max-cves 1Optional environment variables:
GITHUB_TOKEN: increases GitHub API/search limits.NVD_API_KEY: improves NVD API throughput.
Install the repository's local git hooks:
git config core.hooksPath .githooksThe pre-commit hook scans staged files. The pre-push hook scans tracked and untracked non-ignored files. You can run the same scanner manually:
python scripts/scan_sensitive.py --all
python scripts/scan_sensitive.py --historyOpen the dashboard through a static server:
python -m http.server 8000Then visit http://localhost:8000.
config/blacklist.yml: owners, repos, and keywords that should reduce confidence.config/allowlist.yml: trusted public research owners/repos/domains that receive a modest confidence boost.config/seeds.txt: optional CVE IDs or URLs containing CVE IDs to refresh outside the recent NVD window.
The YAML parser intentionally supports only simple top-level lists so the collector has no runtime dependency on PyYAML.
.github/workflows/collect.yml runs:
- Bounded recent-CVE refreshes after pushes to collector/config/test code.
- Daily new-CVE refresh.
- Weekly 2026 backfill with a bounded GitHub PoC candidate-search budget.
- Tests before collection.
- Sensitive-data scans before collection and before committing generated
api/artifacts. - Commits changed
api/artifacts back to the repository. - Deploys the refreshed static site to GitHub Pages after collection.
- Restores a local HTTP cache between runs to reduce pressure on NVD, CVE List V5, FIRST, CISA, and GitHub APIs.
.github/workflows/pages.yml deploys the static dashboard to GitHub Pages on pushes to main.
.github/workflows/security.yml scans repository files and reachable git history on push, pull request, and manual runs.
For private repositories, GitHub-hosted Actions minutes and storage may be billed after plan quotas. Broad GitHub search can also hit API limits; keep --max-cves and --github-pages conservative unless you are using an appropriate token.
For the public site, prefer the single source repository Pages deployment. The temporary evilbotnet.github.io repo is not required once ilovemalware.com is serving this repository's GitHub Pages deployment.