Check whether a CVE or GHSA actually affects your application code — not just whether a dependency appears in a lockfile.
- Go backend with embedded React UI
- OSV / NVD for advisory data
- Static analysis for imports and vulnerable API usage
- OpenAI-compatible LLM for final impact judgment
- Go 1.22+
- Node.js 18+ (build UI once)
- A local copy of the repo to scan (download or extract yourself; the tool does not clone remotes)
cp .env.example .env
# Edit REPO_PATH, REPO_BRANCH, LLM_API_KEY, LLM_MODEL
make build
make runOpen http://localhost:8080, enter a CVE or GHSA ID (e.g. CVE-2025-64718), click Analyze.
The UI supports English and Русский (language switcher in the top-right).
Build and run with your repository mounted at /repo:
cp .env.example .env
# Set LLM_API_KEY, REPO_BRANCH, etc. REPO_PATH is overridden in compose to /repo
export REPO_HOST_PATH=/absolute/path/to/your/project
docker compose up --buildOr build the image only:
docker build -t cve-analyzer .
docker run --rm -p 8080:8080 --env-file .env \
-e REPO_PATH=/repo \
-v /absolute/path/to/your/project:/repo:ro \
cve-analyzergit is included in the image for optional branch checkout when .git exists in the mounted repo.
REPO_PATH=/path/to/your/project
REPO_BRANCH=main
LLM_BASE_URL=https://api.openai.com/v1
LLM_API_KEY=sk-...
LLM_MODEL=gpt-4o-miniCorporate gateways: set LLM_BASE_URL to your OpenAI-compatible endpoint.
- Point
REPO_PATHat the project root (must containgo.mod,package.json, etc.). - Enter CVE-… or GHSA-….
- Read the results table: affected?, locations, explanation, fix version.
- Progress logs (SSE) show each pipeline step.
CLI test (server running):
./scripts/test-analyze.sh CVE-2025-64718| Variable | Required | Description |
|---|---|---|
REPO_PATH |
Yes | Absolute path to local repository |
REPO_BRANCH |
Yes | Branch name (git switch if .git exists) |
LLM_BASE_URL |
No | Default https://api.openai.com/v1 |
LLM_API_KEY |
Yes | API key / bearer token |
LLM_MODEL |
No | Default gpt-4o-mini |
LLM_TIMEOUT_SEC |
No | Default 120 |
EXCLUDE_DIRS |
No | Comma-separated dirs to skip when scanning |
SERVER_ADDR |
No | Default :8080 |
MAX_SNIPPET_FILES |
No | Max files sent as snippets to LLM (default 20) |
| Endpoint | Method | Description |
|---|---|---|
/api/v1/config |
GET | Repo path, branch, exclude dirs |
/api/v1/analyze |
POST | {"alias":"CVE-..."} — JSON result |
/api/v1/analyze/stream |
POST | Same body — SSE progress + result |
/health |
GET | Liveness |
# Backend only (after make build-frontend once)
go run ./cmd/server
# UI hot reload (proxies API to :8080)
make dev-frontend # terminal 2make test
make tidyThe analyzer is a six-stage pipeline. The LLM is only used for advisory interpretation (twice: parsing + final verdict reasoning). All code scanning is deterministic — your repository content never travels to the LLM as source code, only as a small structured evidence report.
┌─────────────────────────────────────────────────────────────────────────┐
│ 1. FETCH ADVISORY │
│ OSV → fallback OSV aliases → NVD (for CVEs) → enrich GHSA text │
│ Output: domain.Vulnerability { Summary, Details, Affected[], … } │
└─────────────────────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────────────────────┐
│ 2. BUILD ADVISORY EXCERPT (structural, library-agnostic) │
│ • Keeps every H1–H6 section, every fenced code block, every line │
│ with backticked tokens. │
│ • Drops Credits / References / table separators / URL-only lines. │
│ • Caps at ~12k runes for cheap, focused prompts. │
└─────────────────────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────────────────────┐
│ 3. EXTRACT SIGNALS (LLM with JSON-Schema → heuristic backstop) │
│ Schema-constrained output: │
│ signals[]: { name, kind, scan, confidence } │
│ ecosystems[] │
│ guidance: markdown checklist │
│ Kinds: function | method | property | type | path | concept │
│ scan=true ONLY for direct-call CVEs with concrete callable names. │
│ Server-side gate strips: HTTP verbs, config keys, defensive checks, │
│ runtime built-ins, file-ext / domain fragments, tokens <4 chars. │
│ If LLM fails twice → regex heuristic produces the SAME shape. │
└─────────────────────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────────────────────┐
│ 4. RESOLVE DEPENDENCIES │
│ Parses go.mod, go.sum, package.json, package-lock.json, yarn.lock, │
│ pnpm-lock.yaml, Pipfile, requirements.txt, … │
│ Output: map[ecosystem:name] → installed version. │
└─────────────────────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────────────────────┐
│ 5. STATIC SCAN (binding-aware, ecosystem-aware, no LLM) │
│ For each affected package present in the repo: │
│ a. Walk source tree, skip node_modules / vendor / .venv / … │
│ b. Parse imports per language: │
│ Go — AST import paths, alias names │
│ JS/TS — default / named / namespace / require, with binding │
│ Python — import … as, from … import … │
│ c. For each file that imports the package, build local bindings │
│ (qualified namespace + direct named imports). │
│ d. For each scan symbol, match either │
│ <binding>.<symbol>( OR bare <symbol>( │
│ depending on how the file imported the package. │
│ Output: EvidencePackage { imports, api_usages, observed_apis } │
└─────────────────────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────────────────────┐
│ 6. VERDICT (deterministic rule first, LLM organizes the result) │
│ Decision rules (in priority order): │
│ • package not in repo → Not affected (high) │
│ • version not in advisory range → Not affected (high) │
│ • imported but no scan symbols extracted → Manual review (low) │
│ • imported AND scan symbols found in source → Likely affected │
│ • imported AND scan symbols, none found → Not affected (high) │
│ Then a schema-constrained LLM call enriches the verdict with │
│ natural-language summary and per-package recommendation. │
└─────────────────────────────────────────────────────────────────────────┘
| Package | Role |
|---|---|
internal/domain |
Pure data types and ports (VulnFetcher, CodeScanner, VulnAnalyzer, Signal, EvidencePackage, …). No I/O. |
internal/usecase |
Orchestration. analyze.go runs the six stages. advisory_signals.go drives LLM-with-heuristic-backstop. advisory_excerpt.go, heuristic_extract.go, ecosystem.go are pure logic. |
internal/adapter/osv |
OSV vulnerability API client. |
internal/adapter/nvd |
NVD fallback client for CVEs not in OSV. |
internal/adapter/vuln |
Composite fetcher that chains OSV → NVD and enriches with GHSA. |
internal/adapter/deps |
Manifest / lockfile parsing per ecosystem. |
internal/adapter/scanner |
Binding-aware static scanner. One file per language for import/binding extraction; api_usage.go does the symbol matching. |
internal/adapter/llm |
OpenAI-compatible client. JSON-Schema response_format with json_object fallback. Two prompts: advisory parse + result organizer. |
internal/adapter/http |
HTTP handlers, SSE streaming, terminal logging of progress events. |
internal/adapter/repo |
Optional git switch to the configured branch. |
internal/web |
Embedded React UI. |
frontend/ |
React + TypeScript source; built once into internal/web. |
- No package name is hardcoded anywhere in production code (
axios,lodash,js-yamlexist only in test fixtures). - Per-ecosystem behaviour is dispatched by file extension and OSV ecosystem label.
- Adding a new ecosystem means adding one file: import detection + binding resolver.
- LLM down / 4xx / 5xx → 2 retries with timing logs, then regex heuristic produces the same
AdvisoryParseResultshape. Verdict still runs on static evidence. - OSV miss for a CVE → automatic NVD fallback.
- NVD miss too → analysis fails fast with a clear error.
- Repo has no manifests → 0 evidence packages, single "advisory-only" row in the result table.
- Indirect-trigger CVEs (prototype pollution, env injection, race conditions) → zero scan targets is the correct outcome; verdict says "Cannot prove safety statically — manual review required" instead of false-negative "Not affected."
| Destination | Content |
|---|---|
| Browser UI (Progress panel) | Full SSE event stream — per-step labels, advisory parse timing, scan counts, verdict. |
| Terminal (server stdout) | Same events, one line each, prefixed with [analyze] and the CVE alias. Useful when the JSON /api/v1/analyze endpoint is used (no SSE) or when running unattended. |
Important rule: an import does not mean affected. The static scanner looks for specific APIs named in the advisory and only flags reachable usage. For indirect-trigger CVEs (e.g., prototype pollution), the result is "manual review" rather than a false "Not affected".
See docs/DEVELOPERS.md for further pipeline detail.
Results are guidance, not a guarantee. The tool combines OSV version ranges, LLM-parsed advisory signals (with regex fallback), static scans, and a second LLM pass for impact. Any step can be wrong or incomplete. Always verify critical findings (upgrade paths, pentest scope, compliance) with your own review.
| Topic | What to expect |
|---|---|
| Import ≠ affected | A package in package.json and even imported in source does not automatically mean the CVE applies. The tool checks advisory APIs (e.g. js-yaml merge vs your use of safeLoadAll). |
| Version vs exploitability | You may be on a vulnerable version while the UI says Not affected (no matching API calls in your code). You should still upgrade if OSV lists your version as affected. |
| Opposite case | The UI may say Likely affected when only the version matches and axios/LLM over-counts usage; confirm before treating as a confirmed vuln in production. |
| Prototype pollution CVEs | Many axios/js issues need Object.prototype polluted by another dependency. The tool cannot prove that chain exists in your runtime—only that axios is present and in range. |
| Browser vs Node | Some axios CVEs target the Node HTTP adapter (NO_PROXY, toFormData, http.js). A frontend-only app may have lower real risk but the same vulnerable package version. |
| Advisory parsing | “Vulnerable APIs” are guessed from CVE text. Words like request, headers, or test helpers (it, true) can appear in Locations as noise; prefer Explanation and version/fix columns. |
| LLM | Needs a working API key and network. On failure, analysis falls back to static rules. Model output can disagree with static evidence or miss nuance. |
| Coverage | No full type-checker or dataflow; reflection, dynamic imports, and generated code may be missed. node_modules / vendor are never scanned. |
| Unknown CVEs | If OSV and NVD have no record, analysis fails. |
For more detail and examples from manual testing, see docs/DEVELOPERS.md#limitations-and-accuracy.
| ID | Typical result | Why |
|---|---|---|
CVE-2026-42043 |
Affected | axios in range + used in source |
CVE-2025-64718 |
Not affected | js-yaml imported; uses safeLoadAll, not merge |
CVE-2021-23337 |
Not affected | lodash patched version |
CVE-2021-44228 |
Not affected | Log4j not in repo |
Internal / project use — adjust as needed.