CVE Analyzer

Check whether a CVE or GHSA actually affects your application code — not just whether a dependency appears in a lockfile.

Go backend with embedded React UI
OSV / NVD for advisory data
Static analysis for imports and vulnerable API usage
OpenAI-compatible LLM for final impact judgment

Quick start

Prerequisites

Go 1.22+
Node.js 18+ (build UI once)
A local copy of the repo to scan (download or extract yourself; the tool does not clone remotes)

Setup

cp .env.example .env
# Edit REPO_PATH, REPO_BRANCH, LLM_API_KEY, LLM_MODEL
make build
make run

Open http://localhost:8080, enter a CVE or GHSA ID (e.g. CVE-2025-64718), click Analyze.

The UI supports English and Русский (language switcher in the top-right).

Docker

Build and run with your repository mounted at /repo:

cp .env.example .env
# Set LLM_API_KEY, REPO_BRANCH, etc. REPO_PATH is overridden in compose to /repo

export REPO_HOST_PATH=/absolute/path/to/your/project
docker compose up --build

Or build the image only:

docker build -t cve-analyzer .
docker run --rm -p 8080:8080 --env-file .env \
  -e REPO_PATH=/repo \
  -v /absolute/path/to/your/project:/repo:ro \
  cve-analyzer

git is included in the image for optional branch checkout when .git exists in the mounted repo.

Example `.env`

REPO_PATH=/path/to/your/project
REPO_BRANCH=main
LLM_BASE_URL=https://api.openai.com/v1
LLM_API_KEY=sk-...
LLM_MODEL=gpt-4o-mini

Corporate gateways: set LLM_BASE_URL to your OpenAI-compatible endpoint.

Usage

Point REPO_PATH at the project root (must contain go.mod, package.json, etc.).
Enter CVE-… or GHSA-….
Read the results table: affected?, locations, explanation, fix version.
Progress logs (SSE) show each pipeline step.

CLI test (server running):

./scripts/test-analyze.sh CVE-2025-64718

Environment variables

Variable	Required	Description
`REPO_PATH`	Yes	Absolute path to local repository
`REPO_BRANCH`	Yes	Branch name (`git switch` if `.git` exists)
`LLM_BASE_URL`	No	Default `https://api.openai.com/v1`
`LLM_API_KEY`	Yes	API key / bearer token
`LLM_MODEL`	No	Default `gpt-4o-mini`
`LLM_TIMEOUT_SEC`	No	Default `120`
`EXCLUDE_DIRS`	No	Comma-separated dirs to skip when scanning
`SERVER_ADDR`	No	Default `:8080`
`MAX_SNIPPET_FILES`	No	Max files sent as snippets to LLM (default `20`)

API

Endpoint	Method	Description
`/api/v1/config`	GET	Repo path, branch, exclude dirs
`/api/v1/analyze`	POST	`{"alias":"CVE-..."}` — JSON result
`/api/v1/analyze/stream`	POST	Same body — SSE progress + result
`/health`	GET	Liveness

Development

# Backend only (after make build-frontend once)
go run ./cmd/server

# UI hot reload (proxies API to :8080)
make dev-frontend   # terminal 2

make test
make tidy

Architecture

The analyzer is a six-stage pipeline. The LLM is only used for advisory interpretation (twice: parsing + final verdict reasoning). All code scanning is deterministic — your repository content never travels to the LLM as source code, only as a small structured evidence report.

┌─────────────────────────────────────────────────────────────────────────┐
│ 1. FETCH ADVISORY                                                       │
│    OSV  →  fallback OSV aliases  →  NVD (for CVEs)  →  enrich GHSA text │
│    Output: domain.Vulnerability { Summary, Details, Affected[], … }     │
└─────────────────────────────────────────────────────────────────────────┘
                                  │
┌─────────────────────────────────────────────────────────────────────────┐
│ 2. BUILD ADVISORY EXCERPT (structural, library-agnostic)                │
│    • Keeps every H1–H6 section, every fenced code block, every line     │
│      with backticked tokens.                                            │
│    • Drops Credits / References / table separators / URL-only lines.    │
│    • Caps at ~12k runes for cheap, focused prompts.                     │
└─────────────────────────────────────────────────────────────────────────┘
                                  │
┌─────────────────────────────────────────────────────────────────────────┐
│ 3. EXTRACT SIGNALS  (LLM with JSON-Schema → heuristic backstop)         │
│    Schema-constrained output:                                           │
│       signals[]:   { name, kind, scan, confidence }                     │
│       ecosystems[]                                                      │
│       guidance: markdown checklist                                      │
│    Kinds: function | method | property | type | path | concept          │
│    scan=true ONLY for direct-call CVEs with concrete callable names.    │
│    Server-side gate strips: HTTP verbs, config keys, defensive checks,  │
│    runtime built-ins, file-ext / domain fragments, tokens <4 chars.     │
│    If LLM fails twice → regex heuristic produces the SAME shape.        │
└─────────────────────────────────────────────────────────────────────────┘
                                  │
┌─────────────────────────────────────────────────────────────────────────┐
│ 4. RESOLVE DEPENDENCIES                                                 │
│    Parses go.mod, go.sum, package.json, package-lock.json, yarn.lock,   │
│    pnpm-lock.yaml, Pipfile, requirements.txt, …                         │
│    Output: map[ecosystem:name] → installed version.                     │
└─────────────────────────────────────────────────────────────────────────┘
                                  │
┌─────────────────────────────────────────────────────────────────────────┐
│ 5. STATIC SCAN  (binding-aware, ecosystem-aware, no LLM)                │
│    For each affected package present in the repo:                       │
│      a. Walk source tree, skip node_modules / vendor / .venv / …        │
│      b. Parse imports per language:                                     │
│           Go      — AST import paths, alias names                       │
│           JS/TS   — default / named / namespace / require, with binding │
│           Python  — import … as, from … import …                        │
│      c. For each file that imports the package, build local bindings    │
│         (qualified namespace + direct named imports).                   │
│      d. For each scan symbol, match either                              │
│           <binding>.<symbol>(    OR    bare <symbol>(                   │
│         depending on how the file imported the package.                 │
│    Output: EvidencePackage { imports, api_usages, observed_apis }       │
└─────────────────────────────────────────────────────────────────────────┘
                                  │
┌─────────────────────────────────────────────────────────────────────────┐
│ 6. VERDICT (deterministic rule first, LLM organizes the result)         │
│    Decision rules (in priority order):                                  │
│      • package not in repo                       → Not affected (high)  │
│      • version not in advisory range             → Not affected (high)  │
│      • imported but no scan symbols extracted    → Manual review (low)  │
│      • imported AND scan symbols found in source → Likely affected      │
│      • imported AND scan symbols, none found     → Not affected (high)  │
│    Then a schema-constrained LLM call enriches the verdict with         │
│    natural-language summary and per-package recommendation.             │
└─────────────────────────────────────────────────────────────────────────┘

Component layout

Package	Role
`internal/domain`	Pure data types and ports (`VulnFetcher`, `CodeScanner`, `VulnAnalyzer`, `Signal`, `EvidencePackage`, …). No I/O.
`internal/usecase`	Orchestration. `analyze.go` runs the six stages. `advisory_signals.go` drives LLM-with-heuristic-backstop. `advisory_excerpt.go`, `heuristic_extract.go`, `ecosystem.go` are pure logic.
`internal/adapter/osv`	OSV vulnerability API client.
`internal/adapter/nvd`	NVD fallback client for CVEs not in OSV.
`internal/adapter/vuln`	Composite fetcher that chains OSV → NVD and enriches with GHSA.
`internal/adapter/deps`	Manifest / lockfile parsing per ecosystem.
`internal/adapter/scanner`	Binding-aware static scanner. One file per language for import/binding extraction; `api_usage.go` does the symbol matching.
`internal/adapter/llm`	OpenAI-compatible client. JSON-Schema response_format with `json_object` fallback. Two prompts: advisory parse + result organizer.
`internal/adapter/http`	HTTP handlers, SSE streaming, terminal logging of progress events.
`internal/adapter/repo`	Optional `git switch` to the configured branch.
`internal/web`	Embedded React UI.
`frontend/`	React + TypeScript source; built once into `internal/web`.

Library-agnostic by design

No package name is hardcoded anywhere in production code (axios, lodash, js-yaml exist only in test fixtures).
Per-ecosystem behaviour is dispatched by file extension and OSV ecosystem label.
Adding a new ecosystem means adding one file: import detection + binding resolver.

Failure modes that don't crash the pipeline

LLM down / 4xx / 5xx → 2 retries with timing logs, then regex heuristic produces the same AdvisoryParseResult shape. Verdict still runs on static evidence.
OSV miss for a CVE → automatic NVD fallback.
NVD miss too → analysis fails fast with a clear error.
Repo has no manifests → 0 evidence packages, single "advisory-only" row in the result table.
Indirect-trigger CVEs (prototype pollution, env injection, race conditions) → zero scan targets is the correct outcome; verdict says "Cannot prove safety statically — manual review required" instead of false-negative "Not affected."

Where logs go

Destination	Content
Browser UI (Progress panel)	Full SSE event stream — per-step labels, advisory parse timing, scan counts, verdict.
Terminal (server stdout)	Same events, one line each, prefixed with `[analyze]` and the CVE alias. Useful when the JSON `/api/v1/analyze` endpoint is used (no SSE) or when running unattended.

Important rule: an import does not mean affected. The static scanner looks for specific APIs named in the advisory and only flags reachable usage. For indirect-trigger CVEs (e.g., prototype pollution), the result is "manual review" rather than a false "Not affected".

See docs/DEVELOPERS.md for further pipeline detail.

Limitations and accuracy

Results are guidance, not a guarantee. The tool combines OSV version ranges, LLM-parsed advisory signals (with regex fallback), static scans, and a second LLM pass for impact. Any step can be wrong or incomplete. Always verify critical findings (upgrade paths, pentest scope, compliance) with your own review.

Topic	What to expect
Import ≠ affected	A package in `package.json` and even imported in source does not automatically mean the CVE applies. The tool checks advisory APIs (e.g. js-yaml `merge` vs your use of `safeLoadAll`).
Version vs exploitability	You may be on a vulnerable version while the UI says Not affected (no matching API calls in your code). You should still upgrade if OSV lists your version as affected.
Opposite case	The UI may say Likely affected when only the version matches and axios/LLM over-counts usage; confirm before treating as a confirmed vuln in production.
Prototype pollution CVEs	Many axios/js issues need `Object.prototype` polluted by another dependency. The tool cannot prove that chain exists in your runtime—only that axios is present and in range.
Browser vs Node	Some axios CVEs target the Node HTTP adapter (`NO_PROXY`, `toFormData`, `http.js`). A frontend-only app may have lower real risk but the same vulnerable package version.
Advisory parsing	“Vulnerable APIs” are guessed from CVE text. Words like `request`, `headers`, or test helpers (`it`, `true`) can appear in Locations as noise; prefer Explanation and version/fix columns.
LLM	Needs a working API key and network. On failure, analysis falls back to static rules. Model output can disagree with static evidence or miss nuance.
Coverage	No full type-checker or dataflow; reflection, dynamic imports, and generated code may be missed. `node_modules` / `vendor` are never scanned.
Unknown CVEs	If OSV and NVD have no record, analysis fails.

For more detail and examples from manual testing, see docs/DEVELOPERS.md#limitations-and-accuracy.

Sample manual tests

ID	Typical result	Why
`CVE-2026-42043`	Affected	axios in range + used in source
`CVE-2025-64718`	Not affected	js-yaml imported; uses `safeLoadAll`, not `merge`
`CVE-2021-23337`	Not affected	lodash patched version
`CVE-2021-44228`	Not affected	Log4j not in repo

License

Internal / project use — adjust as needed.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
cmd/server		cmd/server
docs		docs
frontend		frontend
internal		internal
scripts		scripts
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CVE Analyzer

Quick start

Prerequisites

Setup

Docker

Example `.env`

Usage

Environment variables

API

Development

Architecture

Component layout

Library-agnostic by design

Failure modes that don't crash the pipeline

Where logs go

Limitations and accuracy

Sample manual tests

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CVE Analyzer

Quick start

Prerequisites

Setup

Docker

Example .env

Usage

Environment variables

API

Development

Architecture

Component layout

Library-agnostic by design

Failure modes that don't crash the pipeline

Where logs go

Limitations and accuracy

Sample manual tests

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Example `.env`

Packages