Patch Finder is an LLM-assisted workflow that locates upstream fix commits for CVEs by orchestrating public data sources such as GitHub Security Advisories, OSV, and NVD. The agent relies on a local vLLM deployment of the openai/gpt-oss-20b model and enriches LLM output with targeted web and HTML scraping helpers.
The agent uses a multi-step approach to find CVE fix commits:
- 
Bootstrap Phase: Automatically fetches key sources (NVD, CVE.org, OSV, GHSA, Debian tracker) and extracts initial evidence including bug IDs, references, and potential commit URLs. 
- 
Iterative Search: The LLM agent analyzes the evidence and performs targeted web searches and URL fetches to locate: - Official bug tracker entries (GitHub Issues, Chromium bugs, kernel.org)
- Security advisories and commit references
- Source repository commits (GitHub, chromium.googlesource.com, git.kernel.org)
 
- 
Verification: Cross-references findings across multiple authoritative sources to ensure accuracy. 
- 
Extraction: Extracts the full 40-character commit SHA-1 hash and constructs the complete commit URL. 
The agent can handle CVEs from various ecosystems:
- GitHub-based projects (most npm, pip, Ruby packages)
- Chromium/Chrome (googlesource.com repositories)
- Linux Kernel (git.kernel.org, kernel mirrors)
- OpenSSL and other projects (git.openssl.org, GitLab, etc.)
Success is not guaranteed and depends on several factors:
- 
Information Availability: The fix commit must be publicly documented in at least one of the checked sources (NVD, CVE.org, OSV, GHSA, vendor advisories, or search results). 
- 
AI Model Behavior: The agent is a thin client to an AI model with probabilistic behavior. The same CVE may succeed or fail across different runs depending on: - Information available in the internet
- Quality of search results returned by Google
- Tool calling decisions made by the model
- Hallucinations and any other potential mistakes that a model can make
 
- 
Data Quality Issues: - CVE records may lack commit references
- Bug trackers may not link to commits
- Commits may not mention the CVE identifier
- Proprietary fixes may not be publicly disclosed
 
- 
Network Dependencies: Requires access to external services (NVD, Google Custom Search API, GitHub, etc.). Network issues or rate limits may cause failures. 
- 
Complex Repository Structures: Some projects use non-standard workflows, mirrors, or private security fixes that are difficult to trace automatically. 
Recommendation: Always manually verify the returned commit by reviewing the diff and confirming it addresses the vulnerability described in the CVE.
- Python 3.12+ with dependencies installed from requirements.txt
- A local vLLM server hosting openai/gpt-oss-20b(requires at least RTX 4090 or more powerful GPU)
- Google Cloud project with the Custom Search API enabled (for web search queries), there is a number of requests available free of charge every day.
- Create a new Python environment and install all needed dependencies:
uv venv --python 3.12
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv pip install -r requirements.txt
python -m playwright install chromium- Set the required environment variables:
export OPENAI_API_KEY="local"                      # arbitrary value; vLLM ignores it but the SDK requires one
export OPENAI_BASE_URL="http://localhost:8000/v1"  # matches the vLLM OpenAI-compatible endpoint
export GOOGLE_CSE_ID="<your Google CSE ID>"        # you can find it at https://programmablesearchengine.google.com/controlpanel/all
export GOOGLE_API_KEY="<your Google API key>"      # Google Cloud API Key for your project
export GITLAB_TOKEN="<your GitLab personal access token>"  # GitLab Personal Access Token with read_api scopeWithout these variables the agent won't work correctly!
Optional overrides:
- PATCH_FINDER_MODEL(default- openai/gpt-oss-20b)
- PATCH_FINDER_MAX_STEPS(default- 60)
- PATCH_FINDER_MAX_TOKENS(default- 1024)
- PATCH_FINDER_TOP_P(default- 1.0)
- PATCH_FINDER_MAX_CONTEXT_CHARS(default- 48000)
- Launch vLLM with the OpenAI-compatible server so the agent can reach it through the standard OpenAI SDK. Example command (run in WSL or Linux shell):
vllm serve openai/gpt-oss-20b \
  --dtype auto \
  --enforce-eager \
  --host 0.0.0.0 \
  --port 8000 \
  --tool-call-parser openai \
  --reasoning-parser openai_gptoss \
  --enable-auto-tool-choiceAdjust --host/--port as needed and ensure the server matches the URL set in OPENAI_BASE_URL.
The provided requirements.txt already has all needed dependencies. If you want to configure your vLLM serving API on a different machine, install the server dependencies:
uv pip install vllm openai httpx python-dotenv readability-lxml beautifulsoup4 lxml rapidfuzz pydantic tiktoken- Run the agent and request a patch for a vulnerability with CVE identifier:
python agent_runner.py CVE-2025-50182 --debug- --stepscontrols the maximum number of LLM/tool interaction rounds (defaults to 60).
- --debugprints detailed tool and retry diagnostics; omit it for quieter output.
The agent prints either a validated success payload (SuccessOut) with commit coordinates or a structured error payload (ErrorOut).