fix: add SSRF guard to Anthropic/Gemini PDF providers and move Gemini API key to header by cdxiaodong · Pull Request #46377 · openclaw/openclaw

cdxiaodong · 2026-03-14T16:44:15Z

Summary

Both anthropicAnalyzePdf() and geminiAnalyzePdf() in src/agents/tools/pdf-native-providers.ts used raw fetch() with a user-controlled baseUrl parameter. An attacker could set baseUrl to an internal/private IP address, causing the server to make requests to internal services (SSRF).
The Anthropic function leaked the x-api-key header to any attacker-controlled destination.
The Gemini function passed the API key as a URL query parameter (?key=...), exposing it in server logs, proxy logs, and HTTP Referer headers (CWE-598).

Changes

Replace raw fetch() with fetchWithSsrFGuard(withStrictGuardedFetchMode(...)) in both functions, which validates the resolved hostname/IP against the SSRF blocklist before connecting.
For Gemini, move the API key from URL query parameter to the x-goog-api-key HTTP header to prevent credential leakage.
Add proper release() cleanup in finally blocks for both functions.

Test plan

Verify Anthropic PDF analysis still works with default https://api.anthropic.com base URL
Verify Gemini PDF analysis still works with default https://generativelanguage.googleapis.com base URL
Confirm that setting baseUrl to a private/internal IP (e.g. http://169.254.169.254) is blocked by the SSRF guard
Confirm Gemini API key no longer appears in URL query parameters

… API key to header

greptile-apps · 2026-03-14T16:46:12Z

Greptile Summary

This PR fixes two concrete security vulnerabilities — SSRF exposure and API credential leakage — in anthropicAnalyzePdf and geminiAnalyzePdf by replacing raw fetch() calls with fetchWithSsrFGuard(withStrictGuardedFetchMode(...)), moving the Gemini API key from a URL query parameter to a request header, and adding proper release() cleanup in finally blocks.

SSRF guard applied — both functions now use fetchWithSsrFGuard with STRICT mode, which resolves the target hostname before connecting and rejects private/internal IPs, preventing a user-controlled baseUrl from reaching internal services.
Gemini credential moved to header — the API key is sent via x-goog-api-key instead of appending it to the URL, preventing exposure in server logs, proxy logs, and HTTP Referer headers (CWE-598).
Credentials stripped on cross-origin redirects — fetchWithSsrFGuard already strips non-safe headers on cross-origin redirects, so sensitive headers are not forwarded to unintended hosts.
release() cleanup added — both functions now call release() in a finally block, ensuring the pinned dispatcher is closed after every request outcome.
Style note: Neither call provides an auditContext, so blocked SSRF attempts will appear in logs as the generic "url-fetch" context rather than "anthropic-pdf" / "gemini-pdf", reducing observability. The try blocks also open after the await fetchWithSsrFGuard(...) assignment rather than wrapping the full acquisition, which is functionally safe but unconventional.

Confidence Score: 4/5

This PR is safe to merge — the security fixes are correct and well-implemented, with only minor style improvements remaining.
The SSRF guard is applied correctly with strict mode, the Gemini API key leakage via query parameter is fixed, and release() cleanup is handled in finally blocks. The fetchWithSsrFGuard implementation also ensures credentials are stripped on cross-origin redirects. Minor issues: no auditContext set (reduces security log observability) and the try block doesn't cover the resource acquisition scope (safe today but unconventional).
No files require special attention beyond the two style suggestions noted above.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/agents/tools/pdf-native-providers.ts
Line: 68-86

Comment:
**Consider adding `auditContext` for better SSRF log attribution**

Neither provider specifies `auditContext` in the `fetchWithSsrFGuard` call. When an SSRF attempt is blocked, the guard logs a warning using the value of `auditContext ?? "url-fetch"`, so all blocked attempts from these functions will appear as the generic `"url-fetch"` context rather than something like `"anthropic-pdf"` or `"gemini-pdf"`. Adding it would make security incident investigations substantially easier.

For `anthropicAnalyzePdf`:
```suggestion
  const { response: res, release } = await fetchWithSsrFGuard(
    withStrictGuardedFetchMode({
      url: fetchUrl,
      auditContext: "anthropic-pdf",
      init: {
```

A similar change for `geminiAnalyzePdf` (line ~159–173) would set `auditContext: "gemini-pdf"`.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/tools/pdf-native-providers.ts
Line: 87-118

Comment:
**`try` block doesn't cover resource acquisition**

The `try/finally` block opens *after* the `await fetchWithSsrFGuard(...)` call, meaning the `release` returned by a successful call is not guarded against an exception thrown between the awaited result assignment and the `try`. In practice this is safe today because `fetchWithSsrFGuard` calls its internal cleanup before re-throwing any error, making the returned `release` a no-op if it were skipped. However the pattern is fragile and may confuse future maintainers into thinking the `try` covers the full resource lifecycle.

A more conventional pattern that makes the intent explicit:

```typescript
let release: (() => Promise<void>) | undefined;
try {
  const result = await fetchWithSsrFGuard(withStrictGuardedFetchMode({ ... }));
  const res = result.response;
  release = result.release;

  // ... response handling
  return text.trim();
} finally {
  await release?.();
}
```

The same applies to the `geminiAnalyzePdf` function (lines 174–202).

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: 970a85d}

greptile-apps · 2026-03-14T16:46:16Z

+  const { response: res, release } = await fetchWithSsrFGuard(
+    withStrictGuardedFetchMode({
+      url: fetchUrl,
+      init: {
+        method: "POST",
+        headers: {
+          "Content-Type": "application/json",
+          "x-api-key": apiKey,
+          "anthropic-version": "2023-06-01",
+          "anthropic-beta": "pdfs-2024-09-25",
+        },
+        body: JSON.stringify({
+          model: params.modelId,
+          max_tokens: params.maxTokens ?? 4096,
+          messages: [{ role: "user", content }],
+        }),
+      },
    }),
-  });
+  );


Consider adding auditContext for better SSRF log attribution

Neither provider specifies auditContext in the fetchWithSsrFGuard call. When an SSRF attempt is blocked, the guard logs a warning using the value of auditContext ?? "url-fetch", so all blocked attempts from these functions will appear as the generic "url-fetch" context rather than something like "anthropic-pdf" or "gemini-pdf". Adding it would make security incident investigations substantially easier.

For anthropicAnalyzePdf:

Suggested change

const { response: res, release } = await fetchWithSsrFGuard(

withStrictGuardedFetchMode({

url: fetchUrl,

init: {

method: "POST",

headers: {

"Content-Type": "application/json",

"x-api-key": apiKey,

"anthropic-version": "2023-06-01",

"anthropic-beta": "pdfs-2024-09-25",

},

body: JSON.stringify({

model: params.modelId,

max_tokens: params.maxTokens ?? 4096,

messages: [{ role: "user", content }],

}),

},

}),

});

);

const { response: res, release } = await fetchWithSsrFGuard(

withStrictGuardedFetchMode({

url: fetchUrl,

auditContext: "anthropic-pdf",

init: {

A similar change for geminiAnalyzePdf (line ~159–173) would set auditContext: "gemini-pdf".

Prompt To Fix With AI

This is a comment left during a code review. Path: src/agents/tools/pdf-native-providers.ts Line: 68-86 Comment: **Consider adding `auditContext` for better SSRF log attribution** Neither provider specifies `auditContext` in the `fetchWithSsrFGuard` call. When an SSRF attempt is blocked, the guard logs a warning using the value of `auditContext ?? "url-fetch"`, so all blocked attempts from these functions will appear as the generic `"url-fetch"` context rather than something like `"anthropic-pdf"` or `"gemini-pdf"`. Adding it would make security incident investigations substantially easier. For `anthropicAnalyzePdf`: ```suggestion const { response: res, release } = await fetchWithSsrFGuard( withStrictGuardedFetchMode({ url: fetchUrl, auditContext: "anthropic-pdf", init: { ``` A similar change for `geminiAnalyzePdf` (line ~159–173) would set `auditContext: "gemini-pdf"`. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-14T16:46:17Z

+  try {

  if (!res.ok) {
    const body = await res.text().catch(() => "");
    throw new Error(
      `Anthropic PDF request failed (${res.status} ${res.statusText})${body ? `: ${body.slice(0, 400)}` : ""}`,
    );
  }

  const json = (await res.json().catch(() => null)) as unknown;
  if (!isRecord(json)) {
    throw new Error("Anthropic PDF response was not JSON.");
  }

  const responseContent = json.content as AnthropicResponseContent | undefined;
  if (!Array.isArray(responseContent)) {
    throw new Error("Anthropic PDF response missing content array.");
  }

  const text = responseContent
    .filter((block) => block.type === "text" && typeof block.text === "string")
    .map((block) => block.text!)
    .join("");

  if (!text.trim()) {
    throw new Error("Anthropic PDF returned no text.");
  }

  return text.trim();
+  } finally {
+    await release();
+  }


try block doesn't cover resource acquisition

The try/finally block opens after the await fetchWithSsrFGuard(...) call, meaning the release returned by a successful call is not guarded against an exception thrown between the awaited result assignment and the try. In practice this is safe today because fetchWithSsrFGuard calls its internal cleanup before re-throwing any error, making the returned release a no-op if it were skipped. However the pattern is fragile and may confuse future maintainers into thinking the try covers the full resource lifecycle.

A more conventional pattern that makes the intent explicit:

let release: (() => Promise<void>) | undefined; try { const result = await fetchWithSsrFGuard(withStrictGuardedFetchMode({ ... })); const res = result.response; release = result.release; // ... response handling return text.trim(); } finally { await release?.(); }

The same applies to the geminiAnalyzePdf function (lines 174–202).

Prompt To Fix With AI

This is a comment left during a code review. Path: src/agents/tools/pdf-native-providers.ts Line: 87-118 Comment: **`try` block doesn't cover resource acquisition** The `try/finally` block opens *after* the `await fetchWithSsrFGuard(...)` call, meaning the `release` returned by a successful call is not guarded against an exception thrown between the awaited result assignment and the `try`. In practice this is safe today because `fetchWithSsrFGuard` calls its internal cleanup before re-throwing any error, making the returned `release` a no-op if it were skipped. However the pattern is fragile and may confuse future maintainers into thinking the `try` covers the full resource lifecycle. A more conventional pattern that makes the intent explicit: ```typescript let release: (() => Promise<void>) | undefined; try { const result = await fetchWithSsrFGuard(withStrictGuardedFetchMode({ ... })); const res = result.response; release = result.release; // ... response handling return text.trim(); } finally { await release?.(); } ``` The same applies to the `geminiAnalyzePdf` function (lines 174–202). How can I resolve this? If you propose a fix, please make it concise.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 970a85dea9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-14T16:50:44Z

-      messages: [{ role: "user", content }],
+  const fetchUrl = `${baseUrl}/v1/messages`;
+  const { response: res, release } = await fetchWithSsrFGuard(
+    withStrictGuardedFetchMode({


Keep guarded PDF fetches compatible with env proxy setups

Using withStrictGuardedFetchMode here forces fetchWithSsrFGuard down strict mode, which creates a pinned direct dispatcher and bypasses the global env-proxy dispatcher. In deployments that require HTTP(S)_PROXY for outbound access (no direct egress), native PDF calls will now fail for Anthropic/Gemini while the rest of model traffic can still work through the global proxy path, so this change introduces a production regression for proxy-only environments.

Useful? React with 👍 / 👎.

clawsweeper · 2026-04-28T07:05:33Z

Codex review: needs real behavior proof before merge.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
The PR wraps native Anthropic/Gemini PDF provider requests in strict SSRF-guarded fetches, moves Gemini auth to a header, and adds guarded-fetch release cleanup.

Reproducibility: yes. at source level. Current main passes model.baseUrl into the native PDF helpers, and those helpers still perform raw provider fetches against derived URLs; I did not run live provider credentials.

PR rating
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🦐 gold shrimp
Summary: The PR targets a real security issue, but missing real behavior proof and a provider-transport regression make it not quality-ready.

Rank-up moves:

Rework native PDF requests through a provider-aware guarded fetch path that preserves env-proxy/dispatcher policy.
Add redacted real behavior proof for default Anthropic and Gemini native PDF calls plus a blocked private/internal baseUrl.

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

PR egg
🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?

The egg game starts only after the PR passes the real-behavior proof check.
Before that, no creature, rarity, or ASCII portrait is rolled. The treat waits for real proof.
This is still just collectible flavor: proof affects review readiness, not creature quality.

Real behavior proof
Needs real behavior proof before merge: The PR body only has unchecked test-plan items; the contributor needs redacted live terminal/log/screenshot/recording proof for successful provider calls and a blocked private baseUrl, then should update the PR body for re-review.

Risk before merge
Why this matters: - Merging as-is can break Anthropic/Gemini native PDF calls in proxy-only deployments because the PR bypasses the provider transport path that selects trusted env-proxy mode.

Direct strict guarded fetch can route provider traffic outside the same controlled egress policy used by other model requests.
The branch is conflicting with current main, where Gemini header auth and provider base URL normalization have already changed.
No redacted live proof shows successful default Anthropic/Gemini native PDF calls or blocked private/internal baseUrl behavior after the patch.

Maintainer options:

Rework Through Provider Transport (recommended)
Route native PDF fetches through a provider-aware guarded model fetch helper so SSRF blocking and provider proxy/dispatcher policy are both preserved.
Pause Until Security Proof Exists
Keep the PR open but unmerged until a maintainer reviews the egress-policy shape and the contributor posts redacted real provider proof.

Next step before merge
Needs security owner review, conflict resolution, and contributor-supplied real behavior proof; automation cannot supply the contributor's live provider proof for this external PR.

Security
Needs attention: The patch attempts useful SSRF hardening but introduces a provider egress-policy regression by bypassing provider transport routing.

Review findings

[P1] Preserve provider transport routing — src/agents/tools/pdf-native-providers.ts:68-69

Review details

Best possible solution:

Introduce a provider-transport-aware guarded fetch path for native PDF analysis that preserves SSRF blocking, provider proxy/dispatcher policy, native PDF timeouts, Gemini header auth, audit context, and focused regression coverage.

Do we have a high-confidence way to reproduce the issue?

Yes, at source level. Current main passes model.baseUrl into the native PDF helpers, and those helpers still perform raw provider fetches against derived URLs; I did not run live provider credentials.

Is this the best way to solve the issue?

No. The security direction is right, but direct strict guarded fetch is not the best fix because native PDF provider traffic needs to preserve the same provider transport policy as other model traffic.

Label justifications:

P1: The PR concerns SSRF and provider credential handling in an agent tool, but the proposed fix can regress real provider egress setups.
merge-risk: 🚨 compatibility: Strict guarded fetch can break existing proxy-only deployments that rely on provider transport env-proxy routing.
merge-risk: 🚨 auth-provider: The diff changes provider request routing and credential placement for native PDF provider calls.
merge-risk: 🚨 security-boundary: Bypassing provider transport can route model-provider traffic outside the intended controlled egress boundary.

Full review comments:

[P1] Preserve provider transport routing — src/agents/tools/pdf-native-providers.ts:68-69
Using withStrictGuardedFetchMode directly sends native PDF POSTs outside buildGuardedModelFetch, where provider calls switch to trusted env-proxy mode when needed. This can break or bypass controlled egress in proxy-only deployments; the same issue applies to the Gemini call later in the file.
Confidence: 0.9

Overall correctness: patch is incorrect
Overall confidence: 0.91

Security concerns:

[medium] Strict guard bypasses provider proxy policy — src/agents/tools/pdf-native-providers.ts:68
The new native PDF calls use strict guarded fetch directly, which can fall outside provider transport's trusted env-proxy behavior and break or bypass expected controlled egress for provider requests.
Confidence: 0.88

Acceptance criteria:

node scripts/run-vitest.mjs src/agents/tools/pdf-native-providers.test.ts
node scripts/run-vitest.mjs src/agents/provider-transport-fetch.test.ts
Redacted live proof for default Anthropic and Gemini native PDF calls plus a blocked private/internal baseUrl

What I checked:

Current Anthropic native PDF path still uses raw fetch: Current main derives the Anthropic request URL from params.baseUrl and calls raw fetch with the x-api-key header. (src/agents/tools/pdf-native-providers.ts:65, 57028585538c)
Current Gemini native PDF path still uses raw fetch: Current main has already moved Gemini auth to x-goog-api-key, but still posts to a baseUrl-derived URL through raw fetch. (src/agents/tools/pdf-native-providers.ts:156, 57028585538c)
Native PDF helpers receive configured model baseUrl: The PDF tool passes model.baseUrl into both native provider helpers, so the URL-controlled provider fetch path is source-reproducible. (src/agents/tools/pdf-tool.ts:193, 57028585538c)
Provider transport owns env-proxy routing: buildGuardedModelFetch switches provider requests to trusted env-proxy guarded mode when no explicit dispatcher policy is configured. (src/agents/provider-transport-fetch.ts:574, 57028585538c)
Strict guarded mode is a separate transport choice: withStrictGuardedFetchMode forces strict mode, while trusted env-proxy mode is a distinct guarded-fetch preset used by provider transport. (src/infra/net/fetch-guard.ts:108, 57028585538c)
PR diff bypasses provider transport: The PR imports fetchWithSsrFGuard and withStrictGuardedFetchMode directly into the native PDF helper instead of extending or reusing the provider-aware fetch path. (src/agents/tools/pdf-native-providers.ts:68, 970a85dea93c)

Likely related people:

tyler6204: Authored and merged the PDF analysis tool with native provider support, including the native PDF provider helper. (role: introduced behavior; confidence: high; commits: d0ac1b019517; files: src/agents/tools/pdf-native-providers.ts, src/agents/tools/pdf-tool.ts)
steipete: Recent commits changed provider policy hooks, Google Generative AI normalization, guarded fetch modes, and the release snapshot containing the current native PDF helper behavior. (role: recent area contributor; confidence: high; commits: c973b053a5e2, 5cdb50abe6c5, d042192c7c9c; files: src/agents/tools/pdf-native-providers.ts, src/agents/provider-transport-fetch.ts, src/infra/net/fetch-guard.ts)
0xsline: Authored the merged Gemini PDF URL normalization fix, which is adjacent to the Gemini provider URL surface touched here. (role: adjacent bug-fix contributor; confidence: medium; commits: bfeea5d23fc6; files: src/agents/tools/pdf-native-providers.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 57028585538c.

fix: add SSRF guard to Anthropic/Gemini PDF providers and move Gemini…

970a85d

… API key to header

openclaw-barnacle Bot added agents Agent runtime and tooling size: S labels Mar 14, 2026

greptile-apps Bot reviewed Mar 14, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Mar 14, 2026

View reviewed changes

BingqingLyu mentioned this pull request Apr 27, 2026

fix: add SSRF guard to Anthropic/Gemini PDF providers and move Gemini API key to header BingqingLyu/openclaw#704

Open

4 tasks

clawsweeper Bot added the P1 High-priority user-facing bug, regression, or broken workflow. label May 16, 2026

openclaw-barnacle Bot added triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. triage: refactor-only Candidate: refactor/cleanup-only PR without maintainer context. labels May 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: add SSRF guard to Anthropic/Gemini PDF providers and move Gemini API key to header#46377

fix: add SSRF guard to Anthropic/Gemini PDF providers and move Gemini API key to header#46377
cdxiaodong wants to merge 1 commit into
openclaw:mainfrom
cdxiaodong:fix/pdf-provider-ssrf-guard

cdxiaodong commented Mar 14, 2026

Uh oh!

greptile-apps Bot commented Mar 14, 2026

Uh oh!

greptile-apps Bot Mar 14, 2026

Uh oh!

greptile-apps Bot Mar 14, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 14, 2026

Uh oh!

clawsweeper Bot commented Apr 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

cdxiaodong commented Mar 14, 2026

Summary

Changes

Test plan

Uh oh!

greptile-apps Bot commented Mar 14, 2026

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps Bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

clawsweeper Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

clawsweeper Bot commented Apr 28, 2026 •

edited

Loading