Skip to content

fix: use SSRF-guarded fetch for TTS provider requests#46333

Closed
cdxiaodong wants to merge 1 commit into
openclaw:mainfrom
cdxiaodong:fix/tts-ssrf-guard
Closed

fix: use SSRF-guarded fetch for TTS provider requests#46333
cdxiaodong wants to merge 1 commit into
openclaw:mainfrom
cdxiaodong:fix/tts-ssrf-guard

Conversation

@cdxiaodong
Copy link
Copy Markdown

Vulnerability: TTS Provider BaseUrl SSRF with API Key Exfiltration

Severity: High (CVSS 7.7)

Root Cause

ElevenLabs and OpenAI TTS functions in tts-core.ts use raw fetch() with user-controlled baseUrl from configuration, without SSRF validation. An attacker with operator.admin scope can set baseUrl to an attacker-controlled server or internal network endpoint (AWS IMDS 169.254.169.254, Redis, K8s API), causing API keys (xi-api-key, Authorization: Bearer) to be leaked in request headers.

Fix

Replace raw fetch() calls in both elevenLabsTTS() and openaiTTS() with fetchWithSsrFGuard(withStrictGuardedFetchMode(...)), which enforces DNS pinning and blocks requests to private/internal IP ranges.

Affected File

  • src/tts/tts-core.ts

ElevenLabs and OpenAI TTS functions used raw fetch() with user-controlled
baseUrl parameters. An attacker with operator.admin scope could redirect
baseUrl to exfiltrate API keys (xi-api-key, Authorization Bearer) and
enable SSRF against internal services.

Replace raw fetch() with fetchWithSsrFGuard(withStrictGuardedFetchMode())
which enforces DNS pinning and private IP blocking.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 14, 2026

Greptile Summary

This PR closes an SSRF vulnerability in src/tts/tts-core.ts by replacing raw fetch() calls in elevenLabsTTS and openaiTTS with fetchWithSsrFGuard(withStrictGuardedFetchMode(...)), which enforces DNS pinning, blocks requests to private/internal IP ranges, and strips sensitive auth headers (xi-api-key, Authorization) on cross-origin redirects — all of which were possible attack vectors when an operator-level user controlled baseUrl.

  • The core security fix is correct and complete: all fetch() call sites in the file have been migrated; no raw fetch() remains.
  • Cross-origin redirect credential stripping is handled correctly by retainSafeHeadersForCrossOriginRedirect inside the guard.
  • The release() dispatcher cleanup is properly placed in finally blocks to prevent resource leaks.
  • Minor design inconsistency: controller.signal is passed inside init rather than as the top-level signal field of GuardedFetchOptions. This works today but is fragile — if a top-level timeoutMs is ever added to these call sites, the inner signal would be silently overridden.
  • Neither call passes an auditContext, so SSRF block events will log the generic string "url-fetch" rather than a provider-specific identifier, reducing observability.

Confidence Score: 4/5

  • This PR is safe to merge — it introduces a clear security improvement with no functional regressions.
  • The SSRF guard is applied correctly to all affected fetch() call sites, auth headers are stripped on cross-origin redirects, and release() is called unconditionally. The two flagged issues (signal placement and missing auditContext) are both style-level concerns that do not affect correctness or security today; they are worth addressing in a follow-up but do not block merging.
  • No files require special attention — the single changed file (src/tts/tts-core.ts) implements the fix correctly.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/tts/tts-core.ts
Line: 599-626

Comment:
**Consider passing `signal` at the top-level and adding `auditContext`**

`fetchWithSsrFGuard` has a dedicated top-level `signal` field on `GuardedFetchOptions` which is processed by `buildAbortSignal` (handling event-listener cleanup and chaining). Placing it inside `init` instead works today (the inner signal is preserved when `params.signal` and `params.timeoutMs` are both absent, since `buildAbortSignal` returns `undefined`), but it is fragile — if `timeoutMs` is ever added to these call sites at the `fetchWithSsrFGuard` level the inner `init.signal` would be silently overridden by the composed signal.

Additionally, passing `auditContext` provides far more useful context in security logs when a request is SSRF-blocked (currently logs will show the generic `"url-fetch"` string).

Suggested refactor for `elevenLabsTTS`:
```ts
const { response, release } = await fetchWithSsrFGuard(
  withStrictGuardedFetchMode({
    url: url.toString(),
    signal: controller.signal,        // ← top-level, not inside init
    auditContext: "elevenlabs-tts",   // ← named context in SSRF block logs
    init: {
      method: "POST",
      headers: {
        "xi-api-key": apiKey,
        "Content-Type": "application/json",
        Accept: "audio/mpeg",
      },
      body: JSON.stringify({ ... }),
      // signal removed from here
    },
  }),
);
```
And analogously for `openaiTTS` with `auditContext: "openai-tts"`.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/tts/tts-core.ts
Line: 668-688

Comment:
**Same `signal`-in-`init` / missing `auditContext` concern**

Same issue as in `elevenLabsTTS` above: `signal: controller.signal` is buried inside `init` rather than passed as the top-level `signal` field, and no `auditContext` is provided to identify this call site in SSRF audit logs. Consider moving `signal` to the top level and adding `auditContext: "openai-tts"` for consistency and safety.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 6fbc4c2

Comment thread src/tts/tts-core.ts
Comment on lines +599 to +626
const { response, release } = await fetchWithSsrFGuard(
withStrictGuardedFetchMode({
url: url.toString(),
init: {
method: "POST",
headers: {
"xi-api-key": apiKey,
"Content-Type": "application/json",
Accept: "audio/mpeg",
},
body: JSON.stringify({
text,
model_id: modelId,
seed: normalizedSeed,
apply_text_normalization: normalizedNormalization,
language_code: normalizedLanguage,
voice_settings: {
stability: voiceSettings.stability,
similarity_boost: voiceSettings.similarityBoost,
style: voiceSettings.style,
use_speaker_boost: voiceSettings.useSpeakerBoost,
speed: voiceSettings.speed,
},
}),
signal: controller.signal,
},
}),
signal: controller.signal,
});
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider passing signal at the top-level and adding auditContext

fetchWithSsrFGuard has a dedicated top-level signal field on GuardedFetchOptions which is processed by buildAbortSignal (handling event-listener cleanup and chaining). Placing it inside init instead works today (the inner signal is preserved when params.signal and params.timeoutMs are both absent, since buildAbortSignal returns undefined), but it is fragile — if timeoutMs is ever added to these call sites at the fetchWithSsrFGuard level the inner init.signal would be silently overridden by the composed signal.

Additionally, passing auditContext provides far more useful context in security logs when a request is SSRF-blocked (currently logs will show the generic "url-fetch" string).

Suggested refactor for elevenLabsTTS:

const { response, release } = await fetchWithSsrFGuard(
  withStrictGuardedFetchMode({
    url: url.toString(),
    signal: controller.signal,        // ← top-level, not inside init
    auditContext: "elevenlabs-tts",   // ← named context in SSRF block logs
    init: {
      method: "POST",
      headers: {
        "xi-api-key": apiKey,
        "Content-Type": "application/json",
        Accept: "audio/mpeg",
      },
      body: JSON.stringify({ ... }),
      // signal removed from here
    },
  }),
);

And analogously for openaiTTS with auditContext: "openai-tts".

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/tts/tts-core.ts
Line: 599-626

Comment:
**Consider passing `signal` at the top-level and adding `auditContext`**

`fetchWithSsrFGuard` has a dedicated top-level `signal` field on `GuardedFetchOptions` which is processed by `buildAbortSignal` (handling event-listener cleanup and chaining). Placing it inside `init` instead works today (the inner signal is preserved when `params.signal` and `params.timeoutMs` are both absent, since `buildAbortSignal` returns `undefined`), but it is fragile — if `timeoutMs` is ever added to these call sites at the `fetchWithSsrFGuard` level the inner `init.signal` would be silently overridden by the composed signal.

Additionally, passing `auditContext` provides far more useful context in security logs when a request is SSRF-blocked (currently logs will show the generic `"url-fetch"` string).

Suggested refactor for `elevenLabsTTS`:
```ts
const { response, release } = await fetchWithSsrFGuard(
  withStrictGuardedFetchMode({
    url: url.toString(),
    signal: controller.signal,        // ← top-level, not inside init
    auditContext: "elevenlabs-tts",   // ← named context in SSRF block logs
    init: {
      method: "POST",
      headers: {
        "xi-api-key": apiKey,
        "Content-Type": "application/json",
        Accept: "audio/mpeg",
      },
      body: JSON.stringify({ ... }),
      // signal removed from here
    },
  }),
);
```
And analogously for `openaiTTS` with `auditContext: "openai-tts"`.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread src/tts/tts-core.ts
Comment on lines +668 to +688
const { response, release } = await fetchWithSsrFGuard(
withStrictGuardedFetchMode({
url: `${baseUrl}/audio/speech`,
init: {
method: "POST",
headers: {
Authorization: `Bearer ${apiKey}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model,
input: text,
voice,
response_format: responseFormat,
...(speed != null && { speed }),
...(effectiveInstructions != null && { instructions: effectiveInstructions }),
}),
signal: controller.signal,
},
}),
signal: controller.signal,
});
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same signal-in-init / missing auditContext concern

Same issue as in elevenLabsTTS above: signal: controller.signal is buried inside init rather than passed as the top-level signal field, and no auditContext is provided to identify this call site in SSRF audit logs. Consider moving signal to the top level and adding auditContext: "openai-tts" for consistency and safety.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/tts/tts-core.ts
Line: 668-688

Comment:
**Same `signal`-in-`init` / missing `auditContext` concern**

Same issue as in `elevenLabsTTS` above: `signal: controller.signal` is buried inside `init` rather than passed as the top-level `signal` field, and no `auditContext` is provided to identify this call site in SSRF audit logs. Consider moving `signal` to the top level and adding `auditContext: "openai-tts"` for consistency and safety.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6fbc4c209a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/tts/tts-core.ts
Comment on lines +669 to +670
withStrictGuardedFetchMode({
url: `${baseUrl}/audio/speech`,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve custom TTS endpoints on private networks

Using withStrictGuardedFetchMode here makes every non-public baseUrl fail SSRF checks, so OpenAI-compatible TTS backends on localhost/RFC1918 (which this file explicitly supports via OPENAI_TTS_BASE_URL, e.g. http://localhost:8880/v1) now fail at runtime with SsrFBlockedError. This is a regression from the previous behavior and breaks documented self-hosted/local TTS setups unless a private-network policy escape hatch is added for trusted operator-configured endpoints.

Useful? React with 👍 / 👎.

@steipete
Copy link
Copy Markdown
Contributor

Closing this as implemented after Codex review.

Current main already routes the OpenAI and ElevenLabs TTS requests through the shared SSRF-guarded fetch path in their plugin-owned implementations, and the PR's target file no longer owns those network calls.

What I checked:

  • Current OpenAI TTS uses guarded fetch: openaiTTS() now calls fetchWithSsrFGuard(...) with ssrfPolicyFromHttpBaseUrlAllowedHostname(baseUrl) and auditContext: "openai-tts". (extensions/openai/tts.ts:107, 0bef73d151bd)
  • Current ElevenLabs TTS uses guarded fetch: elevenLabsTTS() now calls fetchWithSsrFGuard(...) with ssrfPolicyFromHttpBaseUrlAllowedHostname(normalizedBaseUrl) and auditContext: "elevenlabs.tts". (extensions/elevenlabs/tts.ts:83, 0bef73d151bd)
  • PR target file no longer owns these call sites: Current src/tts/tts-core.ts is only 159 lines and no longer contains openaiTTS, elevenLabsTTS, or raw/fetch-guarded provider request logic; the implementations live under bundled provider plugins. (src/tts/tts-core.ts:1, 0bef73d151bd)
  • Latest release still had raw fetch: In release v2026.4.23, both provider files still used raw fetch(...) for the TTS request path, so this appears implemented on main after that release rather than already shipped there. (extensions/openai/tts.ts:149, a9797214338b)
  • Self-hosted/local TTS support still exists on main: Docs still advertise custom OpenAI TTS baseUrl / OPENAI_TTS_BASE_URL, and config validation still accepts a localhost OpenAI TTS base URL (http://localhost:8880/v1) for voice-call config. (src/config/config.plugin-validation.test.ts:461, 0bef73d151bd)
  • Hostname-scoped SSRF policy helper is exported and tested: The plugin SDK exports ssrfPolicyFromHttpBaseUrlAllowedHostname, and SSRF tests cover building an allowed-hostname policy from HTTP base URLs. (src/plugin-sdk/ssrf-runtime.ts:4, 0bef73d151bd)

So I’m closing this as already implemented rather than keeping a duplicate issue open.

Review notes: reviewed against 0bef73d151bd.

@steipete steipete closed this Apr 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants