Skip to content

improve: add empty-text guard and retry for Edge TTS#47232

Closed
Huntterxx wants to merge 20 commits intoopenclaw:mainfrom
Huntterxx:improve/edge-tts-followup
Closed

improve: add empty-text guard and retry for Edge TTS#47232
Huntterxx wants to merge 20 commits intoopenclaw:mainfrom
Huntterxx:improve/edge-tts-followup

Conversation

@Huntterxx
Copy link
Copy Markdown
Contributor

Follow-up to PR #43385

Implements reviewer suggestions:

  • Add pre-check for empty or whitespace-only text
  • Retry once if Edge TTS produces a zero-byte file
  • Include file size in error message for debugging

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: Edge TTS could be called with empty text or produce a zero-byte file due to transient failures.
  • Why it matters: This wastes API calls and may still fail permanently even when the issue is temporary.
  • What changed: Added an empty-text guard before TTS invocation and a single retry if a zero-byte file is produced, including file size in the error message.
  • What did NOT change (scope boundary): No changes to provider logic, configuration, or external integrations.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

Improves robustness of Edge TTS.
Empty or whitespace-only text now fails fast instead of invoking TTS.

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: N/A
  • Runtime/container: Node.js
  • Model/provider: Edge TTS
  • Integration/channel (if any): N/A
  • Relevant config (redacted): N/A

Steps

  1. Use default Edge TTS configuration
  2. Call edgeTTS with empty or whitespace-only text
  3. Simulate zero-byte output from TTS

Expected

  • Empty text throws immediately
  • Zero-byte output triggers retry and throws only if retry also fails

Actual

  • Matches expected behavior

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

Verified empty-text guard throws before invoking TTS.
Verified retry executes when zero-byte file is produced.
Verified error includes file size when failure persists.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps:

Failure Recovery (if this breaks)

  • Revert this commit.

Risks and Mitigations

List only real risks for this PR. Add/remove entries as needed. If none, write None.

  • Risk: Additional TTS invocation due to retry may slightly increase latency.
  • Mitigation: Retry occurs only on zero-byte output, which indicates failure.

Follow-up to PR openclaw#43385

Implements reviewer suggestions:

- Add pre-check for empty or whitespace-only text
- Retry once if Edge TTS produces a zero-byte file
- Include file size in error message for debugging
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 15, 2026

Greptile Summary

This PR improves robustness of Edge TTS by adding an empty/whitespace-only text guard and a single retry when the output file is zero bytes. The changes are minimal and well-scoped, but there are two issues worth addressing before merging.

  • Guard ordering (logic): The empty-text check at line 720 is placed after the EdgeTTS object is constructed (lines 707–717). The guard should be moved to the top of the function body, before any TTS initialization, to ensure no resources are allocated on invalid input.
  • Vacuous debug info (style): The error message size=${size} on line 734 will always print size=0 because the branch is only reachable when the retry also produces a zero-byte file. The interpolation adds no diagnostic value and should either be removed or replaced with a more meaningful message (e.g., "after retry").
  • The retry logic itself is sound — it re-uses the same tts instance and re-stats the output file before re-throwing.

Confidence Score: 3/5

  • Mostly safe to merge; the guard ordering issue is a logic smell rather than a runtime failure path, but it should be fixed for correctness.
  • The overall intent and structure are correct. The primary concern is that the empty-text guard fires after EdgeTTS construction rather than before, which is a logic ordering issue. The misleading size=0 message is a minor style issue. Neither causes a crash or data loss, but the guard order should be corrected before merging.
  • src/tts/tts-core.ts — guard placement (lines 707–722) and error message (line 734)
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/tts/tts-core.ts
Line: 707-722

Comment:
**Empty-text guard placed after `EdgeTTS` construction**

The empty-text guard (line 720) is checked *after* `new EdgeTTS(...)` is already instantiated (lines 707–717). If constructing `EdgeTTS` allocates resources, opens connections, or has any other side effects, those happen before the guard can short-circuit. The guard should be the very first thing in the function body to prevent unnecessary initialization.

```suggestion
  const { text, outputPath, config, timeoutMs } = params;

  if (!text || text.trim().length === 0) {
    throw new Error("TTS text cannot be empty");
  }

  const tts = new EdgeTTS({
    voice: config.voice,
    lang: config.lang,
    outputFormat: config.outputFormat,
    saveSubtitles: config.saveSubtitles,
    proxy: config.proxy,
    rate: config.rate,
    pitch: config.pitch,
    volume: config.volume,
    timeout: config.timeoutMs ?? timeoutMs,
  });
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/tts/tts-core.ts
Line: 734

Comment:
**`size` in error message is always `0`**

The PR description states the goal is to "include file size in error message for debugging," but `size` at this point is always `0` — we only reach this line inside a branch guarded by `if (size === 0)`. The interpolated value provides no additional diagnostic information beyond what the phrase "empty audio file" already conveys.

If the intent is future-proofing for a condition where the file is non-zero but suspiciously small, that threshold should be made explicit. Otherwise, the `(size=0)` suffix can be dropped to keep the message clean.

```suggestion
      throw new Error("Edge TTS produced empty audio file after retry");
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: e179a32

Comment thread src/tts/tts-core.ts Outdated
Comment thread src/tts/tts-core.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e7fdf9716c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tts/tts-core.ts Outdated
@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create an environment for this repo.

@Huntterxx
Copy link
Copy Markdown
Contributor Author

@BunsDev This pull request is ready for review and merge. All changes have been completed, tested, and no errors were found. Please let me know if any adjustments are needed.

@steipete
Copy link
Copy Markdown
Contributor

Maintainer review update against current main: keep open, but this PR is stale and not mergeable as-is.

The underlying robustness gap still exists, but the implementation target has moved. Current main has Microsoft/Edge TTS in extensions/microsoft/tts.ts, not the old src/tts/tts-core.ts path this PR edits.

Current source evidence:

  • text is passed directly into tts.ttsPromise(text, outputPath) with no empty/whitespace guard.
  • zero-byte output throws Edge TTS produced empty audio file immediately after one stat; there is no retry.
  • tests exist in extensions/microsoft/tts.test.ts, so the right regression target is there.

Minimal good rebase:

  • move the empty-text guard to the top of textToSpeechWithEdge, before dependency loading / EdgeTTS construction;
  • retry ttsPromise once when the output file exists but has size 0;
  • keep the final error message explicit (after retry is more useful than size=0);
  • add/adjust tests in extensions/microsoft/tts.test.ts for whitespace input and one successful retry after an empty first write.

So: valid fix idea, but needs a rebase onto the Microsoft speech plugin layout before we should merge it.

@steipete
Copy link
Copy Markdown
Contributor

Thanks @Huntterxx. I ported the useful part of this PR to the current Microsoft provider path and landed it on main in 7a85c1a822.

The shipped fix is in extensions/microsoft/tts.ts now:

  • rejects blank/whitespace text before constructing Edge TTS
  • retries once when Edge TTS resolves with a zero-byte or missing output file
  • keeps provider-thrown errors single-attempt so real upstream failures are not hidden
  • adds regression coverage in extensions/microsoft/tts.test.ts

Validation: pnpm check:changed plus OpenAI live TTS smoke for the same TTS surface. Closing this stale branch since the fix is now on main.

@steipete steipete closed this Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants