Skip to content

feat(tts): add provider personas#70748

Merged
obviyus merged 6 commits intoopenclaw:mainfrom
barronlroth:feat/tts-personas-speech-core
Apr 26, 2026
Merged

feat(tts): add provider personas#70748
obviyus merged 6 commits intoopenclaw:mainfrom
barronlroth:feat/tts-personas-speech-core

Conversation

@barronlroth
Copy link
Copy Markdown
Contributor

Summary

AI-assisted: Yes.
Testing degree: targeted local validation on the rebased head; broader full-gate runs existed earlier in the branch, but I did not rerun the entire pnpm build && pnpm check && pnpm test sequence after the final rebase before opening this draft.
codex review --base upstream/main is installed locally but could not run in this environment because ~/.codex/sessions is permission-blocked.

  • Problem: OpenClaw had no deterministic, reusable way to describe a TTS persona once and map it across different speech providers.
  • Why it matters: multi-provider TTS setups could not reliably preserve provider-specific voice guidance, especially for Gemini-style speech prompting, and there was no first-class persona selection surface in config, CLI, or gateway APIs.
  • What changed: added TTS persona config + runtime resolution, provider-specific synthesis preparation hooks, Gemini/OpenAI provider mappings, CLI/chat/gateway persona controls, status/docs updates, and regression coverage for the new seams.
  • What did NOT change (scope boundary): this PR does not add model-generated intra-line audio tags as a deterministic OpenClaw feature, and it does not introduce provider-specific hardcoded narration beyond persona prompt/instruction shaping.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Root Cause (if applicable)

  • Root cause: N/A
  • Missing detection / guardrail: N/A
  • Contributing context (if known): N/A

Regression Test Plan (if applicable)

N/A for the feature framing, but the PR adds targeted seam/integration coverage for persona resolution and provider-specific synthesis preparation.

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: extensions/speech-core/src/tts.test.ts, extensions/google/speech-provider.test.ts, extensions/openai/speech-provider.test.ts, src/auto-reply/reply/commands-tts.test.ts, src/gateway/server-methods/tts.test.ts, src/cli/capability-cli.test.ts, src/config/zod-schema.tts.test.ts, test/helpers/plugins/tts-contract-suites.ts
  • Scenario the test should lock in: deterministic persona selection, provider fallback behavior, provider-specific synthesis prompt shaping, and persona control surfaces.
  • Why this is the smallest reliable guardrail: the behavior spans config schema, speech-core runtime resolution, provider hooks, and external control surfaces.
  • Existing test that already covers this (if any): N/A
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

  • New messages.tts.persona and messages.tts.personas config surfaces.
  • New /tts persona [id|off] chat command.
  • New CLI support for listing/selecting personas.
  • New gateway methods for listing personas and setting the active persona.
  • Gemini and OpenAI speech providers now consume resolved persona guidance deterministically through provider-specific synthesis preparation.
  • TTS status output now includes the active persona.

Diagram (if applicable)

Before:
[user/provider config] -> [raw TTS prompt only] -> [provider-specific behavior ad hoc]

After:
[persona id + provider bindings]
  -> [speech-core persona resolution]
  -> [provider prepareSynthesis hook]
  -> [provider-specific prompt/instruction shaping]
  -> [deterministic synthesized output path]

Security Impact (required)

  • New permissions/capabilities? (Yes/No): Yes
  • Secrets/tokens handling changed? (Yes/No): No
  • New/changed network calls? (Yes/No): No
  • Command/tool execution surface changed? (Yes/No): No
  • Data access scope changed? (Yes/No): No
  • If any Yes, explain risk + mitigation: this adds persona selection/read APIs and command surfaces only within the existing TTS feature area. Persona ids are schema-validated, provider mapping stays inside the provider/runtime seam, and there are no new secret or network paths.

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 22 / pnpm workspace
  • Model/provider: Google Gemini speech, OpenAI speech, generic speech-core persona resolution
  • Integration/channel (if any): gateway + CLI + auto-reply TTS surfaces
  • Relevant config (redacted): messages.tts.personas, messages.tts.persona, provider TTS config

Steps

  1. Configure one or more TTS personas with provider-specific bindings.
  2. Select a persona through config, /tts persona, CLI, or gateway method.
  3. Synthesize speech through the configured provider.

Expected

  • Persona resolution is deterministic.
  • Provider fallback is explicit.
  • Gemini/OpenAI receive the correct provider-specific synthesis guidance.
  • Status and control surfaces report the active persona.

Actual

  • Targeted automated coverage passed for the added config/runtime/provider/control-surface paths.
  • I did not rerun a full live provider/manual synthesis pass after the final rebase.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Validation run on this branch included:

  • pnpm plugin-sdk:api:check
  • pnpm config:docs:check
  • pnpm tsgo
  • pnpm tsgo:test
  • pnpm check
  • targeted TTS/config/provider/gateway/CLI tests

Additional follow-up during local prep validated an unrelated gateway socket-teardown regression outside this PR and was intentionally left uncommitted.

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: persona config schema wiring, persona resolution/fallback behavior, Gemini prompt shaping, OpenAI instruction mapping fallback, CLI/chat/gateway persona control surfaces, status output.
  • Edge cases checked: unknown persona handling, provider fallback behavior, provider-specific config overlays, active persona off/reset flow.
  • What you did not verify: a fresh full-suite rerun and a final live manual speech turn on the rebased head.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes/No): Yes
  • Config/env changes? (Yes/No): Yes
  • Migration needed? (Yes/No): No
  • If yes, exact upgrade steps: optional config only; existing TTS setups continue without defining personas.

Risks and Mitigations

  • Risk: provider-specific persona shaping could drift across speech providers over time.
    • Mitigation: provider-specific prepareSynthesis hooks plus targeted provider contract tests.
  • Risk: persona selection surfaces could diverge across config, CLI, gateway, and chat commands.
    • Mitigation: shared speech-core resolution path and targeted gateway/CLI/command tests.
  • Risk: SDK/runtime seam expansion could drift from docs or generated baselines.
    • Mitigation: updated docs plus regenerated config/plugin-sdk baselines checked in with the feature.

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation gateway Gateway runtime cli CLI command changes extensions: openai size: XL labels Apr 23, 2026
@barronlroth barronlroth force-pushed the feat/tts-personas-speech-core branch 2 times, most recently from b7c4525 to 17a5dce Compare April 23, 2026 20:17
@barronlroth barronlroth marked this pull request as ready for review April 23, 2026 20:20
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 23, 2026

Greptile Summary

This PR adds first-class TTS persona support to OpenClaw: persona config schema, runtime resolution, provider-specific synthesis preparation hooks (Gemini audio-profile-v1 wrapping, OpenAI instructions injection), CLI/chat command/gateway control surfaces, and status reporting. The overall design is sound — persona resolution is deterministic, the prepareSynthesis hook correctly threads through both regular and telephony synthesis paths, and config schemas use .strict() to prevent silent drift.

One minor telemetry accuracy issue: in the synthesis loops, skipped-provider attempt records hardcode personaBinding: persona ? \"missing\" : \"none\" regardless of why the provider was skipped (e.g., no_provider_registered or unsupported_for_telephony). A persona could have a binding for that provider yet the attempt would still be logged as missing, producing misleading /tts status output.

Note also that Google TTS persona prompt fields (persona.prompt.*) are only embedded in the request when the Google provider config explicitly opts in via promptTemplate: \"audio-profile-v1\", while OpenAI applies persona instructions automatically. This intentional asymmetry is worth documenting clearly for operators setting up multi-provider personas.

Confidence Score: 5/5

Safe to merge; all remaining findings are P2 style/telemetry concerns that do not affect synthesis correctness or security.

No P0 or P1 findings. The one substantive issue (incorrect personaBinding label in skipped-attempt telemetry) is a cosmetic inaccuracy in status output and does not affect actual TTS behavior, provider selection, or data integrity. The persona resolution logic, fallback policy enforcement, provider hooks, and all control surfaces are correctly wired.

extensions/speech-core/src/tts.ts — the skipped-provider personaBinding labeling in both synthesis loops.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: extensions/speech-core/src/tts.ts
Line: 1037-1044

Comment:
**`personaBinding` incorrectly hardcoded to `"missing"` on skip**

When a provider is skipped due to `no_provider_registered` or `unsupported_for_telephony`, the code sets `personaBinding: persona ? "missing" : "none"` unconditionally. But a persona could have a provider-specific binding configured for that provider even when it was skipped for an unrelated reason. The same issue appears in the telephony loop. This produces misleading telemetry — users inspecting `/tts status` attempt details would see `persona=...:missing` even when the persona binding was not the skip cause.

The real binding state is only knowable after calling `mergeProviderConfigWithPersona`, which doesn't run on the skip path. The simplest fix is to omit `personaBinding` entirely from skipped-provider attempt records, or to explicitly note the binding as `"unknown"`.

```suggestion
      attempts.push({
        provider,
        outcome: "skipped",
        reasonCode: resolvedProvider.reasonCode,
        persona: persona?.id,
        personaBinding: persona ? "unknown" : "none",
        error: resolvedProvider.message,
      });
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "TTS: add provider personas" | Re-trigger Greptile

Comment thread extensions/speech-core/src/tts.ts
@barronlroth barronlroth force-pushed the feat/tts-personas-speech-core branch from 17a5dce to d568af4 Compare April 23, 2026 20:30
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d568af45e5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread extensions/speech-core/src/tts.ts
Comment thread src/tts/status-config.ts
@barronlroth barronlroth force-pushed the feat/tts-personas-speech-core branch 2 times, most recently from db06b68 to 6b52167 Compare April 23, 2026 21:04
Copy link
Copy Markdown
Contributor

@obviyus obviyus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work here. Can you please make these changes before we land it?

  • Drop the docs/.generated/*.sha256 changes from this PR.
  • Rebase on latest main to clear the dirty merge state.
  • Clarify fallbackPolicy: "fail" semantics, ideally with a small test: should it fail the whole persona request, or only skip providers without persona bindings?

@steipete
Copy link
Copy Markdown
Contributor

Maintainer deep review against current main: this is not closeable as done on main, but it should not be landed in the current shape.

Current state:

  • main does not have first-class TTS personas (messages.tts.personas, /tts persona, tts.setPersona, provider prepareSynthesis, etc.), so feat(tts): add provider-agnostic TTS personas in speech-core #68323 remains a real feature request.
  • This PR is CONFLICTING and already has CHANGES_REQUESTED from @obviyus.
  • The touched surface is large: config schema/docs/generated API baselines, Plugin SDK speech/TTS seams, speech-core runtime, Google/OpenAI/XAI providers, chat commands, CLI, gateway methods/scopes, status output, and contract suites.

Blockers before this can be reconsidered:

  • Rebase on latest main; do not ask reviewers to reason over a dirty merge state.
  • Drop docs/.generated/*.sha256 churn unless it is regenerated by the documented repo command and remains required after the rebase. Baseline changes must be source-driven, not hand-carried.
  • Clarify and test fallbackPolicy: "fail": does it fail the whole persona synthesis when the selected provider lacks a persona binding, or only skip that provider and continue? The current semantics need to be explicit in code, docs, and status/attempt reporting.
  • Fix attempt/status accuracy for skipped providers: personaBinding: "missing" must not be reported for providers skipped because they are unregistered, unsupported for telephony, etc. That would make /tts status misleading.
  • Keep provider behavior consistent or document the intentional asymmetry: Google only applies prompt fields with promptTemplate: "audio-profile-v1", while OpenAI appears to auto-map persona prompt fields into instructions.
  • Add a fresh live/manual verification note after rebase for at least one .profile-backed provider path. For this feature, a provider request-shaping unit test is useful but not enough to prove the user-visible TTS behavior still works.

This is a substantial API/product feature, not a narrow repair. Once rebased and narrowed around the resolved semantics, it can be reviewed as the canonical #68323 implementation.

@barronlroth
Copy link
Copy Markdown
Contributor Author

@steipete @obviyus

Appreciate the reviews. I'll make those changes now.

Copy link
Copy Markdown
Contributor Author

Fresh rebase/update pushed in 50b713ca3.

Addressed the requested follow-ups:

  • Rebased on latest main available at push time and cleared the dirty merge state; GitHub now reports the PR as mergeable.
  • Regenerated docs/.generated/config-baseline.sha256 and docs/.generated/plugin-sdk-api-baseline.sha256 via the documented generators; the matching check lanes pass, so the remaining SHA changes are source-driven.
  • Clarified fallbackPolicy: "fail": it skips the unbound provider attempt with reasonCode: "not_configured" and personaBinding: "missing", still tries fallback providers, and only fails the whole request if every attempt fails/skips.
  • Documented the Google/OpenAI persona prompt asymmetry in docs/tools/tts.md.

Validation after the final rebase:

  • pnpm plugin-sdk:api:check
  • pnpm config:schema:check
  • pnpm config:docs:check
  • targeted oxfmt on directly edited TS files
  • OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test extensions/speech-core/src/tts.test.ts extensions/google/speech-provider.test.ts extensions/openai/speech-provider.test.ts src/auto-reply/reply/commands-tts.test.ts src/cli/capability-cli.test.ts src/config/zod-schema.tts.test.ts src/gateway/server-methods/tts.test.ts src/tts/status-config.test.ts
  • live/manual .profile-backed provider check: source ~/.profile provided XAI_API_KEY, then OPENCLAW_LIVE_TEST=1 OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test:live extensions/xai/xai.live.test.ts -t "synthesizes TTS through the registered speech provider" passed (1 passed, 4 skipped).

Copy link
Copy Markdown
Contributor

@steipete steipete left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep reviewed the refreshed head 50b713ca3. This is closer, but I still would not land it yet.

Blocking findings:

  1. Rebase is needed again. Current main has moved since the PR head; local merge proof shows conflicts in docs/.generated/config-baseline.sha256 and docs/.generated/plugin-sdk-api-baseline.sha256 (git merge-tree $(git merge-base main refs/tmp/pr-70748) main refs/tmp/pr-70748). GitHub still reports mergeable: UNKNOWN and reviewDecision: CHANGES_REQUESTED. Please rebase on current main and regenerate/check the baselines after the rebase.

  2. persona.rewrite is a public no-op config surface. The PR adds and exports TtsPersonaRewriteConfig in src/config/types.tts.ts, accepts it in TtsPersonaRewriteSchema / TtsPersonaSchema, exports it through the plugin SDK, and even accepts it in src/config/zod-schema.tts.test.ts, but runtime grep shows no implementation path consuming it (git grep -n "TtsPersonaRewrite\|rewrite:" refs/tmp/pr-70748 -- src extensions docs/tools/tts.md). This means users can set messages.tts.personas.<id>.rewrite.enabled=true and it silently does nothing. Please either implement the rewrite behavior with tests/docs, or remove the rewrite schema/types/export until it exists.

Non-blocking note:

  • Google TTS model naming is still awkward. Official Google Gemini TTS docs currently list Gemini 2.5 Flash/Pro Preview TTS as supported; the PR docs/example and catalog include gemini-3.1-flash-tts-preview. My .profile live Google TTS smoke on current main passed, so I am not treating this as a blocker, but the PR should avoid presenting an undocumented model as the recommended example unless we have an explicit source or live proof attached.

Validation I ran locally while reviewing:

  • pnpm docs:list
  • OPENCLAW_LIVE_TEST=1 OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test:live extensions/google/google.live.test.ts -t "synthesizes speech through the registered provider" -> passed
  • local merge-tree conflict check against current main

@barronlroth
Copy link
Copy Markdown
Contributor Author

Deep reviewed the refreshed head 50b713ca3. This is closer, but I still would not land it yet.

Blocking findings:

  1. Rebase is needed again. Current main has moved since the PR head; local merge proof shows conflicts in docs/.generated/config-baseline.sha256 and docs/.generated/plugin-sdk-api-baseline.sha256 (git merge-tree $(git merge-base main refs/tmp/pr-70748) main refs/tmp/pr-70748). GitHub still reports mergeable: UNKNOWN and reviewDecision: CHANGES_REQUESTED. Please rebase on current main and regenerate/check the baselines after the rebase.

  2. persona.rewrite is a public no-op config surface. The PR adds and exports TtsPersonaRewriteConfig in src/config/types.tts.ts, accepts it in TtsPersonaRewriteSchema / TtsPersonaSchema, exports it through the plugin SDK, and even accepts it in src/config/zod-schema.tts.test.ts, but runtime grep shows no implementation path consuming it (git grep -n "TtsPersonaRewrite\|rewrite:" refs/tmp/pr-70748 -- src extensions docs/tools/tts.md). This means users can set messages.tts.personas.<id>.rewrite.enabled=true and it silently does nothing. Please either implement the rewrite behavior with tests/docs, or remove the rewrite schema/types/export until it exists.

Non-blocking note:

  • Google TTS model naming is still awkward. Official Google Gemini TTS docs currently list Gemini 2.5 Flash/Pro Preview TTS as supported; the PR docs/example and catalog include gemini-3.1-flash-tts-preview. My .profile live Google TTS smoke on current main passed, so I am not treating this as a blocker, but the PR should avoid presenting an undocumented model as the recommended example unless we have an explicit source or live proof attached.

Validation I ran locally while reviewing:

  • pnpm docs:list
  • OPENCLAW_LIVE_TEST=1 OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test:live extensions/google/google.live.test.ts -t "synthesizes speech through the registered provider" -> passed
  • local merge-tree conflict check against current main

Acknowledged - on it!

@barronlroth barronlroth force-pushed the feat/tts-personas-speech-core branch from 50b713c to 4a0a473 Compare April 25, 2026 18:52
@barronlroth
Copy link
Copy Markdown
Contributor Author

@steipete addressed your latest review on refreshed head 4a0a473f48.

  1. Rebase / generated baseline conflicts
  • Rebased again on current origin/main (9bd348fdec).
  • Verified the branch now merges cleanly with git merge-tree --write-tree origin/main HEAD.
  • Regenerated and checked the generated baselines after the rebase:
    • pnpm config:schema:check
    • pnpm config:docs:check
    • pnpm plugin-sdk:api:check
  1. persona.rewrite public no-op surface
  • Removed the unused persona.rewrite config surface instead of implementing new rewrite behavior in this PR.
  • Removed TtsPersonaRewriteConfig, the zod schema field, and the plugin SDK export.
  • Added a schema regression test that rejects messages.tts.personas.<id>.rewrite until real runtime behavior exists.
  1. Google TTS model naming note

Validation after the final rebase:

  • pnpm config:schema:check
  • pnpm config:docs:check
  • pnpm plugin-sdk:api:check
  • pnpm exec oxfmt --check --threads=1 src/config/types.tts.ts src/config/zod-schema.core.ts src/config/zod-schema.tts.test.ts src/plugin-sdk/config-runtime.ts src/tts/status-config.test.ts
  • OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test src/config/zod-schema.tts.test.ts src/tts/status-config.test.ts src/gateway/server-methods/tts.test.ts src/cli/capability-cli.test.ts src/auto-reply/reply/commands-tts.test.ts -- --reporter=verbose --testTimeout=5000 --hookTimeout=5000
  • OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test extensions/speech-core/src/tts.test.ts extensions/google/speech-provider.test.ts extensions/openai/speech-provider.test.ts -- --reporter=verbose --testTimeout=5000 --hookTimeout=5000

One unrelated local gate note: pnpm check:changed still fails in unchanged src/cli/plugins-cli-test-helpers.ts with TS2322/TS2345, so I did not modify it in this PR.

@barronlroth barronlroth force-pushed the feat/tts-personas-speech-core branch from 4a0a473 to 29bbb70 Compare April 25, 2026 19:47
@barronlroth
Copy link
Copy Markdown
Contributor Author

Follow-up CI fix pushed in 29bbb70044.

Addressed the failing checks from the prior head:

  • check-lint: fixed the oxlint complaint in src/config/zod-schema.tts.test.ts by using a normal rewrite property in the negative schema test.
  • checks-node-agentic-plugin-sdk: relaxed the brittle timing assertion in src/plugin-sdk/channel-entry-contract.test.ts; the test still verifies the built-artifact sidecar/profile path and guards against invalid negative timings, but no longer requires exact 0.0ms timing text.
  • checks-node-agentic-plugins: latest main already included the node:child_process partial mock fix for src/plugins/bundled-runtime-deps.test.ts; I preserved that while rebasing.

Rebased again on latest main; git merge-tree --write-tree origin/main HEAD passes locally.

Validation after the rebase:

  • pnpm config:schema:check
  • pnpm config:docs:check
  • pnpm plugin-sdk:api:check
  • pnpm lint:core --threads=1
  • OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test src/config/zod-schema.tts.test.ts src/plugin-sdk/channel-entry-contract.test.ts src/plugins/bundled-runtime-deps.test.ts -- --reporter=verbose --testTimeout=10000 --hookTimeout=10000

@barronlroth barronlroth force-pushed the feat/tts-personas-speech-core branch from 29bbb70 to 218bc56 Compare April 26, 2026 01:39
@openclaw-barnacle openclaw-barnacle Bot added the plugin: azure-speech Azure Speech plugin label Apr 26, 2026
@steipete
Copy link
Copy Markdown
Contributor

Maintainer re-review after the per-agent TTS work landed on main.

This PR is still solving a real remaining feature: provider personas are not covered by agents.list[].tts. Current main has per-agent provider/voice/model/style overrides, but it still does not have messages.tts.persona, messages.tts.personas.*, persona rewrite semantics, or provider persona mapping.

This is not ready to land yet:

  • The branch is behind current main and overlaps with the newly landed per-agent TTS config/runtime/docs work (0ca952cdd5, 9b4f0779ce, 69e7e499b1). Please rebase and make sure persona resolution composes with agents.list[].tts instead of duplicating its merge logic.
  • Current checks show failures in checks-node-core-fast-support, checks-node-agentic-commands, checks-node-core, and the parity gate. Those need fresh logs after rebase and fixes before this can be considered.
  • The public contract needs to be crisp: precedence should be documented and tested across global messages.tts, active agents.list[].tts, local /tts prefs, active persona selection, and inline [[tts:...]] directives.
  • Keep provider-specific behavior behind provider hooks. The core persona config can be generic, but Gemini/OpenAI-specific prompt/instruction shaping should stay in those provider plugins and contract tests.
  • Please add/refresh focused tests for the composed case: global persona + per-agent TTS override, per-agent selected persona if supported, /tts persona status output, and fallback provider behavior.

Best path to land:

  1. Rebase on current main.
  2. Reduce/confirm the write surface after the rebase so this PR only owns the persona contract and provider mappings, not already-landed per-agent TTS plumbing.
  3. Fix the failing CI lanes.
  4. Run at least the focused config/runtime/provider/command/gateway tests plus pnpm check:changed.
  5. If you have provider keys, live-test one Gemini/OpenAI persona synthesis path; otherwise call out exactly which key is missing.

Keeping this open because #68323 remains a valid gap, but it needs the above cleanup before merge.

@obviyus obviyus force-pushed the feat/tts-personas-speech-core branch 2 times, most recently from c0e8037 to e932fc5 Compare April 26, 2026 03:46
@obviyus
Copy link
Copy Markdown
Contributor

obviyus commented Apr 26, 2026

Maintainer patch pushed in e932fc55ec.

Addressed the re-review items: rebased on current main, kept persona resolution composed with agents.list[].tts, kept provider-specific shaping in provider plugins, and added focused composed persona/status tests.

Local proof:

  • pnpm test extensions/speech-core/src/tts.test.ts src/tts/status-config.test.ts src/auto-reply/reply/commands-tts.test.ts
  • pnpm config:schema:check && pnpm config:docs:check && pnpm plugin-sdk:api:check
  • pnpm build

pnpm check:changed currently stops on an unrelated current-main typecheck error in src/plugins/bundle-commands.test.ts (TS1117, duplicate object key); this PR does not touch that file.

@obviyus obviyus force-pushed the feat/tts-personas-speech-core branch from e932fc5 to ac8aa08 Compare April 26, 2026 03:49
@obviyus obviyus self-assigned this Apr 26, 2026
@obviyus obviyus force-pushed the feat/tts-personas-speech-core branch from ac8aa08 to ab2d31c Compare April 26, 2026 04:09
Copy link
Copy Markdown
Contributor

@obviyus obviyus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified provider personas compose with per-agent TTS config instead of duplicating the agent merge path, including /tts persona status and composed provider override coverage.

Maintainer follow-up: rebased onto current main, removed the generated sha256 baseline churn, and moved the changelog entry into the active release block.

Local gate: pnpm config:schema:check, focused TTS command/status/speech-core tests, and pnpm build.

@obviyus obviyus merged commit bcc9fc4 into openclaw:main Apr 26, 2026
60 of 62 checks passed
@obviyus
Copy link
Copy Markdown
Contributor

obviyus commented Apr 26, 2026

Landed on main.

Thanks @barronlroth.

vincentkoc added a commit that referenced this pull request Apr 26, 2026
The TTS doc had grown to 1008 lines with 11 separate flat 'X primary'
config blocks, a 100-line dense 'Notes on fields' bullet list, and
the new provider-personas feature (#70748) buried near the bottom.
Restructure for readability and feature visibility:

- Lead with a Steps-based 'Quick start' so first-time readers can
  enable TTS in 4 explicit steps.
- Replace the 13-bullet provider list with a single 'Supported
  providers' table that names auth env vars and per-provider notes
  inline. Add a Warning callout for the Microsoft/edge legacy alias.
- Collapse the 11 'X primary' config blocks into one Tabs component
  ('OpenAI + ElevenLabs', 'Google Gemini', 'Azure Speech',
  'Microsoft (no key)', 'MiniMax', 'Inworld', 'xAI', 'Volcengine',
  'Xiaomi MiMo', 'OpenRouter', 'Gradium', 'Local CLI') so users see
  one preset at a time and the page is scannable.
- Promote 'Personas' to its own top-level section with two examples
  (minimal and the Alfred provider-neutral persona), and add a new
  'How providers use persona prompts' AccordionGroup covering Google
  (promptTemplate audio-profile-v1, personaPrompt), OpenAI
  (instructions auto-mapping), and Other providers, plus a fallback
  policy table.
- Note that agents.list[].tts.persona overrides global persona
  per-agent (covers the recent feat(tts) per-agent voice-override
  work).
- Convert the 100-line 'Notes on fields' wall into a per-provider
  AccordionGroup using ParamField, so the field reference is
  scannable and field types/defaults are visually distinct.
- Sentence-case headings, drop redundant body H1, fold the flow
  diagram inline with Auto-TTS behavior, and refresh the Output
  formats section to a table-first layout.
- Schema fields (label/description/provider/fallbackPolicy/prompt
  with profile/scene/sampleContext/style/accent/pacing/constraints
  and providers map) verified against src/config/types.tts.ts; all
  defaults and env-var fallbacks preserved verbatim.

Net diff: 585 insertions, 684 deletions across the same surface
area.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli CLI command changes docs Improvements or additions to documentation extensions: openai extensions: tts-local-cli gateway Gateway runtime plugin: azure-speech Azure Speech plugin size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(tts): add provider-agnostic TTS personas in speech-core

3 participants