Skip to content

feat: server-backed Edge TTS for voice output #19025

@dzianisv

Description

@dzianisv

Feature Request

Problem

The current voice output relies on the browser's built-in speechSynthesis API, which has limited voice quality and inconsistent behavior across browsers. Some browsers (especially on Linux) have poor or no TTS support at all.

Proposed Solution

Add server-backed Edge TTS synthesis using Microsoft's Edge TTS engine via node-edge-tts. This provides high-quality, natural-sounding voices that work consistently across all browsers.

Implementation overview:

  • Server route POST /tts/edge — accepts { text, voice?, rate?, pitch?, volume? }, returns audio/mpeg
  • Config — optional voice.edge section with enabled, voice, rate, pitch, volume fields; works out of the box with sensible defaults (no config required)
  • App fallback — the speak() function in message-timeline.tsx tries the server endpoint first, falls back to browser speechSynthesis on failure
  • Type declarations — ambient types for node-edge-tts module

Files involved

  • packages/opencode/src/tts/edge.ts — synthesis wrapper
  • packages/opencode/src/server/routes/tts.ts — Hono route handler
  • packages/opencode/src/config/config.ts — voice.edge config schema
  • packages/app/src/pages/session/message-timeline.tsx — server-first speak()
  • packages/opencode/test/tts/ — unit and route tests

Reference Implementation

A working implementation is available at https://github.com/dzianisv/opencode (PR will be linked).

Metadata

Metadata

Assignees

Labels

coreAnything pertaining to core functionality of the application (opencode server stuff)

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions