feat(chat): add voice input with real-time transcription by guitavano · Pull Request #3105 · decocms/studio

guitavano · 2026-04-13T01:56:51Z

Summary

Adds a microphone button to the chat input (only shown when the browser supports SpeechRecognition)
Real-time speech-to-text using the browser-native Web Speech API — no backend changes required
Live waveform visualizer driven by the Web Audio API (AnalyserNode) animates while listening
Transcribed text is appended to any existing editor content (not replaced), so the user can mix typing and dictation freely
Cancel (×) and confirm (✓) controls match the existing input style
Proper microphone permission handling: getUserMedia is called upfront to trigger the browser prompt; if denied, the mic button turns red with a descriptive tooltip and clicking it re-prompts

Test plan

Click the mic button — browser should ask for microphone permission
Deny permission — mic button should turn red with tooltip "Microphone access denied — click to try again"
Grant permission — waveform overlay should appear and animate while speaking
Speak — interim transcript (dimmed) should appear live, final transcript (solid) should accumulate
Press ✓ — transcribed text should be inserted at the cursor / appended to existing text in the editor
Press × — overlay should dismiss with no changes to the editor
Type something first, then use voice — transcribed text should be appended after existing content
Test on Firefox (Speech API unsupported) — mic button should not appear

Made with Cursor

Summary by cubic

Add voice input to chat with real-time transcription that types directly into the editor and shows a live waveform. During recording the editor locks; cancel restores previous text, confirm keeps the transcription.

New Features
- Mic button shows only when SpeechRecognition is available; hidden on unsupported browsers.
- Transcript appears live in the editor while recording; input is disabled to prevent edits.
- Bottom bar switches to waveform with ×/✓; cancel restores prior content, confirm keeps the text.
- Permission flow via getUserMedia; denied state turns the mic red with a tooltip and allows retry.
- New use-voice-input hook and VoiceWaveform; TiptapInput adds appendText(), syncVoiceText(), and restoreContent().
Bug Fixes
- Added Web Speech API global types in globals.d.ts for SpeechRecognition and related interfaces to fix TypeScript builds and noUncheckedIndexedAccess issues.
- Prevented losing last interim words on stop and ensured a space is added before appended voice text when baseline content exists.
- Removed unused VoiceInputOverlay export.

^{Written for commit ec23fa4. Summary will update on new commits.}

Made-with: Cursor

github-actions · 2026-04-13T01:57:05Z

🧪 Benchmark

Should we run the Virtual MCP strategy benchmark for this PR?

React with 👍 to run the benchmark.

Reaction	Action
👍	Run quick benchmark (10 & 128 tools)

Benchmark will run on the next push after you react.

github-actions · 2026-04-13T01:57:06Z

Release Options

Suggested: Minor (2.261.0) — based on feat: prefix

React with an emoji to override the release type:

Reaction	Type	Next Version
👍	Prerelease	`2.260.3-alpha.1`
🎉	Patch	`2.260.3`
❤️	Minor	`2.261.0`
🚀	Major	`3.0.0`

Current version: 2.260.2

Note: If multiple reactions exist, the smallest bump wins. If no reactions, the suggested bump is used (default: patch).

…Access Made-with: Cursor

cubic-dev-ai

1 issue found across 4 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/mesh/src/web/components/chat/input.tsx">

<violation number="1" location="apps/mesh/src/web/components/chat/input.tsx:334">
P1: Transcription can be lost on confirm because `appendText` is called while `TiptapInput` is unmounted in recording mode.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

apps/mesh/src/web/components/chat/input.tsx

Transcript now appears live in the normal text area instead of a separate overlay. The bottom bar switches to waveform + accept/decline during recording. Waveform uses chart-2 color and reads low-mid frequency bins where speech energy lives. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

cubic-dev-ai

2 issues found across 3 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/mesh/src/web/components/chat/input.tsx">

<violation number="1" location="apps/mesh/src/web/components/chat/input.tsx:337">
P1: Use the final text returned by `stopRecording()` when confirming, otherwise the last dictated words can be lost.</violation>

<violation number="2" location="apps/mesh/src/web/components/chat/input.tsx:354">
P2: Prefix dictated text with a space when baseline content is non-empty so transcription truly appends instead of concatenating words.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

apps/mesh/src/web/components/chat/input.tsx

Use the final text returned by stopRecording() to guarantee interim words captured in the ref are committed. Add a space before voice text when baseline content is non-empty to avoid word concatenation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Made-with: Cursor

viktormarinho

lgtm

* feat(chat): add voice input with real-time transcription and waveform Made-with: Cursor * fix(chat): add Web Speech API global types and fix noUncheckedIndexedAccess Made-with: Cursor * fix(chat): rework voice input to type directly into textarea Transcript now appears live in the normal text area instead of a separate overlay. The bottom bar switches to waveform + accept/decline during recording. Waveform uses chart-2 color and reads low-mid frequency bins where speech energy lives. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(chat): prevent last words being lost and fix space before voice text Use the final text returned by stopRecording() to guarantee interim words captured in the ref are committed. Add a space before voice text when baseline content is non-empty to avoid word concatenation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(chat): remove unused VoiceInputOverlay export Made-with: Cursor --------- Co-authored-by: rafavalls <valls@deco.cx> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(chat): add voice input with real-time transcription and waveform

6ca8111

Made-with: Cursor

fix(chat): add Web Speech API global types and fix noUncheckedIndexed…

76b5b64

…Access Made-with: Cursor

cubic-dev-ai bot reviewed Apr 13, 2026

View reviewed changes

apps/mesh/src/web/components/chat/input.tsx Outdated Show resolved Hide resolved

cubic-dev-ai bot reviewed Apr 13, 2026

View reviewed changes

apps/mesh/src/web/components/chat/input.tsx Outdated Show resolved Hide resolved

apps/mesh/src/web/components/chat/input.tsx Show resolved Hide resolved

rafavalls and others added 2 commits April 13, 2026 14:18

fix(chat): remove unused VoiceInputOverlay export

ec23fa4

Made-with: Cursor

viktormarinho approved these changes Apr 13, 2026

View reviewed changes

guitavano merged commit a4ac87e into main Apr 13, 2026
15 checks passed

guitavano deleted the feat/voice-input-chat branch April 13, 2026 18:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(chat): add voice input with real-time transcription#3105

feat(chat): add voice input with real-time transcription#3105
guitavano merged 5 commits intomainfrom
feat/voice-input-chat

guitavano commented Apr 13, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

github-actions bot commented Apr 13, 2026

Uh oh!

github-actions bot commented Apr 13, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

viktormarinho left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

guitavano commented Apr 13, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by cubic

Uh oh!

github-actions bot commented Apr 13, 2026

🧪 Benchmark

Uh oh!

github-actions bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release Options

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

viktormarinho left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

guitavano commented Apr 13, 2026 •

edited by cubic-dev-ai bot

Loading

github-actions bot commented Apr 13, 2026 •

edited

Loading