Naturalize floating bar voice streaming by kodjima33 · Pull Request #6259 · BasedHardware/omi

kodjima33 · 2026-04-01T19:29:32Z

Summary

switch the default ElevenLabs voice from Rachel to Sloane for a less generic release voice
make streaming wait for sentence-sized chunks before speaking, with a larger emergency cutoff to avoid stitched robot prosody
slightly retune ElevenLabs voice settings and align the settings copy with the shipped default voice

Verification

intended verification target was the Mac mini only
the Mac mini became unreachable over SSH during this pass, so I could not complete the remote compile/run loop before merging
root cause for the reported bad voice was verified from source and release tags: v0.11.214 already contains streaming playback, still defaults to generic voices without a custom voice id, and chunks aggressively enough to sound robotic

greptile-apps · 2026-04-01T19:34:18Z

Greptile Summary

This PR improves voice playback quality in the floating control bar by switching the default ElevenLabs voice from Rachel to Sloane, retuning voice settings for more natural delivery, and significantly reworking the text-chunking heuristic to wait for sentence-level boundaries before handing text off to the TTS API — reducing the "stitched robot prosody" caused by over-eager chunking.

Key changes:

Default voice ID changed to Sloane (BAMYoBHLZM7lJgJAmFz0) in all three places it appears (service, settings UI, placeholder text).
Chunk thresholds raised: minimum 48 → 85 chars, preferred 140 → 220 chars, and a new 360-char emergency ceiling added.
nextChunkBoundary now has three tiers: (1) sentence-ending punctuation within the preferred window, (2) sentence-ending punctuation extended to the emergency window, then (3) clause separators / whitespace / hard cut only when the emergency ceiling is reached.
ElevenLabs voice settings retuned (stability 0.42 → 0.34, similarity boost 0.82 → 0.88, style 0.22 → 0.12).
System-voice fallback preference list expanded with "Ava" and "Allison" ahead of "Samantha".
floatingBarVoiceAnswersEnabled doc-comment de-scoped from "development builds" to all builds.
Note: Per the PR description, the intended hardware verification target (Mac mini) became unreachable before merging. The logic change was reviewed from source, but a live compile/run cycle was not completed.

Confidence Score: 5/5

Safe to merge — all findings are minor style suggestions with no functional impact.
The chunking logic is sound: all index arithmetic is bounded by min(text.count, limit), no infinite-loop risk exists in drainBufferedText, and the ElevenLabs error path already falls back to the system voice. The only finding is a P2 redundant character in the clause-separator set. The unverified hardware run is noted in the PR description as a known gap, not a regression introduced by this change.
No files require special attention beyond the one P2 style note in FloatingBarVoicePlaybackService.swift.

Important Files Changed

Filename	Overview
desktop/Desktop/Sources/FloatingControlBar/FloatingBarVoicePlaybackService.swift	Default ElevenLabs voice changed from Rachel to Sloane; chunk thresholds raised (min 48→85, preferred 140→220, new emergency 360); voice settings retuned; system voice preference list expanded — one minor redundancy in the clause-separator character set.
desktop/Desktop/Sources/FloatingControlBar/ShortcutSettings.swift	Doc-comment updated to remove "development builds" qualifier, accurately reflecting that voice answers are now a general feature.
desktop/Desktop/Sources/MainWindow/Pages/SettingsPage.swift	UI strings and placeholder voice ID updated from Rachel/`21m00Tcm4TlvDq8ikWAM` to Sloane/`BAMYoBHLZM7lJgJAmFz0` to match the new default.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[New streamed text arrives] --> B{text.count >= minimumChunkLength\n85 chars?}
    B -- No --> WAIT[Return nil — buffer more text]
    B -- Yes --> C[Search preferredSlice\n0..220 chars for '.!?\\n']
    C -- Found --> SPLIT1[Split after last sentence-ending punctuation]
    C -- Not found --> D{text.count >= preferredChunkLength\n220 chars?}
    D -- No --> WAIT
    D -- Yes --> E[Search emergencySlice\n0..360 chars for '.!?\\n']
    E -- Found --> SPLIT2[Split after punctuation in emergency window]
    E -- Not found --> F{text.count >= emergencyChunkLength\n360 chars?}
    F -- No --> WAIT
    F -- Yes --> G[Search emergencySlice for ',;:']
    G -- Found --> SPLIT3[Split after clause separator]
    G -- Not found --> H[Search emergencySlice for whitespace]
    H -- Found --> SPLIT4[Split at last whitespace]
    H -- Not found --> SPLIT5[Hard cut at emergencyLimit]

_{Reviews (1): Last reviewed commit: "Naturalize floating bar voice streaming" | Re-trigger Greptile}

greptile-apps · 2026-04-01T19:34:21Z

+
+    guard text.count >= emergencyChunkLength else { return nil }
+
+    if let clauseIndex = emergencySlice.lastIndex(where: { ",;:\n".contains($0) }) {


Redundant \n in clause separator set

At this point in the control flow, the preceding emergencySlice.lastIndex(where: { ".!?\n".contains($0) }) check on the same slice has already returned nil, which guarantees there is no \n character anywhere within emergencySlice. Including \n in ",;:\n" is therefore unreachable dead code on this path.

Suggested change

if let clauseIndex = emergencySlice.lastIndex(where: { ",;:\n".contains($0) }) {

if let clauseIndex = emergencySlice.lastIndex(where: { ",;:".contains($0) }) {

## Summary - switch the default ElevenLabs voice from Rachel to Sloane for a less generic release voice - make streaming wait for sentence-sized chunks before speaking, with a larger emergency cutoff to avoid stitched robot prosody - slightly retune ElevenLabs voice settings and align the settings copy with the shipped default voice ## Verification - intended verification target was the Mac mini only - the Mac mini became unreachable over SSH during this pass, so I could not complete the remote compile/run loop before merging - root cause for the reported bad voice was verified from source and release tags: v0.11.214 already contains streaming playback, still defaults to generic voices without a custom voice id, and chunks aggressively enough to sound robotic

Naturalize floating bar voice streaming

7a94f5d

kodjima33 merged commit fdbff29 into main Apr 1, 2026
2 checks passed

kodjima33 deleted the nik/naturalize-streaming-voice branch April 1, 2026 19:29

greptile-apps Bot reviewed Apr 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Naturalize floating bar voice streaming#6259

Naturalize floating bar voice streaming#6259
kodjima33 merged 1 commit into
mainfrom
nik/naturalize-streaming-voice

kodjima33 commented Apr 1, 2026

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 1, 2026

Uh oh!

greptile-apps Bot Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		guard text.count >= emergencyChunkLength else { return nil }

		if let clauseIndex = emergencySlice.lastIndex(where: { ",;:\n".contains($0) }) {

Conversation

kodjima33 commented Apr 1, 2026

Summary

Verification

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 1, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant