You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The atomic "voice moves up a tier" switch: relocate the realtime conversational surface to application chrome (attached to the Overseer), and in the same move replace the per-session conversational voice button with lightweight dictation (composer mic) + on-demand read-back. One coherent moment — never an interim state with two ways to talk in a session.
Per-session half: the per-session conversational button is removed and replaced by (a) composer-mic dictation (STT → composer, manual send, transcription reviewable before send — the ChatGPT dictate pattern) and (b) on-demand "summarize & read back" TTS of the last turn / important result. Speech transport only; no realtime turn-loop in a worker session.
Why atomic: keeping the heavy per-session conversation alive until this moment preserves the habit-teacher (#21) and avoids any conversation-less gap on the daily driver; doing the per-session swap at the same instant avoids an interim "two ways to talk" UX. Conceptual basis: cross-session conversation is meaningful because work sessions are things-to-do with completion states (an inbox to arbitrate), unlike independent chat topics — so conversation belongs at the fleet/Overseer tier, and the session keeps only speech transport.
The always-on inbox failure mode (persistent low-amplitude anxiety): aggressive bounded-deferral defaults (§9), no sounds for routine items, clear "Overseer wants you" vs "Overseer is processing" distinction. Failure mode to avoid: "Clippy got into cocaine".
Auto-send hazard for dictation — a per-session message is an instruction a worker acts on; manual-send-after-readback is required, not optional.
Goal
The atomic "voice moves up a tier" switch: relocate the realtime conversational surface to application chrome (attached to the Overseer), and in the same move replace the per-session conversational voice button with lightweight dictation (composer mic) + on-demand read-back. One coherent moment — never an interim state with two ways to talk in a session.
Spec
docs/plans/2026-06-03-overseer-build-sequence.mdStep 5 (primary)docs/plans/2026-06-03-overseer-framing.md"The chrome-button insight", "Why voice belongs at the Overseer tier, not the worker tier"docs/plans/2026-06-03-overseer-prioritization.md§9 (attention budget)The atomic switch (both halves fire together)
voiceFocuskind re-targetssession → overseer, not a rewrite).Why atomic: keeping the heavy per-session conversation alive until this moment preserves the habit-teacher (#21) and avoids any conversation-less gap on the daily driver; doing the per-session swap at the same instant avoids an interim "two ways to talk" UX. Conceptual basis: cross-session conversation is meaningful because work sessions are things-to-do with completion states (an inbox to arbitrate), unlike independent chat topics — so conversation belongs at the fleet/Overseer tier, and the session keeps only speech transport.
Acceptance
Out of scope
Dependencies
Suggested PR breakdown
1-2 PRs: chrome Overseer voice button; per-session swap (remove conversational button + add composer-mic dictation + read-back); mobile-web smoke tests.
Risks