Skip to content

feat(overlay): show real-time interim ASR text during recording#6

Merged
missuo merged 3 commits into
missuo:mainfrom
erning:feat/overlay-interim-text
Mar 25, 2026
Merged

feat(overlay): show real-time interim ASR text during recording#6
missuo merged 3 commits into
missuo:mainfrom
erning:feat/overlay-interim-text

Conversation

@erning

@erning erning commented Mar 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add on_interim_text FFI callback so the Obj-C frontend receives partial ASR results in real-time
  • Display interim recognition text in the floating overlay pill during recording, replacing the static "Listening…" label
  • Update README and DESIGN.md to document the overlay and interim text feature

Details

  • Pill width grows with text but never shrinks within a session (animated, 0.15s ease-out)
  • Width clamped to screen minus 32px margins on each side
  • Long text is left-truncated showing the trailing portion with a gradient fade
  • Before the first interim result arrives, display is unchanged ("Listening…")
  • All other overlay states (connecting, recognizing, thinking, pasting, error) are unchanged

Demo

interim-text.mp4

Test plan

  • Hold hotkey and speak — overlay shows real-time interim text
  • Pill grows but never shrinks during one session
  • Long text shows trailing portion with left gradient fade
  • Release hotkey — transitions normally through finalizing → correcting → pasting → idle
  • Next session starts fresh with "Listening…"
  • Tap-to-toggle mode works identically
  • Non-recording states display unchanged

erning added 3 commits March 25, 2026 09:29
Add on_interim_text callback to SPCallbacks so the Obj-C frontend
receives partial ASR results in real-time during recording. The
callback is invoked from both the audio streaming loop and the
post-recording drain in wait_for_final.

Bridge, delegate, and AppDelegate forwarding are wired up. Overlay
declares updateInterimText: with a no-op stub for now.
Replace the static "Listening…" label with real-time interim
recognition text as it arrives from the ASR service.

- Pill width grows with text but never shrinks within a session
- Width clamped to screen minus margins (32px each side)
- Width transitions use 0.15s ease-out animation
- Long text is left-truncated showing the trailing portion with
  a gradient fade on the left edge
- State transitions clear interim text and reset max width
- Before first interim arrives, display is unchanged ("Listening…")
Update README and DESIGN.md to account for the floating status overlay
and real-time interim ASR text display during recording.

- Replace "No GUI at all" / "No visible GUI" with "Minimal GUI"
- Remove "No floating screen bubble" from non-goals
- Add Section 4.3 describing the floating overlay behavior
- Add interim text display step to both Path A and Path B flows
- Note overlay display in ASR Pipeline description
@missuo

missuo commented Mar 25, 2026

Copy link
Copy Markdown
Owner

Cool!

@missuo missuo merged commit 8381b99 into missuo:main Mar 25, 2026
@missuo

missuo commented Mar 25, 2026

Copy link
Copy Markdown
Owner

After my test, the overlay works fine when you speak continuously, but as long as you pause for 2-3 seconds while speaking, none of the subsequent text will be displayed.

@missuo

missuo commented Mar 25, 2026

Copy link
Copy Markdown
Owner

Fixed in c6aa702

@erning erning deleted the feat/overlay-interim-text branch March 25, 2026 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants