Skip to content

ux(wasm): Thinking... indicator during prompt prefill#26

Merged
unamedkr merged 1 commit intomainfrom
ux/thinking-indicator
Apr 10, 2026
Merged

ux(wasm): Thinking... indicator during prompt prefill#26
unamedkr merged 1 commit intomainfrom
ux/thinking-indicator

Conversation

@unamedkr
Copy link
Copy Markdown
Collaborator

Summary

Users reported the WASM demo feels "hung" after sending the first message — blank assistant bubble for several seconds before tokens start appearing.

Cause: Prompt prefill (tokenization + processing all prompt tokens through 28 layers) takes 3-10s in single-threaded WASM. No visual feedback during this phase.

Fix: Show a spinner + "Thinking..." in the assistant bubble immediately after sending. Replaced by the first streamed token. Stats bar shows "Processing prompt..." during prefill.

Test plan

  • Send message → see spinner + "Thinking..." immediately
  • First token replaces the indicator
  • Stats bar shows "Processing prompt..." then switches to tok/s

🤖 Generated with Claude Code

First-token latency can be several seconds on a 0.8B model in WASM
(processing the full prompt through 28 layers in single-threaded
WASM). Without feedback, users see a blank assistant bubble and
think the demo is broken.

Add a spinner + "Thinking..." message inside the assistant bubble
that appears immediately after sending. Replaced by the first
streamed token. Also show "Processing prompt..." in the stats bar.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@unamedkr unamedkr merged commit 163affe into main Apr 10, 2026
@unamedkr unamedkr deleted the ux/thinking-indicator branch April 10, 2026 07:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant