Skip to content

fix(wasm): remove prefill sleep — restores token streaming#33

Merged
unamedkr merged 1 commit intomainfrom
fix/wasm-remove-prefill-sleep
Apr 10, 2026
Merged

fix(wasm): remove prefill sleep — restores token streaming#33
unamedkr merged 1 commit intomainfrom
fix/wasm-remove-prefill-sleep

Conversation

@unamedkr
Copy link
Copy Markdown
Collaborator

Root cause

PR #30 added emscripten_sleep(0) inside quant.h's prefill loop to prevent UI hang. But the call stack at that point is too deep (quant_generate → prefill → tq_forward → matmul → SIMD) for ASYNCIFY to unwind. This silently broke ASYNCIFY for the ENTIRE generate call — including the token callback's sleep, killing streaming.

Fix

Remove the prefill sleep. Accept the prefill blocking (few seconds) with "Thinking..." shown via requestAnimationFrame. Token streaming during generation works again.

Behavior after fix

  1. Enter pressed → "Thinking..." shows (via double-rAF)
  2. Prefill blocks browser 2-8s (unavoidable without step API)
  3. Tokens stream in real-time ← THIS WAS BROKEN, NOW FIXED
  4. "X tokens, Y tok/s" updates live

🤖 Generated with Claude Code

The emscripten_sleep(0) added to quant.h's prefill loop (PR #30)
broke ASYNCIFY for the entire quant_generate call. The call stack
during tq_forward() is too deep (matmul → SIMD kernels) for
ASYNCIFY to unwind/rewind — it silently fails and the generation
callback's sleep stops working too.

Fix: remove prefill sleep entirely. The prefill blocks the browser
for a few seconds (unavoidable without a step-by-step API), but
"Thinking..." is shown before via requestAnimationFrame. Token
streaming during generation works again.

Also: pthreads removed (PR #32) to avoid pthreads+ASYNCIFY
conflict, build.sh now uses single-thread SIMD + ASYNCIFY only.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@unamedkr unamedkr merged commit 01e0a2d into main Apr 10, 2026
@unamedkr unamedkr deleted the fix/wasm-remove-prefill-sleep branch April 10, 2026 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant