Skip to content

fix: prevent JS stack overflow from crashing the process#15

Draft
wytzepiet wants to merge 2 commits into
fluttercandies:mainfrom
wytzepiet:feat/js-stack-safety
Draft

fix: prevent JS stack overflow from crashing the process#15
wytzepiet wants to merge 2 commits into
fluttercandies:mainfrom
wytzepiet:feat/js-stack-safety

Conversation

@wytzepiet
Copy link
Copy Markdown

Problem

QuickJS's stack overflow check is a soft limit: it only fires once JS grows past max_stack_size, and it does not know the real native thread stack. JS runs on flutter_rust_bridge's tokio workers, whose stack defaults to 2 MB. So:

  • If max_stack_size is set near or above the thread stack (or 0, which QuickJS treats as "no limit"), JS overflows the native stack first and the whole process aborts (SIGABRT) instead of throwing a catchable RangeError.
  • The 256 KB default is safe but very shallow for deep call trees (e.g. a recursive UI render).

Fix

  1. Clamp max_stack_size below the thread stack, mapping 0 to that ceiling, so overflow is always a catchable RangeError, never a crash.
  2. Run all JS on a dedicated runtime whose threads have an 8 MB stack, and default the budget to 6 MB. Deep-but-normal JS now runs; overflow still throws. QuickJS refreshes its overflow baseline on each entry, so spreading JS across these threads is safe.

No public API changes, no new bridged methods, no codegen. execute_pending_job's signature is unchanged.

Tests

Host-side tests (no device): eval and pumped jobs reach the same depth and both throw a catchable error; a bigger budget reaches deeper on both paths; the default budget is generous; an over-large budget is clamped instead of crashing a worker thread.

🤖 Generated with Claude Code

wytzepiet and others added 2 commits June 1, 2026 01:22
…hing

QuickJS's stack overflow check is a soft limit measured against the native
thread stack. If max_stack_size reaches the thread stack, JS overflows the
real stack first and the process aborts (SIGABRT) instead of throwing a
catchable RangeError. JS runs on flutter_rust_bridge's tokio workers (2 MB
stack, which fjs does not set), and `0` means "no limit" to QuickJS — both
overflow the native stack.

Clamp the requested budget to 3/4 of the thread stack, mapping 0 to that
ceiling, so overflow stays a catchable RangeError. Tests cover that eval and
pumped jobs reach the same depth, a bigger budget reaches deeper, and an
over-large budget no longer crashes a worker thread.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
flutter_rust_bridge runs Rust on tokio workers whose stack defaults to 2 MB,
which fjs cannot resize. That is far below a browser's ~8 MB JS stack, so
deep-but-normal JS (e.g. a recursive UI render) overflows the native stack and
aborts the process instead of throwing.

Run every JS entry (eval, call, module eval, job pump) on fjs's own runtime
whose threads have an 8 MB stack, via a small `with_js` helper and a global
`js_executor`. QuickJS refreshes its overflow baseline on each entry, so
spreading JS across these threads stays correct. Default the budget to 6 MB
(3/4 of the thread stack) and keep the clamp relative to it, so deep UIs render
while overflow stays a catchable RangeError. The sync runtime, which runs on the
caller's thread, keeps its conservative ceiling.

No API changes and no new bridged methods; `execute_pending_job`'s signature is
unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@iota9star
Copy link
Copy Markdown
Member

Review findings:

  • Blocking: JsAsyncRuntime::new() does not apply the new default MAX_SAFE_STACK_SIZE; only create() does. In libfjs/src/api/runtime.rs, new() constructs the runtime/loaders but never calls set_max_stack_size, while create() does. That means the Dart-facing JsAsyncRuntime() factory still gets the QuickJS 256 KB default instead of the intended 6 MB default.

    I verified this by comparing recursion depth through both constructors: new() failed around depth 121, while create() reached around depth 738.

    Suggested fix: set the max stack size in new() too, or factor runtime construction through a shared helper used by both constructors. Please add a regression test that proves both constructors get the same safer default.

  • Integration note with feat(runtime): background driver so async work (fetch, timers) resolves on its own #12: once the background driver PR is merged, the driver still needs to execute JS jobs on the dedicated JS executor thread. In a merged feat(runtime): background driver so async work (fetch, timers) resolves on its own #12 + fix: prevent JS stack overflow from crashing the process #15 checkout, start_drive() still uses tokio::spawn(self.rt.drive()), so async jobs pumped by the background driver run on tokio-rt-worker rather than fjs-js, bypassing the larger-stack protection. Consider using crate::js_executor::spawn(self.rt.drive()) or equivalent, plus a regression test for driver-pumped jobs.

Verification run on this PR: cargo test --manifest-path libfjs/Cargo.toml passed locally with 538 tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants