Skip to content

split call_function into 14 inline-never family dispatchers#504

Closed
danieljohnmorris wants to merge 3 commits into
mainfrom
fix/call-function-decompose
Closed

split call_function into 14 inline-never family dispatchers#504
danieljohnmorris wants to merge 3 commits into
mainfrom
fix/call-function-decompose

Conversation

@danieljohnmorris
Copy link
Copy Markdown
Collaborator

Summary

  • call_function was a single ~4,500-line function with ~134 if builtin == ... arms. In debug builds, rustc reserves stack slots for every local in a frame simultaneously even across mutually exclusive branches, so each recursive user-fn call consumed several hundred KiB of stack.
  • PR feature: crypto primitives - sha256, hmac-sha256, base64, hex, ct-eq #494 (sha2/hmac crate) and PR feature: --allow-net/read/write/run CLI capability flags #500 (capability flags) both needed band-aids: a per-test 8 MiB thread wrapper and RUST_MIN_STACK=16777216 in CI respectively.
  • This PR fixes the root cause by splitting the arms into 14 #[inline(never)] family dispatchers. The new call_function is a thin match router (<100 lines). Per-frame stack pressure is now O(one family's locals) regardless of how many builtins exist.

Repro before / after

Before (debug build, default 8 MiB stack):

ilo "fib n:n>n;<=n 1 n;a=fib -n 1;b=fib -n 2;+a b" fib 30
# => SIGSEGV / thread stack overflow at around depth 22-25

After:

ilo "fib n:n>n;<=n 1 n;a=fib -n 1;b=fib -n 2;+a b" fib 30
# => 832040

What's in the diff

Commit 1 - split call_function into 14 inline-never family dispatchers

  • dispatch_map_builtins, dispatch_linalg_builtins, dispatch_math_builtins, dispatch_datetime_builtins, dispatch_basic_builtins, dispatch_list_builtins, dispatch_io_builtins, dispatch_text_builtins, dispatch_fs_builtins, dispatch_json_builtins, dispatch_hof_builtins, dispatch_stat_builtins, dispatch_regex_builtins, dispatch_fft_builtins
  • resolve_fn_ref and closure_captures promoted from nested functions inside call_function to module-level items, shared by list and HOF dispatchers
  • HOF dispatchers (list, hof) take env: &mut Env because srt/rsrt key-fn and map/flt/fld call call_function recursively
  • New call_function is a thin match builtin { ... } router with Len inlined (high-frequency, trivial)
  • No functional change to any builtin's behaviour

Commit 2 - regression tests (tests/regression_call_function_stack.rs)

  • fib30_no_stack_overflow_vm / fib30_no_stack_overflow_jit
  • fold_range_no_stack_overflow_vm / fold_range_no_stack_overflow_jit
  • fib30_cross_engine_parity

Commit 3 - examples/deep-recursion.ilo

  • fib(10/20/30) and sumto(100/1000) with -- run: / -- out: assertions; exercises all example engines

Test plan

  • cargo build clean (no warnings)
  • cargo clippy -- -D warnings clean
  • cargo test --lib - 3274 passed / 0 failed (AOT tests need libilo.a via symlink)
  • cargo test - all integration tests pass including new regression_call_function_stack
  • cargo test --test examples_engines - deep-recursion.ilo assertions pass
  • fib(30) = 832040 on VM and JIT in debug build, no overflow

Follow-ups

The monolithic ~4500-line call_function with ~134 arms caused stack
overflows in debug builds because rustc reserves stack slots for every
local in a frame simultaneously, even across mutually exclusive branches.
PR #494 and PR #500 both needed band-aids (per-test thread wrapper and
RUST_MIN_STACK=16MiB) as a result.

Extract the arms into 14 #[inline(never)] family dispatchers:
dispatch_map_builtins, dispatch_linalg_builtins, dispatch_math_builtins,
dispatch_datetime_builtins, dispatch_basic_builtins, dispatch_list_builtins,
dispatch_io_builtins, dispatch_text_builtins, dispatch_fs_builtins,
dispatch_json_builtins, dispatch_hof_builtins, dispatch_stat_builtins,
dispatch_regex_builtins, dispatch_fft_builtins.

The new call_function is a thin match router (<100 lines). Each recursive
call now uses only the router frame plus the one family that executes, so
per-frame pressure is O(one family's locals) regardless of total builtin count.

HOF dispatchers (list, hof) receive env: &mut Env because srt/rsrt key-fn
and map/flt/fld/etc. call call_function recursively. Non-HOF dispatchers
are pure.

resolve_fn_ref and closure_captures are promoted from nested functions
inside call_function to module-level items, visible to all dispatchers.
fib(30) = 832040 and fold over range 0..1000 (sum = 499500) must complete
on the default 8 MiB thread stack in debug builds. Both tests run on VM
and JIT engines to cover all call_function dispatch paths.

Before the fix these tests would have hit SIGSEGV at around fib(22-25)
depending on platform.
Demonstrates fib(30) and sum via fld over range 0..1000 - the same
patterns that would have overflowed with the old monolithic call_function.
Acts as an in-context learning example for agents and an additional
regression harness via examples_engines.
@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 55.64516% with 110 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/interpreter/mod.rs 55.64% 110 Missing ⚠️

📢 Thoughts on this report? Let us know!

@danieljohnmorris
Copy link
Copy Markdown
Collaborator Author

Closing to unstick the merge queue. The keep-both rebase strategy can't handle this PR cleanly (it touches the same dispatch table every other PR appends to, producing broken-brace artifacts on rebase). Will reimplement against current main after the rest of the queue drains.

@danieljohnmorris
Copy link
Copy Markdown
Collaborator Author

Superseded by fresh-impl PR (in flight on feature/-v2 branch). The original PR's keep-both rebase produced broken Rust that can't be unstuck without manual brace surgery. The v2 PR is a clean reimpl against current main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant