Deep recursion ~25% slower since rc.0 — cheapen per-call depth-limit bookkeeping

## Summary

The `string.format` / `table` / dispatcher performance pass landing in `v1.0.0-rc.1` was a net win across the benchmark suite, but the `fibonacci` workload (fib(30), pure recursion) **regressed ~25%** versus `v1.0.0-rc.0`:

* rc.0: 1.64 ips (608.6 ms), ±0.69%
* main/rc.1: 1.23 ips (814.2 ms), ±0.42%

Measured with full Benchee runs on a single machine, run sequentially. Luerl and PUC-Lua (identical code in both runs) were used as drift controls and moved within ±3%, so net of machine drift the regression is still ≈ −22%. This is real, not noise.

## Cause

PR #283 ("configurable max call depth", commit `3a4a0c8`) added per-call bookkeeping to the executor's call/return paths: a `State.check_call_depth!/1` function call plus `call_depth + 1` / `- 1` state updates on **every** Lua function call. `fibonacci` is ~2.7M calls with almost no work per call, so it pays that overhead with nothing to amortize against.

Workloads that do real work per call are unaffected or faster (OOP +41%, closures +4%) because the dispatcher gains (#275, #277) dominate there. The regression is specific to call-dense, work-light code.

## Options to investigate

* Inline the depth check as a plain integer comparison rather than a function call into `State`.
* Check depth every Nth frame instead of on every call.
* Derive depth from the existing `call_stack` length instead of maintaining a separate counter.

## Acceptance criteria

* Recover most of the regression — target within ~5% of rc.0 on `fibonacci` — **without** weakening the call-depth limit.
* Verify with `mix lua.bench --workload fibonacci` before/after on the same machine.

## References

* PR #283 (the cause)
* Benchmark report: `benchmarks/results/2026-06-02-rc0-vs-main.md`
* Called out under "Known issues" in the `v1.0.0-rc.1` CHANGELOG; to be fixed before `1.0.0` final.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deep recursion ~25% slower since rc.0 — cheapen per-call depth-limit bookkeeping #324

Summary

Cause

Options to investigate

Acceptance criteria

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Deep recursion ~25% slower since rc.0 — cheapen per-call depth-limit bookkeeping #324

Description

Summary

Cause

Options to investigate

Acceptance criteria

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions