Skip to content

perf(vm): dispatcher coverage for table opcodes and :numeric_for#275

Merged
davydog187 merged 4 commits into
mainfrom
perf/dispatcher-tables
May 28, 2026
Merged

perf(vm): dispatcher coverage for table opcodes and :numeric_for#275
davydog187 merged 4 commits into
mainfrom
perf/dispatcher-tables

Conversation

@davydog187
Copy link
Copy Markdown
Contributor

Dispatcher table opcodes — make table-heavy workloads bypass the interpreter

Plan: .agents/plans/B5b-v2-dispatcher-tables.md
Closes #271

Goal

Extend Lua.VM.Dispatcher and Lua.Compiler.Bytecode to lower the
table opcode family plus :numeric_for. After this PR all four
table_ops benchmarks compile end-to-end and stay out of the
interpreter fallback path, plus the closures orchestrator
(run_closures).

Success criteria

  • Lua.Compiler.Bytecode learns encoders for :new_table,
    :get_table, :set_table, :set_field, :set_list (basic
    non-multi-return form), :length, and :numeric_for (body
    encoded recursively; falls back if body contains an uncovered
    opcode or :break).
  • Lua.VM.Dispatcher gains one case branch per new opcode
    plus the :cps_for continuation marker handled by
    finish_body/6.
  • :call with result_count == 0 (statement-call form) is
    covered via a new @op_call_zero tag — required so
    run_table_sort can compile (table.sort(t) is a statement
    call).
  • Lua.VM.Executor exposes six new dispatcher_* public
    bridges (dispatcher_get_table, dispatcher_set_table,
    dispatcher_set_field, dispatcher_length,
    dispatcher_coerce_numeric_for_controls,
    dispatcher_close_open_upvalues_at_or_above) — each wraps an
    existing defp helper so metamethod fidelity matches the
    interpreter for free.
  • mix compile --warnings-as-errors passes.
  • mix test: 1883 → 1902 tests, 0 failures, 25 skipped.
  • mix test --only lua53: 29 tests, 0 failures, 18 skipped.
  • test/lua/vm/leak_regression_test.exs: 3 tests, 0 failures.
  • All four table_ops benchmark functions compile to bytecode
    end-to-end (verified by a new dispatcher-test setup that asserts
    {:compiled_closure, _, _} for each global after install).
  • run_closures orchestrator compiles (make_counter still
    falls back because of :closure — that's B5c-v2).
  • No workload regresses by more than 10% vs. interpreter on the
    dispatcher_vs_interpreter-style A/B harness.
  • Soft target run_table_sum(1000) ≥1.5x interpreter — landed
    at 1.13x. Profile shows Table.put/3 allocation churn
    and setelement/3 register writes dominate; both are
    explicitly out of scope per the parent plan ("table-storage
    churn is the post-B5b ceiling").

Performance

Mini-bench (mix run -e, 100–200 iter median, warmed):

Workload Dispatcher Interpreter Speedup
run_table_build(500) 90 µs 106 µs 1.18x
run_table_sum(500) 121 µs 129 µs 1.06x
run_table_sum(1000) 254 µs 289 µs 1.13x
run_table_map_reduce(500) 241 µs 245 µs 1.02x
fib(22) 12.2 ms 18.7 ms 1.54x

fib lands at 1.54x — well above B5a-v2's 1.17x median for fib(25),
a sign the table-opcode additions did not push the arithmetic
path off any inlining cliff.

Discoveries

  1. :call with result_count == 0 was needed. run_table_sort
    calls table.sort(t) at statement position, which lowers to
    {:call, _, _, 0, _}. Added @op_call_zero (reusing slot 25,
    freed by the removed @op_test_true). The dispatcher branch
    shares the same shape as :call_one, with a :discard sentinel
    in the frame's base slot signalling "throw the return value
    away."

  2. string_ops orchestrators do not fully compile. The
    issue described "orchestrators in closures and string_ops"
    as compiling. run_closures does. But both string_ops orchestrators
    end with return table.concat(...) / return string.format(...)
    — multi-return shapes (:call with result_count = -1 plus
    :return_vararg) which are explicitly out of scope per the
    parent plan. Documented as a B5c-v2 follow-up.

  3. :break inside :numeric_for forces fallback. The
    interpreter's :break walks find_loop_exit/1 against the
    continuation stack. Reproducing that in the dispatcher requires
    mixing post-test markers with {:loop_exit, _} markers — the
    same machinery generic-for / while-loop / break all want, which
    is the domain of B5c-v2. The encoder rejects :numeric_for
    bodies containing :break upfront (walking recursively through
    nested :test branches) and the whole enclosing prototype falls
    back.

  4. The :cps_for continuation marker integrates cleanly. B5a's
    cont stack carried only {code, pc} resume points. Adding
    {:cps_for, base, loop_var, body_bc, code, pc + 1} markers
    expands finish_body/6 from two clauses to three. The marker
    stays on the stack across iterations (re-pushed when the body
    restarts), so nested numeric-fors compose naturally on the same
    stack.

  5. Soft perf target missed; hard floor met. setelement/3 +
    Table.put/3 allocation dominate the table workloads. The
    parent plan explicitly flagged this and ruled mutable storage
    out of scope; the next plan for closing the table-workload gap
    is mutable register/table storage, not more dispatch work.

Changes

 .agents/plans/B5b-v2-dispatcher-tables.md   |   ~233  (status: review, discoveries appended)
 lib/lua/compiler/bytecode.ex                |   +95  (7 new opcodes + :call_zero + accessors)
 lib/lua/vm/dispatcher.ex                    |  +231  (7 new branches + :cps_for + helpers)
 lib/lua/vm/executor.ex                      |   +73  (6 dispatcher_* bridges)
 test/lua/compiler/bytecode_test.exs         |   ~54  (flip 2 fallback tests, add 3)
 test/lua/vm/dispatcher_test.exs             |  +306  (21 new goldens: 7 table, 6 numeric_for,
                                              |        4 table_ops shapes, 1 :call_zero, plus
                                              |        the test_ops setup block)

Verification

mix format                                            ✓
mix compile --warnings-as-errors                      ✓
mix test                                              ✓ 1902 tests, 0 failures
mix test --only lua53                                 ✓ 29 tests, 0 failures
mix test test/lua/vm/dispatcher_test.exs              ✓ 48 tests (27 → 48)
mix test test/lua/compiler/bytecode_test.exs          ✓ 15 tests (14 → 15)
mix test test/lua/vm/leak_regression_test.exs         ✓ 3 tests

Out of scope (intentional)

  • :closure opcode and varargs → B5c-v2.
  • Multi-return :call (result_count = -1) and :return_vararg
    → still B5c-v2. This is why string_ops orchestrators
    don't fully compile.
  • :while_loop, :repeat_loop, :generic_for → defer to a future
    plan (not blockers for table_ops).
  • :break inside :numeric_for → fallback for now. The
    loop-exit continuation machinery lands with the rest of B5c-v2.
  • :set_list with {:multi, _} → fallback.
  • Mutable table storage → its own follow-up plan if the table-workload
    gap matters.
  • Line attribution for table errors → B5d-v2. All new bridges
    pass line: 0.

Extend the v2 dispatcher and bytecode encoder to cover the table
opcode family (:new_table, :get_table, :set_table, :set_field,
:set_list non-multi-return form, :length), :numeric_for with a
new :cps_for continuation marker, and :call with result_count == 0
(the statement-call form like table.sort(t)). After this PR all four
table_ops benchmarks compile end-to-end and the run_closures
orchestrator joins them; only multi-return shapes (return f(...),
:return_vararg) still keep workloads on the interpreter.

Mini-bench: fib(22) 1.54x interpreter (up from 1.17x in B5a-v2),
run_table_build(500) 1.18x, run_table_sum(1000) 1.13x,
run_table_map_reduce(500) 1.02x. The soft target (>=1.5x on
table_sum) is not met because Table.put/3 allocation churn and
setelement/3 register writes dominate the table workloads — both
flagged as out of scope by the parent plan. The hard floor (no
regression) is met across the board.

The dispatcher delegates the slow paths to new Executor.dispatcher_*
bridges (dispatcher_get_table, dispatcher_set_table,
dispatcher_set_field, dispatcher_length,
dispatcher_coerce_numeric_for_controls,
dispatcher_close_open_upvalues_at_or_above), each wrapping the
existing defp helpers so metamethod fidelity matches the
interpreter for free.

Plan: B5b-v2
Closes #271
- Tighten `:set_list` encoder guard to `count > 0`. The literal-count
  zero shape is the interpreter's multi-return sentinel (consumes
  `state.multi_return_count` trailing values); codegen never emits
  it from a literal constructor today, but encoding it as a no-op
  would silently diverge if that ever changed. Two new tests pin
  the fallback contract for both `count == 0` and `{:multi, _}`.

- Fix misleading comment about step == 0 in `:numeric_for` body
  completion. Neither the dispatcher nor the interpreter implements
  PUC-Lua's runtime check; both infinite-loop on step == 0. Parity
  is preserved; the comment now describes it accurately.

- Alias `Lua.VM.Table` at the top of `Lua.VM.Dispatcher` and use
  the short `Table.put/3` form in `set_list_into_table/6`,
  matching the other sibling-module aliases.

- Tighten `@spec` on `dispatcher_set_table` / `dispatcher_set_field`
  to `State.t() | no_return()` (the non-tref clauses always raise).

- Document the unused `_proto` parameter on `dispatcher_length` as
  forward-compat for B5d-v2 error attribution. The other table
  bridges already thread `proto.source` through their slow paths;
  `__len` does not yet, but will when error positions land for
  compiled prototypes.
@davydog187
Copy link
Copy Markdown
Contributor Author

Addressed the review feedback in 75e616d. Summary:

# Issue Fix
1 Misleading step == 0 comment in :numeric_for Rewrote to accurately note that both executors infinite-loop on step == 0 — parity preserved, no PUC-style runtime check on either path.
2 Latent encoder/dispatcher divergence on :set_list count == 0 Tightened the encoder guard to count > 0. Added two regression tests (count == 0 and {:multi, _}) that build synthetic prototypes and assert Bytecode.compile/1 returns bytecode == nil. Pins the contract so a future codegen change can't silently diverge.
3 Inconsistent aliasing of Lua.VM.Table Added alias Lua.VM.Table at the top; set_list_into_table/6 now uses the short Table.put/3 form.
4 @spec overstates return for dispatcher_set_table/dispatcher_set_field Tightened to State.t() | no_return().
5 Unused _proto on dispatcher_length Added a comment documenting forward-compat intent for B5d-v2 error attribution. The other table bridges already thread proto.source; __len will join them when error positions land.

Verification:

mix format --check-formatted                         ✓
mix compile --warnings-as-errors                     ✓
mix test                                             ✓ 1902 → 1904 tests, 0 failures
mix test --only lua53                                ✓ 29 tests, 0 failures
mix test test/lua/compiler/bytecode_test.exs         ✓ 17 → 19 tests
mix test test/lua/vm/dispatcher_test.exs             ✓ 48 tests
mix test test/lua/vm/leak_regression_test.exs        ✓ 3 tests

@davydog187
Copy link
Copy Markdown
Contributor Author

Benchmarks

dave@dave-mac-mini ~/code/tvlabs/lua (perf/dispatcher-tables) $ mix lua.bench
==> closures
Compiling 16 files (.ex)
Generated lua app
Operating System: macOS
CPU Information: Apple M4
Number of Available Cores: 10
Available memory: 32 GB
Elixir 1.19.4
Erlang 27.3.4.7
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 1 s
time: 3 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 16 s
Excluding outliers: false

Benchmarking C Lua (luaport) ...
Benchmarking lua (chunk) ...
Benchmarking lua (eval) ...
Benchmarking luerl ...
Calculating statistics...
Formatting results...

Name                      ips        average  deviation         median         99th %
C Lua (luaport)       32.86 K       30.44 μs    ±16.72%          29 μs       43.13 μs
luerl                  2.62 K      381.93 μs     ±4.79%      379.50 μs      442.03 μs
lua (chunk)            2.38 K      420.66 μs     ±5.42%      415.88 μs      493.72 μs
lua (eval)             2.35 K      426.12 μs     ±5.49%      421.29 μs      498.47 μs

Comparison:
C Lua (luaport)       32.86 K
luerl                  2.62 K - 12.55x slower +351.50 μs
lua (chunk)            2.38 K - 13.82x slower +390.23 μs
lua (eval)             2.35 K - 14.00x slower +395.69 μs
==> dispatcher_vs_interpreter

--- closure tags ---
dispatcher: :compiled_closure
interpreter: :lua_closure

fib(20) dispatcher=6765 interpreter=6765 match=true

Operating System: macOS
CPU Information: Apple M4
Number of Available Cores: 10
Available memory: 32 GB
Elixir 1.19.4
Erlang 27.3.4.7
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 1 s
time: 3 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 8 s
Excluding outliers: false

Benchmarking dispatcher fib(25) ...
Benchmarking interpreter fib(25) ...
Calculating statistics...
Formatting results...

Name                          ips        average  deviation         median         99th %
dispatcher fib(25)          18.46       54.17 ms     ±1.31%       53.88 ms       56.02 ms
interpreter fib(25)         13.35       74.89 ms     ±1.13%       74.74 ms       76.68 ms

Comparison:
dispatcher fib(25)          18.46
interpreter fib(25)         13.35 - 1.38x slower +20.72 ms
==> fibonacci
Operating System: macOS
CPU Information: Apple M4
Number of Available Cores: 10
Available memory: 32 GB
Elixir 1.19.4
Erlang 27.3.4.7
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 1 s
time: 3 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 16 s
Excluding outliers: false

Benchmarking C Lua (luaport) ...
Benchmarking lua (chunk) ...
Benchmarking lua (eval) ...
Benchmarking luerl ...
Calculating statistics...
Formatting results...

Name                      ips        average  deviation         median         99th %
C Lua (luaport)         37.53       26.65 ms     ±1.32%       26.58 ms       28.03 ms
lua (eval)               1.63      614.51 ms     ±0.68%      614.71 ms      619.56 ms
lua (chunk)              1.62      617.95 ms     ±0.70%      617.21 ms      625.23 ms
luerl                    1.40      715.25 ms     ±1.30%      710.93 ms      725.36 ms

Comparison:
C Lua (luaport)         37.53
lua (eval)               1.63 - 23.06x slower +587.86 ms
lua (chunk)              1.62 - 23.19x slower +591.31 ms
luerl                    1.40 - 26.84x slower +688.60 ms
==> helpers
==> oop
Operating System: macOS
CPU Information: Apple M4
Number of Available Cores: 10
Available memory: 32 GB
Elixir 1.19.4
Erlang 27.3.4.7
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 1 s
time: 3 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 16 s
Excluding outliers: false

Benchmarking C Lua (luaport) ...
Benchmarking lua (chunk) ...
Benchmarking lua (eval) ...
Benchmarking luerl ...
Calculating statistics...
Formatting results...

Name                      ips        average  deviation         median         99th %
C Lua (luaport)       33.61 K       29.76 μs    ±14.35%       28.92 μs       39.67 μs
luerl                  8.16 K      122.51 μs    ±17.80%      115.88 μs      199.08 μs
lua (chunk)            7.68 K      130.28 μs    ±20.63%      122.46 μs      243.83 μs
lua (eval)             7.32 K      136.61 μs    ±18.15%      129.96 μs      231.50 μs

Comparison:
C Lua (luaport)       33.61 K
luerl                  8.16 K - 4.12x slower +92.75 μs
lua (chunk)            7.68 K - 4.38x slower +100.53 μs
lua (eval)             7.32 K - 4.59x slower +106.85 μs
==> string_ops

=== String Concatenation via table.concat (n=100) (mode: quick) ===

Operating System: macOS
CPU Information: Apple M4
Number of Available Cores: 10
Available memory: 32 GB
Elixir 1.19.4
Erlang 27.3.4.7
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 1 s
time: 3 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 16 s
Excluding outliers: false

Benchmarking C Lua (luaport) ...
Benchmarking lua (chunk) ...
Benchmarking lua (eval) ...
Benchmarking luerl ...
Calculating statistics...
Formatting results...

Name                      ips        average  deviation         median         99th %
C Lua (luaport)       55.68 K       17.96 μs    ±19.29%       17.38 μs       24.42 μs
luerl                 24.73 K       40.44 μs     ±8.25%       40.48 μs       53.21 μs
lua (chunk)           22.75 K       43.96 μs    ±16.99%       46.67 μs       56.51 μs
lua (eval)            20.78 K       48.13 μs    ±16.12%       49.63 μs       64.17 μs

Comparison:
C Lua (luaport)       55.68 K
luerl                 24.73 K - 2.25x slower +22.48 μs
lua (chunk)           22.75 K - 2.45x slower +26.00 μs
lua (eval)            20.78 K - 2.68x slower +30.17 μs

=== String Formatting via string.format (n=100) (mode: quick) ===

Operating System: macOS
CPU Information: Apple M4
Number of Available Cores: 10
Available memory: 32 GB
Elixir 1.19.4
Erlang 27.3.4.7
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 1 s
time: 3 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 16 s
Excluding outliers: false

Benchmarking C Lua (luaport) ...
Benchmarking lua (chunk) ...
Benchmarking lua (eval) ...
Benchmarking luerl ...
Calculating statistics...
Formatting results...

Name                      ips        average  deviation         median         99th %
C Lua (luaport)       30.07 K       33.25 μs     ±6.85%       32.25 μs       39.84 μs
luerl                  9.58 K      104.40 μs    ±10.41%      103.13 μs      138.01 μs
lua (chunk)            5.43 K      184.13 μs    ±16.87%      175.21 μs      307.68 μs
lua (eval)             5.32 K      187.88 μs    ±16.14%      179.13 μs      304.97 μs

Comparison:
C Lua (luaport)       30.07 K
luerl                  9.58 K - 3.14x slower +71.14 μs
lua (chunk)            5.43 K - 5.54x slower +150.88 μs
lua (eval)             5.32 K - 5.65x slower +154.62 μs
==> table_ops

=== Table Build (mode: quick) ===

Operating System: macOS
CPU Information: Apple M4
Number of Available Cores: 10
Available memory: 32 GB
Elixir 1.19.4
Erlang 27.3.4.7
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 1 s
time: 3 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: medium (n=100)
Estimated total run time: 16 s
Excluding outliers: false

Benchmarking C Lua (luaport) with input medium (n=100) ...
Benchmarking lua (chunk) with input medium (n=100) ...
Benchmarking lua (eval) with input medium (n=100) ...
Benchmarking luerl with input medium (n=100) ...
Calculating statistics...
Formatting results...

##### With input medium (n=100) #####
Name                      ips        average  deviation         median         99th %
C Lua (luaport)      105.28 K        9.50 μs    ±21.11%        9.21 μs       13.54 μs
lua (chunk)           63.07 K       15.85 μs     ±9.24%       15.92 μs       19.92 μs
luerl                 54.89 K       18.22 μs    ±14.61%          18 μs       24.79 μs
lua (eval)            54.61 K       18.31 μs    ±12.32%       18.08 μs       25.67 μs

Comparison:
C Lua (luaport)      105.28 K
lua (chunk)           63.07 K - 1.67x slower +6.36 μs
luerl                 54.89 K - 1.92x slower +8.72 μs
lua (eval)            54.61 K - 1.93x slower +8.81 μs

=== Table Sort (mode: quick) ===

Operating System: macOS
CPU Information: Apple M4
Number of Available Cores: 10
Available memory: 32 GB
Elixir 1.19.4
Erlang 27.3.4.7
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 1 s
time: 3 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: medium (n=100)
Estimated total run time: 16 s
Excluding outliers: false

Benchmarking C Lua (luaport) with input medium (n=100) ...
Benchmarking lua (chunk) with input medium (n=100) ...
Benchmarking lua (eval) with input medium (n=100) ...
Benchmarking luerl with input medium (n=100) ...
Calculating statistics...
Formatting results...

##### With input medium (n=100) #####
Name                      ips        average  deviation         median         99th %
C Lua (luaport)       55.81 K       17.92 μs    ±14.53%       17.54 μs       22.63 μs
luerl                 38.13 K       26.23 μs    ±36.19%       21.21 μs       46.04 μs
lua (eval)            27.94 K       35.79 μs    ±11.16%       35.54 μs       52.04 μs
lua (chunk)           26.83 K       37.27 μs    ±19.89%       34.38 μs       56.38 μs

Comparison:
C Lua (luaport)       55.81 K
luerl                 38.13 K - 1.46x slower +8.31 μs
lua (eval)            27.94 K - 2.00x slower +17.87 μs
lua (chunk)           26.83 K - 2.08x slower +19.35 μs

=== Table Iterate/Sum (mode: quick) ===

Operating System: macOS
CPU Information: Apple M4
Number of Available Cores: 10
Available memory: 32 GB
Elixir 1.19.4
Erlang 27.3.4.7
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 1 s
time: 3 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: medium (n=100)
Estimated total run time: 16 s
Excluding outliers: false

Benchmarking C Lua (luaport) with input medium (n=100) ...
Benchmarking lua (chunk) with input medium (n=100) ...
Benchmarking lua (eval) with input medium (n=100) ...
Benchmarking luerl with input medium (n=100) ...
Calculating statistics...
Formatting results...

##### With input medium (n=100) #####
Name                      ips        average  deviation         median         99th %
C Lua (luaport)      104.66 K        9.56 μs    ±20.41%        9.33 μs       13.29 μs
lua (chunk)           44.04 K       22.70 μs    ±17.59%       21.79 μs       40.25 μs
lua (eval)            39.91 K       25.06 μs    ±13.24%       24.75 μs       35.25 μs
luerl                 35.65 K       28.05 μs     ±7.87%          28 μs       34.13 μs

Comparison:
C Lua (luaport)      104.66 K
lua (chunk)           44.04 K - 2.38x slower +13.15 μs
lua (eval)            39.91 K - 2.62x slower +15.50 μs
luerl                 35.65 K - 2.94x slower +18.50 μs

=== Table Map + Reduce (mode: quick) ===

Operating System: macOS
CPU Information: Apple M4
Number of Available Cores: 10
Available memory: 32 GB
Elixir 1.19.4
Erlang 27.3.4.7
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 1 s
time: 3 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: medium (n=100)
Estimated total run time: 16 s
Excluding outliers: false

Benchmarking C Lua (luaport) with input medium (n=100) ...
Benchmarking lua (chunk) with input medium (n=100) ...
Benchmarking lua (eval) with input medium (n=100) ...
Benchmarking luerl with input medium (n=100) ...
Calculating statistics...
Formatting results...

##### With input medium (n=100) #####
Name                      ips        average  deviation         median         99th %
C Lua (luaport)       88.87 K       11.25 μs    ±20.19%       11.29 μs       15.38 μs
lua (eval)            20.87 K       47.91 μs    ±13.73%       46.58 μs          69 μs
lua (chunk)           19.37 K       51.63 μs    ±18.05%       52.63 μs       73.50 μs
luerl                 16.70 K       59.88 μs    ±29.83%          52 μs      113.21 μs

Comparison:
C Lua (luaport)       88.87 K
lua (eval)            20.87 K - 4.26x slower +36.66 μs
lua (chunk)           19.37 K - 4.59x slower +40.38 μs
luerl                 16.70 K - 5.32x slower +48.63 μs

@davydog187 davydog187 merged commit 262b9be into main May 28, 2026
5 checks passed
@davydog187 davydog187 deleted the perf/dispatcher-tables branch May 28, 2026 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dispatcher: cover table opcodes and :numeric_for (B5b-v2)

1 participant