Skip to content

[claude] bench: measure FlatBuffers verification cost for Layout and Array#8014

Open
joseph-isaacs wants to merge 2 commits into
developfrom
claude/flatbuffers-memory-safety-XKbWQ
Open

[claude] bench: measure FlatBuffers verification cost for Layout and Array#8014
joseph-isaacs wants to merge 2 commits into
developfrom
claude/flatbuffers-memory-safety-XKbWQ

Conversation

@joseph-isaacs
Copy link
Copy Markdown
Contributor

Adds a divan benchmark in vortex-layout/benches/flatbuffer_verify.rs
that compares root:: (checked), root_with_opts::, and
root_unchecked:: for representative Layout and Array shapes.

Results on this machine (medians, release):

Layout (per file open):
1 x 8 (2.0 KB) -> 726 ns checked, 2.6 ns unchecked
16 x 32 (34 KB) -> 33.3 us checked
128 x 32 (295 KB) -> 277 us checked
1024 x 32 (2.5 MB)-> 2.23 ms checked

Array (per SerializedArray decode and per buffer_lengths() call):
8 fields (2.0 KB) -> 636 ns checked
100 fields(6.0 KB) -> 5.8 us checked
1000 fields(44 KB) -> 56 us checked

Key findings:

  • root_unchecked is ~3 ns regardless of size; the verifier IS the cost.
  • root_with_opts is the same cost as root - the Vortex VerifierOptions
    knob is a DoS bound, not a perf knob.
  • Verifier walks roughly O(table count + buffer size), ~100 ns/KB.

Motivates dropping the redundant root:: re-verification inside
SerializedArray::buffer_lengths() (vortex-array/src/serde.rs:514), which
runs on an already-validated buffer.

Signed-off-by: Claude noreply@anthropic.com

claude added 2 commits May 17, 2026 22:38
Adds a divan benchmark in vortex-layout/benches/flatbuffer_verify.rs
that compares root::<T> (checked), root_with_opts::<T>, and
root_unchecked::<T> for representative Layout and Array shapes.

Results on this machine (medians, release):

  Layout (per file open):
    1 x 8   (2.0 KB)  -> 726 ns checked, 2.6 ns unchecked
    16 x 32 (34 KB)   -> 33.3 us checked
    128 x 32 (295 KB) -> 277 us checked
    1024 x 32 (2.5 MB)-> 2.23 ms checked

  Array (per SerializedArray decode and per buffer_lengths() call):
    8 fields  (2.0 KB) -> 636 ns checked
    100 fields(6.0 KB) -> 5.8 us checked
    1000 fields(44 KB) -> 56 us checked

Key findings:
- root_unchecked is ~3 ns regardless of size; the verifier IS the cost.
- root_with_opts is the same cost as root - the Vortex VerifierOptions
  knob is a DoS bound, not a perf knob.
- Verifier walks roughly O(table count + buffer size), ~100 ns/KB.

Motivates dropping the redundant root::<Array> re-verification inside
SerializedArray::buffer_lengths() (vortex-array/src/serde.rs:514), which
runs on an already-validated buffer.

Signed-off-by: Claude <noreply@anthropic.com>
…uffer_lengths

`SerializedArray::buffer_lengths()` was calling `root::<fba::Array>()` on every
invocation, which re-runs the full FlatBuffer verifier across the array tree.
The buffer is already verified once at construction time by `validate_array_tree`
(line 527), and the rest of `SerializedArray` already exploits this invariant -
see `from_flatbuffer_and_segment_with_overrides` at line 614, which uses
`fba::root_as_array_unchecked` with the same safety justification.

Switch `buffer_lengths` to `fba::root_as_array_unchecked` for consistency.

Measured on the same workload with the new buffer_lengths bench:

  n_fields  fixed    legacy(root::<Array>)  speedup
  1          26 ns    237 ns                  9x
  8          35 ns    772 ns                 22x
  32         59 ns    2.56 us                43x
  100       140 ns    7.75 us                55x
  1000      1.13 us  73.7 us                 65x

Post-fix cost grows with n_fields purely from the Vec<usize> allocation +
iteration to extract buffer descriptor lengths; the verifier overhead is gone.

Signed-off-by: Claude <noreply@anthropic.com>
@joseph-isaacs joseph-isaacs changed the title bench: measure FlatBuffers verification cost for Layout and Array [claude] bench: measure FlatBuffers verification cost for Layout and Array May 18, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 18, 2026

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 1 improved benchmark
❌ 1 regressed benchmark
✅ 1235 untouched benchmarks
🆕 10 new benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
🆕 Simulation buffer_lengths_fixed[1000] N/A 16.2 µs N/A
🆕 Simulation buffer_lengths_fixed[1] N/A 1.7 µs N/A
🆕 Simulation buffer_lengths_fixed[100] N/A 3 µs N/A
🆕 Simulation buffer_lengths_fixed[32] N/A 2.1 µs N/A
🆕 Simulation buffer_lengths_fixed[8] N/A 1.8 µs N/A
🆕 Simulation buffer_lengths_legacy_root[1000] N/A 336.3 µs N/A
🆕 Simulation buffer_lengths_legacy_root[100] N/A 39 µs N/A
🆕 Simulation buffer_lengths_legacy_root[1] N/A 6.4 µs N/A
🆕 Simulation buffer_lengths_legacy_root[32] N/A 16.6 µs N/A
🆕 Simulation buffer_lengths_legacy_root[8] N/A 8.8 µs N/A
Simulation chunked_varbinview_canonical_into[(100, 100)] 273.3 µs 308 µs -11.28%
Simulation chunked_varbinview_opt_canonical_into[(1000, 10)] 224.8 µs 187.7 µs +19.75%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing claude/flatbuffers-memory-safety-XKbWQ (172e0c7) with develop (7b47788)

Open in CodSpeed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants