Codex/splayed clickbench by singaraiona · Pull Request #212 · RayforceDB/rayforce

singaraiona · 2026-05-22T13:18:32Z

No description provided.

The fused top-count path (use_emit_filter, n_keys>1) open-addresses into one monolithic table sized to the row count. For a 10M-row composite group-by — q32 `by {WatchID, ClientIP}` — that table is ~16M slots across three arrays (~290 MB); every probe is a cache miss and the loop runs at memory latency. group.c already has a radix-partitioned group path (radix_phase1/2/3) that partitions rows into cache-sized chunks and groups each locally. Gate the monolithic path to cap <= 2^19 slots; above that, bail via the existing `goto skip_top_count_filter` to the radix path. ClickBench 10M splayed: q32 1020->247ms, q18 1293->418ms, q16 567->246ms. 2407/2408 tests pass.

Two algorithmic fixes to exec_group: 1. strlen-on-SYM aggregates (avg/sum(strlen(sym_col))) resolved the symbol per row via ray_sym_str — a spin-lock + dict lookup each call. ClickBench q27 `avg(strlen URL)` over 10M rows spent 91% of its time in ray_sym_str/sym_lock. Borrow the sym→string snapshot once before the da_accum dispatch and index it lock-free in both da_accum_row paths (all-SUM fast path and the general path). q27: 1111 ms -> ~100 ms. 2. The single-key sparse_i64 top-count path open-addresses into a monolithic table that rehashes up to ~16M entries for a high-cardinality key (q15 `by UserID`), cache-thrashing. Above 2^21 rows, bail to the radix-partitioned path. q15: 357 -> 230 ms. 2407/2408 tests pass.

…oup-by The radix dispatch was gated `!rowsel`, so any composite group-by with a WHERE clause fell back to the monolithic hash — scanning all 10M rows into a row-count-sized table. q31 (`by {WatchID,ClientIP}` with `where SearchPhrase<>''`) ran 650 ms despite the filter cutting the input ~20x. radix_phase1_fn already honours c->rowsel (skips non-passing rows during the scatter), so the partitioned data only ever holds passing rows — the gate was over-conservative. Dropping it routes filtered composite group-bys onto the cache-friendly radix path. q31: 650 -> 175 ms. 2407/2408 tests pass.

…osite group-by" This reverts commit 03004e9.

…ix path" This reverts commit 22edfd4.

hetoku added 6 commits May 22, 2026 08:41

Revert "perf(group): allow radix group path for filtered (WHERE) comp…

b1be92c

…osite group-by" This reverts commit 03004e9.

Revert "perf(group): route high-cardinality composite group-by to rad…

9ea03e1

…ix path" This reverts commit 22edfd4.

perf(clickbench): optimize splayed query runs

597f06c

singaraiona merged commit aefacf5 into master May 22, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Codex/splayed clickbench#212

Codex/splayed clickbench#212
singaraiona merged 6 commits into
masterfrom
codex/splayed-clickbench

singaraiona commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

singaraiona commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants