perf: sort-merge join (SMJ) batch deferred filtering and move mark joins to specialized stream by mbutrovich · Pull Request #21184 · apache/datafusion

mbutrovich · 2026-03-26T18:00:18Z

Which issue does this PR close?

Partially addresses #20910.

Rationale for this change

Sort-merge join with a filter on outer joins (LEFT/RIGHT/FULL) runs process_filtered_batches() on every key transition in the Init state. With near-unique keys (1:1 cardinality), this means running the full deferred filtering pipeline (concat + get_corrected_filter_mask + filter_record_batch_by_join_type) once per row — making filtered LEFT/RIGHT/FULL 55x slower than INNER for 10M unique keys.

Additionally, mark join logic in SortMergeJoinStream materializes full (streamed, buffered) pairs only to discard most of them via get_corrected_filter_mask(). Mark joins are structurally identical to semi joins (one output row per outer row with a boolean result) and belong in SemiAntiMarkSortMergeJoinStream, which avoids pair materialization entirely using a per-outer-batch bitset.

What changes are included in this PR?

Three areas of improvement, building on the specialized semi/anti stream from #20806:

1. Move mark joins to SemiAntiMarkSortMergeJoinStream

Rename semi_anti_sort_merge_join → semi_anti_mark_sort_merge_join
Match on join type; emit_outer_batch() emits all rows with the match bitset as a boolean column (vs semi's filter / anti's invert-and-filter)
Route LeftMark/RightMark from SortMergeJoinExec::execute() to the renamed stream
Remove all mark-specific logic from SortMergeJoinStream (mark_row_as_match, is_not_null column generation, mark arms in filter correction)

2. Batch filter evaluation in freeze_streamed()

Split freeze_streamed() into null-joined classification + freeze_streamed_matched() for batched materialization
Collect indices across chunks, materialize left/right columns once using tiered Arrow kernels (slice → take → interleave)
Single RecordBatch construction and single expression.evaluate() per freeze instead of per chunk
Vectorize append_filter_metadata() using builder extend() instead of per-element loop

3. Batch deferred filtering in Init state (this is the big win for Q22 and Q23)

Gate process_filtered_batches() on accumulated rows >= batch_size instead of running on every Init entry
Accumulated data bounded to ~2×batch_size (one from freeze_dequeuing_buffered, one accumulating toward next freeze) — does not reintroduce unbounded buffering fixed by PR fix: SortMergeJoin don't wait for all input before emitting #20482
Exhausted state flushes any remainder

Cleanup:

SortMergeJoinStream now handles only Inner/Left/Right/Full — all semi/anti/mark branching removed
get_corrected_filter_mask(): merge identical Left/Right/Full branches; add null-metadata passthrough for already-null-joined rows
filter_record_batch_by_join_type(): rewrite from filter(true) + filter(false) + concat to zip() for in-place null-joining — preserves row ordering and removes create_null_joined_batch() entirely
filter_record_batch_by_join_type(): use compute::filter() directly on BooleanArray instead of wrapping in temporary RecordBatch

Benchmarks

cargo run --release --bin dfbench -- smj

Query	Join Type	Rows	Keys	Filter	Main (ms)	PR (ms)	Speedup
Q1	INNER	100K×100K	1:1	—	1.7	1.7	1.0x
Q2	INNER	100K×1M	1:10	—	12.2	11.6	1.0x
Q3	INNER	1M×1M	1:100	—	64.2	64.9	1.0x
Q4	INNER	100K×1M	1:10	1%	2.2	2.2	1.0x
Q5	INNER	1M×1M	1:100	10%	12.8	12.7	1.0x
Q6	LEFT	100K×1M	1:10	—	11.1	11.3	1.0x
Q7	LEFT	100K×1M	1:10	50%	13.4	14.1	1.0x
Q8	FULL	100K×100K	1:10	—	2.2	2.2	1.0x
Q9	FULL	100K×1M	1:10	10%	14.5	14.8	1.0x
Q10	LEFT SEMI	100K×1M	1:10	—	3.6	3.4	1.0x
Q11	LEFT SEMI	100K×1M	1:10	1%	2.0	2.3	1.0x
Q12	LEFT SEMI	100K×1M	1:10	50%	5.1	5.4	1.0x
Q13	LEFT SEMI	100K×1M	1:10	90%	9.9	10.1	1.0x
Q14	LEFT ANTI	100K×1M	1:10	—	3.5	3.7	1.0x
Q15	LEFT ANTI	100K×1M	1:10	partial	3.7	3.5	1.0x
Q16	LEFT ANTI	100K×100K	1:1	—	1.6	1.7	1.0x
Q17	INNER	100K×5M	1:50	5%	7.4	7.8	1.0x
Q18	LEFT SEMI	100K×5M	1:50	2%	5.4	5.5	1.0x
Q19	LEFT ANTI	100K×5M	1:50	partial	21.0	21.2	1.0x
Q20	INNER	1M×10M	1:100	GROUP BY	759	761	1.0x
Q21	INNER	10M×10M	1:1	50%	181	173	1.0x
Q22	LEFT	10M×10M	1:1	50%	10,228	184	55x
Q23	FULL	10M×10M	1:1	50%	9,884	228	43x

General workload (Q1-Q20, various join types/cardinalities/selectivities): no regressions.

Are these changes tested?

Yes:

48 SMJ unit tests (cargo test -p datafusion-physical-plan --lib joins::sort_merge_join)
10 join sqllogictest files (cargo test -p datafusion-sqllogictest --test sqllogictests -- join)
Semi/anti/mark stream tests (cargo test -p datafusion-physical-plan --lib joins::semi_anti_mark_sort_merge_join)
New unit test for mark join with filter via the renamed stream
Three new unit tests to exercise full join with filter that spills
New fuzz test to exercise full join with filter that spills
New benchmark queries Q21-Q23: 10M×10M unique keys with 50% join filter for INNER/LEFT/FULL — exercises the degenerate case this PR fixes
I ran 50 iterations of the fuzz tests (modified to only test against hash join as the baseline because nested loop join takes too long) cargo test -p datafusion --features extended_tests --test fuzz -- join_fuzz

Are there any user-facing changes?

No.

mbutrovich · 2026-03-26T18:26:47Z

Tagging folks who had feedback on recent SMJ changes @comphead @rluvaton @stuhood. Thank you!

rluvaton · 2026-03-26T18:27:26Z

run benchmarks sort_merge_join

mbutrovich · 2026-03-26T18:29:37Z

run benchmarks sort_merge_join

Note that the 2 queries I expect a speedup on in the smj suite are new in this PR, so I don't think we'll see their performance against main. I had to hoist the benchmark to main and run it locally for the comparison in the PR description.

adriangbot · 2026-03-26T18:31:10Z

🤖 Criterion benchmark running (GKE) | trigger
Linux bench-c4137272954-565-stzm5 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux
Comparing simplify_smj_full_opt (1c1bec5) to ba399a8 (merge-base) diff
BENCH_NAME=sort_merge_join
BENCH_COMMAND=cargo bench --features=parquet --bench sort_merge_join
BENCH_FILTER=
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-03-26T18:31:16Z

Benchmark for this request failed.

Last 20 lines of output:

Click to expand

Cloning into '/workspace/datafusion-branch'...
simplify_smj_full_opt
From https://github.com/apache/datafusion
 * [new ref]         refs/pull/21184/head -> simplify_smj_full_opt
 * branch            main                 -> FETCH_HEAD
Switched to branch 'simplify_smj_full_opt'
ba399a80f9ffcb0563adf2b67add13d0476f6291
Cloning into '/workspace/datafusion-base'...
HEAD is now at ba399a8 docs: add KalamDB to known users (#21181)
rustc 1.94.0 (4a4ef493e 2026-03-02)
1c1bec5e7a217c366e704d1fd5bf8594a9e9540e
ba399a80f9ffcb0563adf2b67add13d0476f6291
    Blocking waiting for file lock on package cache
    Blocking waiting for file lock on package cache
    Blocking waiting for file lock on package cache
error: target `sort_merge_join` in package `datafusion-physical-plan` requires the features: `test_utils`
Consider enabling them by passing, e.g., `--features="test_utils"`

File an issue against this benchmark runner

mbutrovich · 2026-03-26T18:35:19Z

Also I'm now confused where I should add benchmarks. #20464 added Criterion SMJ benchmarks for sort-merge join , but it's missing scenarios from dfbench's smj benchmarks, which I further extend here. Any help?

Dandandan · 2026-03-26T18:52:35Z

adriangb/datafusion-benchmarking#2

datafusion/physical-plan/src/joins/sort_merge_join/filter.rs

mbutrovich · 2026-03-26T19:45:53Z

datafusion/physical-plan/src/joins/semi_anti_mark_sort_merge_join/stream.rs

-            self.coalescer.push_batch(filtered)?;
+        let matched_buf = self.matched.finish();
+
+        match self.join_type {


@comphead this one is for you. You suggested this in #20806 and this time I finally listened :)

Dandandan · 2026-03-26T20:04:13Z

run benchmark smj

adriangbot · 2026-03-26T20:07:02Z

🤖 Benchmark running (GKE) | trigger
Linux bench-c4137871831-571-vqcm7 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux
Comparing simplify_smj_full_opt (481753a) to ba399a8 (merge-base) diff using: smj
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-03-26T20:20:08Z

🤖 Benchmark completed (GKE) | trigger

Details

Comparing HEAD and simplify_smj_full_opt
--------------------
Benchmark smj.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query     ┃                                 HEAD ┃                 simplify_smj_full_opt ┃    Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 1  │          3.82 / 3.90 ±0.09 / 4.08 ms │           3.72 / 3.79 ±0.09 / 3.96 ms │ no change │
│ QQuery 2  │       18.26 / 18.77 ±0.33 / 19.23 ms │        19.04 / 19.39 ±0.24 / 19.77 ms │ no change │
│ QQuery 3  │    102.84 / 104.12 ±0.83 / 105.10 ms │     103.73 / 105.01 ±1.04 / 106.35 ms │ no change │
│ QQuery 4  │          3.97 / 4.05 ±0.06 / 4.09 ms │           4.08 / 4.15 ±0.07 / 4.25 ms │ no change │
│ QQuery 5  │       22.22 / 22.69 ±0.30 / 23.05 ms │        22.06 / 22.43 ±0.26 / 22.79 ms │ no change │
│ QQuery 6  │       18.37 / 18.50 ±0.14 / 18.71 ms │        18.86 / 19.26 ±0.46 / 20.07 ms │ no change │
│ QQuery 7  │       22.00 / 22.65 ±0.55 / 23.40 ms │        22.81 / 22.98 ±0.11 / 23.10 ms │ no change │
│ QQuery 8  │          3.16 / 3.19 ±0.03 / 3.24 ms │           3.15 / 3.21 ±0.05 / 3.26 ms │ no change │
│ QQuery 9  │       22.72 / 23.07 ±0.36 / 23.66 ms │        23.23 / 23.61 ±0.45 / 24.48 ms │ no change │
│ QQuery 10 │          8.51 / 8.66 ±0.10 / 8.80 ms │           8.15 / 8.56 ±0.32 / 9.12 ms │ no change │
│ QQuery 11 │          4.01 / 4.03 ±0.02 / 4.06 ms │           3.98 / 4.00 ±0.01 / 4.02 ms │ no change │
│ QQuery 12 │          7.68 / 7.90 ±0.13 / 8.05 ms │           7.61 / 8.02 ±0.27 / 8.38 ms │ no change │
│ QQuery 13 │       11.75 / 12.16 ±0.30 / 12.68 ms │        11.24 / 12.15 ±0.49 / 12.66 ms │ no change │
│ QQuery 14 │          8.62 / 8.91 ±0.20 / 9.18 ms │           8.32 / 8.64 ±0.25 / 9.00 ms │ no change │
│ QQuery 15 │          8.46 / 8.94 ±0.37 / 9.34 ms │           8.49 / 8.94 ±0.29 / 9.29 ms │ no change │
│ QQuery 16 │          2.34 / 2.39 ±0.05 / 2.49 ms │           2.30 / 2.35 ±0.03 / 2.38 ms │ no change │
│ QQuery 17 │       15.37 / 15.58 ±0.13 / 15.77 ms │        15.53 / 15.70 ±0.12 / 15.86 ms │ no change │
│ QQuery 18 │       11.97 / 12.18 ±0.21 / 12.58 ms │        11.91 / 11.99 ±0.07 / 12.11 ms │ no change │
│ QQuery 19 │       36.95 / 40.07 ±1.73 / 41.82 ms │        38.85 / 40.18 ±1.32 / 42.60 ms │ no change │
│ QQuery 20 │ 1257.09 / 1263.71 ±4.97 / 1270.19 ms │ 1261.98 / 1272.40 ±13.15 / 1294.70 ms │ no change │
└───────────┴──────────────────────────────────────┴───────────────────────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 1605.47ms │
│ Total Time (simplify_smj_full_opt)   │ 1616.74ms │
│ Average Time (HEAD)                  │   80.27ms │
│ Average Time (simplify_smj_full_opt) │   70.29ms │
│ Queries Faster                       │         0 │
│ Queries Slower                       │         0 │
│ Queries with No Change               │        20 │
│ Queries with Failure                 │         0 │
└──────────────────────────────────────┴───────────┘

Resource Usage

smj — base (merge-base)

Metric	Value
Wall time	8.4s
Peak memory	3.1 GiB
Avg memory	3.1 GiB
CPU user	87.0s
CPU sys	0.9s
Disk read	0 B
Disk write	140.0 KiB

smj — branch

Metric	Value
Wall time	14.5s
Peak memory	3.4 GiB
Avg memory	3.2 GiB
CPU user	114.4s
CPU sys	2.1s
Disk read	0 B
Disk write	130.0 MiB

File an issue against this benchmark runner

mbutrovich · 2026-03-26T20:22:10Z

🤖 Benchmark completed (GKE) | trigger

Details
Resource Usage
File an issue against this benchmark runner

Yeah, these are expected results. The queries that demonstrate this issue are new in the PR so we don't get to compare against main.

rluvaton · 2026-03-26T20:40:41Z

Can you please create a pr with only the queries so we can run the benchmark on

mbutrovich · 2026-03-26T20:50:36Z

I'm noticing we don't have a ton of test coverage with spilling. I'll try to shore that up.

mbutrovich · 2026-03-26T20:55:43Z

Can you please create a pr with only the queries so we can run the benchmark on

#21188

See #21184 for reason of this benchmark.

adriangbot · 2026-03-26T21:34:11Z

Hi @mbutrovich, thanks for the request (#21184 (comment)). Only whitelisted users can trigger benchmarks. Allowed users: Dandandan, Jefffrey, Omega359, adriangb, alamb, comphead, etseidl, gabotechs, geoffreyclaude, klion26, kosiew, rluvaton, xudong963, zhuqi-lucas.

File an issue against this benchmark runner

rluvaton · 2026-03-26T21:37:19Z

run benchmark smj

adriangbot · 2026-03-26T21:39:32Z

🤖 Benchmark running (GKE) | trigger
Linux bench-c4138390922-574-8th64 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux
Comparing simplify_smj_full_opt (2d2758b) to 37978e3 (merge-base) diff using: smj
Results will be posted here when complete

File an issue against this benchmark runner

mbutrovich · 2026-03-26T21:51:01Z

I'm running another 50 iterations of fuzz tests now that I added one that spills, so that'll take ~90 minutes. So far I'm through 12 iterations, so I'll check back in once it's done.

adriangbot · 2026-03-26T21:54:41Z

🤖 Benchmark completed (GKE) | trigger

Details

Comparing HEAD and simplify_smj_full_opt
--------------------
Benchmark smj.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query     ┃                                     HEAD ┃                simplify_smj_full_opt ┃         Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 1  │              3.68 / 3.75 ±0.10 / 3.94 ms │          3.69 / 3.79 ±0.08 / 3.94 ms │      no change │
│ QQuery 2  │           18.27 / 18.67 ±0.24 / 18.98 ms │       18.88 / 19.90 ±0.96 / 21.65 ms │   1.07x slower │
│ QQuery 3  │        101.94 / 104.49 ±1.45 / 106.02 ms │    103.50 / 105.42 ±1.10 / 106.53 ms │      no change │
│ QQuery 4  │              3.85 / 4.00 ±0.10 / 4.12 ms │          3.97 / 4.01 ±0.04 / 4.08 ms │      no change │
│ QQuery 5  │           22.16 / 22.65 ±0.39 / 23.34 ms │       22.11 / 22.45 ±0.19 / 22.66 ms │      no change │
│ QQuery 6  │           18.21 / 18.90 ±0.48 / 19.72 ms │       18.32 / 18.85 ±0.54 / 19.82 ms │      no change │
│ QQuery 7  │           22.65 / 23.01 ±0.45 / 23.87 ms │       22.69 / 23.41 ±0.74 / 24.80 ms │      no change │
│ QQuery 8  │              3.12 / 3.18 ±0.06 / 3.29 ms │          3.02 / 3.13 ±0.06 / 3.16 ms │      no change │
│ QQuery 9  │           23.06 / 23.33 ±0.20 / 23.55 ms │       23.17 / 23.50 ±0.23 / 23.78 ms │      no change │
│ QQuery 10 │              8.50 / 8.71 ±0.18 / 9.03 ms │          8.26 / 8.73 ±0.26 / 8.93 ms │      no change │
│ QQuery 11 │              3.92 / 3.96 ±0.02 / 3.98 ms │          3.91 / 3.98 ±0.05 / 4.04 ms │      no change │
│ QQuery 12 │              7.78 / 7.89 ±0.05 / 7.93 ms │          7.52 / 7.81 ±0.18 / 8.06 ms │      no change │
│ QQuery 13 │           11.86 / 12.28 ±0.28 / 12.60 ms │       11.45 / 12.30 ±0.51 / 12.87 ms │      no change │
│ QQuery 14 │              8.44 / 8.70 ±0.18 / 8.95 ms │          8.64 / 8.90 ±0.17 / 9.16 ms │      no change │
│ QQuery 15 │              8.52 / 8.67 ±0.17 / 8.98 ms │          8.50 / 8.79 ±0.27 / 9.21 ms │      no change │
│ QQuery 16 │              2.30 / 2.35 ±0.03 / 2.38 ms │          2.31 / 2.38 ±0.05 / 2.44 ms │      no change │
│ QQuery 17 │           15.03 / 15.28 ±0.14 / 15.44 ms │       15.27 / 15.49 ±0.17 / 15.78 ms │      no change │
│ QQuery 18 │           11.75 / 11.82 ±0.06 / 11.92 ms │       11.87 / 11.93 ±0.08 / 12.09 ms │      no change │
│ QQuery 19 │           36.82 / 39.51 ±1.54 / 41.45 ms │       37.43 / 39.20 ±1.43 / 41.65 ms │      no change │
│ QQuery 20 │     1261.33 / 1270.77 ±4.85 / 1274.47 ms │ 1250.60 / 1261.22 ±8.99 / 1275.63 ms │      no change │
│ QQuery 21 │       361.96 / 375.60 ±11.11 / 390.97 ms │   353.89 / 373.99 ±13.74 / 392.09 ms │      no change │
│ QQuery 22 │   7402.75 / 8517.52 ±770.78 / 9493.61 ms │   372.52 / 383.54 ±10.57 / 401.27 ms │ +22.21x faster │
│ QQuery 23 │ 7686.58 / 9418.98 ±1057.23 / 10559.65 ms │   428.50 / 450.15 ±21.38 / 478.34 ms │ +20.92x faster │
└───────────┴──────────────────────────────────────────┴──────────────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 19924.01ms │
│ Total Time (simplify_smj_full_opt)   │  2812.87ms │
│ Average Time (HEAD)                  │   866.26ms │
│ Average Time (simplify_smj_full_opt) │   122.30ms │
│ Queries Faster                       │          2 │
│ Queries Slower                       │          1 │
│ Queries with No Change               │         20 │
│ Queries with Failure                 │          0 │
└──────────────────────────────────────┴────────────┘

Resource Usage

smj — base (merge-base)

Metric	Value
Wall time	99.9s
Peak memory	3.4 GiB
Avg memory	3.3 GiB
CPU user	1097.3s
CPU sys	4.1s
Disk read	0 B
Disk write	179.6 MiB

smj — branch

Metric	Value
Wall time	14.4s
Peak memory	3.4 GiB
Avg memory	3.2 GiB
CPU user	114.2s
CPU sys	2.2s
Disk read	0 B
Disk write	88.0 KiB

File an issue against this benchmark runner

mbutrovich · 2026-03-26T22:05:59Z

Comparing HEAD and simplify_smj_full_opt
--------------------
Benchmark smj.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query     ┃                                     HEAD ┃                simplify_smj_full_opt ┃         Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 22 │   7402.75 / 8517.52 ±770.78 / 9493.61 ms │   372.52 / 383.54 ±10.57 / 401.27 ms │ +22.21x faster │
│ QQuery 23 │ 7686.58 / 9418.98 ±1057.23 / 10559.65 ms │   428.50 / 450.15 ±21.38 / 478.34 ms │ +20.92x faster │
└───────────┴──────────────────────────────────────────┴──────────────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 19924.01ms │
│ Total Time (simplify_smj_full_opt)   │  2812.87ms │
│ Average Time (HEAD)                  │   866.26ms │
│ Average Time (simplify_smj_full_opt) │   122.30ms │
└──────────────────────────────────────┴────────────┘

You love to see it.

rluvaton · 2026-03-26T22:07:19Z

Damn, well done

mbutrovich · 2026-03-27T01:22:55Z

I'm running another 50 iterations of fuzz tests now that I added one that spills, so that'll take ~90 minutes. So far I'm through 12 iterations, so I'll check back in once it's done.

This finished without issue.

adriangb · 2026-03-27T06:09:17Z

run benchmark smj

adriangbot · 2026-03-27T06:11:46Z

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4140424132-575-ns2gb 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing simplify_smj_full_opt (2d2758b) to 37978e3 (merge-base) diff using: smj
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-03-27T06:28:29Z

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

Comparing HEAD and simplify_smj_full_opt
--------------------
Benchmark smj.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query     ┃                                   HEAD ┃                simplify_smj_full_opt ┃         Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 1  │            3.93 / 4.41 ±0.26 / 4.68 ms │          3.84 / 3.90 ±0.06 / 3.99 ms │  +1.13x faster │
│ QQuery 2  │         18.55 / 18.91 ±0.26 / 19.26 ms │       18.35 / 19.08 ±0.50 / 19.90 ms │      no change │
│ QQuery 3  │      103.97 / 105.48 ±1.16 / 107.12 ms │    103.99 / 104.98 ±1.17 / 107.28 ms │      no change │
│ QQuery 4  │            4.03 / 4.06 ±0.03 / 4.12 ms │          4.07 / 4.13 ±0.04 / 4.18 ms │      no change │
│ QQuery 5  │         22.42 / 22.77 ±0.25 / 23.13 ms │       22.15 / 22.49 ±0.21 / 22.77 ms │      no change │
│ QQuery 6  │         18.09 / 18.43 ±0.23 / 18.73 ms │       18.24 / 18.80 ±0.34 / 19.13 ms │      no change │
│ QQuery 7  │         22.20 / 22.48 ±0.30 / 23.06 ms │       22.79 / 22.90 ±0.13 / 23.14 ms │      no change │
│ QQuery 8  │            3.11 / 3.21 ±0.11 / 3.41 ms │          3.05 / 3.15 ±0.06 / 3.23 ms │      no change │
│ QQuery 9  │         23.60 / 23.90 ±0.28 / 24.36 ms │       23.09 / 23.78 ±0.37 / 24.10 ms │      no change │
│ QQuery 10 │            8.29 / 8.77 ±0.25 / 8.97 ms │          8.91 / 9.08 ±0.12 / 9.25 ms │      no change │
│ QQuery 11 │            3.95 / 3.99 ±0.03 / 4.03 ms │          3.96 / 4.04 ±0.07 / 4.17 ms │      no change │
│ QQuery 12 │            7.56 / 7.70 ±0.09 / 7.81 ms │          7.61 / 7.91 ±0.24 / 8.34 ms │      no change │
│ QQuery 13 │         11.11 / 11.54 ±0.46 / 12.22 ms │       11.95 / 12.14 ±0.14 / 12.29 ms │   1.05x slower │
│ QQuery 14 │            8.52 / 8.69 ±0.12 / 8.85 ms │          8.85 / 9.04 ±0.12 / 9.23 ms │      no change │
│ QQuery 15 │            8.62 / 8.79 ±0.15 / 9.04 ms │          8.82 / 8.92 ±0.10 / 9.10 ms │      no change │
│ QQuery 16 │            2.27 / 2.32 ±0.03 / 2.36 ms │          2.28 / 2.38 ±0.07 / 2.51 ms │      no change │
│ QQuery 17 │         14.94 / 15.16 ±0.12 / 15.24 ms │       15.31 / 15.47 ±0.10 / 15.59 ms │      no change │
│ QQuery 18 │         11.67 / 11.80 ±0.08 / 11.88 ms │       11.80 / 11.96 ±0.09 / 12.07 ms │      no change │
│ QQuery 19 │         37.09 / 39.37 ±1.87 / 42.78 ms │       37.85 / 39.47 ±1.54 / 42.07 ms │      no change │
│ QQuery 20 │   1249.96 / 1256.53 ±5.36 / 1265.16 ms │ 1254.91 / 1267.28 ±7.59 / 1275.30 ms │      no change │
│ QQuery 21 │     372.04 / 389.34 ±17.66 / 420.56 ms │    359.17 / 364.27 ±3.39 / 368.48 ms │  +1.07x faster │
│ QQuery 22 │ 7209.91 / 8453.19 ±950.95 / 9574.83 ms │   374.00 / 395.98 ±14.01 / 410.69 ms │ +21.35x faster │
│ QQuery 23 │ 6691.68 / 7908.85 ±848.30 / 8974.33 ms │   422.23 / 450.58 ±42.13 / 534.04 ms │ +17.55x faster │
└───────────┴────────────────────────────────────────┴──────────────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 18349.71ms │
│ Total Time (simplify_smj_full_opt)   │  2821.77ms │
│ Average Time (HEAD)                  │   797.81ms │
│ Average Time (simplify_smj_full_opt) │   122.69ms │
│ Queries Faster                       │          4 │
│ Queries Slower                       │          1 │
│ Queries with No Change               │         18 │
│ Queries with Failure                 │          0 │
└──────────────────────────────────────┴────────────┘

Resource Usage

smj — base (merge-base)

Metric	Value
Wall time	92.1s
Peak memory	3.4 GiB
Avg memory	3.3 GiB
CPU user	1032.9s
CPU sys	3.9s
Disk read	0 B
Disk write	179.6 MiB

smj — branch

Metric	Value
Wall time	14.4s
Peak memory	3.4 GiB
Avg memory	3.2 GiB
CPU user	114.1s
CPU sys	2.1s
Disk read	0 B
Disk write	708.0 KiB

File an issue against this benchmark runner

mbutrovich added 11 commits March 25, 2026 16:43

Remove dead code.

974d305

More cleanup.

3de5c0c

move mark logic

a570c6c

move mark logic

e4fcdd1

add benchmark, optimize remaining smj stream

e66a44b

clean up, debug_asserts

640dddc

add a new test

f71e161

scale benchmark

cd799a6

Batch deferred filtering for outer joins with unique keys

d922b9b

add comments

14d9653

Merge branch 'main' into simplify_smj_full_opt

5779054

github-actions bot added the physical-plan Changes to the physical-plan crate label Mar 26, 2026

clippy fix.

1c1bec5

mbutrovich requested review from comphead and rluvaton March 26, 2026 18:26

mbutrovich marked this pull request as ready for review March 26, 2026 18:29

mbutrovich mentioned this pull request Mar 26, 2026

[EPIC] Benchmark improvements #21165

Open

rluvaton reviewed Mar 26, 2026

View reviewed changes

datafusion/physical-plan/src/joins/sort_merge_join/filter.rs Outdated Show resolved Hide resolved

rluvaton reviewed Mar 26, 2026

View reviewed changes

datafusion/physical-plan/src/joins/sort_merge_join/filter.rs Show resolved Hide resolved

mbutrovich added 4 commits March 26, 2026 15:19

add clarifying comment

9fc21a0

remove booleans is_semi is_mark and just use JoinType enum.

66632ef

clean up redundant comment next to already-verbose unreachable! macro.

60127a7

clearer debug_assert messages

481753a

mbutrovich commented Mar 26, 2026

View reviewed changes

mbutrovich added a commit to mbutrovich/datafusion that referenced this pull request Mar 26, 2026

get benchmark from apache#21184

7a0bfda

mbutrovich mentioned this pull request Mar 26, 2026

test: add SMJ benchmarks from #21184 #21188

Merged

github-merge-queue bot pushed a commit that referenced this pull request Mar 26, 2026

test: add SMJ benchmarks from #21184 (#21188)

38cc8e6

See #21184 for reason of this benchmark.

mbutrovich and others added 2 commits March 26, 2026 17:15

Merge branch 'main' into simplify_smj_full_opt

eeafcd0

add 3 spilling SMJ unit tests and 1 spilling SMJ fuzz test

2d2758b

github-actions bot added the core Core DataFusion crate label Mar 26, 2026

adriangb mentioned this pull request Mar 27, 2026

could you please enable issues adriangb/datafusion-benchmarking#3

Open

Conversation

mbutrovich commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Benchmarks

Are these changes tested?

Are there any user-facing changes?

Uh oh!

mbutrovich commented Mar 26, 2026

Uh oh!

rluvaton commented Mar 26, 2026

Uh oh!

mbutrovich commented Mar 26, 2026

Uh oh!

adriangbot commented Mar 26, 2026

Uh oh!

adriangbot commented Mar 26, 2026

Uh oh!

mbutrovich commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Dandandan commented Mar 26, 2026

Uh oh!

Uh oh!

Uh oh!

mbutrovich Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Dandandan commented Mar 26, 2026

Uh oh!

adriangbot commented Mar 26, 2026

Uh oh!

adriangbot commented Mar 26, 2026

Uh oh!

mbutrovich commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rluvaton commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mbutrovich commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mbutrovich commented Mar 26, 2026

Uh oh!

adriangbot commented Mar 26, 2026

Uh oh!

rluvaton commented Mar 26, 2026

Uh oh!

adriangbot commented Mar 26, 2026

Uh oh!

mbutrovich commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adriangbot commented Mar 26, 2026

Uh oh!

mbutrovich commented Mar 26, 2026

Uh oh!

rluvaton commented Mar 26, 2026

Uh oh!

mbutrovich commented Mar 27, 2026

Uh oh!

adriangb commented Mar 27, 2026

Uh oh!

adriangbot commented Mar 27, 2026

Uh oh!

adriangbot commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mbutrovich commented Mar 26, 2026 •

edited

Loading

mbutrovich commented Mar 26, 2026 •

edited

Loading

mbutrovich commented Mar 26, 2026 •

edited

Loading

rluvaton commented Mar 26, 2026 •

edited

Loading

mbutrovich commented Mar 26, 2026 •

edited

Loading

mbutrovich commented Mar 26, 2026 •

edited

Loading