feat: add sort pushdown benchmark and SLT tests by zhuqi-lucas · Pull Request #21213 · apache/datafusion

zhuqi-lucas · 2026-03-28T04:06:46Z

Which issue does this PR close?

Related to #17348
Precursor to #21182

Rationale for this change

Add benchmark and integration tests for sort pushdown optimization, split out from #21182 per reviewer request. This allows comparing benchmark results before and after the optimization lands, and the SLT diff in #21182 will clearly show which test expectations changed due to the optimization.

What changes are included in this PR?

Benchmark

New sort-pushdown benchmark subcommand with 4 queries testing sort elimination:

Query	Description
Q1	`ORDER BY l_orderkey ASC` (full scan)
Q2	`ORDER BY l_orderkey ASC LIMIT 100`
Q3	`SELECT * ORDER BY l_orderkey ASC` (wide)
Q4	`SELECT * ORDER BY l_orderkey ASC LIMIT 100` (wide)

Usage:
```bash
./bench.sh data sort_pushdown
./bench.sh run sort_pushdown # baseline
./bench.sh run sort_pushdown_sorted # with sort elimination
```

SLT Integration Tests (5 new groups)

Test A: Non-overlapping files + WITH ORDER → Sort eliminated (single partition)
Test B: Overlapping files → SortExec retained (baseline, files in original order)
Test C: LIMIT queries (ASC sort elimination + DESC reverse scan)
Test D: `target_partitions=2` → SPM + per-partition sort elimination
Test E: Inferred ordering from Parquet metadata (no WITH ORDER) — single and multi partition

Files Changed

File	Change
`benchmarks/src/sort_pushdown.rs`	New benchmark module
`benchmarks/src/lib.rs`	Register module
`benchmarks/src/bin/dfbench.rs`	Register subcommand
`benchmarks/bench.sh`	Add data/run entries
`datafusion/sqllogictest/test_files/sort_pushdown.slt`	5 new SLT test groups

Test plan

`cargo clippy -p datafusion-benchmarks` — 0 warnings
`cargo test -p datafusion-sqllogictest -- sort_pushdown` — all tests pass

🤖 Generated with Claude Code

Copilot

Pull request overview

Adds a new dedicated benchmark to measure the impact of sort pushdown / sort elimination, enabling baseline vs “declared sorted” comparisons ahead of the core optimization work.

Changes:

Introduce a new sort-pushdown benchmark module with four ORDER BY l_orderkey query variants (with/without LIMIT, narrow vs SELECT *).
Register the new benchmark module and dfbench subcommand.
Extend bench.sh with data/run entries for sort_pushdown and sort_pushdown_sorted.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 4 comments.

File	Description
`benchmarks/src/sort_pushdown.rs`	New benchmark implementation and CLI options for running the sort-pushdown scenarios
`benchmarks/src/lib.rs`	Exposes the new `sort_pushdown` module
`benchmarks/src/bin/dfbench.rs`	Adds the `sort-pushdown` subcommand wiring
`benchmarks/bench.sh`	Adds script commands to generate data and run the new benchmark variants

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-28T04:11:20Z

benchmarks/src/sort_pushdown.rs

+//! Measured on 300k rows, 8 non-overlapping sorted parquet files, single partition,
+//! debug build (results vary by hardware; relative speedup is the key metric):


The reference-results section claims the benchmark was run on "8 non-overlapping sorted parquet files" / single partition, but ./bench.sh data sort_pushdown currently just calls data_tpch (tpchgen-cli --parts=1) and ./bench.sh run sort_pushdown* doesn't force --partitions 1. Either adjust the reference-results text to match the default dataset/run configuration, or update the bench script/data generation to actually produce multi-part sorted inputs and set partitions explicitly.

Suggested change

//! Measured on 300k rows, 8 non-overlapping sorted parquet files, single partition,

//! debug build (results vary by hardware; relative speedup is the key metric):

//! Measured on approximately 300k rows using the default dataset prepared via

//! `./bench.sh data sort_pushdown`, debug build (results vary by hardware; the

//! relative speedup is the key metric):

Copilot · 2026-03-28T04:11:21Z

benchmarks/bench.sh

+                sort_pushdown|sort_pushdown_sorted)
+                    # same data as for tpch
+                    data_tpch "1" "parquet"
+                    ;;


sort_pushdown|sort_pushdown_sorted reuses data_tpch "1" "parquet", which currently generates parquet with --parts=1. That means the benchmark will typically run against a single lineitem parquet file, not multiple non-overlapping files as described. If the intent is to benchmark cross-file sort elimination, consider generating TPCH parquet with multiple parts for this benchmark (and documenting that the files are expected to be sorted by l_orderkey).

Copilot · 2026-03-28T04:11:21Z

benchmarks/bench.sh

+    RESULTS_FILE="${RESULTS_DIR}/sort_pushdown.json"
+    echo "Running sort pushdown benchmark (no WITH ORDER)..."
+    debug_run $CARGO_COMMAND --bin dfbench -- sort-pushdown --iterations 5 --path "${TPCH_DIR}" -o "${RESULTS_FILE}" ${QUERY_ARG} ${LATENCY_ARG}
+}


The new sort-pushdown benchmark is described as single-partition in the module docs, but the bench runner doesn't pass --partitions here (so it will use DataFusion's default target partitions). If you want stable, comparable results and to match the documented plan shape (no SPM for single partition), consider passing --partitions 1 (or documenting the expected/default partitions) in these runner functions.

Copilot · 2026-03-28T04:11:21Z

benchmarks/src/sort_pushdown.rs

+    common: CommonOpt,
+
+    /// Sort pushdown query number. If not specified, runs all queries
+    #[arg(short, long)]


--query accepts any usize, but the code later does query_id - 1 / indexes into QUERIES, which will panic for --query 0 or any value > 4. Consider adding a clap value range (1..=4) and/or validating query_id in run() and returning a DataFusion error instead of panicking.

Suggested change

#[arg(short, long)]

#[arg(short, long, value_parser = clap::value_parser!(usize).range(1..=4))]

Add benchmark and integration tests for sort pushdown optimization as a precursor to the core optimization PR (apache#21182). Benchmark: new `sort-pushdown` subcommand with 4 queries testing sort elimination (ASC full scan, ASC LIMIT, wide full, wide LIMIT). SLT tests (5 new groups): - Test A: Non-overlapping files + WITH ORDER → Sort eliminated - Test B: Overlapping files → SortExec retained - Test C: LIMIT queries (ASC sort elimination + DESC reverse scan) - Test D: target_partitions=2 → SPM + per-partition sort elimination - Test E: Inferred ordering from Parquet metadata (no WITH ORDER) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings March 28, 2026 04:06

zhuqi-lucas mentioned this pull request Mar 28, 2026

feat: sort file groups by statistics during sort pushdown (Sort pushdown phase 2) #21182

Open

6 tasks

Copilot started reviewing on behalf of zhuqi-lucas March 28, 2026 04:07 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

zhuqi-lucas force-pushed the feat/sort-pushdown-benchmark branch from dc77327 to b41eddf Compare March 29, 2026 08:21

github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Mar 29, 2026

zhuqi-lucas changed the title ~~feat: add sort pushdown benchmark~~ feat: add sort pushdown benchmark and SLT tests Mar 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add sort pushdown benchmark and SLT tests#21213

feat: add sort pushdown benchmark and SLT tests#21213
zhuqi-lucas wants to merge 1 commit intoapache:mainfrom
zhuqi-lucas:feat/sort-pushdown-benchmark

zhuqi-lucas commented Mar 28, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		//! Measured on 300k rows, 8 non-overlapping sorted parquet files, single partition,
		//! debug build (results vary by hardware; relative speedup is the key metric):

-//! Measured on 300k rows, 8 non-overlapping sorted parquet files, single partition,
-//! debug build (results vary by hardware; relative speedup is the key metric):
+//! Measured on approximately 300k rows using the default dataset prepared via
+//! `./bench.sh data sort_pushdown`, debug build (results vary by hardware; the
+//! relative speedup is the key metric):

	#[arg(short, long)]
	#[arg(short, long, value_parser = clap::value_parser!(usize).range(1..=4))]

Conversation

zhuqi-lucas commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Benchmark

SLT Integration Tests (5 new groups)

Files Changed

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zhuqi-lucas commented Mar 28, 2026 •

edited

Loading