[Data] Add TPCH Q20, 21, 22 benchmark scripts to nightly tests by ryankert01 · Pull Request #62333 · ray-project/ray

ryankert01 · 2026-04-03T17:49:18Z

Description

As title. All three queries follow the established patterns from the existing TPC-H benchmark suite

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

gemini-code-assist

Code Review

This pull request introduces Ray Data implementations for TPC-H queries 20, 21, and 22, along with their corresponding configurations in the nightly release test suite for scale factor 100. The review feedback highlights several performance optimization opportunities, specifically recommending against hardcoding low partition counts (like num_partitions=16) which can cause memory pressure at 100GB scale. Additionally, suggestions were made for Query 21 to push filters down before materialization and remove redundant filter steps to improve efficiency.

gemini-code-assist · 2026-04-03T17:50:54Z

+        ps_forest = partsupp.join(
+            forest_parts,
+            join_type="left_semi",
+            num_partitions=16,


Hardcoding num_partitions=16 is likely too low for Scale Factor 100 (100GB), where tables like lineitem and partsupp contain hundreds of millions of rows. This can lead to excessively large partitions (several GBs each), causing memory pressure or underutilization of the cluster. It is generally better to let Ray Data automatically determine the number of partitions or set it to a much higher value (e.g., 200+) for this scale.

it looks like a convention across existing test.

ryankert01 · 2026-04-03T18:15:29Z

fixed at 44f1bbe

ryankert01 · 2026-04-03T19:07:23Z

cc @owenowenisme

Signed-off-by: ryankert01 <ryankert01@gmail.com>

owenowenisme

LGTM

…roject#62333) ## Description As title. All three queries follow the established patterns from the existing TPC-H benchmark suite ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: ryankert01 <ryankert01@gmail.com> Co-authored-by: You-Cheng Lin <106612301+owenowenisme@users.noreply.github.com>

gemini-code-assist Bot reviewed Apr 3, 2026

View reviewed changes

ryankert01 force-pushed the tpch-q20-21-22 branch from e9620ce to 752350c Compare April 3, 2026 17:57

ryankert01 marked this pull request as ready for review April 3, 2026 17:57

ray-gardener Bot added data Ray Data-related issues community-contribution Contributed by the community labels Apr 3, 2026

ryankert01 added 2 commits April 7, 2026 14:04

add tpch q20, 21, 22

0ce9a6e

Signed-off-by: ryankert01 <ryankert01@gmail.com>

update

a884618

Signed-off-by: ryankert01 <ryankert01@gmail.com>

ryankert01 force-pushed the tpch-q20-21-22 branch from 44f1bbe to a884618 Compare April 7, 2026 06:04

Merge branch 'master' into tpch-q20-21-22

9bc1bbb

owenowenisme added the go add ONLY when ready to merge, run all tests label Apr 19, 2026

owenowenisme approved these changes Apr 20, 2026

View reviewed changes

goutamvenkat-anyscale merged commit c33b607 into ray-project:master Apr 20, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Data] Add TPCH Q20, 21, 22 benchmark scripts to nightly tests#62333

[Data] Add TPCH Q20, 21, 22 benchmark scripts to nightly tests#62333
goutamvenkat-anyscale merged 3 commits intoray-project:masterfrom
ryankert01:tpch-q20-21-22

ryankert01 commented Apr 3, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 3, 2026

Uh oh!

ryankert01 Apr 3, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ryankert01 commented Apr 3, 2026

Uh oh!

ryankert01 commented Apr 3, 2026

Uh oh!

owenowenisme left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ryankert01 commented Apr 3, 2026

Description

Additional information

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

ryankert01 Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ryankert01 commented Apr 3, 2026

Uh oh!

ryankert01 commented Apr 3, 2026

Uh oh!

owenowenisme left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants