Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gather partial TopN results #21761

Merged
merged 1 commit into from
May 2, 2024
Merged

Conversation

Dith3r
Copy link
Member

@Dith3r Dith3r commented Apr 30, 2024

Description

Gathering TopN avoids unnecessary network overhead, especially when both the number of splits and the TopN limit are big.

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# General
* Improve performance of ORDER BY queries with LIMIT on large data sets. ({issue}`21761`)

Copy link
Member

@lukasz-stec lukasz-stec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@Dith3r Dith3r force-pushed the ke/gather-topn branch 3 times, most recently from f2fbc4f to 98b4c2a Compare April 30, 2024 11:02
@github-actions github-actions bot added the hive Hive connector label Apr 30, 2024
import static io.trino.sql.planner.plan.TopNNode.Step.PARTIAL;

/**
* Adds local round-robin and gathering exchange on top of partial TopN to limit the task output size.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How big of an issue is this? For a leaf task, the output is currently N * number_of_splits. With this change, it's N. But if N is small, this may not be such big of a deal in general.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here are some numbers

trino:iceberg_small_files_tpch_sf1000_orc_part> explain analyze verbose select * from lineitem order by orderkey limit 10000;

Before

Output: 2947463228 rows (419.40GB) 
...
CPU Time: 18061.8s total,  332K rows/s, 9.04MB/s, 45% active
Per Node: 9.9 parallelism, 3.29M rows/s, 89.6MB/s
Parallelism: 69.4
Peak Memory: 1.07GB
4:20 [6B rows, 159GB] [23.1M rows/s, 627MB/s]


After

Output: 60000 rows (8.75MB)
...
CPU Time: 17139.7s total,  350K rows/s, 9.53MB/s, 34% active
Per Node: 17.7 parallelism,  6.2M rows/s,  169MB/s
Parallelism: 123.9
Peak Memory: 1.62GB
2:18 [6B rows, 159GB] [43.4M rows/s, 1.15GB/s]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a nice win 👍

Gathering TopN avoids unnecessary network overhead, especially when
both the number of splits and the TopN limit are big.

Co-authored-by: Kamil Endruszkiewicz <kamil.endruszkiewicz@starburstdata.com>
@raunaqmorarka raunaqmorarka merged commit a6474d8 into trinodb:master May 2, 2024
94 checks passed
@github-actions github-actions bot added this to the 447 milestone May 2, 2024
@Dith3r Dith3r deleted the ke/gather-topn branch May 21, 2024 07:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants