Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: avoid prefetching all sst streams at once #1069

Merged
merged 1 commit into from
Jul 18, 2023

Conversation

ShiKaiWi
Copy link
Member

@ShiKaiWi ShiKaiWi commented Jul 12, 2023

Rationale

Close #959

The current procedure to fetch sst data is started immediately, and it may lead to high memory consumption when massive concurrent queries reaches.

Detailed Changes

Introduce a prefetchable stream to replace the normal stream, with which a way to trigger the fetching data is provided and when to trigger it can also be determined by the caller.

Test Plan

The memory usage and latency of the old version and new one will be attached here.

@ShiKaiWi ShiKaiWi force-pushed the feat-control-pull-record-batches branch from ee14206 to 568e135 Compare July 14, 2023 02:07
analytic_engine/src/row_iter/chain.rs Outdated Show resolved Hide resolved
analytic_engine/src/sst/parquet/async_reader.rs Outdated Show resolved Hide resolved
common_util/src/prefetchable_stream.rs Outdated Show resolved Hide resolved
@ShiKaiWi ShiKaiWi force-pushed the feat-control-pull-record-batches branch from 2a6e6c7 to 32ac0fe Compare July 18, 2023 02:28
Copy link
Contributor

@Rachelint Rachelint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ShiKaiWi ShiKaiWi merged commit a26988b into apache:main Jul 18, 2023
6 checks passed
@ShiKaiWi ShiKaiWi deleted the feat-control-pull-record-batches branch July 18, 2023 06:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OOM caused by query
2 participants