-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] add spilling capability for multi_cast_local_exchange operator #47982
Merged
silverbullet233
merged 10 commits into
StarRocks:main
from
silverbullet233:mcast_local_exchange_limit
Jul 25, 2024
Merged
[Enhancement] add spilling capability for multi_cast_local_exchange operator #47982
silverbullet233
merged 10 commits into
StarRocks:main
from
silverbullet233:mcast_local_exchange_limit
Jul 25, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
silverbullet233
force-pushed
the
mcast_local_exchange_limit
branch
from
July 11, 2024 08:10
470b34c
to
6f884d7
Compare
Signed-off-by: silverbullet233 <3675229+silverbullet233@users.noreply.github.com>
silverbullet233
force-pushed
the
mcast_local_exchange_limit
branch
from
July 12, 2024 02:29
6f884d7
to
0ac7e79
Compare
silverbullet233
changed the title
[Ignore][WIP]ignore me
[Enhancement] add spilling capability for multi_cast_local_exchange operator
Jul 12, 2024
satanson
reviewed
Jul 12, 2024
stdpain
reviewed
Jul 12, 2024
stdpain
reviewed
Jul 12, 2024
stdpain
reviewed
Jul 12, 2024
stdpain
reviewed
Jul 15, 2024
Signed-off-by: silverbullet233 <3675229+silverbullet233@users.noreply.github.com>
Signed-off-by: silverbullet233 <3675229+silverbullet233@users.noreply.github.com>
satanson
previously approved these changes
Jul 17, 2024
Signed-off-by: silverbullet233 <3675229+silverbullet233@users.noreply.github.com>
Signed-off-by: silverbullet233 <3675229+silverbullet233@users.noreply.github.com>
satanson
previously approved these changes
Jul 22, 2024
Signed-off-by: silverbullet233 <3675229+silverbullet233@users.noreply.github.com>
stdpain
reviewed
Jul 23, 2024
Signed-off-by: silverbullet233 <3675229+silverbullet233@users.noreply.github.com>
stdpain
reviewed
Jul 25, 2024
stdpain
reviewed
Jul 25, 2024
satanson
previously approved these changes
Jul 25, 2024
Signed-off-by: silverbullet233 <3675229+silverbullet233@users.noreply.github.com>
Quality Gate passedIssues Measures |
dirtysalt
approved these changes
Jul 25, 2024
[FE Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[BE Incremental Coverage Report]✅ pass : 417 / 455 (91.65%) file detail
|
satanson
approved these changes
Jul 25, 2024
dujijun007
pushed a commit
to dujijun007/starrocks
that referenced
this pull request
Jul 29, 2024
…perator (StarRocks#47982) Signed-off-by: silverbullet233 <3675229+silverbullet233@users.noreply.github.com>
@mergify backport branch-3.3 |
✅ Backports have been created
|
42 tasks
silverbullet233
added a commit
to silverbullet233/starrocks
that referenced
this pull request
Sep 18, 2024
…perator (StarRocks#47982) Signed-off-by: silverbullet233 <3675229+silverbullet233@users.noreply.github.com> (cherry picked from commit ca962a5)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why I'm doing:
When CTEs in a query are reused, the same data is sent to multiple downstream through the MultiCastLocalExchange operator. In the current implementation, all unconsumed data is cached in memory, depending on the gap between the fastest and slowest consumers, resulting in uncontrollable memory usage.
In this scenario, we cannot control memory by restricting the producer's writes, as there may be dependencies between downstream CTEs, such as when the hash join build side and probe side appear simultaneously, the probe side needs to wait for the build side to complete. If restrictions are imposed on the write side, it will cause queries to be stuck forever due to mutual waiting.
What I'm doing:
In this PR, I introduced a new implementation of
MultiCastLocalExchanger
calledSpillableMultiCastLocalExchanger
that can control the memory usage of operators by spilling data to disk.SpillableMultiCastLocalExchanger
is implemented based onMemLimitedChunkQueue
.MemLimitedChunkQueue
is an MPMC queue that accepts multiple producers writing data from different threads and supports multiple consumers consuming the same data, similar to a message queue like Kafka. It controls memory usage by spilling data to disk, while also considering the efficiency of spilling io. as we know, too many small io can seriously affect performance.MemLimitedChunkQueue
organizes data internally in the form of a block linked list. The nodes in a linked list are calledBlock
, which contain multipleCell
s, each Cell contains a Chunk.Every time chunk is pushed, a Cell is created and added to the Block at the end of the linked list. When the size of the Block exceeds a certain limit, a new Block is generated and added to the end of the linked list.
Block is the smallest unit handling by spill io task. when writing, if the data in memory exceeds a certain size, a flush io task will be submitted to flush the oldest block to the disk.
when consumers consume data and find that the block they want to read is not in memory, they will submit a load io task to load the corresponding block back into memory.
Through this approach, we can ensure that both producers and consumers can function properly while controlling memory usage.
Test
I constructed a query based on the tpch 100g dataset for testing.
baseline doesn't change other session variables.
for testing, only enable force spill on multi_cast_local_exchange operator.
here is some metrics from query profile
MULTI_CAST_LOCAL_EXCHANG_SINK
baseline
with spillable_multi_cast_local_exchange
QueryPeakMemoryUsage and QueryTime
baseline: 78.354 GB, 48s952ms
with spillable_multi_cast_local_exchange:67.408 GB, 1m27s
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: