-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-34089][CORE] HybridRowQueue should respect the configured memory mode #31152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @tgravescs @mridulm @cloud-fan Please take a look, thanks! |
Kubernetes integration test starting |
so I'm a little confused by this. so I went back to the history. This seems to be used by Spillable, HashRelation, and RowQueue. It looks like at the time those were added, the MemoryConsumer didn't support a memory mode. |
Kubernetes integration test status failure |
Test build #133972 has finished for PR 31152 at commit
|
@tgravescs thanks for providing the context! I looked at the source code and I found one caller of this particular For |
hmm...but seems like
|
kindly ping @tgravescs |
All implementations of |
yeah it doesn't make sense to spill if you need off heap and all you have is on heap or vice versa. |
I see. Thanks for the explanation. |
Hi @tgravescs @cloud-fan Sorry for the late here. I've updated the PR according to our discussion. To summarize, the latest PR only fixes Besides, I've also checked all the other MemoryConsumers which seems correct to me. Please take another look when you get a chance, thanks! |
sql/core/src/main/scala/org/apache/spark/sql/execution/python/RowQueue.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala
Outdated
Show resolved
Hide resolved
90ebfb9
to
9c29fbc
Compare
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #136254 has finished for PR 31152 at commit
|
retest this please |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #136258 has finished for PR 31152 at commit
|
GA passed, merging to master, thanks! |
What changes were proposed in this pull request?
This PR fixes the
HybridRowQueue
to respect the configured memory mode.Besides, this PR also refactored the constructor of
MemoryConsumer
to accept the memory mode explicitly.Why are the changes needed?
HybridRowQueue
supports both onHeap and offHeap manipulation. But it inherited the wrongMemoryConsumer
constructor, which hard-coded the memory mode toonHeap
.Does this PR introduce any user-facing change?
No. (Maybe yes in some cases where users can't complete the job before could complete successfully after the fix because of
HybridRowQueue
is able to spill under offHeap mode now. )How was this patch tested?
Updated the existing test to make it test both offHeap and onHeap modes.