Skip to content

[SPARK-34089][CORE] HybridRowQueue should respect the configured memory mode #31152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

Ngone51
Copy link
Member

@Ngone51 Ngone51 commented Jan 12, 2021

What changes were proposed in this pull request?

This PR fixes the HybridRowQueue to respect the configured memory mode.

Besides, this PR also refactored the constructor of MemoryConsumer to accept the memory mode explicitly.

Why are the changes needed?

HybridRowQueue supports both onHeap and offHeap manipulation. But it inherited the wrong MemoryConsumer constructor, which hard-coded the memory mode to onHeap.

Does this PR introduce any user-facing change?

No. (Maybe yes in some cases where users can't complete the job before could complete successfully after the fix because of HybridRowQueue is able to spill under offHeap mode now. )

How was this patch tested?

Updated the existing test to make it test both offHeap and onHeap modes.

@github-actions github-actions bot added the CORE label Jan 12, 2021
@Ngone51
Copy link
Member Author

Ngone51 commented Jan 12, 2021

cc @tgravescs @mridulm @cloud-fan Please take a look, thanks!

@SparkQA
Copy link

SparkQA commented Jan 12, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38559/

@tgravescs
Copy link
Contributor

tgravescs commented Jan 12, 2021

so I'm a little confused by this. so I went back to the history. This seems to be used by Spillable, HashRelation, and RowQueue. It looks like at the time those were added, the MemoryConsumer didn't support a memory mode.
It looks like when that memory mode was added spilling was only support to on heap by Spillable (looking at the comment ing): 8fb1d1c
Is that not the case now?
Looking at:
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/collection/Spillable.scala#L111
it seems like its only on heap.

@SparkQA
Copy link

SparkQA commented Jan 12, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38559/

@SparkQA
Copy link

SparkQA commented Jan 12, 2021

Test build #133972 has finished for PR 31152 at commit 0a3743b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

@tgravescs thanks for providing the context! I looked at the source code and I found one caller of this particular MemoryConsumer constructor, HybridRowQueue, actually supports off-heap.

For MemoryConsumer implementations, I think they are responsible to set the memory mode correctly. We should fix Spillable as it only supports on-heap.

@Ngone51
Copy link
Member Author

Ngone51 commented Jan 13, 2021

It looks like when that memory mode was added spilling was only support to on heap by Spillable

hmm...but seems like Spillable can still spill (by maybeSpill()) under off-heap mode if the spill is triggered by itself.

protected def maybeSpill(collection: C, currentMemory: Long): Boolean = {

@Ngone51
Copy link
Member Author

Ngone51 commented Jan 20, 2021

kindly ping @tgravescs

@cloud-fan
Copy link
Contributor

hmm...but seems like Spillable can still spill (by maybeSpill()) under off-heap mode if the spill is triggered by itself.

All implementations of Spillable use java collection to store data, so they can't be off-heap.

@tgravescs
Copy link
Contributor

yeah it doesn't make sense to spill if you need off heap and all you have is on heap or vice versa.

@Ngone51
Copy link
Member Author

Ngone51 commented Jan 20, 2021

I see. Thanks for the explanation.

@Ngone51 Ngone51 changed the title [SPARK-34089][CORE] MemoryConsumer should respect TaskMemoryManager's MemoryMode when the mode isn't explicitly set [SPARK-34089][CORE] HybridRowQueue should respect the configured memory mode Mar 16, 2021
@Ngone51
Copy link
Member Author

Ngone51 commented Mar 16, 2021

Hi @tgravescs @cloud-fan Sorry for the late here. I've updated the PR according to our discussion. To summarize, the latest PR only fixes HybridRowQueue to make it respect the configured memory mode instead of the hard-coded onHeap mode since it supports both modes.

Besides, I've also checked all the other MemoryConsumers which seems correct to me. Please take another look when you get a chance, thanks!

@Ngone51 Ngone51 force-pushed the fix-MemoryConsumer-memorymode branch from 90ebfb9 to 9c29fbc Compare March 19, 2021 09:13
@SparkQA
Copy link

SparkQA commented Mar 19, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40836/

@SparkQA
Copy link

SparkQA commented Mar 19, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40836/

@SparkQA
Copy link

SparkQA commented Mar 19, 2021

Test build #136254 has finished for PR 31152 at commit 9c29fbc.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Mar 19, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40840/

@SparkQA
Copy link

SparkQA commented Mar 19, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40840/

@SparkQA
Copy link

SparkQA commented Mar 19, 2021

Test build #136258 has finished for PR 31152 at commit 9c29fbc.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

GA passed, merging to master, thanks!

@cloud-fan cloud-fan closed this in e4bb975 Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants