Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-4452][Core]Shuffle data structures can starve others on the same thread for memory #7130

Closed
wants to merge 14 commits into from

Conversation

lianhuiwang
Copy link
Contributor

Currently when there are 2 spillable objects in a thread, ShuffleMemoryManager does not work well. some time latter spillable just can ask 5MB.
when join on #5868#, because join has ExternalAppendOnlyMap and UnsafeShuffleExternalSorter’s page size is 128MB, UnsafeShuffleExternalSorter always cannot ask PAGE_SIZE memory and throw IOException("Unable to acquire " + PAGE_SIZE + " bytes of memory”).
in the future, when using #6444’s binary processing sort on unsafe-shuffle-sort, the problem is more serious.
in this PR, when previous spillable is finished,add it to ShuffleMemoryManager’s reserved list. when latter spillable ask memory,but ShuffleMemoryManager has no enough memory, it spill previous spillable and get its memory. that make latter spoilable get more memory.
@JoshRosen @andrewor14 @sryza i think you are interesting in this. and can you take a look? thanks.

@SparkQA
Copy link

SparkQA commented Jun 30, 2015

Test build #36152 has finished for PR 7130 at commit 5cc88ab.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

import org.apache.spark.SparkEnv
import org.apache.spark.{Logging, SparkEnv, Spillable}

import scala.reflect.ClassTag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: scala imports before spark imports

@SparkQA
Copy link

SparkQA commented Jun 30, 2015

Test build #36155 has finished for PR 7130 at commit 5280ef0.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

val manager = new ShuffleMemoryManager(1000L)

val spill1 = new FakeSpillable()
val spill2 = new FakeSpillable()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: spill2 is unused

@SparkQA
Copy link

SparkQA commented Jun 30, 2015

Test build #36157 has finished for PR 7130 at commit c17c047.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 30, 2015

Test build #36162 has finished for PR 7130 at commit 2e0b237.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 30, 2015

Test build #36171 has finished for PR 7130 at commit 83040cf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 1, 2015

Test build #36270 has finished for PR 7130 at commit f4737bb.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 1, 2015

Test build #36281 has finished for PR 7130 at commit 907366b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor

/bump @andrewor14 @rxin, I think that we should look at this before 1.5. I share @lianhuiwang's concerns that our new usages of ShuffleMemoryManager will make the longstanding starvation problem even worse.

@andrewor14, I think that you know a bit more of the history involved in this logic. Didn't we used to have the opposite problem where multiple collections in the same thread (or task, after #7734) may lead to endless spilling behavior?

@lianhuiwang lianhuiwang closed this Sep 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants