[SPARK-10341] [SQL] fix memory starving in unsafe SMJ #8511

davies · 2015-08-28T20:53:52Z

In SMJ, the first ExternalSorter could consume all the memory before spilling, then the second can not even acquire the first page.

Before we have a better memory allocator, SMJ should call prepare() before call any compute() of it's children.

cc @rxin @JoshRosen

davies · 2015-08-28T22:31:50Z

ping @JoshRosen

SparkQA · 2015-08-28T23:37:48Z

Test build #41759 has finished for PR 8511 at commit 1afb4f3.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- class KMeans @Since("1.5.0") (
- class GaussianMixtureModel @Since("1.3.0") (
- class KMeansModel @Since("1.1.0") (@Since("1.0.0") val clusterCenters: Array[Vector])
- class PowerIterationClusteringModel @Since("1.3.0") (
- class StreamingKMeansModel @Since("1.2.0") (
- class StreamingKMeans @Since("1.2.0") (
- class ChiSqSelectorModel @Since("1.3.0") (
- class ChiSqSelector @Since("1.3.0") (
- class ElementwiseProduct @Since("1.4.0") (
- class IDF @Since("1.2.0") (@Since("1.2.0") val minDocFreq: Int)
- class Normalizer @Since("1.1.0") (p: Double) extends VectorTransformer
- class PCA @Since("1.4.0") (@Since("1.4.0") val k: Int)
- class StandardScaler @Since("1.1.0") (withMean: Boolean, withStd: Boolean) extends Logging
- class StandardScalerModel @Since("1.3.0") (
- class PoissonGenerator @Since("1.1.0") (
- class ExponentialGenerator @Since("1.3.0") (
- class GammaGenerator @Since("1.3.0") (
- class LogNormalGenerator @Since("1.3.0") (
- case class Rating @Since("0.8.0") (
- class MatrixFactorizationModel @Since("0.8.0") (
- abstract class GeneralizedLinearModel @Since("1.0.0") (
- class IsotonicRegressionModel @Since("1.3.0") (
- case class LabeledPoint @Since("1.0.0") (
- class LassoModel @Since("1.1.0") (
- class LinearRegressionModel @Since("1.1.0") (
- class RidgeRegressionModel @Since("1.1.0") (
- class MultivariateGaussian @Since("1.3.0") (
- case class BoostingStrategy @Since("1.4.0") (
- class Strategy @Since("1.3.0") (
- class DecisionTreeModel @Since("1.0.0") (
- class Node @Since("1.2.0") (
- class Predict @Since("1.2.0") (
- class RandomForestModel @Since("1.2.0") (
- class GradientBoostedTreesModel @Since("1.2.0") (

SparkQA · 2015-08-29T02:59:14Z

Test build #1703 has finished for PR 8511 at commit 1afb4f3.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- class KMeans @Since("1.5.0") (

SparkQA · 2015-08-29T07:52:11Z

Test build #1704 has finished for PR 8511 at commit 1afb4f3.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- class KMeans @Since("1.5.0") (

SparkQA · 2015-08-29T10:23:03Z

Test build #1705 has finished for PR 8511 at commit 1afb4f3.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- class KMeans @Since("1.5.0") (

marmbrus · 2015-08-29T20:36:11Z

/cc @andrewor14

yhuai · 2015-08-29T20:36:51Z

core/src/main/scala/org/apache/spark/rdd/MapPartitionsWithPreparationRDD.scala

@@ -24,6 +24,8 @@ import org.apache.spark.{Partition, Partitioner, TaskContext}
 /**
 * An RDD that applies a user provided function to every partition of the parent RDD, and
 * additionally allows the user to prepare each partition before computing the parent partition.
+ *
+ * TODO(davies): remove this once SPARK-10342 is fixed


Based on this comment, it is not clear what need to be removed (remove this class or remove changes of this PR). Can you add a comment in SPARK-10342 about what need to be removed?

It is also unclear we want to remove this even with better memory
management. You might still want this with those. I would just remove this
comment.

On Aug 29, 2015, at 1:37 PM, Yin Huai notifications@github.com wrote:

In
core/src/main/scala/org/apache/spark/rdd/MapPartitionsWithPreparationRDD.scala
#8511 (comment):

@@ -24,6 +24,8 @@ import org.apache.spark.{Partition, Partitioner, TaskContext}
/**

An RDD that applies a user provided function to every partition of the parent RDD, and

additionally allows the user to prepare each partition before computing the parent partition.

* TODO(davies): remove this once SPARK-10342 is fixed

Based on this comment, it is not clear what need to be removed (remove this
class or remove changes of this PR). Can you add a comment in SPARK-10342
about what need to be removed?

—
Reply to this email directly or view it on GitHub
https://github.com/apache/spark/pull/8511/files#r38265746.

SparkQA · 2015-08-29T20:48:57Z

Test build #41785 has finished for PR 8511 at commit d44be2d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

JoshRosen · 2015-08-29T23:30:07Z

core/src/main/scala/org/apache/spark/rdd/MapPartitionsWithPreparationRDD.scala

-    val preparedArgument = preparePartition()
+    prepare()
+    // The same RDD could be called multiple times in one task, each call of compute() should
+    // have sep


Incomplete comment?

andrewor14 · 2015-08-30T06:41:38Z

retest this please

andrewor14 · 2015-08-30T06:53:32Z

core/src/main/scala/org/apache/spark/rdd/ZippedPartitionsRDD.scala

@@ -73,6 +73,13 @@ private[spark] abstract class ZippedPartitionsBaseRDD[V: ClassTag](
    super.clearDependencies()
    rdds = null
  }
+
+  protected def tryPrepareChildren() {


Unit return type. Also can you add a java doc to this:

/** * Call the prepare method of every children that has one. * This is needed for reserving execution memory in advance. */

andrewor14 · 2015-08-30T07:22:14Z

@davies I think the changes here fix the problem, but I find it a little arbitrary that ZippedPartitionsRDD need to change. In the future if we decide to implement SMJ using some other operators then it will silently fail. I wonder if we should make the changes in SMJ itself instead, e.g.:

// In SortMergeJoin
protected override def doExecute(): RDD[InternalRow] = {
  ...

  // You'll need to add the `rdd` argument here, which doesn't currently exist
  def preparePartition(rdd: RDD[...]): Unit = {
    rdd.getNarrowAncestors.collect {
      case ancestor: MapPartitionsWithPrepareRDD[...] => ancestor.prepare()
    }
  }

  def executePartition(...): Iterator[...] = {
    // Just return the parent iterator (no-op)
  }

  val zipped = left.execute().zipPartitions(right.execute()) { ... }
  new MapPartitionsWithPrepareRDD[...](
    zipped, preparePartition, executePartition, preservesPartitioning = true)
}

This essentially forces SMJ to call the prepare() methods of all ancestor RDDs before executing. We might need to change getNarrowAncestors to maintain an ordering (right now it's arbitrary). The cost of traversing the ancestors here should be small because SMJ doesn't have long lineages.

Will something like this work?

davies · 2015-08-30T07:59:08Z

@andrewor14 Not only SortMergeJoin has this problem, SortMergeOuterJoin also has it. And, any join that depends on TungstenAggregation or TungstenSort will have this problem. I think ZipPartitionRDDs is the only place that could introduce multiple MapPartitionsWithPrepareRDD in the same task. Correct me, if there are others.

SparkQA · 2015-08-30T19:47:59Z

Test build #41797 has finished for PR 8511 at commit fa94892.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-08-30T20:10:33Z

Test build #41803 has finished for PR 8511 at commit a3a8a34.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-08-30T23:43:39Z

Test build #1706 has finished for PR 8511 at commit a3a8a34.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- class KMeans @Since("1.5.0") (

andrewor14 · 2015-08-31T01:17:42Z

Hm, my only concern with doing the getNarrowAncestors thing in ZipPartitionsRDD is that now everyone who uses it will traverse the entire lineage. E.g. if GraphX uses ZipPartitionsRDD then we might introduce a regression. I think it's safest if we just do it in the specific operators we care about, i.e. SMJ / SMOJ (are there any others?).

davies · 2015-08-31T17:15:25Z

@andrewor14 I think getNarrowAncestors will not traverse the entire lineage, it only visit the RDDs within the current stage. It still not cheap if the RDD is an union of thousands RDDs.

We could rollback to use children to address you concern, it also work for SMJ and SMOJ, but could fail on others (like FULL OUTER JOIN(AGG(A), AGG(B)). Does this work for you?

SparkQA · 2015-08-31T20:03:37Z

Test build #41838 has finished for PR 8511 at commit 1f4b176.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

andrewor14 · 2015-08-31T20:28:45Z

LGTM will merge once tests pass.

rxin · 2015-08-31T22:55:09Z

I'm going to optimistically merge this since most of the tests passed.

In SMJ, the first ExternalSorter could consume all the memory before spilling, then the second can not even acquire the first page. Before we have a better memory allocator, SMJ should call prepare() before call any compute() of it's children. cc rxin JoshRosen Author: Davies Liu <davies@databricks.com> Closes #8511 from davies/smj_memory. (cherry picked from commit 540bdee) Signed-off-by: Reynold Xin <rxin@databricks.com>

andrewor14 · 2015-08-31T23:05:15Z

core/src/main/scala/org/apache/spark/rdd/MapPartitionsWithPreparationRDD.scala

@@ -38,12 +39,28 @@ private[spark] class MapPartitionsWithPreparationRDD[U: ClassTag, T: ClassTag, M

  override def getPartitions: Array[Partition] = firstParent[T].partitions

+  // In certain join operations, prepare can be called on the same partition multiple times.
+  // In this case, we need to ensure that each call to compute gets a separate prepare argument.
+  private[this] var preparedArguments: ArrayBuffer[M] = new ArrayBuffer[M]


I just noticed this can be a val. We can fix this in a follow-up patch.

Will fix this a follow up PR.

SparkQA · 2015-08-31T23:30:02Z

Test build #41839 has finished for PR 8511 at commit 544f175.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

Since we do not need to preserve a page before calling compute(), MapPartitionsWithPreparationRDD is not needed anymore. This PR basically revert #8543, #8511, #8038, #8011 Author: Davies Liu <davies@databricks.com> Closes #9381 from davies/remove_prepare2.

fix memory starving in unsafe SMJ

1afb4f3

fix bug

d44be2d

yhuai reviewed Aug 29, 2015
View reviewed changes

JoshRosen reviewed Aug 29, 2015
View reviewed changes

address comment, add unit test

fa94892

andrewor14 reviewed Aug 30, 2015
View reviewed changes

address comment

a3a8a34

address comments

544f175

davies force-pushed the smj_memory branch from 1f4b176 to 544f175 Compare August 31, 2015 20:11

asfgit closed this in 540bdee Aug 31, 2015

andrewor14 reviewed Aug 31, 2015
View reviewed changes

davies mentioned this pull request Oct 30, 2015

[SPARK-11423] remove MapPartitionsWithPreparationRDD #9381

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-10341] [SQL] fix memory starving in unsafe SMJ #8511

[SPARK-10341] [SQL] fix memory starving in unsafe SMJ #8511

davies commented Aug 28, 2015

davies commented Aug 28, 2015

SparkQA commented Aug 28, 2015

SparkQA commented Aug 29, 2015

SparkQA commented Aug 29, 2015

SparkQA commented Aug 29, 2015

marmbrus commented Aug 29, 2015

yhuai Aug 29, 2015

rxin Aug 29, 2015

SparkQA commented Aug 29, 2015

JoshRosen Aug 29, 2015

andrewor14 commented Aug 30, 2015

andrewor14 Aug 30, 2015

andrewor14 commented Aug 30, 2015

davies commented Aug 30, 2015

SparkQA commented Aug 30, 2015

SparkQA commented Aug 30, 2015

SparkQA commented Aug 30, 2015

andrewor14 commented Aug 31, 2015

davies commented Aug 31, 2015

SparkQA commented Aug 31, 2015

andrewor14 commented Aug 31, 2015

rxin commented Aug 31, 2015

andrewor14 Aug 31, 2015

davies Aug 31, 2015

SparkQA commented Aug 31, 2015

[SPARK-10341] [SQL] fix memory starving in unsafe SMJ #8511

[SPARK-10341] [SQL] fix memory starving in unsafe SMJ #8511

Conversation

davies commented Aug 28, 2015

davies commented Aug 28, 2015

SparkQA commented Aug 28, 2015

SparkQA commented Aug 29, 2015

SparkQA commented Aug 29, 2015

SparkQA commented Aug 29, 2015

marmbrus commented Aug 29, 2015

yhuai Aug 29, 2015

Choose a reason for hiding this comment

rxin Aug 29, 2015

Choose a reason for hiding this comment

SparkQA commented Aug 29, 2015

JoshRosen Aug 29, 2015

Choose a reason for hiding this comment

andrewor14 commented Aug 30, 2015

andrewor14 Aug 30, 2015

Choose a reason for hiding this comment

andrewor14 commented Aug 30, 2015

davies commented Aug 30, 2015

SparkQA commented Aug 30, 2015

SparkQA commented Aug 30, 2015

SparkQA commented Aug 30, 2015

andrewor14 commented Aug 31, 2015

davies commented Aug 31, 2015

SparkQA commented Aug 31, 2015

andrewor14 commented Aug 31, 2015

rxin commented Aug 31, 2015

andrewor14 Aug 31, 2015

Choose a reason for hiding this comment

davies Aug 31, 2015

Choose a reason for hiding this comment

SparkQA commented Aug 31, 2015