Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-9747] [SQL] Avoid starving an unsafe operator in aggregation #8038

Closed
wants to merge 14 commits into from

Conversation

andrewor14
Copy link
Contributor

This is the sister patch to #8011, but for aggregation.

In a nutshell: create the TungstenAggregationIterator before computing the parent partition. Internally this creates a BytesToBytesMap which acquires a page in the constructor as of this patch. This ensures that the aggregation operator is not starved since we reserve at least 1 page in advance.

@rxin @yhuai

@andrewor14 andrewor14 changed the title [SPARK-9747] Avoid starving an unsafe operator in aggregation [SPARK-9747] [SQL] Avoid starving an unsafe operator in aggregation Aug 7, 2015
resultExpressions,
newMutableProjection,
child.output,
testFallbackStartsAt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen if there is no memory space left to reserve?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll fail fast with "unable to acquire memory" exception

…emory-agg

Conflicts:
	core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java
	sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala
@SparkQA
Copy link

SparkQA commented Aug 7, 2015

Test build #40190 has finished for PR 8038 at commit ca1b44c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 8, 2015

Test build #40195 has finished for PR 8038 at commit 355a9bd.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class JoinedRow extends InternalRow

@rxin
Copy link
Contributor

rxin commented Aug 9, 2015

@andrewor14 can you bring this up to date?

Andrew Or added 2 commits August 10, 2015 13:08
…emory-agg

Conflicts:
	sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala
	sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala
In TungstenAggregate, we fall back to sort-based aggregation if
the hash-based approach cannot request more memory. To do this,
we create a new sorter from an existing unsafe map destructively.
Because this is largely in place, we don't need to reserve a page
in the sorter's constructor.
@yhuai
Copy link
Contributor

yhuai commented Aug 10, 2015

test this please.

aggregationIterator.free()
if (groupingExpressions.isEmpty) {
// This is a grouped aggregate and the input iterator is empty,
// so return an empty iterator.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we should put this comment in the else block. Instead, this branch is used when we do not have input row and there is no grouping expression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch.

@yhuai
Copy link
Contributor

yhuai commented Aug 10, 2015

LGTM. Let's wait for jenkins.

@SparkQA
Copy link

SparkQA commented Aug 10, 2015

Test build #40328 has finished for PR 8038 at commit b4d3633.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yhuai
Copy link
Contributor

yhuai commented Aug 10, 2015

test this please.

@SparkQA
Copy link

SparkQA commented Aug 10, 2015

Test build #40317 timed out for PR 8038 at commit 4d416d0 after a configured wait of 175m.

@yhuai
Copy link
Contributor

yhuai commented Aug 10, 2015

test this please.

@SparkQA
Copy link

SparkQA commented Aug 11, 2015

Test build #40334 timed out for PR 8038 at commit b4d3633 after a configured wait of 175m.


test("memory acquired on construction") {
// Needed for various things in SparkEnv
sc = new SparkContext("local", "testing")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the spark context we are creating at here messed up the the following tests. How about we comment it out and try the pr builder?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, is it possible to create the taskMemoryManager and shuffleMemoryManager without creating a new SparkContext?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I can figure something out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this is why we shouldn't have singleton SQLContexts!)

@SparkQA
Copy link

SparkQA commented Aug 11, 2015

Test build #1428 timed out for PR 8038 at commit b4d3633 after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Aug 11, 2015

Test build #40472 timed out for PR 8038 at commit b10a4f3 after a configured wait of 175m.

…emory-agg

Conflicts:
	sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala
	sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala
@andrewor14
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Aug 11, 2015

Test build #1454 has finished for PR 8038 at commit 94ca5de.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrewor14
Copy link
Contributor Author

retest this please

@yhuai
Copy link
Contributor

yhuai commented Aug 11, 2015

I just triggered another build as a backup.

@SparkQA
Copy link

SparkQA commented Aug 11, 2015

Test build #40525 has finished for PR 8038 at commit 94ca5de.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Aug 11, 2015

compilation failed?

@yhuai
Copy link
Contributor

yhuai commented Aug 11, 2015

[error] /home/jenkins/workspace/SparkPullRequestBuilder@2/sql/core/src/test/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIteratorSuite.scala:42: not enough arguments for constructor TungstenAggregationIterator: (groupingExpressions: Seq[org.apache.spark.sql.catalyst.expressions.NamedExpression], nonCompleteAggregateExpressions: Seq[org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression2], completeAggregateExpressions: Seq[org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression2], initialInputBufferOffset: Int, resultExpressions: Seq[org.apache.spark.sql.catalyst.expressions.NamedExpression], newMutableProjection: (Seq[org.apache.spark.sql.catalyst.expressions.Expression], Seq[org.apache.spark.sql.catalyst.expressions.Attribute]) => () => org.apache.spark.sql.catalyst.expressions.MutableProjection, originalInputAttributes: Seq[org.apache.spark.sql.catalyst.expressions.Attribute], testFallbackStartsAt: Option[Int], numInputRows: org.apache.spark.sql.execution.metric.LongSQLMetric, numOutputRows: org.apache.spark.sql.execution.metric.LongSQLMetric)org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.
[error] Unspecified value parameters numInputRows, numOutputRows.
[error]       iter = new TungstenAggregationIterator(
[error]              ^
[info] Done updating.
[info] Compiling 2 Scala sources to /home/jenkins/workspace/SparkPullRequestBuilder@2/repl/target/scala-2.10/test-classes...
[error] one error found

@SparkQA
Copy link

SparkQA commented Aug 11, 2015

Test build #1463 has finished for PR 8038 at commit 94ca5de.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrewor14
Copy link
Contributor Author

OK, I fixed. Jenkins, retest this please?

@SparkQA
Copy link

SparkQA commented Aug 12, 2015

Test build #1470 has finished for PR 8038 at commit d4dc9ca.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class SQLTransformer (override val uid: String) extends Transformer
    • case class EqualNullSafe(attribute: String, value: Any) extends Filter

@yhuai
Copy link
Contributor

yhuai commented Aug 12, 2015

[error] /home/jenkins/workspace/NewSparkPullRequestBuilder/sql/core/src/test/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIteratorSuite.scala:43: constructor LongSQLMetric in class LongSQLMetric cannot be accessed in class TungstenAggregationIteratorSuite
[error]       val dummyAccum = new LongSQLMetric("dummy")
[error]                        ^
[info] Compiling 2 Scala sources to /home/jenkins/workspace/NewSparkPullRequestBuilder/repl/target/scala-2.10/test-classes...
[error] one error found

@SparkQA
Copy link

SparkQA commented Aug 12, 2015

Test build #40597 timed out for PR 8038 at commit 7ebf6b9 after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Aug 12, 2015

Test build #1477 timed out for PR 8038 at commit 7ebf6b9 after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Aug 12, 2015

Test build #1478 timed out for PR 8038 at commit 7ebf6b9 after a configured wait of 175m.

@andrewor14
Copy link
Contributor Author

So weird... this finished running all python tests successfully too but still timed out somehow?

@rxin
Copy link
Contributor

rxin commented Aug 12, 2015

I'm merging this since it's unlikely a separate issue to cause your test timeout (all the tests did run)

asfgit pushed a commit that referenced this pull request Aug 12, 2015
This is the sister patch to #8011, but for aggregation.

In a nutshell: create the `TungstenAggregationIterator` before computing the parent partition. Internally this creates a `BytesToBytesMap` which acquires a page in the constructor as of this patch. This ensures that the aggregation operator is not starved since we reserve at least 1 page in advance.

rxin yhuai

Author: Andrew Or <andrew@databricks.com>

Closes #8038 from andrewor14/unsafe-starve-memory-agg.

(cherry picked from commit e011079)
Signed-off-by: Reynold Xin <rxin@databricks.com>
@asfgit asfgit closed this in e011079 Aug 12, 2015
@andrewor14 andrewor14 deleted the unsafe-starve-memory-agg branch August 12, 2015 17:45
@SparkQA
Copy link

SparkQA commented Aug 12, 2015

Test build #1495 has finished for PR 8038 at commit 7ebf6b9.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

CodingCat pushed a commit to CodingCat/spark that referenced this pull request Aug 17, 2015
This is the sister patch to apache#8011, but for aggregation.

In a nutshell: create the `TungstenAggregationIterator` before computing the parent partition. Internally this creates a `BytesToBytesMap` which acquires a page in the constructor as of this patch. This ensures that the aggregation operator is not starved since we reserve at least 1 page in advance.

rxin yhuai

Author: Andrew Or <andrew@databricks.com>

Closes apache#8038 from andrewor14/unsafe-starve-memory-agg.
asfgit pushed a commit that referenced this pull request Oct 30, 2015
Since we do not need to preserve a page before calling compute(), MapPartitionsWithPreparationRDD is not needed anymore.

This PR basically revert #8543, #8511, #8038, #8011

Author: Davies Liu <davies@databricks.com>

Closes #9381 from davies/remove_prepare2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants