[SPARK-19605][DStream] Fail it if existing resource is not enough to run streaming job #16936

uncleGen · 2017-02-15T08:01:52Z

What changes were proposed in this pull request?

For more detailed discussion, please review:

SPARK-19605

How was this patch tested?

add new unit test.

…ing job

SparkQA · 2017-02-15T08:10:34Z

Test build #72926 has finished for PR 16936 at commit 4093330.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

uncleGen · 2017-02-15T08:30:26Z

streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala

+      ssc.conf.get(DYN_ALLOCATION_MAX_EXECUTORS)
+    } else {
+      val targetNumExecutors =
+        sys.env.get("SPARK_EXECUTOR_INSTANCES").map(_.toInt).getOrElse(2)


here "2" refers to "YarnSparkHadoopUtil.DEFAULT_NUMBER_EXECUTORS"

uncleGen · 2017-02-15T08:34:39Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/config.scala

-    .intConf
-    .createWithDefault(1)
-
  private[spark] val EXECUTOR_MEMORY_OVERHEAD = ConfigBuilder("spark.yarn.executor.memoryOverhead")


multi-definition error in ApplicationMaster.scala, remove this as we add it in core

uncleGen · 2017-02-15T08:35:16Z

streaming/src/test/scala/org/apache/spark/streaming/StreamingContextSuite.scala

  def test() {
-    val conf = new SparkConf().setMaster("local").setAppName("CreationSite test")
+    val conf = new SparkConf().setMaster("local[2]").setAppName("CreationSite test")
    val ssc = new StreamingContext(conf, Milliseconds(100))


unit test fail here

SparkQA · 2017-02-15T11:11:06Z

Test build #72928 has finished for PR 16936 at commit 19c8adb.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2017-02-15T22:40:32Z

(There is no detail in the JIRA that you point to.) I think we already discussed this in part? this duplicates some logic about how cluster resource management works. Does it really not work if not enough receivers can schedule? that seems like a condition to check somewhere else, like, log a warning that a batch can't operate or can't start because not all receivers scheduled. This could happen even if enough resources are theoretically available, just not available to the app.

uncleGen · 2017-02-16T01:11:42Z

@srowen

Does it really not work if not enough receivers can schedule?

That's not what I want to express. What I mean is the stream output can not operate if there is no enough resource, i.e. existing resource is just enough or even not enough to schedule receiver.

srowen · 2017-02-16T11:39:59Z

Hm it just seems like the wrong approach, to externally estimate whether in theory it won't schedule. It is certainly a problem if streaming doesn't work though users would already realize it. The error check or message could be more explicit but it seems like something the streaming machinery should know and warn about?

uncleGen · 2017-02-16T13:07:40Z

Let us call @zsxwing for some suggestions.

zsxwing · 2017-02-22T07:26:23Z

I agreed with @srowen. Streaming should not need to know the details about the mode, and when someone changes other modules, they won't know these codes inside streaming. I would like to see a cleaner solution.

In addition, when there are not enough resources in a cluster, even if the user sets a large core number, they will still not be able to run the streaming job. Hence, they should always be aware of this limitation of Streaming receivers, and it seems not worth to fix this issue.

SPARK-19605: Fail it if existing resource is not enough to run stream…

4093330

…ing job

bug fix

19c8adb

uncleGen commented Feb 15, 2017

View reviewed changes

uncleGen closed this Feb 22, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-19605][DStream] Fail it if existing resource is not enough to run streaming job #16936

[SPARK-19605][DStream] Fail it if existing resource is not enough to run streaming job #16936

Uh oh!

uncleGen commented Feb 15, 2017

Uh oh!

SparkQA commented Feb 15, 2017

Uh oh!

uncleGen Feb 15, 2017

Uh oh!

uncleGen Feb 15, 2017

Uh oh!

uncleGen Feb 15, 2017

Uh oh!

SparkQA commented Feb 15, 2017

Uh oh!

srowen commented Feb 15, 2017

Uh oh!

uncleGen commented Feb 16, 2017 •

edited

Loading

Uh oh!

srowen commented Feb 16, 2017

Uh oh!

uncleGen commented Feb 16, 2017

Uh oh!

zsxwing commented Feb 22, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-19605][DStream] Fail it if existing resource is not enough to run streaming job #16936

[SPARK-19605][DStream] Fail it if existing resource is not enough to run streaming job #16936

Uh oh!

Conversation

uncleGen commented Feb 15, 2017

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Feb 15, 2017

Uh oh!

uncleGen Feb 15, 2017

Choose a reason for hiding this comment

Uh oh!

uncleGen Feb 15, 2017

Choose a reason for hiding this comment

Uh oh!

uncleGen Feb 15, 2017

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Feb 15, 2017

Uh oh!

srowen commented Feb 15, 2017

Uh oh!

uncleGen commented Feb 16, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

srowen commented Feb 16, 2017

Uh oh!

uncleGen commented Feb 16, 2017

Uh oh!

zsxwing commented Feb 22, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

uncleGen commented Feb 16, 2017 •

edited

Loading