[SPARK-19522] Fix executor memory in local-cluster mode #16975

andrewor14 · 2017-02-17T16:53:34Z

What changes were proposed in this pull request?

bin/spark-shell --master local-cluster[2,1,2048]

is supposed to launch 2 executors, each with 2GB of memory. However, when I ran this in master, I only get executors with 1GB memory. This patch fixes this problem.

How was this patch tested?

SparkSubmitSuite, manual tests.

…xecutor-mem

SparkQA · 2017-02-17T19:38:10Z

Test build #73058 has finished for PR 16975 at commit b1a13dc.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2017-02-17T21:29:49Z

core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala

@@ -466,7 +466,7 @@ object SparkSubmit extends CommandLineUtils {
      // Other options
      OptionAssigner(args.executorCores, STANDALONE | YARN, ALL_DEPLOY_MODES,
        sysProp = "spark.executor.cores"),
-      OptionAssigner(args.executorMemory, STANDALONE | MESOS | YARN, ALL_DEPLOY_MODES,
+      OptionAssigner(args.executorMemory, ALL_CLUSTER_MGRS, ALL_DEPLOY_MODES,


Is the change in SparkContext needed? Seems like this should be all that's needed.

As far as I understand, the last value in the local-cluster master is the amount of memory the worker has available; you may, for whatever reason, want to run executors with less than that, which your change doesn't seem to allow.

If this were the only change then specifying local-cluster[2,1,2048] doesn't actually do anything because we're not setting spark.executor.memory=2048mb anywhere. You could do --master local-cluster[2,1,2048] --conf spark.executor.memory=2048mb but that's cumbersome and now there are two ways to set the executor memory.

You may, for whatever reason, want to run executors with less than that, which your change doesn't seem to allow.

Yeah, I thought about this long and hard but I just couldn't come up with a case where you would possibly want the worker size to be different from executor size in local-cluster mode. If you want to launch 5 workers (2GB), each with 2 executors (1GB), then you might as well just launch 10 executors (1GB) or run real standalone mode locally. I think it's better to fix the out-of-the-box case than to try to cover all potentially non-existent corner cases.

Well, it would make local-cluster[] work like any other master, where you have to explicitly set the executor memory. I understand the desire to simplify things, but this is doing it at the cost of being inconsistent with other cluster managers.

(e.g. the same command line with a different master would behave differently - you'd fall back to having 1g of memory for executors instead of whatever was defined in the local-cluster string.)

(Anyway, either way is probably fine, so go with your judgement. It just seems like a lot of code in SparkContext just to support that use case.)

The inconsistency is already inherent with the parameters in local-cluster[], so I'm not introducing it here with this change. I personally think it's a really bad interface to force the user set executor memory in two different places and require that these two values match.

also we're talking about a net addition of 7 LOC in SparkContext.scala, about half of which are comments and warning logs. It's really not that much code.

vanzin · 2017-02-21T22:39:53Z

core/src/main/scala/org/apache/spark/SparkContext.scala

+      // In other modes, use the configured memory if it exists
+      master match {
+        case SparkMasterRegex.LOCAL_CLUSTER_REGEX(_, _, em) =>
+          if (configuredMemory.isDefined) {


Could you at least change this so that spark.executor.memory takes precedence if it's set? Then both use cases are possible. (Maybe someone is crazy enough to be trying dynamic allocation in local-cluster mode, or something else...)

HyukjinKwon · 2017-05-11T14:30:46Z

Hi @andrewor14, is this still active?

andrewor14 added 3 commits February 8, 2017 18:27

Fix propagation of executor memory in local-cluster mode

db3773c

Log warning if memory is explicitly set

1a5bdfe

Merge branch 'master' of github.com:apache/spark into local-cluster-e…

b1a13dc

…xecutor-mem

vanzin reviewed Feb 17, 2017

View reviewed changes

vanzin reviewed Feb 21, 2017

View reviewed changes

HyukjinKwon mentioned this pull request May 17, 2017

[INFRA] Close stale PRs #18017

Closed

asfgit closed this in 5d2750a May 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-19522] Fix executor memory in local-cluster mode #16975

[SPARK-19522] Fix executor memory in local-cluster mode #16975

andrewor14 commented Feb 17, 2017

SparkQA commented Feb 17, 2017

vanzin Feb 17, 2017

andrewor14 Feb 17, 2017 •

edited

andrewor14 Feb 17, 2017 •

edited

vanzin Feb 17, 2017

vanzin Feb 17, 2017

andrewor14 Feb 18, 2017

andrewor14 Feb 18, 2017

vanzin Feb 21, 2017

andrewor14 Feb 22, 2017

HyukjinKwon commented May 11, 2017

[SPARK-19522] Fix executor memory in local-cluster mode #16975

[SPARK-19522] Fix executor memory in local-cluster mode #16975

Conversation

andrewor14 commented Feb 17, 2017

What changes were proposed in this pull request?

How was this patch tested?

SparkQA commented Feb 17, 2017

vanzin Feb 17, 2017

Choose a reason for hiding this comment

andrewor14 Feb 17, 2017 • edited

Choose a reason for hiding this comment

andrewor14 Feb 17, 2017 • edited

Choose a reason for hiding this comment

vanzin Feb 17, 2017

Choose a reason for hiding this comment

vanzin Feb 17, 2017

Choose a reason for hiding this comment

andrewor14 Feb 18, 2017

Choose a reason for hiding this comment

andrewor14 Feb 18, 2017

Choose a reason for hiding this comment

vanzin Feb 21, 2017

Choose a reason for hiding this comment

andrewor14 Feb 22, 2017

Choose a reason for hiding this comment

HyukjinKwon commented May 11, 2017

andrewor14 Feb 17, 2017 •

edited

andrewor14 Feb 17, 2017 •

edited