Skip to content

Conversation

jongyoul
Copy link
Member

  • Defined executorCores from "spark.mesos.executor.cores"
  • Changed the amount of mesosExecutor's cores to executorCores.
  • Added new configuration option on running-on-mesos.md

@SparkQA
Copy link

SparkQA commented Mar 17, 2015

Test build #28706 has finished for PR 5063 at commit 4486a0e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jongyoul
Copy link
Member Author

@elyast I submitted a PR based on #4170. @tnachen Could you please review my PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is already a spark.executor.cores configuration, why not just use that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought spark.executor.cores means the number of cores per executor on yarn mode, and it's confused that one configuration makes different meaning whenever I choose a different cluster manager. Do you think spark.executor.cores is reasonable and clear? I also think less configuration options are better. I just try not to confuse when I configure setting.

updated by @sryza's comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jongyoul @tnachen I think spark.executor.cores could be used, obviously we would need to update docs to explain how this setting works in context of cluster manager.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spark.executor.cores means the number of cores per executor in YARN mode.

@jongyoul
Copy link
Member Author

@sryza Thanks. Do you think it is reasonable to use spark.executor.cores as mesos executor cores? In fine-grained mode on mesos, each task has its own cores to be offered, so It's used only the number of cores of executor itself when executor's status is idle. Simply, if one task is launched on a executor, the total resources of cores is the sum of spark.executor.cores and spark.task.cpus. I think this make configurations confused. What do you think of it? @elyast Yes, if we use 'spark.executor.cores', we should update docs, too.

@tnachen
Copy link
Contributor

tnachen commented Mar 18, 2015

Thanks for the clarification @sryza, thinking about it more it does has a different meaning in Mesos, since it's just the cpu for Mesos executor.

I originally want to reuse it since the configuration flag name implies it's scheudler agnostic, but in reality it's only used in YARN.

I think we should stick with what @jongyul is proposing with spark.mesos prefix.

@sryza
Copy link
Contributor

sryza commented Mar 18, 2015

Are you saying that the new configuration option controls the number of cores allocated to the executor by Mesos not for use by tasks?

Or is it that the executor can go above the configured number but never below it?

In either case, I don't think it's a good idea to make this Mesos-specific. In theory we might want to add something similar for Spark on YARN later. If it's the former, maybe something like spark.executor.frameworkCores and if it's the latter, maybe something like spark.executor.minCores.

@jongyoul
Copy link
Member Author

@sryza I mean the former one. I agree to your opinion of spark.executor.frameworkCores, and I have one more question. Do you think It's good to update configuration.md for adding this new configuration option not to make this specified on Mesos?

@tnachen
Copy link
Contributor

tnachen commented Mar 18, 2015

It is cores allocated for the custom executor that Spark launches with Mesos.
We can call it frameworkCores, but what will this mean outside of Mesos context?

@jongyoul
Copy link
Member Author

@sryza What do you think of @tnachen's opinion? For avoiding this confusion, I added comments that this configuration is being used on mesos fine-grained mode. Is it enough? or needed to do something else?

@SparkQA
Copy link

SparkQA commented Mar 18, 2015

Test build #28787 has finished for PR 5063 at commit da3caa6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@elyast
Copy link
Contributor

elyast commented Mar 18, 2015

We can also consider more describing config option, like spark.mesos.noOfAllocatedCoresPerExecutor ...

@jongyoul
Copy link
Member Author

I don't think parameter name is not a critical issue. It looks like that @sryza doesn't want a new configuration parameter used only in a specific cluster, but he wants new configuration to be more general. I, actually, don't know which is the best option. @sryza, Give me your opinion.

@sryza
Copy link
Contributor

sryza commented Mar 20, 2015

I'd like to better understand the reason this is needed in Mesos fine-grained mode. We've talked about eventually adding a fine-grained mode for YARN as well. So if we think this configuration property might apply there as well, it'd be nice to make it generic. Otherwise, putting Mesos in the name seems fine.

Why do we set aside a core for the framework? Is this just a quirk of Mesos or is the idea that the executor process actually needs a full core for its non-task work?

@elyast
Copy link
Contributor

elyast commented Mar 20, 2015

Hi @sryza @jongyoul,
To give an illustration of this, let's say I have 10 nodes, 64 cores each, lets say 10 streaming jobs are running with 1 minute window (so every minute they will launch tasks) and lets say those 10 streaming jobs apparently run 10 executors each.

Lets say processing each minute takes 10 seconds. At the end of the day you will end up with 100 cores reservation on Mesos that will be there as long as streaming applications are running. (and the executor most probably are not doing any useful work most of the time). Currently there is no way to tweak how many cores it is given to executor, u may even think u can just assign 0.1 CPU for better utilization of resources.

Let me know what do u think

@jongyoul
Copy link
Member Author

@elyast It's enough to explain why we can set this property. @sryza This feature exists only in Mesos mode now. How about setting this property specified in Mesos now, and changing it later if any of cluster will use this property? I don't thinks It's not generic now, but It is needed in Mesos absolutely. I recommend the property name of spark.mesos.executor.freameworkCores, or spark.mesos.executor.cores looks fine.

@tnachen
Copy link
Contributor

tnachen commented Mar 21, 2015

@sryza I believe this setting is a bit unique for Mesos. I think it's better to clarify the terminology here, of what does framework and executor means for Mesos.

In mesos a scheduler is the process which registers with Master to receives offers, and use these offers to launch tasks through the master and eventually the slave. An executor is responsible for actually launching a task and reporting task statuses that is launched on the slaves, which either is a custom executor that fine-grain mode uses or the default mesos executor that coarse-grain mode uses. And a framework is referring to the overall scheduler + executor.

So in the Mesos fine-grain case, we have an custom executor that is only launched once per slave, and that running executor takes up some cpu/mem itself.

IMO I think calling it spark.mesos.executor.cores is fine enough, I wouldn't put framework in there as framework is not the right terminology.

@SparkQA
Copy link

SparkQA commented Mar 22, 2015

Test build #28959 timed out for PR 5063 at commit f6e78db after a configured wait of 120m.

@sryza
Copy link
Contributor

sryza commented Mar 22, 2015

I see. So frameworkCores isn't ideal because "framework" already has a meaning in the Mesos world. Also, "executor" already has some inherent ambiguity because Mesos uses it differently than Spark. So spark.mesos.executor.cores sounds reasonable to me.

I'm still somewhat confused by the need for such a property (and this might still be stemming from me not understanding about how Spark Mesos integration works). If my understanding is correct, the cores included in spark.mesos.executor.cores are never made available to tasks. If that's the case, when would the Mesos executor ever need more than a single core? Does anything the executor is doing require much CPU?

@elyast
Copy link
Contributor

elyast commented Mar 22, 2015

@sryza I don't think it actually need more than a single core, the issue is you cannot give less than 1 CPU.

@jongyoul
Copy link
Member Author

Jekins, retest this please

@jongyoul
Copy link
Member Author

I rebased this from master

@jongyoul
Copy link
Member Author

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Mar 23, 2015

Test build #28974 has finished for PR 5063 at commit 9f160f5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 23, 2015

Test build #28971 has finished for PR 5063 at commit f6e78db.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@sryza
Copy link
Contributor

sryza commented Mar 23, 2015

@elyast can you explain what you mean in a little more detail? Are you saying that the patch somehow enables users to request fractional cores?

@elyast
Copy link
Contributor

elyast commented Mar 23, 2015

@sryza you can request from Mesos fraction of CPU, however I haven't realized that we have wrong type in this patch, we should change it to Double instead of Int.

@tnachen
Copy link
Contributor

tnachen commented Mar 23, 2015

@sryza When creating a Mesos Task, one usually define the resources required for the execution of the task and the resources required to run the Mesos executor. Again the executor role is initiate executing the task and report task statuses, but can do anything else if it's a custom executor provided by the user. (You can skip defining executor where Mesos provides a default one and also add a default resource padding for the default one).

In Spark fine-grain mode we do have a custom executor in org.apache.spark.executor.MesosExecutorBackend, and cores assigned is just for running this executor alone which is running one per slave per app (it can run mulitple Spark tasks).

@tnachen
Copy link
Contributor

tnachen commented Mar 23, 2015

Fractional is definitely supported, since it's just cpu shares in the end. We should make it a double

@jongyoul
Copy link
Member Author

@tnachen I see, but, It's a little big change because other frameworks doesn't consider the amount totalCores as fractional value, so, we should change org.apache.spark.scheduler.WorkerOffer that makes TaskSchedulerImpl changed, but, this class assumes cpus as integer value. I don't think it's not a good idea for now.

@andrewor14
Copy link
Contributor

@sryza @tnachen just so I understand, doesn't spark.mesos.executor.cores refer to the number of cores used by the Spark executor launched to run a given task? Why is it necessary to differentiate between the number of cores used by the executor for running the task, and the number of cores used by it to do other things? What are the differences in semantics between this and spark.executor.cores exactly?

@sryza
Copy link
Contributor

sryza commented Apr 14, 2015

My understanding based on the discussion here is that spark.mesos.executor.cores is the number of cores reserved by an executor not for use in running tasks. So if spark.mesos.executor.cores is 1, spark.task.cpus is 2, and 3 tasks are running, then a total of 7 cores are being occupied. The primary use case for setting it to a number different than 1 is that Mesos allows values that are smaller than 1. So, when running multiple executors per node, someone might set it to .1 in order to avoid sitting on a bunch of the node's cores.

Did you look at the documentation for the new property? If this wasn't clear, then we should probably update the doc with a better explanation or link to relevant Mesos doc.

@tnachen
Copy link
Contributor

tnachen commented Apr 14, 2015

What @sryza said is correct. spark.mesos.executor.cores is different than spark.executor.cores, as it's only the cpu cores used for the mesos executor, which is used to launch Spark executors and those executors are tied to tasks and have its own cpu allocation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable should be renamed mesosExecutorCores. In Spark, whenever we see the word "executor" we automatically think of Spark executors. Just having executorCores here is somewhat misleading.

Also, I know the rest of this file doesn't do this, but can you make this private[mesos] so it's clear that it's only read within this package?

@andrewor14
Copy link
Contributor

I talked to @sryza and @tnachen offline about the potential sources of confusion here. It seems that this code used to mistakenly use spark.task.cpus as the number of cores to give Mesos executors, which is incorrect but happens to be fine because spark.task.cpus defaults to 1.

I left a more detailed comment about why the name spark.mesos.executor.cores introduces another potential source of confusion. In general, when making changes in the Spark on Mesos part of the code base, we should be explicit about which kind of task and executor we are referring to, since these terms unfortunately have overloaded meanings at the intersection of these two projects. I have filed a JIRA to eventually amend all such references: https://issues.apache.org/jira/browse/SPARK-6920.

@jongyoul The intended change in behavior here LGTM. Once you address the wording / naming comments I left I will go ahead and merge this.

@jongyoul
Copy link
Member Author

@andrewor14 Thanks for overall reviewing. I'll handle what you issue.

jongyoul added 10 commits April 16, 2015 22:09
…ne-grained" mode

- Defined executorCores from "spark.mesos.executor.cores"
- Changed the amount of mesosExecutor's cores to executorCores.
- Added new configuration option on running-on-mesos.md
…ne-grained" mode

- Changed configruation name and description from "spark.mesos.executor.cores" to "spark.executor.frameworkCores"
…esos "fine-grained" mode"

This reverts commit da3caa6a062d7dd59875b3f8b83febe0c95f5c1e.
…eir classes

- Change available resources of cpus to integer value beacuse WorkerOffer support the amount cpus as integer value
…eir classes

- Fixed Mesos*Suite for supporting integer WorkerOffers
- Fixed Documentation
@jongyoul
Copy link
Member Author

I've rebased it from current master at first.

…ne-grained" mode

- Fixed docs
- Changed configuration name
- Made mesosExecutorCores private
@SparkQA
Copy link

SparkQA commented Apr 16, 2015

Test build #30419 has finished for PR 5063 at commit 9238d6e.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@SparkQA
Copy link

SparkQA commented Apr 16, 2015

Test build #30418 has finished for PR 5063 at commit 2d41241.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@jongyoul
Copy link
Member Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Apr 17, 2015

Test build #30448 has finished for PR 5063 at commit 9238d6e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@jongyoul
Copy link
Member Author

@andrewor14 I've fixed what you issue. Please review and merge this.

@andrewor14
Copy link
Contributor

LGTM I'm merging this into master finally thanks everyone.

@asfgit asfgit closed this in 6fbeb82 Apr 18, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants