New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-15917][CORE] Added support for number of executors in Standalone [WIP] #15405
Conversation
…can't be satisfied
add to whitelist |
val numExecutorsLaunched = app.executors.size | ||
// Check to see if we managed to launch the requested number of executors | ||
if(numUsable != 0 && numExecutorsLaunched != app.executorLimit && | ||
numExecutorsScheduled != app.executorLimit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are numExecutorsLaunched
and numExecutorsScheduled
related to each other? Also here we probably want to do an inequality check just in case.
Also style: need space after if
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing is, how noisy is this? Do we log this if dynamic allocation is turned on (we shouldn't)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numExecutorsLaunched
corresponds to the actual number of executors that have been launched so far (literally that have been registered in the executors list in the ApplicationInfo), whereas numExecutorsScheduled
corresponds to the number of executors that have been scheduled/allocated by scheduleExecutorsOnWorkers
. This is needed because scheduleExecutorsOnWorkers
is called multiple times when setting up the executors, and if we don't check the condition we will log repeatedly the same message but with incorrect information (such as "0 executors launched" even though the executors have been launched previously).
Tell me if that doesn't make sense, I did a lot of trial and error until coming up with this condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the noise produced, it should be quite minimal. When it's not possible to launch the number of executors requested, just one warning is logged.
With dynamic allocation on, a message is logged when the initial number of executors is specified and it couldn't be satisfied. I don't think it's too much of a problem as there isn't any warning currently for that, but I can add a check to remove the warning when dynamic allocation is enabled if you prefer.
Thanks for working on this. It's great to see how small the patch turned out to be! |
Test build #66675 has finished for PR 15405 at commit
|
Test build #66681 has finished for PR 15405 at commit
|
Test build #3323 has finished for PR 15405 at commit
|
Are you still working on this? @JonathanTaws |
Hi Jiang,
I've put this on hold as I wasn't getting updates from the admins on the
next steps for this. I'd definitely like to move on with this and
contribute it to the codebase, as I belive it's still relevant nowadays.
Let me know!
Le 13 juin 2017 04:54, "Jiang Xingbo" <notifications@github.com> a écrit :
Are you still working on this? @JonathanTaws
<https://github.com/jonathantaws>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15405 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AFS21qYo6hs8nM2K2Yus5rM33vPjsCZSks5sDfnxgaJpZM4KSBQa>
.
|
I see this is WIP, when do you think it will be ready for review? Thanks! |
My bad, should have removed it. I'll check it's working as expected this
weekend and we can move forward on it!
Le 14 juin 2017 03:33, "Jiang Xingbo" <notifications@github.com> a écrit :
… I see this is WIP, when do you think it will be ready for review? Thanks!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15405 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AFS21gNJxKCLiugvq3CCyVVP94DLEvfWks5sDzhXgaJpZM4KSBQa>
.
|
ping @JonathanTaws Please let me know once this PR is ready for review, thanks! |
@jiang Quite busy at the moment, will take care of it as soon as possible.
I'll ping you once it's done
Le 25 juin 2017 16:33, "Jiang Xingbo" <notifications@github.com> a écrit :
ping @JonathanTaws <https://github.com/jonathantaws> Please let me know
once this PR is ready for review, thanks!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15405 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AFS21riOyIG178z5LukgJj2hxxAV1Hljks5sHm-zgaJpZM4KSBQa>
.
|
What changes were proposed in this pull request?
Currently in standalone mode it is not possible to set the number of executors by using the
--num-executors
orspark.executor.instances
property. Instead, as many executors as possible will be spawned based on the available resources and the properties set.This patch corrects that to support the number of executors property.
Here's the new behavior :
executor.cores
property isn't set, we will try to spawn one executor on each worker taking all of the cores available (like the default value) while the number of workers < number of executors requested. If we can't launch the specified number of executors, a warning is logged.executor.cores
property is set (repeat the same logic forexecutor.memory
):executor.instances
*executor.cores
<=cores.max
, thenexecutor.instances
will be spawned,executor.instances
*executor.cores
>cores.max
, then as many executors will be spawned as it is possible - basically the previous behavior when only executor.cores was set - but we also log a warning saying we couldn't spawn the requested number of executors,In the case where
executor.memory
is set, all constraints are taken into account based on the number of cores and memory per worker assigned (same logic as with the cores).How was this patch tested?
I tested this patch by running a simple Spark app in standalone mode and specifying the
--num-executors
orspark.executor.instances property
, and checking if the number of executors was coherent based on the available resources and the requested number of executors.I plan on testing this patch by adding tests in
MasterSuite
and running the usual/dev/run-tests
.