-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(jobserver): Replace manager_start with SparkLauncher #982
Conversation
* Move variables from manager_start and server_start to setenv * Use set -a in server_start.sh to expose all the variables in env. These variables are then used to launch driver JVM. * Move variables like jar file, conf etc to setenv to make it more configurable * Replace spawning of manager_start in AkkaClusterSupervisor to Launcher * Add abstract class Launcher to hold common variables and functions * Add concrete implementation for manager with manager specific properties * Update documentation
Codecov Report
@@ Coverage Diff @@
## master #982 +/- ##
=======================================
- Coverage 72.48% 70.49% -2%
=======================================
Files 77 78 +1
Lines 2410 2437 +27
Branches 128 212 +84
=======================================
- Hits 1747 1718 -29
- Misses 663 719 +56
Continue to review full report at Codecov.
|
@derSascha can you check this PR with Yarn? |
I tried it for cluster mode, works for me. |
README.md
Outdated
MANAGER_LOGGING_OPTS="-Dlog4j.configuration=$REMOTE_JOBSERVER_DIR/log4j-cluster.properties" | ||
``` | ||
|
||
- Cluster mode for mesos/yarn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to have a separate example for each of mesos and yarn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I will add separate examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine for the most part, thanks, and is a great change. What environments have you tested this under?
I have tested it with
Testing on Mesos/Yarn is required. |
@velvia I added separate examples for Yarn/Mesos. |
@derSascha Any change you can take a look on this PR ? |
@bsikander thanks for the improvement. i don't have the time right now to test the cluster mode, but tested the yarn client mode. Some points i changed while testing around / suggestions / ideas:
Thanks for the pull-request using the launcher seems to be more stable way than the bash script workaround! |
What about something like this in the context configs:
that is parsed and added to the arguments of the launcher. |
hrm... so I am torn on this... I think it is sort of hard to generalize on all the differences between how spark-standalone, yarn, and mesos work and Mesos is particularly challenging due to the way it operates. For example, when you are submitting a driver to be scheduled, the (i.e. Another challenge is args like There are also a number of options which I would want to dynamically configure beyond just I think @derSascha is on the right path with My proposal might be a new namespace of args:
Another example would be passing Sorry for the wall of text, but for this to be flexible enough, I think we would need something like above. Assuming we get it right, I think it would be a lot better than manager_start.sh |
@derSascha thank you for the comments. 1- A new file named
In Launcher, I am already parsing this and adding the key/val as 6- Currently YARN killed event is handled through SparkListener, we can also implement it via SparkAppHandler but I don't see any benefit if I use SparkAppHandler. |
@addisonj About the config part: I am not the pro in scala config parsing and evaluation, but this sounds like a very important feature while removing |
@addisonj @derSascha so let me summarize what you are saying to make sure I understand... The Spark arguments, even such as Is that right? I kind of like |
@velvia @derSascha @addisonj As a side note:
Let's finish this because other changes are being blocked by this change. |
bin/setenv.sh
Outdated
MANAGER_EXTRA_SPARK_CONFS= | ||
MANAGER_LOGGING_OPTS="-Dlog4j.configuration=file:$appdir/log4j-server.properties" | ||
SPARK_LAUNCHER_VERBOSE=0 | ||
if [ -z "$MANAGER_JAR_FILE" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you can do ${MANAGER_JAR_FILE:="$appdir/spark-job-server.jar"}
### Modifying the <environment>.sh script | ||
Replace `MANAGER_*` variables with | ||
``` | ||
MANAGER_JAR_FILE="$appdir/spark-job-server.jar" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it might be good to put these alternatives in setenv.sh
as well, after all the Mesos one is there but YARN isn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, but at this point, I don't have any variable which tells me the current spark.submit.deployMode. I do have a path to *.conf file and somehow within BASH, I can try to parse this. What do you say ?
Also @derSascha any comments?
@bsikander the changes look good to me. There is currently the contextconfig section for defaults. Why can't we just pull the spark settings from there for the Launcher? Or is it because we cannot tell which ones need to be preset before calling the launcher? Rather than creating a whole new config section, I'd rather create a "launcher" section for launcher specific configs. For example
(if that makes sense) |
@velvia Configs defined in passthrough section are added to sparkConf when the JobManagerActor is already initialized (i.e. Driver JVM already started), all of these settings can also be added as |
I’m fine getting rid of the passthrough section. I was thinking about if there were any settings which need to be initialized later, but can’t think of any. The simpler the better.
Actually the ideal for the config is that everything in spark.context-settings which is NOT explicitly parsed can just be passed onto the launcher. I don’t want to change things up too much though.
We’ll probably need to support the passthrough section for compatibility for a while.
… On Dec 21, 2017, at 10:43 AM, Behroz Sikander ***@***.***> wrote:
@velvia <https://github.com/velvia>
Ok, as far as I know, we can add the launcher configs in spark.context-settings. One thing which is not clear is that should I keep the passthrough section <https://github.com/spark-jobserver/spark-jobserver/blob/master/job-server/src/main/resources/application.conf#L145>, which is parsed like this <https://github.com/spark-jobserver/spark-jobserver/blob/master/job-server-api/src/main/scala/spark/jobserver/util/SparkJobUtils.scala#L98> ?
Configs defined in passthrough section are added to sparkConf when the JobManagerActor is already initialized (i.e. Driver JVM already started), all of these settings can also be added as --conf during the spark-submit/sparkLauncher. Should I just remove the passthrough section and add a launcher section which contains all the static configs and can be overriden by POST /context?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#982 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABA320dsrXR5J0AekOxPcLXsUmRnfiFcks5tCqa1gaJpZM4Qw1aq>.
|
Sorry for being away for some time, I was on holidays. So, I have added a new section in config named
Note: Now A few things that we need to take care of in future regarding All the above settings are added to SparkConf like this. Now, we have a new section for Launcher and properties can be set under Since, sparkConf has more precedence than spark-submit, configurations set in launcher can be overriden at runtime and user can get confused. |
@velvia @derSascha @addisonj can you have a look on my new commit? |
Can someone please have a look on this change? |
@velvia I have some other changes that I want to push on top of this change. Would be nice if you can have a look on this. |
@bsikander the last commit looks fine, sorry I have been busy with work and away on holiday also. Going to merge it. |
Thank you merging :) |
I encounter context start fail when use this feathure on yarn cluster mode Did I need some more configuration? update:I finally modify the code in spark.jobserver.util.ManagerLauncher and env variable to make it pass through on yarn cluster mode, can anyone check whether this has side effect, thank you.
|
Idea
These variables are then used to launch driver JVM.
configurable
Testing
This change was tested with client and cluster mode (standalone) and seems to work fine.
Can somebody please verify it with Mesos/Yarn?
Pull Request checklist
Current behavior :
Manager_start.sh script is used to spawn driver JVMs. To get the current status of job from Spark, the only way is to use the History server. Which is not a good approach.
New behavior :
With SparkLauncher, we can use the SparkAppHandle to access the current state (WAITING/KILLED etc) of job.
BREAKING CHANGES
This change works with Spark 2.1.0 and above. This is due to the
redirectOutput
andredirectError
functions which were introduced in 2.1.0.Other information:
If this change gets through, then server_start.sh will also be replaced with launcher and then another change will introduce the changes related to accessing the current state of job in Spark.
This change is