-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-12463][SPARK-12464][SPARK-12465][SPARK-10647][MESOS] Fix zookeeper dir with mesos conf and add docs. #10057
Conversation
@andrewor14 @dragos @JoshRosen PTAL |
Test build #46940 has finished for PR 10057 at commit
|
@@ -94,7 +94,7 @@ private[spark] class ZookeeperMesosClusterPersistenceEngine( | |||
conf: SparkConf) | |||
extends MesosClusterPersistenceEngine with Logging { | |||
private val WORKING_DIR = | |||
conf.get("spark.deploy.zookeeper.dir", "/spark_mesos_dispatcher") + "/" + baseDir | |||
conf.get("spark.mesos.deploy.zookeeper.dir", "/spark_mesos_dispatcher") + "/" + baseDir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to you, but does it make sense to fall back to the original property as a default to preserve compatibility? Does that make sense for all these Mesos-specific override properties?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this documented?
This looks fine as code changes, but any fixes in cluster mode are untestable for me. #9752 might fix it. |
Why not just reuse the existing |
@andrewor14, one of the other configurations is already |
I think actually just using spark.deploy.* seems like a better choice, as in any case we don't really expect users to have different zookeepers deployed and cluster mode Mesos isn't used in conjunction with standalone mode too. I can update the code and the docs so it will be part of configuring Mesos doc as still. |
As long as we don't break backward compatibility anywhere then I think just reusing |
3cb36e8
to
2363ae8
Compare
@@ -50,7 +50,7 @@ private[mesos] class MesosClusterDispatcher( | |||
extends Logging { | |||
|
|||
private val publicAddress = Option(conf.getenv("SPARK_PUBLIC_DNS")).getOrElse(args.host) | |||
private val recoveryMode = conf.get("spark.mesos.deploy.recoveryMode", "NONE").toUpperCase() | |||
private val recoveryMode = conf.get("spark.deploy.recoveryMode", "NONE").toUpperCase() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the theory here that we can change the config param because it wasn't documented before? OK by me. Otherwise seems like the old value would have to be supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still wondering about this as well. I think having some backward compatible makes sense at least for a version. Let me add a warning message when it's set and use the value if it's set as well.
2363ae8
to
2ff38af
Compare
Test build #47464 has finished for PR 10057 at commit
|
2ff38af
to
b8fc74c
Compare
Test build #47497 has finished for PR 10057 at commit
|
@@ -50,7 +50,11 @@ private[mesos] class MesosClusterDispatcher( | |||
extends Logging { | |||
|
|||
private val publicAddress = Option(conf.getenv("SPARK_PUBLIC_DNS")).getOrElse(args.host) | |||
private val recoveryMode = conf.get("spark.mesos.deploy.recoveryMode", "NONE").toUpperCase() | |||
private val recoveryMode = conf.getOption("spark.mesos.deploy.recoverMode").map { mode => | |||
logWarning("spark.mesos.deploy.recoverMode is deprecated. Please configure " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was never documented so we don't need to add deprecation warning here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by the way if you want to change the config name in this PR you should file a separate issue and add it to the title of this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, should we still honor the setting as a deprecation cycle? I'm going to just remove the warning for now but add a TODO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's even worth logging a warning. It wasn't documented so any user who was using the old config must have somehow found out about it from the code, knowing that it's not officially supported. I'd rather keep the code simpler than add a warning that I doubt will be useful for many.
@andrewor14 the cluster mode issue is fixed now, @dragos @mgummelt we can use this patch through our tests |
b8fc74c
to
2c29939
Compare
Test build #48121 has finished for PR 10057 at commit
|
</tr> | ||
</table> | ||
In order to enable this recovery mode, you can set SPARK_DAEMON_JAVA_OPTS in spark-env by configuring `spark.deploy.recoveryMode` and related spark.deploy.zookeeper.* configurations. | ||
For more information about these configurations please refer to the configurations (doc)[configurations.html#deploy] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should also briefly mention this in the running-on-mesos.md
docs right?
2c29939
to
32a33ae
Compare
Test build #48157 has finished for PR 10057 at commit
|
32a33ae
to
4a56f6c
Compare
Jenkins, retest this please |
Test build #48170 has finished for PR 10057 at commit
|
@andrewor14 Can you take a look at this PR sometime this week? |
retest this please |
Test build #50289 has finished for PR 10057 at commit
|
retest this please |
4a56f6c
to
5ebaf04
Compare
Test build #50333 has finished for PR 10057 at commit
|
retest this please |
Test build #50352 has finished for PR 10057 at commit
|
Merged into master. By the way I might have mentioned this before but you probably don't need 3 different issues to rename 3 configs. |
Fix zookeeper dir configuration used in cluster mode, and also add documentation around these settings.