Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-22294][Deploy] Reset spark.driver.bindAddress when starting a Checkpoint #19427

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ class Checkpoint(ssc: StreamingContext, val checkpointTime: Time)
"spark.yarn.app.id",
"spark.yarn.app.attemptId",
"spark.driver.host",
"spark.driver.bindAddress",
"spark.driver.port",
"spark.master",
"spark.yarn.keytab",
Expand All @@ -62,6 +63,7 @@ class Checkpoint(ssc: StreamingContext, val checkpointTime: Time)

val newSparkConf = new SparkConf(loadDefaults = false).setAll(sparkConfPairs)
.remove("spark.driver.host")
.remove("spark.driver.bindAddress")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to remove this? It means we must drop spark.driver.bindAddress if it's not set in the new run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. If it is not set in the new run, it should still be meaningless anyway. It makes sense to know this property on the subsequent calls to spark-submit. If we are resuming a checkpoint it means we are re-submitting work, but it may be run in a different cluster configuration, and thus we may want to change the bindAddress or this different configuration may even wish to rely on falling back to the spark.driver.host configuration. In any case, it should make no sense to keep the old setting, unless we are running on a static configuration, in which case it is not a caveat to remove this, as the command-line to re-launch the job can still re-populate the property if it needs to keep being the same.

.remove("spark.driver.port")
val newReloadConf = new SparkConf(loadDefaults = true)
propertiesToReload.foreach { prop =>
Expand Down