Skip to content

Conversation

@mwlon
Copy link

@mwlon mwlon commented May 26, 2019

What changes were proposed in this pull request?

Retrieve enableFetcherCache option from submission conf rather than dispatcher conf. This resolves some confusing behavior where Spark drivers currently get this conf from the dispatcher, whereas Spark executors get this conf from the submission. After this change, the conf will only need to be specified once.

How was this patch tested?

With (updated) existing tests.

@mwlon mwlon changed the title Spark 26192 2.4 [SPARK-26192][MESOS][2.4] May 26, 2019
Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What this PR targets? we don't add new confs in backports.

@dongjoon-hyun
Copy link
Member

ok to test

@dongjoon-hyun
Copy link
Member

Hi, @mwlon .
Thank you for making a PR, but this is not a backport of 5fd4d74 (SPARK-26192). Please revert the other parts. Otherwise, we cannot proceed this backport as @HyukjinKwon 's comment.

Hi, @HyukjinKwon .
I requested this backport of SPARK-26192 at JIRA (https://issues.apache.org/jira/browse/SPARK-26192). But, as you see, this is not a correct backport. The original scope of SPARK-26192 is 5fd4d74 which Retrieve enableFetcherCache option from submission conf rather than dispatcher conf.. There is no new configuration there.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-26192][MESOS][2.4] [SPARK-26192][MESOS][2.4] Retrieve enableFetcherCache option from submission for driver URIs May 27, 2019
@SparkQA
Copy link

SparkQA commented May 27, 2019

Test build #105815 has finished for PR 24713 at commit d8b4f1e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Ur, wait. It seems that I confused at the commit history. I'll comment back here.

@dongjoon-hyun
Copy link
Member

The following is the full history.

  • SPARK-15994 add this configuration spark.mesos.fetcherCache.enable at Spark 2.1.0.
    However, it made a typo at MesosClusterScheduler.scala
  • SPARK-26082 fixes the typo at Spark 2.4.1. (Also, Spark 2.3.4 will have the fix).
  • Now, SPARK-26192 fixes the configuration loading at Spark 3.0.0.

.createWithDefault("")

private[spark] val MAX_DRIVERS =
ConfigBuilder("spark.mesos.maxDrivers").intConf.createWithDefault(200)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a part of SPARK-26192.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's part of later additions for SPARK-26082 that SPARK-26192 built on in master. I think it's good practice to use these config objects rather than magic strings, and something we want to keep.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backporting the improvement is not allowed. @mwlon . In addition, it's not a good practice to disguise one PR(SPARK-26082) into another PR (SPARK-26192) silently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it's not crazy to backport small enhancements, I'd not backport this change. 3.0 changes most all of the configs to use ConfigBuilder, but I don't necessarily think we want to change just a few of them in 2.4, just out of general consistency.

ConfigBuilder("spark.mesos.maxDrivers").intConf.createWithDefault(200)

private[spark] val RETAINED_DRIVERS =
ConfigBuilder("spark.mesos.retainedDrivers").intConf.createWithDefault(200)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

private[spark] val CLUSTER_RETRY_WAIT_MAX_SECONDS =
ConfigBuilder("spark.mesos.cluster.retry.wait.max")
.intConf
.createWithDefault(60) // 1 minute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

.doc("If set to true, all URIs (example: `spark.executor.uri`, `spark.mesos.uris`) will " +
"be cached by the Mesos Fetcher Cache.")
.booleanConf
.createWithDefault(false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not add this. What we need is just the functional logic of SPARK-26192.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So use magic strings instead of config objects?

Copy link
Member

@dongjoon-hyun dongjoon-hyun May 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, that's the backport. You should not pull the master branch code blindly.

val driverMemoryOverhead = sparkProperties.get(config.DRIVER_MEMORY_OVERHEAD.key)
val driverCores = sparkProperties.get("spark.driver.cores")
val name = request.sparkProperties.getOrElse("spark.app.name", mainClass)
val name = sparkProperties.getOrElse("spark.app.name", mainClass)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert all the changes in this file.

private val useFetchCache = conf.getBoolean("spark.mesos.fetcherCache.enable", false)
private val queuedCapacity = conf.get(config.MAX_DRIVERS)
private val retainedDrivers = conf.get(config.RETAINED_DRIVERS)
private val maxRetryWaitTime = conf.get(config.CLUSTER_RETRY_WAIT_MAX_SECONDS)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert the above.

}

private def getDriverUris(desc: MesosDriverDescription): List[CommandInfo.URI] = {
val useFetchCache = desc.conf.get(config.ENABLE_FETCHER_CACHE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is completely wrong. The following is the Apache Spark patch.

    val useFetchCache = desc.conf.get(config.ENABLE_FETCHER_CACHE) ||
	        conf.get(config.ENABLE_FETCHER_CACHE)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake - I missed a couple commits in my cherry-pick. Will fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mwlon . I left a few comments. Could you address them?

@mwlon
Copy link
Author

mwlon commented May 28, 2019

@HyukjinKwon note that this doesn't add any new confs. It only adds existing confs to the relevant config object rather than using magic strings.

@dongjoon-hyun
Copy link
Member

That confs don't exist in branch-2.4.

existing confs

@mwlon
Copy link
Author

mwlon commented May 28, 2019

@dongjoon-hyun not sure what you mean - in 2.4 we have

private val queuedCapacity = conf.getInt("spark.mesos.maxDrivers", 200)
private val retainedDrivers = conf.getInt("spark.mesos.retainedDrivers", 200)
private val maxRetryWaitTime = conf.getInt("spark.mesos.cluster.retry.wait.max", 60) // 1 minute

in MesosClusterScheduler. So the confs already exist.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented May 28, 2019

What I mean is ConfigBuilder("spark.mesos.maxDrivers"). We didn't create the config with ConfigBuilder in branch-2.4. It's just strings in branch-2.4.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented May 28, 2019

Please follow the exiting way of branch-2.4. Otherwise, I cannot help you.

@mwlon
Copy link
Author

mwlon commented May 29, 2019

Ok, if we don't want the other associated changes of SPARK-26192 and only the functional part, then close this PR and I'll make a new one later.

@dongjoon-hyun
Copy link
Member

Yep. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants