-
Couldn't load subscription status.
- Fork 28.9k
[SPARK-9782] [YARN] Support YARN application tags via SparkConf #8072
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add a new test case in yarn/ClientSuite which checks how the various SparkConf and ClientArguments propagate into the ApplicationSubmissionContext.
|
Jenkins, test this please |
docs/running-on-yarn.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"when querying YARN apps"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Rename from spark.yarn.application.tags to simply spark.yarn.tags. Add a comment explaining the use of reflection. Misc rewording formatting fixes.
|
jenkins, retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: use s"Ignoring $CONF_SPARK_YARN_APPLICATION_TAGS because"...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
@tgravescs @vanzin @andrewor14 can I get one of you to sign off on my proposed property name Otherwise, this LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal, but I'd flip this around and do something like:
Try(appContext.getClass().getMethod("setApplicationTags", classOf[java.util.Set[String]])).foreach { setTagsMethod =>
...
}
That way you print the log message only once, lookup the method only once, and skip all the processing of the option if the method is not available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW if you do this you're probably better off doing:
Try(...) match {
case Success =>
case Failure =>
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! I'm a bit rusty in Scala, so I could be misapplying some idioms here, but my intent on this line was to use the foreach, etc., only to access the "None or single element" of the Option, as indicated on http://www.scala-lang.org/api/current/index.html#scala.Option , so the "foreach" here is supposed to only iterate over the "single" element which happens to itself be a collection, as opposed to iterating over elements of the inner collection. So there shouldn't be any way to cause multiple log statements or multiple reflection-based lookups of the method.
I also wanted to err on the side of minimizing behavioral changes for existing setups, so that if the Option is None, then this chaining as a monad avoids ever needing to lookup a method, or even invoking any of the option-processing methods like StringUtils.getTrimmedStringCollection.
I could add a note to make it more clear that the map/filter/foreach is on the Option, as opposed to the Collection if that'd help.
Anyhow, I'll be happy to apply the reversal to start with the method as you suggest if you prefer, just wanted to hear your thoughts on this Option usage first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. setApplicationTags takes a Set[String], not a single string, so you're not calling it once per tag in the collection. My bad. Current code is fine.
|
Config name LGTM. Code too, just left a minor suggestion. |
|
I think the name is ok, but it might be better if we have put application in it: spark.yarn.application.tags. thoughts? If both of you prefer the current name I'm ok with it though. |
|
@tgravescs my thinking for just spark.yarn.tags was that it's redundant to include application, because configs are by definition per-application. We'd also be consistent with spark.yarn.queue and spark.yarn.jars. If we do include "application", I think it should be spark.yarn.application-tags, because then we're not adding a new "application" namespace. |
|
thats fine. we can keep it tags and if another config with tags in it comes along be more specific on that. |
|
Cool, in that case @dennishuo mind making the change that @vanzin suggested and then I'll merge this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using reflect make the code reading more difficult. generally not be recommanded.
Add a new test case in yarn/ClientSuite which checks how the various SparkConf
and ClientArguments propagate into the ApplicationSubmissionContext.