New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-40747][CORE] Support setting driver log url using env vars on other resource managers #38205
Conversation
cc @HeartSaVioR, would you please take a look? BTW, it does not work on Yarn since |
…n Standalone mode
To make it flexible, I'm planning to add the variable substitution support to the log URL just like SPARK-26311 does(SHS only), and let the cluster manager exposes some attributes. For example, suppose exposing
|
Can one of the admins verify this patch? |
@dongjoon-hyun @holdenk what do think about the plan? is it the right direction to support the external log service on K8s? |
@@ -73,7 +75,12 @@ private[spark] trait SchedulerBackend { | |||
* Executors tab for the driver. | |||
* @return Map containing the log names and their respective URLs | |||
*/ | |||
def getDriverLogUrls: Option[Map[String, String]] = None | |||
def getDriverLogUrls: Option[Map[String, String]] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we explicitly mention the target resource managers instead of Support setting driver log url using env vars on other resource managers
because YarnClusterSchedulerBackend
will not use this implementation?
Lines 38 to 40 in b9998cf
override def getDriverLogUrls: Option[Map[String, String]] = { | |
YarnContainerInfoHelper.getLogUrls(sc.hadoopConfiguration, container = None) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense if we want to keep YARN as-is, another direction is to let Yarn support it then it works on all resource managers.
The pseudo-code would like
override def getDriverLogUrls: Option[Map[String, String]] = {
YarnContainerInfoHelper.getLogUrls(sc.hadoopConfiguration, container = None) ++
super.getDriverLogUrls.getOrElse(Map.empty))
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you happen to know any instances where the production YARN clusters use the external log services?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was mentioned in SPARK-26311
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR seems to aiming only K8s because YARN is not using this and Mesos is not used in these days. May I ask why don't we have this in KubernetesClusterSchedulerBackend
only?
I think it could be a generic way for all cluster managers, and if it's overkill, it's fine to support this feature in |
Hi @dongjoon-hyun, in #38357, I implement the proposed idea, would you please take a look when you have time? PS: design doc https://docs.google.com/document/d/1MfB39LD4B4Rp7MDRxZbMKMbdNSe6V6mBmMQ-gkCnM-0/edit?usp=sharing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but needs consensus from @dongjoon-hyun
I tagged @tgravescs on #38357, assuming that is the version that was proposed - or is this what we are looking at ? |
@mridulm Yes, SPARK-40887(#38357) is the proposed version. I opened this PR mostly for collecting feedback in case the community has another idea. |
Please link the 2 and I would prefer to see this in draft if it isn't meant to be the real solution and just looking for feedback. The jira's should ideally be linked as well. |
Close and in favor #38357 |
What changes were proposed in this pull request?
This PR pulls out the
getDriverLogUrls
fromStandaloneSchedulerBackend
to superclassSchedulerBackend
, to make it support setting driver log url by env vars w/ prefixSPARK_DRIVER_LOG_URL_
on other resource managers, especially for K8s.Why are the changes needed?
Since it has such an ability for the executor on K8s mode, we want to align the ability for the driver.
The related code in
CoarseGrainedExecutorBackend
Does this PR introduce any user-facing change?
Yes, the user could set the log urls by env vars.
How was this patch tested?
Existing UT, if it's not sufficient and the approach is accepted, will add more ut later.
And MT in local mode.