New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-38455][SPARK-38187][K8S] Support driver/executor PodGroup
templates
#35776
Conversation
PodGroup
templates
PodGroup
templatesPodGroup
templates
priorityClassName.foreach(podGroup.editOrNewSpec().withPriorityClassName(_).endSpec()) | ||
var spec = pg.getSpec | ||
if (spec == null) spec = new PodGroupSpec | ||
queue.foreach(spec.setQueue(_)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw there is queue in the PodGroup template. So this will overwrite it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, currently, spark.kubernetes.job.queue
configuration will overwrite it. It's the behavior for non-Volcano driver/executor pod template, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I like this direction instead of adding many individual configurations.
I take a roughly look it's a better choice to help to avoid configuration introduced. I will take a deep look and test this today. @dongjoon-hyun Thanks! |
|
||
priorityClassName.foreach(podGroup.editOrNewSpec().withPriorityClassName(_).endSpec()) | ||
var spec = pg.getSpec | ||
if (spec == null) spec = new PodGroupSpec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If no template given, do we need some default pod group spec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's the existing code's behavior at PodGroupBuilder
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea looks good to me.
val template = if (kubernetesConf.isInstanceOf[KubernetesDriverConf]) { | ||
kubernetesConf.get(KUBERNETES_DRIVER_PODGROUP_TEMPLATE_FILE) | ||
} else { | ||
kubernetesConf.get(KUBERNETES_EXECUTOR_PODGROUP_TEMPLATE_FILE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we are not support executor's separated podgroup yet, currently no ability to create pre resource for excutors in spark on k8s.
we only supported driver side podgroup or job level podgroup (share driver podgroup) now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya, this is added for feature parity for your existing contribution.
getAdditionalPreKubernetesResources
is used only for KubernetesDriverBuilder
.
To fix license linter error, the licenses are added. BTW, do you have any other concerns, @Yikun ? |
var spec = pg.getSpec | ||
if (spec == null) spec = new PodGroupSpec | ||
queue.foreach(spec.setQueue(_)) | ||
priorityClassName.foreach(spec.setPriorityClassName(_)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have only one left conern in priorityClassName about configuration overwrite priority.
Compare to pod template this might be a reasonable overwirte sort for me:
- Top 1 configuration value (like queue)
- Top 2 config template value (like driver/executor template)
- Top 3 default value (like priorityClassName which reuse pod priority as default value)
So do you think we should change it to:
// Overwrite if queue configuration specified
queue.foreach(spec.setQueue(_))
// Reuse Pod priority if priority is not set in template
if (spec.getPriorityClassName == null) priorityClassName.foreach(spec.setPriorityClassName(_))
Then users can set PriorityClassName
by specifying priority in template. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand you are exploring all possibility. However, Apache Spark doesn't work like that. :) We prefer the user given configurations over any template files and it's simpler always. So, the current behavior is intentionally consistent with the existing one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I can understand it. This is not only exploring all possibility, but also there are many users want to specify priority more flexiable in job level.
An alternative way to help user use priority conveniently, we still keep this as your current implemtation, consider priority scheduling is very common case, do you think we could introduce a configuration like spark.kuberentes.driver.PriorityClassName
to simplify config template? This also help both native default scheduler and also custom scheduler to specify spark job priority easily.
cpu: "4" | ||
memory: "16Gi" | ||
priorityClassName: executor-priority | ||
queue: executor-queue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: new line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed now, @Yikun .
BTW, @Yikun . We can take advantage this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[info] VolcanoSuite:
[info] - Run SparkPi with volcano scheduler (10 seconds, 10 milliseconds)
[info] - SPARK-38188: Run SparkPi jobs with 2 queues (only 1 enabled) (14 seconds, 304 milliseconds)
[info] - SPARK-38188: Run SparkPi jobs with 2 queues (all enabled) (13 seconds, 252 milliseconds)
[info] - SPARK-38423: Run SparkPi Jobs with priorityClassName (15 seconds, 264 milliseconds)
[info] - SPARK-38423: Run driver job to validate priority order (16 seconds, 315 milliseconds)
- (coming soon)SPARK-38187: Run SparkPi Jobs with minCPU (28 seconds, 525 milliseconds)
- (coming soon)SPARK-38187: Run SparkPi Jobs with minMemory (26 seconds, 497 milliseconds)
Except the priority concern otherwise LGTM. Thanks, I also test it in my env for existing case and coming soon minRes case.
Expected! |
What changes were proposed in this pull request?
This PR aims to support driver/executor
PodGroup
templates like the following.Why are the changes needed?
This is a simpler, more extensible and robust way to support Volcano future because we don't need to add new configurations like #35640 for all Volcano features.
Does this PR introduce any user-facing change?
No because this is a new feature.
How was this patch tested?
Pass the CIs.