Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-34783][K8S] Support remote template files #31877

Closed
wants to merge 5 commits into from
Closed

[SPARK-34783][K8S] Support remote template files #31877

wants to merge 5 commits into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Mar 18, 2021

What changes were proposed in this pull request?

This PR aims to support remote driver/executor template files.

Why are the changes needed?

Currently, KubernetesUtils.loadPodFromTemplate supports only local files.

With this PR, we can do the following.

bin/spark-submit \
...
-c spark.kubernetes.driver.podTemplateFile=s3a://dongjoon/driver.yml \
-c spark.kubernetes.executor.podTemplateFile=s3a://dongjoon/executor.yml \
...

Does this PR introduce any user-facing change?

Yes, this is an improvement.

How was this patch tested?

Manual testing.

@dongjoon-hyun
Copy link
Member Author

Hi, @attilapiros . Could you review this PR?

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a useful feature.

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented Mar 18, 2021

@HyukjinKwon , @attilapiros , @viirya .
I tried to use DependencyUtils, but it turns out that it fails at the execution because Utils.doFetchFile depends on SparkEnv.get. SparkSubmit use this K8s submission code before SparkEnv creation and fails with java.lang.NullPointerException.

In this case, I believe this vanilla approach is the only way.

For the test case concern, I can still remove it but I prefer to have the test coverage for the new function for now. We can revise the test suite later with the withHttpServer refactoring.

@attilapiros
Copy link
Contributor

For the test case concern, I can still remove it but I prefer to have the test coverage for the new function for now. We can revise the test suite later with the withHttpServer refactoring.

+1 for keeping the test and I will revise it in my PR

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented Mar 18, 2021

Thank you, @attilapiros .
I'm still looking at the code to find some workaround.
I'll comment on this PR later if I find a better solution.

@viirya
Copy link
Member

viirya commented Mar 18, 2021

Sounds ok to me.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented Mar 18, 2021

Hi, All.
Sorry for being the back and forth. At the last commit, I was able to address all comments (@HyukjinKwon and @attilapiros ). The failure I was hit at the first trial was a false alarm. Since K8s submit will not use spark: prefix, SparkEnv code patch is not used. And, I manually tested this with the last commit.

@dongjoon-hyun
Copy link
Member Author

Also, the doc update is included. Thanks, @viirya .

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the changes if tests pass

dongjoon-hyun and others added 2 commits March 18, 2021 18:16
…ark/deploy/k8s/KubernetesUtils.scala

Co-authored-by: Hyukjin Kwon <gurwls223@gmail.com>
…ark/deploy/k8s/features/PodTemplateConfigMapStep.scala

Co-authored-by: Hyukjin Kwon <gurwls223@gmail.com>
@dongjoon-hyun
Copy link
Member Author

Thank you for suggestion, @HyukjinKwon . I manually verified that the code works correctly and merged it.

@github-actions github-actions bot added the DOCS label Mar 19, 2021
Copy link
Contributor

@attilapiros attilapiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dongjoon-hyun
Copy link
Member Author

Thank you for review and approval, @HyukjinKwon and @attilapiros .

@SparkQA
Copy link

SparkQA commented Mar 19, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40811/

@SparkQA
Copy link

SparkQA commented Mar 19, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40811/

@dongjoon-hyun
Copy link
Member Author

Thank you, @viirya !

@dongjoon-hyun
Copy link
Member Author

The failure is irrelevant one. Merged to master.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-34783-2 branch March 19, 2021 15:54
flyrain pushed a commit to flyrain/spark that referenced this pull request Sep 21, 2021
This PR aims to support remote driver/executor template files.

Currently, `KubernetesUtils.loadPodFromTemplate` supports only local files.

With this PR, we can do the following.
```bash
bin/spark-submit \
...
-c spark.kubernetes.driver.podTemplateFile=s3a://dongjoon/driver.yml \
-c spark.kubernetes.executor.podTemplateFile=s3a://dongjoon/executor.yml \
...
```

Yes, this is an improvement.

Manual testing.

Closes apache#31877 from dongjoon-hyun/SPARK-34783-2.

Lead-authored-by: Dongjoon Hyun <dhyun@apple.com>
Co-authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 2fa792a)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants