-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-30949][K8S][CORE] Decouple requests and parallelism on drivers in K8s #27695
[SPARK-30949][K8S][CORE] Decouple requests and parallelism on drivers in K8s #27695
Conversation
ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @onursatici . Thank you for making a PR, but the following claim seems to be outdated. Technically, it was fixed long time ago (on January 2017 at JDK 8u121).
By using this, spark applications running on java 8 or older incorrectly get the total number of cores in the host, ignoring the cgroup limits set by kubernetes. Java 9 and newer runtimes do not have this problem.
Please see https://bugs.openjdk.java.net/browse/JDK-8173345 for the detail.
Given the above, could you revise the PR description?
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #118933 has finished for PR 27695 at commit
|
Ping @holdenk since the flakiness of
|
Gentle ping, @onursatici . |
In my experience the K8s tests have all been flaky but I’ll dig into them & decom as well this coming week |
IIUC this issue also affects Standalone cluster mode? |
@dongjoon-hyun, what are the next steps? Does it look fine from your perspective? |
@onursatici . One thing I'm thinking is the deprecation of
|
Since this also affects Standalone cluster, I'd suggest we only exclude Mesos backend in the case match. |
Hi, @jiangxb1987 . What do you mean by this? This PR or this old JDK bug?
|
I mean the JDK bug mentioned in this PR. |
@jiangxb1987 do you recommend to do that in this PR? I think changing the stand-alone driver core count behaviour would broaden the scope of this PR such that it might warrant a separate discussion. Does this change make sense for k8s? Any blockers @dongjoon-hyun? |
It's fine if you want to focus on the k8s behaviour here, I can submit another PR to fix the Standalone backend after this is merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
retest this please |
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #120369 has finished for PR 27695 at commit
|
retest this please |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #120398 has finished for PR 27695 at commit
|
retest this please |
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #120434 has finished for PR 27695 at commit
|
@dongjoon-hyun do you mind taking a look at this? I have revised the PR description |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To merge this PR, it seems that we need to revise spark.kubernetes.driver.request.cores
documentation together. The following is the current one. Although it's correct, I guess we need to mention the decoupled parallelism additionally due to spark.driver.cores
after this PR.
This takes precedence over <code>spark.driver.cores</code>
for specifying the driver pod cpu request if set.
@onursatici . I'm still not sure about this approach.
The last possibility which I can guess is that you want to have bigger parallelism on the small container. Is that your case? Could you give us more concrete example where this PR is beneficial? |
How do you think about the above comment, @onursatici ? I'm wondering your opinion. |
Gentle ping, @onursatici . |
Hey @dongjoon-hyun , sorry for the late response. You are right that the feature I wan't with this PR is to mainly increase parallelism while keeping the cpu resource requests low in K8S.
|
Got it. Thanks, @onursatici . I'll revise the PR description a little and do the final review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @onursatici and @jiangxb1987 .
Merged to master for Apache Spark 3.1.0.
… in K8s ### What changes were proposed in this pull request? `spark.driver.cores` configuration is used to set the amount of parallelism in kubernetes cluster mode drivers. Previously the amount of parallelism in the drivers were the number of cores in the host when running on JDK 8u120 or older, or the maximum of driver containers resource requests and limits when running on [JDK 8u121 or newer](https://bugs.openjdk.java.net/browse/JDK-8173345). This will enable users to specify `spark.driver.cores` to set parallelism, and specify `spark.kubernetes.driver.requests.cores` to limit the resource requests of the driver container, effectively decoupling the two ### Why are the changes needed? Drivers submitted in kubernetes cluster mode set the parallelism of various components like `RpcEnv`, `MemoryManager`, `BlockManager` from inferring the number of available cores by calling `Runtime.getRuntime().availableProcessors()`. By using this, spark applications running on JDK 8u120 or older incorrectly get the total number of cores in the host, [ignoring the cgroup limits set by kubernetes](https://bugs.openjdk.java.net/browse/JDK-6515172). JDK 8u121 and newer runtimes do not have this problem. Orthogonal to this, it is currently not possible to decouple resource limits on the driver container with the amount of parallelism of the various network and memory components listed above. ### Does this PR introduce any user-facing change? Yes. Previously the amount of parallelism in kubernetes cluster mode submitted drivers were the number of cores in the host when running on JDK 8u120 or older, or the maximum of driver containers resource requests and limits when running on JDK 8u121 or newer. Now the value of `spark.driver.cores` is used. ### How was this patch tested? happy to add tests if my proposal looks reasonable Closes apache#27695 from onursatici/os/decouple-requests-and-parallelism. Authored-by: Onur Satici <onursatici@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
… in K8s ### What changes were proposed in this pull request? `spark.driver.cores` configuration is used to set the amount of parallelism in kubernetes cluster mode drivers. Previously the amount of parallelism in the drivers were the number of cores in the host when running on JDK 8u120 or older, or the maximum of driver containers resource requests and limits when running on [JDK 8u121 or newer](https://bugs.openjdk.java.net/browse/JDK-8173345). This will enable users to specify `spark.driver.cores` to set parallelism, and specify `spark.kubernetes.driver.requests.cores` to limit the resource requests of the driver container, effectively decoupling the two ### Why are the changes needed? Drivers submitted in kubernetes cluster mode set the parallelism of various components like `RpcEnv`, `MemoryManager`, `BlockManager` from inferring the number of available cores by calling `Runtime.getRuntime().availableProcessors()`. By using this, spark applications running on JDK 8u120 or older incorrectly get the total number of cores in the host, [ignoring the cgroup limits set by kubernetes](https://bugs.openjdk.java.net/browse/JDK-6515172). JDK 8u121 and newer runtimes do not have this problem. Orthogonal to this, it is currently not possible to decouple resource limits on the driver container with the amount of parallelism of the various network and memory components listed above. ### Does this PR introduce any user-facing change? Yes. Previously the amount of parallelism in kubernetes cluster mode submitted drivers were the number of cores in the host when running on JDK 8u120 or older, or the maximum of driver containers resource requests and limits when running on JDK 8u121 or newer. Now the value of `spark.driver.cores` is used. ### How was this patch tested? happy to add tests if my proposal looks reasonable Closes apache#27695 from onursatici/os/decouple-requests-and-parallelism. Authored-by: Onur Satici <onursatici@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
spark.driver.cores
configuration is used to set the amount of parallelism in kubernetes cluster mode drivers. Previously the amount of parallelism in the drivers were the number of cores in the host when running on JDK 8u120 or older, or the maximum of driver containers resource requests and limits when running on JDK 8u121 or newer. This will enable users to specifyspark.driver.cores
to set parallelism, and specifyspark.kubernetes.driver.requests.cores
to limit the resource requests of the driver container, effectively decoupling the twoWhy are the changes needed?
Drivers submitted in kubernetes cluster mode set the parallelism of various components like
RpcEnv
,MemoryManager
,BlockManager
from inferring the number of available cores by callingRuntime.getRuntime().availableProcessors()
. By using this, spark applications running on JDK 8u120 or older incorrectly get the total number of cores in the host, ignoring the cgroup limits set by kubernetes. JDK 8u121 and newer runtimes do not have this problem.Orthogonal to this, it is currently not possible to decouple resource limits on the driver container with the amount of parallelism of the various network and memory components listed above.
Does this PR introduce any user-facing change?
Yes. Previously the amount of parallelism in kubernetes cluster mode submitted drivers were the number of cores in the host when running on JDK 8u120 or older, or the maximum of driver containers resource requests and limits when running on JDK 8u121 or newer. Now the value of
spark.driver.cores
is used.How was this patch tested?
happy to add tests if my proposal looks reasonable