Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-30949][K8S][CORE] Decouple requests and parallelism on drivers in K8s #27695

Conversation

onursatici
Copy link
Contributor

@onursatici onursatici commented Feb 25, 2020

What changes were proposed in this pull request?

spark.driver.cores configuration is used to set the amount of parallelism in kubernetes cluster mode drivers. Previously the amount of parallelism in the drivers were the number of cores in the host when running on JDK 8u120 or older, or the maximum of driver containers resource requests and limits when running on JDK 8u121 or newer. This will enable users to specify spark.driver.cores to set parallelism, and specify spark.kubernetes.driver.requests.cores to limit the resource requests of the driver container, effectively decoupling the two

Why are the changes needed?

Drivers submitted in kubernetes cluster mode set the parallelism of various components like RpcEnv, MemoryManager, BlockManager from inferring the number of available cores by calling Runtime.getRuntime().availableProcessors(). By using this, spark applications running on JDK 8u120 or older incorrectly get the total number of cores in the host, ignoring the cgroup limits set by kubernetes. JDK 8u121 and newer runtimes do not have this problem.

Orthogonal to this, it is currently not possible to decouple resource limits on the driver container with the amount of parallelism of the various network and memory components listed above.

Does this PR introduce any user-facing change?

Yes. Previously the amount of parallelism in kubernetes cluster mode submitted drivers were the number of cores in the host when running on JDK 8u120 or older, or the maximum of driver containers resource requests and limits when running on JDK 8u121 or newer. Now the value of spark.driver.cores is used.

How was this patch tested?

happy to add tests if my proposal looks reasonable

@dongjoon-hyun
Copy link
Member

ok to test

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-30949][K8S] decouple requests and parallelism on kubernetes drivers [SPARK-30949][K8S][CORE] decouple requests and parallelism on kubernetes drivers Feb 25, 2020
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @onursatici . Thank you for making a PR, but the following claim seems to be outdated. Technically, it was fixed long time ago (on January 2017 at JDK 8u121).

By using this, spark applications running on java 8 or older incorrectly get the total number of cores in the host, ignoring the cgroup limits set by kubernetes. Java 9 and newer runtimes do not have this problem.

Please see https://bugs.openjdk.java.net/browse/JDK-8173345 for the detail.

Given the above, could you revise the PR description?

@SparkQA
Copy link

SparkQA commented Feb 25, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/23681/

@SparkQA
Copy link

SparkQA commented Feb 25, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/23681/

@SparkQA
Copy link

SparkQA commented Feb 26, 2020

Test build #118933 has finished for PR 27695 at commit 2b3ad5b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Ping @holdenk since the flakiness of Test basic decommissioning is observed again.

- Test basic decommissioning *** FAILED ***
  The code passed to eventually never returned normally. Attempted 121 times over 2.01174358735 minutes. Last failure message: "++ id -u

@dongjoon-hyun
Copy link
Member

Gentle ping, @onursatici .

@holdenk
Copy link
Contributor

holdenk commented Feb 29, 2020

In my experience the K8s tests have all been flaky but I’ll dig into them & decom as well this coming week

@jiangxb1987
Copy link
Contributor

IIUC this issue also affects Standalone cluster mode?

@onursatici
Copy link
Contributor Author

@dongjoon-hyun, what are the next steps? Does it look fine from your perspective?

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Mar 4, 2020

@onursatici . One thing I'm thinking is the deprecation of 8u120 and the older versions at 3.0.0. Until now, 3.0.0-preview2 gave some early warning (not official deprecation yet) for 8u91 and the older versions like the following.

Java 8 prior to version 8u92 support is deprecated as of Spark 3.0.0

@jiangxb1987
Copy link
Contributor

Since this also affects Standalone cluster, I'd suggest we only exclude Mesos backend in the case match.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Mar 17, 2020

Hi, @jiangxb1987 . What do you mean by this? This PR or this old JDK bug?

Since this also affects Standalone cluster, I'd suggest we only exclude Mesos backend in the case match.

@jiangxb1987
Copy link
Contributor

I mean the JDK bug mentioned in this PR.

@onursatici
Copy link
Contributor Author

@jiangxb1987 do you recommend to do that in this PR? I think changing the stand-alone driver core count behaviour would broaden the scope of this PR such that it might warrant a separate discussion.

Does this change make sense for k8s? Any blockers @dongjoon-hyun?

@jiangxb1987
Copy link
Contributor

It's fine if you want to focus on the k8s behaviour here, I can submit another PR to fix the Standalone backend after this is merged.

Copy link
Contributor

@jiangxb1987 jiangxb1987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jiangxb1987
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Mar 25, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/25078/

@SparkQA
Copy link

SparkQA commented Mar 25, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/25078/

@SparkQA
Copy link

SparkQA commented Mar 25, 2020

Test build #120369 has finished for PR 27695 at commit 2b3ad5b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Mar 26, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/25107/

@SparkQA
Copy link

SparkQA commented Mar 26, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/25107/

@SparkQA
Copy link

SparkQA commented Mar 26, 2020

Test build #120398 has finished for PR 27695 at commit 2b3ad5b.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Mar 26, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/25142/

@SparkQA
Copy link

SparkQA commented Mar 26, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/25142/

@SparkQA
Copy link

SparkQA commented Mar 26, 2020

Test build #120434 has finished for PR 27695 at commit 2b3ad5b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@onursatici
Copy link
Contributor Author

@dongjoon-hyun do you mind taking a look at this? I have revised the PR description

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To merge this PR, it seems that we need to revise spark.kubernetes.driver.request.cores documentation together. The following is the current one. Although it's correct, I guess we need to mention the decoupled parallelism additionally due to spark.driver.cores after this PR.

This takes precedence over <code>spark.driver.cores</code>
for specifying the driver pod cpu request if set.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Apr 15, 2020

@onursatici . I'm still not sure about this approach.

  • First, K8s environment is different from on-prem environment. I don't think the user of Apache Spark 3.1.0 will use this kind of old JDKs (JDK 8u120 or older). We have reached the milestone of JDK11 at Apache Spark 3.0.0.
  • Second, this original YARN code is made by the following PR for Netty. In K8s environment, the total number of available cores is the same with the requested amount.

The last possibility which I can guess is that you want to have bigger parallelism on the small container. Is that your case? Could you give us more concrete example where this PR is beneficial?

@dongjoon-hyun
Copy link
Member

How do you think about the above comment, @onursatici ? I'm wondering your opinion.

@dongjoon-hyun
Copy link
Member

Gentle ping, @onursatici .

@onursatici
Copy link
Contributor Author

Hey @dongjoon-hyun , sorry for the late response. You are right that the feature I wan't with this PR is to mainly increase parallelism while keeping the cpu resource requests low in K8S.

spark.driver.cores controls the number of threads in various components, including some network components that are mainly blocked by I/O. The ability to increase this while keeping the cpu requests the same would allow users to better utilise their K8S clusters, especially if their workloads are I/O bound

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Apr 21, 2020

Got it. Thanks, @onursatici . I'll revise the PR description a little and do the final review.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-30949][K8S][CORE] decouple requests and parallelism on kubernetes drivers [SPARK-30949][K8S][CORE] Decouple requests and parallelism on kubernetes drivers Apr 21, 2020
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-30949][K8S][CORE] Decouple requests and parallelism on kubernetes drivers [SPARK-30949][K8S][CORE] Decouple requests and parallelism on drivers in K8s Apr 21, 2020
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @onursatici and @jiangxb1987 .
Merged to master for Apache Spark 3.1.0.

onursatici added a commit to palantir/spark that referenced this pull request Jun 25, 2020
… in K8s

### What changes were proposed in this pull request?
`spark.driver.cores` configuration is used to set the amount of parallelism in kubernetes cluster mode drivers. Previously the amount of parallelism in the drivers were the number of cores in the host when running on JDK 8u120 or older, or the maximum of driver containers resource requests and limits when running on [JDK 8u121 or newer](https://bugs.openjdk.java.net/browse/JDK-8173345). This will enable users to specify `spark.driver.cores` to set parallelism, and specify `spark.kubernetes.driver.requests.cores` to limit the resource requests of the driver container, effectively decoupling the two

### Why are the changes needed?
Drivers submitted in kubernetes cluster mode set the parallelism of various components like `RpcEnv`, `MemoryManager`, `BlockManager` from inferring the number of available cores by calling `Runtime.getRuntime().availableProcessors()`. By using this, spark applications running on JDK 8u120 or older incorrectly get the total number of cores in the host, [ignoring the cgroup limits set by kubernetes](https://bugs.openjdk.java.net/browse/JDK-6515172). JDK 8u121 and newer runtimes do not have this problem.

Orthogonal to this, it is currently not possible to decouple resource limits on the driver container with the amount of parallelism of the various network and memory components listed above.

### Does this PR introduce any user-facing change?
Yes. Previously the amount of parallelism in kubernetes cluster mode submitted drivers were the number of cores in the host when running on JDK 8u120 or older, or the maximum of driver containers resource requests and limits when running on JDK 8u121 or newer. Now the value of `spark.driver.cores` is used.

### How was this patch tested?
happy to add tests if my proposal looks reasonable

Closes apache#27695 from onursatici/os/decouple-requests-and-parallelism.

Authored-by: Onur Satici <onursatici@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
rshkv pushed a commit to palantir/spark that referenced this pull request Jun 27, 2020
… in K8s

### What changes were proposed in this pull request?
`spark.driver.cores` configuration is used to set the amount of parallelism in kubernetes cluster mode drivers. Previously the amount of parallelism in the drivers were the number of cores in the host when running on JDK 8u120 or older, or the maximum of driver containers resource requests and limits when running on [JDK 8u121 or newer](https://bugs.openjdk.java.net/browse/JDK-8173345). This will enable users to specify `spark.driver.cores` to set parallelism, and specify `spark.kubernetes.driver.requests.cores` to limit the resource requests of the driver container, effectively decoupling the two

### Why are the changes needed?
Drivers submitted in kubernetes cluster mode set the parallelism of various components like `RpcEnv`, `MemoryManager`, `BlockManager` from inferring the number of available cores by calling `Runtime.getRuntime().availableProcessors()`. By using this, spark applications running on JDK 8u120 or older incorrectly get the total number of cores in the host, [ignoring the cgroup limits set by kubernetes](https://bugs.openjdk.java.net/browse/JDK-6515172). JDK 8u121 and newer runtimes do not have this problem.

Orthogonal to this, it is currently not possible to decouple resource limits on the driver container with the amount of parallelism of the various network and memory components listed above.

### Does this PR introduce any user-facing change?
Yes. Previously the amount of parallelism in kubernetes cluster mode submitted drivers were the number of cores in the host when running on JDK 8u120 or older, or the maximum of driver containers resource requests and limits when running on JDK 8u121 or newer. Now the value of `spark.driver.cores` is used.

### How was this patch tested?
happy to add tests if my proposal looks reasonable

Closes apache#27695 from onursatici/os/decouple-requests-and-parallelism.

Authored-by: Onur Satici <onursatici@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants