[SPARK-41188][CORE][ML] Set executorEnv OMP_NUM_THREADS to be spark.task.cpus by default for spark executor JVM processes #38699

WeichenXu123 · 2022-11-18T03:22:14Z

Signed-off-by: Weichen Xu weichen.xu@databricks.com

What changes were proposed in this pull request?

Set executorEnv OMP_NUM_THREADS to be spark.task.cpus by default for spark executor JVM processes.

Why are the changes needed?

This is for limiting the thread number for OpenBLAS routine to the number of cores assigned to this executor because some spark ML algorithms calls OpenBlAS via netlib-java,
e.g.:
Spark ALS estimator training calls LAPACK API dppsv (internally it will call BLAS lib), if it calls OpenBLAS lib, by default OpenBLAS will try to use all CPU cores. But spark will launch multiple spark tasks on a spark worker, and each spark task might call dppsv API at the same time, and each call internally it will create multiple threads (threads number equals to CPU cores), this causes CPU oversubscription.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Manually.

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>

LuciferYang

+1, LGTM

mridulm · 2022-11-18T17:06:00Z

If we are setting it in SparkContext, do we want to get rid of this from other places like PythonRunner.compute ?

WeichenXu123 · 2022-11-19T04:13:51Z

If we are setting it in SparkContext, do we want to get rid of this from other places like PythonRunner.compute ?

I think we can remove code in PythonRunner.compute

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>

…ask.cpus by default for spark executor JVM processes Signed-off-by: Weichen Xu <weichen.xudatabricks.com> ### What changes were proposed in this pull request? Set executorEnv OMP_NUM_THREADS to be spark.task.cpus by default for spark executor JVM processes. ### Why are the changes needed? This is for limiting the thread number for OpenBLAS routine to the number of cores assigned to this executor because some spark ML algorithms calls OpenBlAS via netlib-java, e.g.: Spark ALS estimator training calls LAPACK API `dppsv` (internally it will call BLAS lib), if it calls OpenBLAS lib, by default OpenBLAS will try to use all CPU cores. But spark will launch multiple spark tasks on a spark worker, and each spark task might call `dppsv` API at the same time, and each call internally it will create multiple threads (threads number equals to CPU cores), this causes CPU oversubscription. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually. Closes #38699 from WeichenXu123/SPARK-41188. Authored-by: Weichen Xu <weichen.xu@databricks.com> Signed-off-by: Weichen Xu <weichen.xu@databricks.com> (cherry picked from commit 82a41d8) Signed-off-by: Weichen Xu <weichen.xu@databricks.com>

WeichenXu123 · 2022-11-19T09:25:36Z

Merged to master / branch-3.3 / branch-3.2

…ask.cpus by default for spark executor JVM processes Signed-off-by: Weichen Xu <weichen.xudatabricks.com> ### What changes were proposed in this pull request? Set executorEnv OMP_NUM_THREADS to be spark.task.cpus by default for spark executor JVM processes. ### Why are the changes needed? This is for limiting the thread number for OpenBLAS routine to the number of cores assigned to this executor because some spark ML algorithms calls OpenBlAS via netlib-java, e.g.: Spark ALS estimator training calls LAPACK API `dppsv` (internally it will call BLAS lib), if it calls OpenBLAS lib, by default OpenBLAS will try to use all CPU cores. But spark will launch multiple spark tasks on a spark worker, and each spark task might call `dppsv` API at the same time, and each call internally it will create multiple threads (threads number equals to CPU cores), this causes CPU oversubscription. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually. Closes apache#38699 from WeichenXu123/SPARK-41188. Authored-by: Weichen Xu <weichen.xu@databricks.com> Signed-off-by: Weichen Xu <weichen.xu@databricks.com>

jzhuge · 2023-02-27T07:54:29Z

If we are setting it in SparkContext, do we want to get rid of this from other places like PythonRunner.compute ?

I think we can remove code in PythonRunner.compute

Found an issue in YARN (SPARK-42596). Could you double check?

HyukjinKwon · 2023-02-28T00:33:32Z

Thanks for pointing it out and making a PR. I left a comment in your PR.

…ask.cpus by default for spark executor JVM processes Signed-off-by: Weichen Xu <weichen.xudatabricks.com> ### What changes were proposed in this pull request? Set executorEnv OMP_NUM_THREADS to be spark.task.cpus by default for spark executor JVM processes. ### Why are the changes needed? This is for limiting the thread number for OpenBLAS routine to the number of cores assigned to this executor because some spark ML algorithms calls OpenBlAS via netlib-java, e.g.: Spark ALS estimator training calls LAPACK API `dppsv` (internally it will call BLAS lib), if it calls OpenBLAS lib, by default OpenBLAS will try to use all CPU cores. But spark will launch multiple spark tasks on a spark worker, and each spark task might call `dppsv` API at the same time, and each call internally it will create multiple threads (threads number equals to CPU cores), this causes CPU oversubscription. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually. Closes apache#38699 from WeichenXu123/SPARK-41188. Authored-by: Weichen Xu <weichen.xu@databricks.com> Signed-off-by: Weichen Xu <weichen.xu@databricks.com> (cherry picked from commit 82a41d8) Signed-off-by: Weichen Xu <weichen.xu@databricks.com>

update

37d5e0c

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>

github-actions bot added the CORE label Nov 18, 2022

WeichenXu123 requested a review from HyukjinKwon November 18, 2022 03:22

HyukjinKwon approved these changes Nov 18, 2022

View reviewed changes

LuciferYang approved these changes Nov 18, 2022

View reviewed changes

update

4066162

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>

github-actions bot added MESOS PYTHON labels Nov 19, 2022

WeichenXu123 closed this in 82a41d8 Nov 19, 2022

HyukjinKwon mentioned this pull request Feb 28, 2023

[SPARK-42596][CORE][YARN] OMP_NUM_THREADS not set to number of executor cores by default #40199

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-41188][CORE][ML] Set executorEnv OMP_NUM_THREADS to be spark.task.cpus by default for spark executor JVM processes #38699

[SPARK-41188][CORE][ML] Set executorEnv OMP_NUM_THREADS to be spark.task.cpus by default for spark executor JVM processes #38699

WeichenXu123 commented Nov 18, 2022 •

edited

LuciferYang left a comment

mridulm commented Nov 18, 2022

WeichenXu123 commented Nov 19, 2022

WeichenXu123 commented Nov 19, 2022

jzhuge commented Feb 27, 2023

HyukjinKwon commented Feb 28, 2023

[SPARK-41188][CORE][ML] Set executorEnv OMP_NUM_THREADS to be spark.task.cpus by default for spark executor JVM processes #38699

[SPARK-41188][CORE][ML] Set executorEnv OMP_NUM_THREADS to be spark.task.cpus by default for spark executor JVM processes #38699

Conversation

WeichenXu123 commented Nov 18, 2022 • edited

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

LuciferYang left a comment

Choose a reason for hiding this comment

mridulm commented Nov 18, 2022

WeichenXu123 commented Nov 19, 2022

WeichenXu123 commented Nov 19, 2022

jzhuge commented Feb 27, 2023

HyukjinKwon commented Feb 28, 2023

WeichenXu123 commented Nov 18, 2022 •

edited