[SPARK-49760][YARN] Correct handling of `SPARK_USER` env variable override in app master by cnauroth · Pull Request #48214 · apache/spark

cnauroth · 2024-09-23T21:02:55Z

What changes were proposed in this pull request?

This patch corrects handling of a user-supplied SPARK_USER environment variable in the YARN app master. Currently, the user-supplied value gets appended to the default, like a classpath entry. The patch fixes it by using only the user-supplied value.

Why are the changes needed?

Overriding the SPARK_USER environment variable in the YARN app master with configuration property spark.yarn.appMasterEnv.SPARK_USER currently results in an incorrect value. Client#setupLaunchEnv first sets a default in the environment map using the Hadoop user. After that, YarnSparkHadoopUtil.addPathToEnvironment sees the existing value in the map and interprets the user-supplied value as needing to be appended like a classpath entry. The end result is the Hadoop user appended with the classpath delimiter and user-supplied value, e.g. cnauroth:overrideuser.

Does this PR introduce any user-facing change?

Yes, the app master now uses the user-supplied SPARK_USER if specified. (The default is still the Hadoop user.)

How was this patch tested?

Existing unit tests pass.
Added new unit tests covering default and overridden SPARK_USER for the app master. The override test fails without this patch, and then passes after the patch is applied.
Manually tested in a live YARN cluster as shown below.

Manual testing used the DFSReadWriteTest job with overrides of SPARK_USER:

spark-submit \
    --deploy-mode cluster \
    --files all-lines.txt \
    --class org.apache.spark.examples.DFSReadWriteTest \
    --conf spark.yarn.appMasterEnv.SPARK_USER=sparkuser_appMaster \
    --conf spark.driverEnv.SPARK_USER=sparkuser_driver \
    --conf spark.executorEnv.SPARK_USER=sparkuser_executor \
    /usr/lib/spark/examples/jars/spark-examples.jar \
    all-lines.txt /tmp/DFSReadWriteTest

Before the patch, we can see the app master's SPARK_USER mishandled by looking at the _SUCCESS file in HDFS:

hdfs dfs -ls -R /tmp/DFSReadWriteTest

drwxr-xr-x   - cnauroth:sparkuser_appMaster hadoop          0 2024-09-20 23:35 /tmp/DFSReadWriteTest/dfs_read_write_test
-rw-r--r--   1 cnauroth:sparkuser_appMaster hadoop          0 2024-09-20 23:35 /tmp/DFSReadWriteTest/dfs_read_write_test/_SUCCESS
-rw-r--r--   1 sparkuser_executor                      hadoop    2295080 2024-09-20 23:35 /tmp/DFSReadWriteTest/dfs_read_write_test/part-00000
-rw-r--r--   1 sparkuser_executor                      hadoop    2288718 2024-09-20 23:35 /tmp/DFSReadWriteTest/dfs_read_write_test/part-00001

After the patch, we can see it working correctly:

hdfs dfs -ls -R /tmp/DFSReadWriteTest
drwxr-xr-x   - sparkuser_appMaster hadoop          0 2024-09-23 17:13 /tmp/DFSReadWriteTest/dfs_read_write_test
-rw-r--r--   1 sparkuser_appMaster hadoop          0 2024-09-23 17:13 /tmp/DFSReadWriteTest/dfs_read_write_test/_SUCCESS
-rw-r--r--   1 sparkuser_executor  hadoop    2295080 2024-09-23 17:13 /tmp/DFSReadWriteTest/dfs_read_write_test/part-00000
-rw-r--r--   1 sparkuser_executor  hadoop    2288718 2024-09-23 17:13 /tmp/DFSReadWriteTest/dfs_read_write_test/part-00001

Was this patch authored or co-authored using generative AI tooling?

No.

…able override in app master. ### What changes were proposed in this pull request? This patch corrects handling of a user-supplied `SPARK_USER` environment variable in the YARN app master. Currently, the user-supplied value gets appended to the default, like a classpath entry. The patch fixes it by using only the user-supplied value. ### Why are the changes needed? Overriding the `SPARK_USER` environment variable in the YARN app master with configuration property `spark.yarn.appMasterEnv.SPARK_USER` currently results in an incorrect value. `Client#setupLaunchEnv` first sets a default in the environment map using the Hadoop user. After that, `YarnSparkHadoopUtil.addPathToEnvironment` sees the existing value in the map and interprets the user-supplied value as needing to be appended like a classpath entry. The end result is the Hadoop user appended with the classpath delimiter and user-supplied value, e.g. `cnauroth:overrideuser`. ### Does this PR introduce _any_ user-facing change? Yes, the app master now uses the user-supplied `SPARK_USER` if specified. (The default is still the Hadoop user.) ### How was this patch tested? * Existing unit tests pass. * Added new unit tests covering default and overridden `SPARK_USER` for the app master. The override test fails without this patch, and then passes after the patch is applied. * Manually tested in a live YARN cluster as shown below. Manual testing used the `DFSReadWriteTest` job with overrides of `SPARK_USER`: ``` spark-submit \ --deploy-mode cluster \ --files all-lines.txt \ --class org.apache.spark.examples.DFSReadWriteTest \ --conf spark.yarn.appMasterEnv.SPARK_USER=sparkuser_appMaster \ --conf spark.driverEnv.SPARK_USER=sparkuser_driver \ --conf spark.executorEnv.SPARK_USER=sparkuser_executor \ /usr/lib/spark/examples/jars/spark-examples.jar \ all-lines.txt /tmp/DFSReadWriteTest ``` Before the patch, we can see the app master's `SPARK_USER` mishandled by looking at the `_SUCCESS` file in HDFS: ``` hdfs dfs -ls -R /tmp/DFSReadWriteTest drwxr-xr-x - cnauroth:sparkuser_appMaster hadoop 0 2024-09-20 23:35 /tmp/DFSReadWriteTest/dfs_read_write_test -rw-r--r-- 1 cnauroth:sparkuser_appMaster hadoop 0 2024-09-20 23:35 /tmp/DFSReadWriteTest/dfs_read_write_test/_SUCCESS -rw-r--r-- 1 sparkuser_executor hadoop 2295080 2024-09-20 23:35 /tmp/DFSReadWriteTest/dfs_read_write_test/part-00000 -rw-r--r-- 1 sparkuser_executor hadoop 2288718 2024-09-20 23:35 /tmp/DFSReadWriteTest/dfs_read_write_test/part-00001 ``` After the patch, we can see it working correctly: ``` hdfs dfs -ls -R /tmp/DFSReadWriteTest drwxr-xr-x - sparkuser_appMaster hadoop 0 2024-09-23 17:13 /tmp/DFSReadWriteTest/dfs_read_write_test -rw-r--r-- 1 sparkuser_appMaster hadoop 0 2024-09-23 17:13 /tmp/DFSReadWriteTest/dfs_read_write_test/_SUCCESS -rw-r--r-- 1 sparkuser_executor hadoop 2295080 2024-09-23 17:13 /tmp/DFSReadWriteTest/dfs_read_write_test/part-00000 -rw-r--r-- 1 sparkuser_executor hadoop 2288718 2024-09-23 17:13 /tmp/DFSReadWriteTest/dfs_read_write_test/part-00001 ``` ### Was this patch authored or co-authored using generative AI tooling? No.

cnauroth · 2024-09-23T21:04:05Z

I'm interested in getting this into master and branch-3.5. It won't cherry-pick cleanly though, so I'll put together a separate branch-3.5 pull request if this gets approved.

dongjoon-hyun

+1, LGTM. Thank you, @cnauroth .

cc @mridulm , too.

dongjoon-hyun · 2024-09-23T21:17:29Z

BTW, according to the original code, this seems to be very old bug, right? Then, we need to backport this to branch-3.4 too.

cnauroth · 2024-09-23T22:44:23Z

BTW, according to the original code, this seems to be very old bug, right? Then, we need to backport this to branch-3.4 too.

Sounds good! I filed #48216 for branch-3.5. The same patch cherry-picks to branch-3.4, and I'm confirming tests on that branch now.

dongjoon-hyun · 2024-09-24T04:35:08Z

Thank you, @cnauroth and @yaooqinn .
Merged to master.

cnauroth · 2024-09-24T17:51:44Z

@dongjoon-hyun and @yaooqinn , thanks very much for the review and commit!

github-actions bot added the YARN label Sep 23, 2024

dongjoon-hyun changed the title ~~[SPARK-49760][YARN] Correct handling of SPARK_USER environment vari…~~ [SPARK-49760][YARN] Correct handling of SPARK_USER env variable override in app master Sep 23, 2024

dongjoon-hyun approved these changes Sep 23, 2024

View reviewed changes

cnauroth mentioned this pull request Sep 23, 2024

[SPARK-49760][YARN] Correct handling of SPARK_USER env variable override in app master #48216

Closed

yaooqinn approved these changes Sep 24, 2024

View reviewed changes

dongjoon-hyun closed this in 35e5d29 Sep 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-49760][YARN] Correct handling of `SPARK_USER` env variable override in app master#48214

[SPARK-49760][YARN] Correct handling of `SPARK_USER` env variable override in app master#48214
cnauroth wants to merge 1 commit intoapache:masterfrom
cnauroth:SPARK-49760

cnauroth commented Sep 23, 2024 •

edited by dongjoon-hyun

Loading

Uh oh!

cnauroth commented Sep 23, 2024

Uh oh!

dongjoon-hyun left a comment

Uh oh!

dongjoon-hyun commented Sep 23, 2024

Uh oh!

cnauroth commented Sep 23, 2024

Uh oh!

dongjoon-hyun commented Sep 24, 2024

Uh oh!

cnauroth commented Sep 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cnauroth commented Sep 23, 2024 • edited by dongjoon-hyun Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

cnauroth commented Sep 23, 2024

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Sep 23, 2024

Uh oh!

cnauroth commented Sep 23, 2024

Uh oh!

dongjoon-hyun commented Sep 24, 2024

Uh oh!

cnauroth commented Sep 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cnauroth commented Sep 23, 2024 •

edited by dongjoon-hyun

Loading