Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-48651][DOC] Configuring different JDK for Spark on YARN #47010

Closed
wants to merge 4 commits into from

Conversation

pan3793
Copy link
Member

@pan3793 pan3793 commented Jun 18, 2024

What changes were proposed in this pull request?

This PR updates the Spark on YARN docs to guide users to configure a different JDK for Spark Applications.

Why are the changes needed?

As of today, the latest Apache Hadoop 3.4.0 does not support Java 17 yet, while Spark 4.0.0 requires at least Java 17, so users who want to use Spark on YARN must configure a different JDK for Spark applications run on YARN.

This is also asked in the mailing list https://lists.apache.org/thread/ply807h0hht1h8o7x7g1s3j51mnot5dr

Does this PR introduce any user-facing change?

Yes, it changes the user docs.

How was this patch tested?

I verified the command in a YARN cluster.

The following command submits a Spark application with the distributed JDK 21

JAVA_HOME=/opt/openjdk-21 spark-submit \
  --master=yarn \
  --deploy-mode=cluster \
  --archives ./openjdk-21.tar.gz \
  --conf spark.yarn.appMasterEnv.JAVA_HOME=./openjdk-21.tar.gz/openjdk-21 \
  --conf spark.executorEnv.JAVA_HOME=./openjdk-21.tar.gz/openjdk-21 \
  --class org.apache.spark.examples.SparkPi \
  spark-examples*.jar 1
image image image

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the DOCS label Jun 18, 2024
@pan3793
Copy link
Member Author

pan3793 commented Jun 18, 2024

docs/running-on-yarn.md Outdated Show resolved Hide resolved
docs/running-on-yarn.md Outdated Show resolved Hide resolved
Copy link
Contributor

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment on lines +1066 to +1067
--conf spark.yarn.appMasterEnv.JAVA_HOME=./openjdk-21.tar.gz/openjdk-21 \
--conf spark.executorEnv.JAVA_HOME=./openjdk-21.tar.gz/openjdk-21 \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yaooqinn @tgravescs sorry for correcting this in 5bbe200 after your approval, I also updated the PR description to add the manual test result on a YARN cluster

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in b77caf7 Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants