Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-43370] Switch spark user only when run driver and executor #43

Closed
wants to merge 2 commits into from

Conversation

Yikun
Copy link
Member

@Yikun Yikun commented May 25, 2023

What changes were proposed in this pull request?

Switch spark user only when run driver and executor

Why are the changes needed?

Address doi comments: question 7 [1]

[1] docker-library/official-images#13089 (comment)
[2] docker-library/official-images#13089 (comment)

Does this PR introduce any user-facing change?

Yes

How was this patch tested?

  1. test mannuly
cd ~/spark-docker/3.4.0/scala2.12-java11-ubuntu
$ docker build . -t spark-test

$ docker run -ti spark-test bash
spark@afa78af05cf8:/opt/spark/work-dir$

$ docker run  --user root  -ti spark-test bash
root@095e0d7651fd:/opt/spark/work-dir#
  1. ci passed

Closes: #44

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
@Yikun
Copy link
Member Author

Yikun commented May 26, 2023

  1. spark as default USER (this PR)
  • Run directly: spark user will be used when enter docker image
$ docker run -ti spark-test bash
spark@afa78af05cf8:/opt/spark/work-dir$
  • Base image: If users want to build image based on base image, users will have to switch root to install extra apt pkg and then switch back to spark.
  • In previous docker (v3.3.0), docker run -ti apache/spark:v3.3.0 bash, 185 user will be used.
  • K8s: we pin to spark user.
  1. root as default USER ([SPARK-43370] Switch spark user only when run driver and executor and set root as default #44)
  • Run directly: root user will be used when enter docker image
$ docker run -ti spark-test bash
root@afa78af05cf8:/opt/spark/work-dir$
  • Base image: If users want to build image based on base image, they can install directly
  • K8s: we pin to spark user.

So, looks like root user is more convenient (but bring some behavior change compare to previous apache/spark image). Wait for feedback from DOI.

Comment on lines 113 to 114
CMD=("$@")
"${CMD[@]}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have exec as well. Simplified, it could be exec "$@". (Applies here or to #44)

@Yikun
Copy link
Member Author

Yikun commented Jun 1, 2023

According the suggestion of docker-library/official-images#13089 (comment)

I recommend that if the entrypoint needs to do setup work as root before stepping down then using the default USER root is appropriate (like mysql), otherwise USER configured to an app-specific user in the Dockerfile is better from a security perspective.

For spark, root USER is not required by any setup work, so I think spark user is enough. So I am going to merge this PR, and close #44 .

@Yikun Yikun marked this pull request as ready for review June 1, 2023 03:36
@Yikun Yikun closed this in 2dc12d9 Jun 1, 2023
@Yikun
Copy link
Member Author

Yikun commented Jun 1, 2023

@HyukjinKwon @yosifkit Thanks for review.

Merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants