New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-40746][INFRA] Switch topull_request_target
and fix workflow
#5
Conversation
0a89805
to
91ad0ea
Compare
pull_request_target
and fix workflowpull_request_target
and fix workflow
- '.github/workflows/main.yml' | ||
|
||
jobs: | ||
run-build: | ||
permissions: | ||
packages: write |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed ? This workflow just delegates to main.yml
where this is also specified.
It does not do any harm either!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's needed see also: apache/spark#38145
Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
@@ -98,7 +102,7 @@ jobs: | |||
with: | |||
context: ${{ env.IMAGE_PATH }} | |||
push: true | |||
tags: ${{ env.TEST_REPO }}:${{ env.UNIQUE_IMAGE_TAG }} | |||
tags: ${{ env.TEST_REPO }}/${{ env.IMAGE_NAME }}:${{ env.UNIQUE_IMAGE_TAG }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Emm, this might has some issue when multiple concurrency jobs push the same tag. Of course, we could use action concurrency or add some random hash tag to protect.
But after some consideration, I guess what we need to do just remove push in this PR
push: true
And change it to local build in minikube docker (don't do a real push) when we adding K8s test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is replaced by #7
### What changes were proposed in this pull request? This patch: - Add spark uid/gid in dockerfile (useradd and groupadd). (used in entrypoint) This way is also used by [others DOI](https://github.com/search?p=2&q=org%3Adocker-library+useradd&type=Code) and apache DOI (such as [zookeeper](https://github.com/31z4/zookeeper-docker/blob/master/3.8.0/Dockerfile#L17-L21), [solr](https://github.com/apache/solr-docker/blob/a20477ed123cd1a72132aebcc0742cee46b5f976/9.0/Dockerfile#L108-L110), [flink](https://github.com/apache/flink-docker/blob/master/1.15/scala_2.12-java11-ubuntu/Dockerfile#L55-L56)). - Use `spark` user in `entrypoint.sh` rather than Dockerfile. (make sure the spark process is executed as non-root users) - Remove `USER` setting in Dockerfile. (make sure base image has permission to extend dockerifle, such as execute `apt update`) - Chown script to `spark:spark` instead of `root:root`. (avoid permission issue such like standalone mode) - Add `gosu` deps, a `sudo` replacement recommanded by [docker](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#user) and [docker official image](https://github.com/docker-library/official-images/blob/9a4d54f1a42ea82970baa4e6f3d0bc75e98fc961/README.md#consistency), and also are used by other DOI images. This change also follow the rules of docker official images, see also [consistency](https://github.com/docker-library/official-images/blob/9a4d54f1a42ea82970baa4e6f3d0bc75e98fc961/README.md#consistency) and [dockerfile best practices about user](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#user). ### Why are the changes needed? The below issues are what I have found so far 1. **Irregular login username** Docker images username is not very standard, docker run with `185` username is a little bit weird. ``` $ docker run -ti apache/spark bash 185d88a24357413:/opt/spark/work-dir$ ``` 2. **Permission issue of spark sbin** And also there are some permission issue when running some spark script, such as standalone mode: ``` $ docker run -ti apache/spark /opt/spark/sbin/start-master.sh mkdir: cannot create directory ‘/opt/spark/logs’: Permission denied chown: cannot access '/opt/spark/logs': No such file or directory starting org.apache.spark.deploy.master.Master, logging to /opt/spark/logs/spark--org.apache.spark.deploy.master.Master-1-1c345a00e312.out /opt/spark/sbin/spark-daemon.sh: line 135: /opt/spark/logs/spark--org.apache.spark.deploy.master.Master-1-1c345a00e312.out: No such file or directory failed to launch: nice -n 0 /opt/spark/bin/spark-class org.apache.spark.deploy.master.Master --host 1c345a00e312 --port 7077 --webui-port 8080 tail: cannot open '/opt/spark/logs/spark--org.apache.spark.deploy.master.Master-1-1c345a00e312.out' for reading: No such file or directory full log in /opt/spark/logs/spark--org.apache.spark.deploy.master.Master-1-1c345a00e312.out ``` </details> 3. **spark as base image case is not supported well** Due to static USER set in Dockerfile. ``` $ cat Dockerfile FROM apache/spark RUN apt update $ docker build -t spark-test:1015 . // ... ------ > [2/2] RUN apt update: #5 0.405 E: Could not open lock file /var/lib/apt/lists/lock - open (13: Permission denied) #5 0.405 E: Unable to lock directory /var/lib/apt/lists/ ------ executor failed running [/bin/sh -c apt update]: exit code: 100 ``` ### Does this PR introduce _any_ user-facing change? Yes. ### How was this patch tested? - CI passed: all k8s test - Regression test: ``` # Username is set to spark rather than 185 docker run -ti spark:scala2.12-java11-python3-r-ubuntu bash spark27bbfca0a581:/opt/spark/work-dir$ ``` ``` # start-master.sh no permission issue $ docker run -ti spark:scala2.12-java11-python3-r-ubuntu bash spark8d1118e26766:~/work-dir$ /opt/spark/sbin/start-master.sh starting org.apache.spark.deploy.master.Master, logging to /opt/spark/logs/spark--org.apache.spark.deploy.master.Master-1-8d1118e26766.out ``` ``` # Image as parent case $ cat Dockerfile FROM spark:scala2.12-java11-python3-r-ubuntu RUN apt update $ docker build -t spark-test:1015 . [+] Building 7.8s (6/6) FINISHED => [1/2] FROM docker.io/library/spark:scala2.12-java11-python3-r-ubuntu 0.0s => [2/2] RUN apt update 7.7s ``` - Other test: ``` # Test on pyspark $ cd spark-docker/3.3.0/scala2.12-java11-python3-r-ubuntu $ docker build -t spark:scala2.12-java11-python3-r-ubuntu . $ docker run -p 4040:4040 -ti spark:scala2.12-java11-python3-r-ubuntu /opt/spark/bin/pyspark ``` ``` # A simple test for `start-master.sh` (standalone mode) $ docker run -ti spark:scala2.12-java11-python3-r-ubuntu bash spark8d1118e26766:~/work-dir$ /opt/spark/sbin/start-master.sh starting org.apache.spark.deploy.master.Master, logging to /opt/spark/logs/spark--org.apache.spark.deploy.master.Master-1-8d1118e26766.out ``` Closes #11 from Yikun/spark-user. Authored-by: Yikun Jiang <yikunkero@gmail.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What changes were proposed in this pull request? - This patch changes the `build-args` to `patch in test` in build and publish workflow, because the docker official image do not support **parameterized FROM** values. docker-library/official-images#13089 (comment) - And also Refactor publish workflow: ![image](https://user-images.githubusercontent.com/1736354/236613626-96f8fbf6-7df7-4d10-b4fb-be4d57c56dce.png) ### Why are the changes needed? Same change with build workflow refactor, to avoid the publish issue like: ``` #5 [linux/amd64 internal] load metadata for docker.io/library/spark:3.4.0-scala2.12-java11-ubuntu #5 ERROR: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed ------ > [linux/amd64 internal] load metadata for docker.io/library/spark:3.4.0-scala2.12-java11-ubuntu: ------ Dockerfile:18 -------------------- 16 | # 17 | ARG BASE_IMAGE=spark:3.4.0-scala2.12-java11-ubuntu 18 | >>> FROM $BASE_IMAGE 19 | 20 | RUN set -ex && \ -------------------- ERROR: failed to solve: spark:3.4.0-scala2.12-java11-ubuntu: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed Error: buildx failed with: ERROR: failed to solve: spark:3.4.0-scala2.12-java11-ubuntu: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed ``` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Publish test in my local fork: - https://github.com/Yikun/spark-docker/actions/runs/5076986823/jobs/9120029759: Skip the local base build use the [published base](https://github.com/Yikun/spark-docker/actions/runs/5076986823/jobs/9120029759#step:11:135) image: ![image](https://user-images.githubusercontent.com/1736354/236612540-2b454c14-e194-4d73-b859-0df001570d27.png) ``` #3 [linux/amd64 internal] load metadata for ghcr.io/yikun/spark-docker/spark:3.4.0-scala2.12-java11-ubuntu #3 DONE 0.9s #4 [linux/arm64 internal] load metadata for ghcr.io/yikun/spark-docker/spark:3.4.0-scala2.12-java11-ubuntu #4 DONE 0.9s ``` - CI passed: do local base build first and build base on the local build Closes #39 from Yikun/publish-build. Authored-by: Yikun Jiang <yikunkero@gmail.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
What changes were proposed in this pull request?
This patch is to make the workflow work in apache repo:
pull_request
topull_request_target
to make sure PR has permision to push test image toghcr.io/apache/spark-docker/spark:TAG
. This real push image will help us do K8s E2E test in future..github/workflows/build_3.3.0.yaml
and3.3.0/**
to trigger pathsapache/spark-docker:TAG
toghcr.io/apache/spark-docker/spark:TAG
Why are the changes needed?
To make the workflow works well in apache repo.
pull_request_target
help us ensure the permission of PR: https://github.com/apache/spark/search?q=pull_request_targetDoes this PR introduce any user-facing change?
No
How was this patch tested?
The CI will only work after this patch merged, it won't be triggered in this PR (due to
pull_request_target
)We can't do complete test before this merge but:
pull_request_target
with @martin-g to simulate the behavior of external contributors contributing to the main repository: Minor test Yikun/spark-docker#2