Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33927][BUILD] Fix Dockerfile for Spark release to work #30971

Closed
wants to merge 3 commits into from

Conversation

HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Dec 30, 2020

What changes were proposed in this pull request?

This PR proposes to fix the Dockerfile for Spark release.

  • Port b135db3 to Dockerfile
  • Upgrade Ubuntu 18.04 -> 20.04 (because of porting b135db3)
  • Remove Python 2 (because of Ubuntu upgrade)
  • Use built-in Python 3.8.5 (because of Ubuntu upgrade)
  • Node.js 11 -> 12 (because of Ubuntu upgrade)
  • Ruby 2.5 -> 2.7 (because of Ubuntu upgrade)
  • Python dependencies and Jekyll + plugins upgrade to the latest as it's used in GitHub Actions build (unrelated to the issue itself)

Why are the changes needed?

To make a Spark release :-).

Does this PR introduce any user-facing change?

No, dev-only.

How was this patch tested?

Manually tested via:

cd dev/create-release/spark-rm
docker build -t spark-rm --build-arg UID=$UID .
...
Successfully built 516d7943634f
Successfully tagged spark-rm:latest

@dongjoon-hyun
Copy link
Member

Thank you, @HyukjinKwon !

@HyukjinKwon
Copy link
Member Author

cc @dongjoon-hyun, @sarutak, @cloud-fan FYI

# * Python (2.7.15/3.6.7)
# * R-base/R-base-dev (4.0.2)
# * Ruby 2.3 build utilities
# * Python (3.8.5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the doc generation work with Python 3.8, @HyukjinKwon ?

Copy link
Member Author

@HyukjinKwon HyukjinKwon Dec 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I use Python 3.8 locally :-). I checked that it works for PySpark documentation build. I will happen to double check when I do actual release.

BTW, after we switch to Rouge at #26521, we're not dependent on Python anymore in other documentation generations if I am not mistaken. So it should be fine.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this should be tested by manually, I verified locally. We may hit another issue when we run release script based on this image. Let's merge this first and proceed first, @HyukjinKwon .

$ docker build -t spark-rm --build-arg UID=$UID .
[+] Building 966.5s (9/9) FINISHED
 => [internal] load build definition from Dockerfile                                                          0.0s
 => => transferring dockerfile: 4.15kB                                                                        0.0s
 => [internal] load .dockerignore                                                                             0.0s
 => => transferring context: 2B                                                                               0.0s
 => [internal] load metadata for docker.io/library/ubuntu:20.04                                               1.8s
 => [auth] library/ubuntu:pull token for registry-1.docker.io                                                 0.0s
 => [1/4] FROM docker.io/library/ubuntu:20.04@sha256:c95a8e48bf88e9849f3e0f723d9f49fa12c5a00cfc6e60d2bc99d87  0.1s
 => => resolve docker.io/library/ubuntu:20.04@sha256:c95a8e48bf88e9849f3e0f723d9f49fa12c5a00cfc6e60d2bc99d87  0.0s
 => => sha256:c95a8e48bf88e9849f3e0f723d9f49fa12c5a00cfc6e60d2bc99d87555295e4c 1.20kB / 1.20kB                0.0s
 => => sha256:4e4bc990609ed865e07afc8427c30ffdddca5153fd4e82c20d8f0783a291e241 943B / 943B                    0.0s
 => => sha256:f643c72bc25212974c16f3348b3a898b1ec1eb13ec1539e10a103e6e217eb2f1 3.32kB / 3.32kB                0.0s
 => [2/4] RUN apt-get clean && apt-get update && apt-get install --no-install-recommends -y gnupg ca-certi  937.6s
 => [3/4] WORKDIR /opt/spark-rm/output                                                                        0.0s
 => [4/4] RUN useradd -m -s /bin/bash -p spark-rm -u 501 spark-rm                                             0.3s
 => exporting to image                                                                                       26.6s
 => => exporting layers                                                                                      26.6s
 => => writing image sha256:aa85b74a904e07d59f7ad4443a14212fc4aea4297148bc3424c5cf17f1c2ab5c                  0.0s
 => => naming to docker.io/library/spark-rm                                                                   0.0s
$ echo $status
0

@HyukjinKwon
Copy link
Member Author

Thank you @dongjoon-hyun for double checking!

@dongjoon-hyun
Copy link
Member

Feel free to merge, @HyukjinKwon . Thank you always for this preparation!

@HyukjinKwon
Copy link
Member Author

Merged to master and branch-3.1.

HyukjinKwon added a commit that referenced this pull request Dec 30, 2020
### What changes were proposed in this pull request?

This PR proposes to fix the `Dockerfile` for Spark release.

- Port b135db3 to `Dockerfile`
- Upgrade Ubuntu 18.04 -> 20.04 (because of porting b135db3)
- Remove Python 2 (because of Ubuntu upgrade)
- Use built-in Python 3.8.5 (because of Ubuntu upgrade)
- Node.js 11 -> 12 (because of Ubuntu upgrade)
- Ruby 2.5 -> 2.7 (because of Ubuntu upgrade)
- Python dependencies and Jekyll + plugins upgrade to the latest as it's used in GitHub Actions build (unrelated to the issue itself)

### Why are the changes needed?

To make a Spark release :-).

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Manually tested via:

```bash
cd dev/create-release/spark-rm
docker build -t spark-rm --build-arg UID=$UID .
```

```
...
Successfully built 516d7943634f
Successfully tagged spark-rm:latest
```

Closes #30971 from HyukjinKwon/SPARK-33927.

Lead-authored-by: Hyukjin Kwon <gurwls223@apache.org>
Co-authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit 403bf55)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
@SparkQA
Copy link

SparkQA commented Dec 30, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38110/

@SparkQA
Copy link

SparkQA commented Dec 30, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38110/

@SparkQA
Copy link

SparkQA commented Dec 30, 2020

Test build #133522 has finished for PR 30971 at commit 7bce81e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sarutak
Copy link
Member

sarutak commented Dec 30, 2020

Sorry for the late reply. It LGTM!

okumin pushed a commit to zookage/spark that referenced this pull request Jan 3, 2021
This PR proposes to fix the `Dockerfile` for Spark release.

- Port apache@b135db3 to `Dockerfile`
- Upgrade Ubuntu 18.04 -> 20.04 (because of porting b135db3)
- Remove Python 2 (because of Ubuntu upgrade)
- Use built-in Python 3.8.5 (because of Ubuntu upgrade)
- Node.js 11 -> 12 (because of Ubuntu upgrade)
- Ruby 2.5 -> 2.7 (because of Ubuntu upgrade)
- Python dependencies and Jekyll + plugins upgrade to the latest as it's used in GitHub Actions build (unrelated to the issue itself)

To make a Spark release :-).

No, dev-only.

Manually tested via:

```bash
cd dev/create-release/spark-rm
docker build -t spark-rm --build-arg UID=$UID .
```

```
...
Successfully built 516d7943634f
Successfully tagged spark-rm:latest
```

Closes apache#30971 from HyukjinKwon/SPARK-33927.

Lead-authored-by: Hyukjin Kwon <gurwls223@apache.org>
Co-authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
@HyukjinKwon HyukjinKwon deleted the SPARK-33927 branch January 4, 2022 00:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants