Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-35463][BUILD] Skip checking checksum on a system without shasum #32613

Closed
wants to merge 3 commits into from
Closed

[SPARK-35463][BUILD] Skip checking checksum on a system without shasum #32613

wants to merge 3 commits into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented May 20, 2021

What changes were proposed in this pull request?

Not every build system has shasum. This PR aims to skip checksum checks on a system without shasum.

Why are the changes needed?

PREPARE

$ docker run -it --rm -v $PWD:/spark openjdk:11-slim /bin/bash
root@a0e001a6e50f:/# cd /spark/
root@a0e001a6e50f:/spark# apt-get update
root@a0e001a6e50f:/spark# apt-get install curl

BEFORE (Failure due to command not found)

root@a0e001a6e50f:/spark# build/mvn clean
exec: curl --silent --show-error -L https://downloads.lightbend.com/scala/2.12.10/scala-2.12.10.tgz
exec: curl --silent --show-error -L https://www.apache.org/dyn/closer.lua/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz?action=download
exec: curl --silent --show-error -L https://archive.apache.org/dist/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz.sha512
Veryfing checksum from /spark/build/apache-maven-3.6.3-bin.tar.gz.sha512
build/mvn: line 81: shasum: command not found
Bad checksum from https://archive.apache.org/dist/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz.sha512

AFTER

root@a0e001a6e50f:/spark# build/mvn clean
exec: curl --silent --show-error -L https://downloads.lightbend.com/scala/2.12.10/scala-2.12.10.tgz
Skipping checksum because shasum is not installed.
exec: curl --silent --show-error -L https://www.apache.org/dyn/closer.lua/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz?action=download
exec: curl --silent --show-error -L https://archive.apache.org/dist/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz.sha512
Skipping checksum because shasum is not installed.
Using `mvn` from path: /spark/build/apache-maven-3.6.3/bin/mvn

Does this PR introduce any user-facing change?

Yes, this will recover the build.

How was this patch tested?

Manually with the above process.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-35463][BUILD] Skip checking checksum on a system doesn't have shasum [SPARK-35463][BUILD] Skip checking checksum on a system without shasum May 20, 2021
@dongjoon-hyun dongjoon-hyun requested a review from srowen May 20, 2021 20:58
@dongjoon-hyun
Copy link
Member Author

I marked this as a release blocker for Apache Spark 3.1.2 because it breaks our downstream build system, @srowen , @HyukjinKwon , @viirya .

@srowen
Copy link
Member

srowen commented May 20, 2021

@Yikun raised a counter-point - do we really want to ignore this vs asking build envs to just have this package? I could go either way on it.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented May 20, 2021

Sorry, but which one is it?

From release perspective, this is a regression for downstream and a kind of release blocker for building Apache Spark 3.1.2.

@Yikun raised a counter-point - do we really want to ignore this vs asking build envs to just have this package? I could go either way on it.

If you are mentioning the following, it doesn't fix this regression. Maybe, some other comments?

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented May 20, 2021

Technically, I prefer not to add another system requirement for Apache Spark build.

  • If we need to enforce checksum, we should add it as a migration guide at least.
  • I don't think it's worth of strong enforcement. For those who wants the enforcement, it's easy for them to install shasum.

@srowen
Copy link
Member

srowen commented May 20, 2021

Right, #32604 is not directly related to this question. If verifying the integrity of Maven is important, and it is, do we want to let people skip it, perhaps not realizing it's skipped?

Yes it sounds like it's going to cause a problem on some out-of-the-box distributions. On the one hand, it's reasonable to say that build envs need to include necessary tools and now shasum is needed. But I am OK with just printing a warning rather than blocking the build entirely.

@dongjoon-hyun
Copy link
Member Author

Got it. I'll add a warning there.

But I am OK with just printing a warning rather than blocking the build entirely.

@dongjoon-hyun
Copy link
Member Author

Here is the goal of this PR, @srowen .

  1. A legacy system without shasum will skip the checksum check completely.
  2. So, the new additional warning will be Skipping checksum instead of Bad checksum.
  3. All the existing commits (original your PR and @Yikun 's today's commit) are valid and meaningful as long as shasum exists.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented May 20, 2021

I updated with an explicit warning Skipping checksum because shasum is not installed.
Could you check once more? PR description is also updated.

@dongjoon-hyun
Copy link
Member Author

Thank you so much! I verified this manually. Merged to master/3.1/3.0.

dongjoon-hyun added a commit that referenced this pull request May 20, 2021
### What changes were proposed in this pull request?

Not every build system has `shasum`. This PR aims to skip checksum checks on a system without `shasum`.

### Why are the changes needed?

**PREPARE**
```
$ docker run -it --rm -v $PWD:/spark openjdk:11-slim /bin/bash
roota0e001a6e50f:/# cd /spark/
roota0e001a6e50f:/spark# apt-get update
roota0e001a6e50f:/spark# apt-get install curl
roota0e001a6e50f:/spark# build/mvn clean
```

**BEFORE (Failure due to `command not found`)**
```
roota0e001a6e50f:/spark# build/mvn clean
exec: curl --silent --show-error -L https://downloads.lightbend.com/scala/2.12.10/scala-2.12.10.tgz
exec: curl --silent --show-error -L https://www.apache.org/dyn/closer.lua/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz?action=download
exec: curl --silent --show-error -L https://archive.apache.org/dist/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz.sha512
Veryfing checksum from /spark/build/apache-maven-3.6.3-bin.tar.gz.sha512
build/mvn: line 81: shasum: command not found
Bad checksum from https://archive.apache.org/dist/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz.sha512
```

**AFTER**
```
roota0e001a6e50f:/spark# build/mvn clean
exec: curl --silent --show-error -L https://downloads.lightbend.com/scala/2.12.10/scala-2.12.10.tgz
Skipping checksum because shasum is not installed.
exec: curl --silent --show-error -L https://www.apache.org/dyn/closer.lua/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz?action=download
exec: curl --silent --show-error -L https://archive.apache.org/dist/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz.sha512
Skipping checksum because shasum is not installed.
Using `mvn` from path: /spark/build/apache-maven-3.6.3/bin/mvn
```

### Does this PR introduce _any_ user-facing change?

Yes, this will recover the build.

### How was this patch tested?

Manually with the above process.

Closes #32613 from dongjoon-hyun/SPARK-35463.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 8e13b8c)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
dongjoon-hyun added a commit that referenced this pull request May 20, 2021
### What changes were proposed in this pull request?

Not every build system has `shasum`. This PR aims to skip checksum checks on a system without `shasum`.

### Why are the changes needed?

**PREPARE**
```
$ docker run -it --rm -v $PWD:/spark openjdk:11-slim /bin/bash
roota0e001a6e50f:/# cd /spark/
roota0e001a6e50f:/spark# apt-get update
roota0e001a6e50f:/spark# apt-get install curl
roota0e001a6e50f:/spark# build/mvn clean
```

**BEFORE (Failure due to `command not found`)**
```
roota0e001a6e50f:/spark# build/mvn clean
exec: curl --silent --show-error -L https://downloads.lightbend.com/scala/2.12.10/scala-2.12.10.tgz
exec: curl --silent --show-error -L https://www.apache.org/dyn/closer.lua/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz?action=download
exec: curl --silent --show-error -L https://archive.apache.org/dist/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz.sha512
Veryfing checksum from /spark/build/apache-maven-3.6.3-bin.tar.gz.sha512
build/mvn: line 81: shasum: command not found
Bad checksum from https://archive.apache.org/dist/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz.sha512
```

**AFTER**
```
roota0e001a6e50f:/spark# build/mvn clean
exec: curl --silent --show-error -L https://downloads.lightbend.com/scala/2.12.10/scala-2.12.10.tgz
Skipping checksum because shasum is not installed.
exec: curl --silent --show-error -L https://www.apache.org/dyn/closer.lua/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz?action=download
exec: curl --silent --show-error -L https://archive.apache.org/dist/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz.sha512
Skipping checksum because shasum is not installed.
Using `mvn` from path: /spark/build/apache-maven-3.6.3/bin/mvn
```

### Does this PR introduce _any_ user-facing change?

Yes, this will recover the build.

### How was this patch tested?

Manually with the above process.

Closes #32613 from dongjoon-hyun/SPARK-35463.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 8e13b8c)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
@SparkQA
Copy link

SparkQA commented May 20, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43287/

@SparkQA
Copy link

SparkQA commented May 20, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43287/

@github-actions github-actions bot added the BUILD label May 20, 2021
@dongjoon-hyun dongjoon-hyun deleted the SPARK-35463 branch May 20, 2021 22:46
@maropu
Copy link
Member

maropu commented May 20, 2021

late lgtm.

@Yikun
Copy link
Member

Yikun commented May 20, 2021

LGTM

@SparkQA
Copy link

SparkQA commented May 20, 2021

Test build #138765 has finished for PR 32613 at commit e3def4a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Thank you, @maropu and @Yikun !

flyrain pushed a commit to flyrain/spark that referenced this pull request Sep 21, 2021
### What changes were proposed in this pull request?

Not every build system has `shasum`. This PR aims to skip checksum checks on a system without `shasum`.

### Why are the changes needed?

**PREPARE**
```
$ docker run -it --rm -v $PWD:/spark openjdk:11-slim /bin/bash
roota0e001a6e50f:/# cd /spark/
roota0e001a6e50f:/spark# apt-get update
roota0e001a6e50f:/spark# apt-get install curl
roota0e001a6e50f:/spark# build/mvn clean
```

**BEFORE (Failure due to `command not found`)**
```
roota0e001a6e50f:/spark# build/mvn clean
exec: curl --silent --show-error -L https://downloads.lightbend.com/scala/2.12.10/scala-2.12.10.tgz
exec: curl --silent --show-error -L https://www.apache.org/dyn/closer.lua/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz?action=download
exec: curl --silent --show-error -L https://archive.apache.org/dist/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz.sha512
Veryfing checksum from /spark/build/apache-maven-3.6.3-bin.tar.gz.sha512
build/mvn: line 81: shasum: command not found
Bad checksum from https://archive.apache.org/dist/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz.sha512
```

**AFTER**
```
roota0e001a6e50f:/spark# build/mvn clean
exec: curl --silent --show-error -L https://downloads.lightbend.com/scala/2.12.10/scala-2.12.10.tgz
Skipping checksum because shasum is not installed.
exec: curl --silent --show-error -L https://www.apache.org/dyn/closer.lua/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz?action=download
exec: curl --silent --show-error -L https://archive.apache.org/dist/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz.sha512
Skipping checksum because shasum is not installed.
Using `mvn` from path: /spark/build/apache-maven-3.6.3/bin/mvn
```

### Does this PR introduce _any_ user-facing change?

Yes, this will recover the build.

### How was this patch tested?

Manually with the above process.

Closes apache#32613 from dongjoon-hyun/SPARK-35463.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 8e13b8c)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants