Skip to content

Conversation

panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Dec 6, 2023

What changes were proposed in this pull request?

The pr aims to skip maven daily testing as ivy uses some corrupted cache jar files.
This is a temporary workaround solution.
After SPARK-46400 and applying it to the already released Spark version, we should remove this logic.

Why are the changes needed?

Fix maven daily testing GA.
In our Maven daily testing, some UTs failed due to some corrupt jars in maven repo:
https://github.com/apache/spark/actions/runs/7019155617/job/19095991788#step:9:27263
image
image

Does this PR introduce any user-facing change?

No.

How was this patch tested?

  • Pass GA.
  • Manually check.
image

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the INFRA label Dec 6, 2023
@panbingkun panbingkun changed the title try fix maven test [Don't merge & review] try fix maven test Dec 6, 2023
@github-actions github-actions bot added the SQL label Dec 6, 2023
@github-actions github-actions bot removed the SQL label Dec 7, 2023
@panbingkun panbingkun changed the title [Don't merge & review] try fix maven test [Don't merge & review] [SPARK-46302][BUILD] Fix maven daily testing Dec 7, 2023
@panbingkun
Copy link
Contributor Author

panbingkun commented Dec 7, 2023

image

As seen above, missing hive-exec-2.3.9.jar and slf4j-api-1.7.30.jar.

@panbingkun
Copy link
Contributor Author

How about just add a filter at

val testingVersions: Seq[String] = if (isPythonVersionAvailable &&
SystemUtils.isJavaVersionAtMost(JavaVersion.JAVA_17)) {
import scala.io.Source
try Utils.tryWithResource(
Source.fromURL(s"$releaseMirror/spark")) { source =>
source.mkString
.split("\n")
.filter(_.contains("""<a href="spark-"""))
.filterNot(_.contains("preview"))
.map("""<a href="spark-(\d.\d.\d)/">""".r.findFirstMatchIn(_).get.group(1))
.filter(_ < org.apache.spark.SPARK_VERSION).toImmutableArraySeq

like:

val skipReleaseVersions =
    sys.env.getOrElse("SKIP_SPARK_RELEASE_VERSIONS", "").split(",").toSet
...

source.mkString
        .split("\n")
        .filter(_.contains("""<a href="spark-"""))
        .filterNot(_.contains("preview"))
        .map("""<a href="spark-(\d.\d.\d)/">""".r.findFirstMatchIn(_).get.group(1))
        .filter(_ < org.apache.spark.SPARK_VERSION)
        .filterNot(skipReleaseVersions.contains).toImmutableArraySeq

Done.

required: false
type: string
default: '{}'
default: '{"SKIP_SPARK_RELEASE_VERSIONS": "3.3.3,3.4.2,3.5.0,master"}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for master

Copy link
Contributor

@LuciferYang LuciferYang Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and default should be '{}' or '', we should pass it from build_maven.yml

HIVE_PROFILE: ${{ matrix.hive }}
SPARK_LOCAL_IP: localhost
GITHUB_PREV_SHA: ${{ github.event.before }}
SKIP_SPARK_RELEASE_VERSIONS: ${{ fromJSON(inputs.envs).SKIP_SPARK_RELEASE_VERSIONS }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option should not be needed, the Run tests section has already been deserialized for env

- name: Run tests
env: ${{ fromJSON(inputs.envs) }}

with:
envs: >-
{
"SKIP_SPARK_RELEASE_VERSIONS": "3.3.3,3.4.2,3.5.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"SKIP_SPARK_RELEASE_VERSIONS": "3.3.3,3.4.2,3.5.0"
"SKIP_SPARK_RELEASE_VERSIONS": "3.3.4,3.4.2,3.5.0"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@LuciferYang
Copy link
Contributor

I think we should make #44343 as TODO of this one, and explicitly add TODO in the code. On the other hand, please update the PR title and PR description to better match the change of this PR.

@panbingkun panbingkun changed the title [SPARK-46302][BUILD] Fix maven daily testing [SPARK-46302][BUILD] Make maven's daily testing skip some released spark versions, where ivy uses corrupted cache jars Dec 19, 2023
@panbingkun panbingkun changed the title [SPARK-46302][BUILD] Make maven's daily testing skip some released spark versions, where ivy uses corrupted cache jars [SPARK-46302][TEST] Make maven's daily testing skip some released spark versions, where ivy uses corrupted cache jars Dec 19, 2023
@panbingkun panbingkun changed the title [SPARK-46302][TEST] Make maven's daily testing skip some released spark versions, where ivy uses corrupted cache jars [SPARK-46302][TESTS] Make maven's daily testing skip some released spark versions, where ivy uses corrupted cache jars Dec 19, 2023
@panbingkun panbingkun changed the title [SPARK-46302][TESTS] Make maven's daily testing skip some released spark versions, where ivy uses corrupted cache jars [SPARK-46302][TESTS] Skip maven daily testing as ivy uses some corrupted cache jar files Dec 19, 2023
@panbingkun
Copy link
Contributor Author

I think we should make #44343 as TODO of this one, and explicitly add TODO in the code. On the other hand, please update the PR title and PR description to better match the change of this PR.

Done.

npm install --save-dev
node --experimental-vm-modules node_modules/.bin/jest
maven-test:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine to me, let's revert this change after test pass

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will revert this change, because the maven test has passed and the verification is successful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

object PROCESS_TABLES extends QueryTest with SQLTestUtils {
// TODO After SPARK-46400 and applying it to the already released Spark version,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... I think maybe we can describe it like

In SPARK-46302, the env SKIP_SPARK_RELEASE_VERSIONS has been added to allow Maven tests to skip problematic release versions. Related issues will be fixed in SPARK-46400, and testing will be resumed after the fixed Spark 3.x version is released.

and I think just add comments here, no need to add them in line 247

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let's run this test first.

in addition, before we release 3. x, should we test test("backward compatibility") after the SPARK-46400 feature is fixed on the branch master?

Copy link
Contributor

@LuciferYang LuciferYang Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it has ever tested master...
I don't think it's necessary at the moment.

Copy link
Contributor

@LuciferYang LuciferYang Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before the fixed version released, Maven's daily tests can skip this case first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

Copy link
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM

@LuciferYang
Copy link
Contributor

Merged into master. Thanks @panbingkun

Let's monitor next maven daily test ~

srowen pushed a commit that referenced this pull request Jan 31, 2024
…maven repo, skip this cache and try again

### What changes were proposed in this pull request?
The pr aims to
- fix potential bug(ie: #44208) and enhance user experience.
- make the code more compliant with standards

### Why are the changes needed?
We use the local maven repo as the first-level cache in ivy.  The original intention was to reduce the time required to parse and obtain the ar, but when there are corrupted files in the local maven repo,The above mechanism will be directly interrupted and the prompt is very unfriendly, which will greatly confuse the user.  Based on the original intention, we should skip the cache directly in similar situations.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually test.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #44343 from panbingkun/SPARK-46400.

Authored-by: panbingkun <pbk1982@gmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
LuciferYang pushed a commit that referenced this pull request Feb 8, 2024
…ocal maven repo, skip this cache and try again

### What changes were proposed in this pull request?
The pr aims to
- fix potential bug(ie: #44208) and enhance user experience.
- make the code more compliant with standards

Backport above to branch 3.5.
Master branch pr: #44343

### Why are the changes needed?
We use the local maven repo as the first-level cache in ivy.  The original intention was to reduce the time required to parse and obtain the ar, but when there are corrupted files in the local maven repo,The above mechanism will be directly interrupted and the prompt is very unfriendly, which will greatly confuse the user.  Based on the original intention, we should skip the cache directly in similar situations.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually test.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #45017 from panbingkun/branch-3.5_SPARK-46400.

Authored-by: panbingkun <panbingkun@baidu.com>
Signed-off-by: yangjie01 <yangjie01@baidu.com>
dongjoon-hyun pushed a commit that referenced this pull request Feb 15, 2024
…ocal maven repo, skip this cache and try again

### What changes were proposed in this pull request?
The pr aims to
- fix potential bug(ie: #44208) and enhance user experience.
- make the code more compliant with standards

Backport above to branch 3.4.
Master branch pr: #44343

### Why are the changes needed?
We use the local maven repo as the first-level cache in ivy.  The original intention was to reduce the time required to parse and obtain the ar, but when there are corrupted files in the local maven repo,The above mechanism will be directly interrupted and the prompt is very unfriendly, which will greatly confuse the user.  Based on the original intention, we should skip the cache directly in similar situations.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually test.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #45018 from panbingkun/branch-3.4_SPARK-46400.

Authored-by: panbingkun <panbingkun@baidu.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Mar 26, 2024
…ocal maven repo, skip this cache and try again

### What changes were proposed in this pull request?
The pr aims to
- fix potential bug(ie: apache#44208) and enhance user experience.
- make the code more compliant with standards

Backport above to branch 3.4.
Master branch pr: apache#44343

### Why are the changes needed?
We use the local maven repo as the first-level cache in ivy.  The original intention was to reduce the time required to parse and obtain the ar, but when there are corrupted files in the local maven repo,The above mechanism will be directly interrupted and the prompt is very unfriendly, which will greatly confuse the user.  Based on the original intention, we should skip the cache directly in similar situations.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually test.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#45018 from panbingkun/branch-3.4_SPARK-46400.

Authored-by: panbingkun <panbingkun@baidu.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants