Skip to content

Commit

Permalink
[SPARK-48137][INFRA] Run yarn test only in PR builders and Daily CIs
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

We have been providing a dedicated test environment for `yarn` and `connect` module because they are flaky.
- #45107

However, they are still flaky. So, this PR aims to run `yarn` test only in PR builders (if needed) and Daily CIs (always).
- Reduce the irrelevant re-tries by triggering `YARN CI` only when we need to test `YARN` module.
- Protect YARN CI from `connect` flakiness by providing an independent GitHub Action environment in PR Builders and Daily CIs.
- Lastly, commit builder will offload YARN module tests to the daily CIs

### Why are the changes needed?

- PR builders provide an extensive test coverage with YARN testing.
- Daily CIs with YARN tests
   - NON-ANSI CI: https://github.com/apache/spark/actions/workflows/build_non_ansi.yml (1AM)
   - Java 21 SBT CI: https://github.com/apache/spark/actions/workflows/build_java21.yml (4AM)
   - RockDB UI CI: https://github.com/apache/spark/actions/workflows/build_rockdb_as_ui_backend.yml (6AM)
   - Maven Java 17 CI: https://github.com/apache/spark/actions/workflows/build_maven.yml (1PM)
   - Maven Java 21 CI: https://github.com/apache/spark/actions/workflows/build_maven_java21.yml (2PM)
   - Maven Java 21 on AppleSilicon CI: https://github.com/apache/spark/actions/workflows/build_maven_java21_macos14.yml (8PM every two days)

- YARN CI has been flaky in GitHub Action environment and requires irrelevant re-tries very frequently.
    - https://github.com/apache/spark/actions/runs/8962451417/job/24611353908 (2024-05-05)
    - https://github.com/apache/spark/actions/runs/8962440192/job/24611326971 (2024-05-05)

```
[info] *** 6 TESTS FAILED ***
[error] Failed tests:
[error] 	org.apache.spark.deploy.yarn.YarnClusterSuite
[error] (yarn / Test / test) sbt.TestsFailedException: Tests unsuccessful
```

  <img width="544" alt="Screenshot 2024-05-05 at 20 12 28" src="https://github.com/apache/spark/assets/9700541/cbf9fb03-fc4c-4513-b5e5-158c3c9a085a">

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manual review.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #46395 from dongjoon-hyun/SPARK-48137.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
  • Loading branch information
dongjoon-hyun committed May 6, 2024
1 parent 8294c59 commit 7c728b2
Show file tree
Hide file tree
Showing 4 changed files with 15 additions and 4 deletions.
12 changes: 10 additions & 2 deletions .github/workflows/build_and_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,12 +80,14 @@ jobs:
pyspark=`./dev/is-changed.py -m $pyspark_modules`
if [[ "${{ github.repository }}" != 'apache/spark' ]]; then
pandas=$pyspark
yarn=`./dev/is-changed.py -m yarn`
kubernetes=`./dev/is-changed.py -m kubernetes`
sparkr=`./dev/is-changed.py -m sparkr`
buf=true
ui=true
else
pandas=false
yarn=false
kubernetes=false
sparkr=false
buf=false
Expand All @@ -102,6 +104,7 @@ jobs:
\"tpcds-1g\": \"false\",
\"docker-integration-tests\": \"false\",
\"lint\" : \"true\",
\"yarn\" : \"$yarn\",
\"k8s-integration-tests\" : \"$kubernetes\",
\"buf\" : \"$buf\",
\"ui\" : \"$ui\",
Expand Down Expand Up @@ -155,8 +158,8 @@ jobs:
- >-
streaming, sql-kafka-0-10, streaming-kafka-0-10, streaming-kinesis-asl,
kubernetes, hadoop-cloud, spark-ganglia-lgpl, protobuf
- >-
yarn, connect
- yarn
- connect
# Here, we split Hive and SQL tests into some of slow ones and the rest of them.
included-tags: [""]
excluded-tags: [""]
Expand Down Expand Up @@ -194,6 +197,11 @@ jobs:
hive: hive2.3
excluded-tags: org.apache.spark.tags.ExtendedSQLTest,org.apache.spark.tags.SlowSQLTest
comment: "- other tests"
exclude:
# Always run if yarn == 'true', even infra-image is skip (such as non-master job)
# In practice, the build will run in individual PR, but not against the individual commit
# in Apache Spark repository.
- modules: ${{ fromJson(needs.precondition.outputs.required).yarn != 'true' && 'yarn' }}
env:
MODULES_TO_TEST: ${{ matrix.modules }}
EXCLUDED_TAGS: ${{ matrix.excluded-tags }}
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/build_java21.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ jobs:
"sparkr": "true",
"tpcds-1g": "true",
"docker-integration-tests": "true",
"yarn": "true",
"k8s-integration-tests": "true",
"buf": "true",
"ui": "true"
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/build_non_ansi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,5 +44,6 @@ jobs:
"pyspark": "true",
"sparkr": "true",
"tpcds-1g": "true",
"docker-integration-tests": "true"
"docker-integration-tests": "true",
"yarn": "true"
}
3 changes: 2 additions & 1 deletion .github/workflows/build_rockdb_as_ui_backend.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,5 +42,6 @@ jobs:
{
"build": "true",
"pyspark": "true",
"sparkr": "true"
"sparkr": "true",
"yarn": "true"
}

0 comments on commit 7c728b2

Please sign in to comment.