Skip to content

[SPARK-57069][INFRA] Share SBT precompile artifact with docker/k8s integration test CI jobs#56110

Open
zhengruifeng wants to merge 1 commit into
apache:masterfrom
zhengruifeng:share-precompile-integration-tests-dev5
Open

[SPARK-57069][INFRA] Share SBT precompile artifact with docker/k8s integration test CI jobs#56110
zhengruifeng wants to merge 1 commit into
apache:masterfrom
zhengruifeng:share-precompile-integration-tests-dev5

Conversation

@zhengruifeng
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

This PR extends the SBT precompile-sharing pattern (parent: SPARK-56830; prior sub-tasks: SPARK-56768 pyspark, SPARK-56831 sparkr, SPARK-56943 JVM build) to the two remaining SBT-compiling jobs in .github/workflows/build_and_test.yml that still run their own full Spark compile:

  • docker-integration-tests
  • k8s-integration-tests

Concretely:

  • The existing precompile job's if: gate is extended to also fire when docker-integration-tests == 'true' or k8s-integration-tests == 'true' in the precondition output, so the artifact is available whenever either job needs it.
  • docker-integration-tests:
    • needs: precondition -> needs: [precondition, precompile]
    • if: extended with (!cancelled()) && so the job still runs if precompile is cancelled.
    • Adds "Download precompiled artifact" + "Extract precompiled artifact" steps between Java setup and Run tests, with graceful fallback (continue-on-error: true).
    • Run tests exports SKIP_SCALA_BUILD=true when extraction succeeded; dev/run-tests.py already honors this flag and skips build_apache_spark + build_spark_assembly_sbt.
  • k8s-integration-tests:
    • Same needs: and if: change.
    • Adds the same Download/Extract steps after Java setup.
    • The actual test runs via a direct build/sbt ... "kubernetes-integration-tests/test" call rather than dev/run-tests.py, so no SKIP_SCALA_BUILD is set. SBT sees the extracted target/ and skips compilation of the already-built modules (Spark Core, SQL, etc.); only the kubernetes-integration-tests test module itself compiles incrementally.

Optional: graceful fallback if precompile fails

Same pattern as the prior sub-tasks:

  • precompile keeps continue-on-error: true.
  • Both consumers' "Download precompiled artifact" step is gated on needs.precompile.result == 'success' and has continue-on-error: true.
  • "Extract precompiled artifact" is gated on the download succeeding and has continue-on-error: true.
  • For docker, SKIP_SCALA_BUILD=true is exported only when steps.extract-precompiled.outcome == 'success'; otherwise dev/run-tests.py runs the original local SBT build.
  • For k8s, if extraction fails, SBT compiles from scratch as before.

Worst case is degraded to the pre-PR behavior, not a workflow failure.

Profile coverage

The precompile job runs:

./build/sbt -Phadoop-3 -Pyarn -Pspark-ganglia-lgpl -Phadoop-cloud -Phive \
  -Pkubernetes -Pjvm-profiler -Pkinesis-asl -Phive-thriftserver \
  -Pdocker-integration-tests -Pvolcano \
  Test/package streaming-kinesis-asl-assembly/assembly connect/assembly assembly/package
  • docker-integration-tests: profile is in the precompile invocation; the module's target/ is pre-built, so dev/run-tests --modules docker-integration-tests only runs the test phase.
  • k8s-integration-tests: -Pkubernetes is in the precompile so the parent module is pre-built. The job itself adds -Pkubernetes-integration-tests to enable the integration test submodule, which SBT compiles incrementally on top of the reused target/. Net work in this job drops from "compile all of Spark + integration tests" to "compile only the integration-tests module".

Why are the changes needed?

Today every scheduled / dispatched run of build_and_test.yml that requires docker-integration-tests or k8s-integration-tests re-runs the same SBT compile that precompile already produced for pyspark / sparkr / build. Wiring these two consumers to the existing artifact removes that duplicate work for free (precompile is already running).

Does this PR introduce any user-facing change?

No. CI infrastructure change only.

How was this patch tested?

The change is exercised by the CI run of this PR itself. The Download/Extract steps log artifact size; the Run tests step prints Reusing precompiled artifact, skipping local SBT build. for the docker job when the fast path is taken. If the precompile job is forced to fail (or its artifact is missing), both consumers fall back to the original local SBT build.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Opus 4.7)

…t CI jobs

Generated-by: Claude Code (Opus 4.7)
@zhengruifeng zhengruifeng changed the title [INFRA] Share SBT precompile artifact with docker/k8s integration test CI jobs [SPARK-57069][INFRA] Share SBT precompile artifact with docker/k8s integration test CI jobs May 26, 2026
@zhengruifeng zhengruifeng marked this pull request as ready for review May 26, 2026 10:24
@zhengruifeng
Copy link
Copy Markdown
Contributor Author

CI performance: before vs after

Comparing per-job wall time on real CI runs:

Job Before avg (n=2) After (n=1) Savings
Precompile Spark 16m34s 16m13s -- (same)
Run Docker integration tests 90m48s 74m12s ~16m36s (~18%)
Run Spark on Kubernetes Integration test 66m56s 65m48s ~1m08s (~2%)

Samples:

Reading the result

  • Docker is a clean win -- ~17m saved per run, ~18% of job wall time, same payoff shape as the pyspark sharing in SPARK-56768. Docker tests are compile-heavy relative to their other work.
  • K8s barely moves (~1m). The savings on the Spark-side SBT compile are real, but they're absorbed by the parts of the K8s job that don't change: Minikube startup, Spark Docker image build, the kubernetes-integration-tests module's own compile (which isn't in the precompile because it needs -Pkubernetes-integration-tests), and the actual K8s integration test execution. Wall time is dominated by these.
  • I'd still keep the K8s wiring -- the change is small, the fallback is silent if precompile fails, and even small per-run savings add up. A possible follow-up that could shave 5-10m off the K8s job is to fold -Pkubernetes-integration-tests (and -Psparkr) into the precompile invocation so SBT doesn't recompile those modules at test time. Happy to do that in a separate PR if reviewers want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant