Describe the proposed change
The Iceberg CI workflow (.github/workflows/iceberg_spark_test.yml) currently runs a fully-crossed matrix:
- 3 job types (
iceberg-spark, iceberg-spark-extensions, iceberg-spark-runtime)
- 3 Iceberg versions (1.8.1, 1.9.1, 1.10.0)
- 2 Spark versions (3.4.3, 3.5.8)
- 2 JDK versions (11, 17)
- 1 Scala version (2.13)
That's 36 jobs per PR. Adding Spark 4.0.1 / 4.1.1 to the same pattern would push it higher.
The JDK dimension is the easiest to trim. Each Spark version effectively has one preferred JDK in practice:
- Spark 3.4 → JDK 11
- Spark 3.5 → JDK 17 (and Spark 4.x will require 17+ anyway)
Pinning JDK per Spark version (using a matrix include: rather than full cross-product) cuts the matrix in half: 36 → 18 jobs, removing 18 redundant combinations.
Rationale
- Iceberg's own CI does not exhaustively cross JDK with every Spark version; we are overcovering relative to upstream.
- JDK 11 vs 17 differences that affect Comet are caught elsewhere:
pr_build_linux.yml and spark_sql_test.yml both run JDK 11 + 17 against multiple Spark versions.
- The Iceberg suites are about Iceberg/Comet integration, not JVM-level differences; the same JDK delta is unlikely to surface there but not in the broader matrix.
- Frees CI capacity to add Spark 4.0.1 and 4.1.1 to the matrix without a net increase in PR runtime.
Proposed change
Replace the fully-crossed matrix in each of the three Iceberg jobs with an include: list that pins JDK per Spark version, e.g.
strategy:
matrix:
iceberg-version: [{short: '1.8', full: '1.8.1'}, {short: '1.9', full: '1.9.1'}, {short: '1.10', full: '1.10.0'}]
include:
- spark-version: {short: '3.4', full: '3.4.3'}
java-version: 11
- spark-version: {short: '3.5', full: '3.5.8'}
java-version: 17
Net effect: 36 → 18 PR jobs.
Additional context
Part of a broader CI cleanup ahead of adding Spark 4.0.1 and 4.1.1 to the test matrix. Other potential follow-ups (separate issues): drop Iceberg 1.9 (boundary coverage only), reduce macOS matrix, tier nightly vs PR-blocking tests.
Describe the proposed change
The Iceberg CI workflow (
.github/workflows/iceberg_spark_test.yml) currently runs a fully-crossed matrix:iceberg-spark,iceberg-spark-extensions,iceberg-spark-runtime)That's 36 jobs per PR. Adding Spark 4.0.1 / 4.1.1 to the same pattern would push it higher.
The JDK dimension is the easiest to trim. Each Spark version effectively has one preferred JDK in practice:
Pinning JDK per Spark version (using a matrix
include:rather than full cross-product) cuts the matrix in half: 36 → 18 jobs, removing 18 redundant combinations.Rationale
pr_build_linux.ymlandspark_sql_test.ymlboth run JDK 11 + 17 against multiple Spark versions.Proposed change
Replace the fully-crossed matrix in each of the three Iceberg jobs with an
include:list that pins JDK per Spark version, e.g.Net effect: 36 → 18 PR jobs.
Additional context
Part of a broader CI cleanup ahead of adding Spark 4.0.1 and 4.1.1 to the test matrix. Other potential follow-ups (separate issues): drop Iceberg 1.9 (boundary coverage only), reduce macOS matrix, tier nightly vs PR-blocking tests.