Skip to content

[GLUTEN-11559][Build] Improve incremental build time for test-compile phase#11560

Merged
baibaichen merged 7 commits intoapache:mainfrom
baibaichen:fix/improve-incremental-build
Feb 10, 2026
Merged

[GLUTEN-11559][Build] Improve incremental build time for test-compile phase#11560
baibaichen merged 7 commits intoapache:mainfrom
baibaichen:fix/improve-incremental-build

Conversation

@baibaichen
Copy link
Contributor

@baibaichen baibaichen commented Feb 4, 2026

What changes are proposed in this pull request?

This PR dramatically improves incremental mvn test-compile time through a series of build system optimizations, and introduces dev/run-scala-test.sh — a CLI tool that replicates IntelliJ IDEA's ScalaTest execution, enabling AI agents (Claude, GitHub Copilot, etc.) to run individual ScalaTest methods for automated bug fixing.

fix #11559

Motivation

Gluten's incremental mvn test-compile takes ~2.5 minutes even for single-file changes (32-core machine), making the AI-driven edit → compile → test → fix loop impractically slow. With Maven Daemon (mvnd), incremental builds now complete in 20 seconds and zero-change builds in 3 seconds.

Additionally, standard Maven cannot run individual ScalaTest methods — the -Dsuites and -am flags conflict. run-scala-test.sh solves this by building the classpath via Maven and then launching ScalaTest directly, exactly as IntelliJ does.

Changes

Commit 1: Upgrade protobuf-maven-plugin and enable checkStaleness

  • Upgrade protobuf-maven-plugin from 0.5.1 to 0.6.1
  • Enable <checkStaleness>true</checkStaleness> in all protobuf executions (gluten-core, gluten-substrait, backends-velox)
  • Protobuf code generation is now skipped when .proto files are unchanged

Commit 2: Enable Scala incremental compilation

  • Upgrade scala-maven-plugin from 4.8.0 to 4.9.2 (aligned with Apache Spark)
  • Change scala.recompile.mode from all to incremental — Zinc now recompiles only affected files
  • Skip maven-compiler-plugin compilation — Zinc already handles Java sources in incremental mode (same approach as Apache Spark)
  • Add -Ybackend-parallelism 8 for parallel code generation in both Scala 2.12/2.13

Commit 3: Consolidate build-info generation

  • Merge build-info and build-info-with-backends into a single antrun execution in gluten-core
  • Remove duplicate build-info-with-backends execution from gluten-substrait
  • gluten-build-info.sh now computes backend paths internally based on --backend parameter

Commit 4: Add dev tooling for AI agent integration

  • dev/run-scala-test.sh (674 lines): Run ScalaTest like IntelliJ IDEA from CLI
    • Auto-resolves classpath via Maven, replaces jar paths with target/classes dirs for instant code changes
    • Supports --mvnd for Maven Daemon, --profile for profiling, --export-only for classpath inspection
    • Built-in file-change detection cache: skips Maven entirely when no source files changed
    • Designed for AI agent consumption: agents can run individual test methods (-t "test name") to verify fixes
  • build/mvnd: Maven Daemon wrapper (auto-downloads mvnd 1.0.3) — persistent JVM preserves Zinc's JIT caches
  • build/mvn: Increase ReservedCodeCacheSize from 1g to 2g
  • dev/analyze-build-profile.py: Analyze maven-profiler JSON reports with comparison mode
  • .gitignore: Add build/mvnd, .run-scala-test-cache/, .profiler/, .mvn/

Commit 5: Move test-jar from test-compile to package phase

  • Remove <phase>test-compile</phase> from prepare-test-jar execution in all 18 modules
  • The test-jar goal defaults to the package phase, so test-jars are no longer rebuilt during mvn test-compile
  • This eliminates a Zinc cascade recompilation issue: previously, test-jars were repackaged at every test-compile invocation (even with no changes), causing downstream modules to detect classpath changes and triggering full recompilation of their test sources
  • Trade-off: cross-module test-jar dependencies resolve from ~/.m2 during test-compile. Run mvn install -DskipTests after changing upstream test APIs

Benchmark Results

Machine 1

Scenario mvn mvnd Improvement
Clean build 1m18s 1m20s (cold)
Incremental (1 file changed) 1m04s 20s -69%
Zero-change + file cache 0s 1s
Zero-change + force 55s 2s -96%

Machine 1

Scenario mvn mvnd mvnd Speedup
Clean build 3m48s (228s) 3m45s (225s) cold 1.01x
Incremental (changed) 2m53s (173s) 0m43s (43s) 4.0x
Zero-change 2m24s (144s) 0m08s (8s) 18.0x
Zero-change + cache 0m05s (5s) 0m05s (5s) 1.0x

run-scala-test.sh Usage (for AI Agents)

# Run entire suite
./dev/run-scala-test.sh \
  -Pjava-21,spark-4.0,scala-2.13,backends-velox,hadoop-3.3,spark-ut \
  -pl gluten-ut/spark40 \
  -s org.apache.spark.sql.GlutenDeprecatedDatasetAggregatorSuite

# Run single test method (key for AI agent bug-fix workflows)
./dev/run-scala-test.sh \
  -Pjava-21,spark-4.0,scala-2.13,backends-velox,hadoop-3.3,spark-ut \
  -pl gluten-ut/spark40 \
  -s org.apache.spark.sql.GlutenDeprecatedDatasetAggregatorSuite \
  -t "typed aggregation: class input with reordering" \
  --mvnd

The --mvnd flag uses Maven Daemon for persistent JVM, reducing repeat compilations to ~3s. The built-in cache detects source file changes and skips Maven entirely when nothing changed.

Fixes #11559

How was this patch tested?

  • All 5 commits individually verified with mvn test-compile -pl backends-velox -am (BUILD SUCCESS)
  • --export-only classpath comparison: IDENTICAL before and after Commit 5 changes
  • Benchmark run with dev/benchmark-build.sh across 11 scenarios (mvn/mvnd × clean/incremental/zero-change/cache/force)
  • Profiler analysis confirms zero-change scenario has no actual compilation — all overhead is Zinc analysis cache loading (~2.4s total with mvnd)
  • pass GHA

Was this patch authored or co-authored using generative AI tooling?

Generated-by: GitHub Copilot (Claude Opus 4.6)

@github-actions github-actions bot added CORE works for Gluten Core VELOX labels Feb 4, 2026
@github-actions
Copy link

github-actions bot commented Feb 4, 2026

Run Gluten Clickhouse CI on x86

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets faster incremental mvn test-compile runs by deferring build-info generation to later lifecycle phases and enabling incremental protobuf codegen.

Changes:

  • Bump protobuf-maven-plugin version in parent pom.xml from 0.5.1 to 0.6.1.
  • Enable incremental protobuf generation via <checkStaleness>true</checkStaleness> in affected modules.
  • Move maven-antrun-plugin build-info executions from generate-resources to prepare-package in gluten-core and gluten-substrait.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
pom.xml Upgrades protobuf-maven-plugin version in pluginManagement.
gluten-substrait/pom.xml Moves backend build-info generation later; enables protobuf staleness checking.
gluten-core/pom.xml Enables protobuf staleness checking; moves core build-info generation later.
backends-velox/pom.xml Enables protobuf staleness checking for velox backend protos.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

<goal>run</goal>
</goals>
<phase>generate-resources</phase>
<phase>prepare-package</phase>
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving the build-info antrun execution to prepare-package means gluten-build-info.properties is generated after the default process-resources phase. Since this file is written under ${project.build.directory}/generated-resources (see dev/gluten-build-info.sh) and resources are only copied into ${project.build.outputDirectory} during process-resources, the gluten-core artifact (and reactor classpath for downstream modules) will not contain the build-info resource in a clean build. This can break runtime code that loads gluten-build-info.properties (e.g., shims/common/.../GlutenBuildInfo.scala throws if missing) and any tests that touch that code.

Consider either keeping this execution before process-resources, or adding a resources copy step after this execution (e.g., a maven-resources-plugin execution in prepare-package) / adjusting the script to write directly into ${project.build.outputDirectory} so the file ends up on the classpath before packaging and downstream module tests.

Suggested change
<phase>prepare-package</phase>
<phase>generate-resources</phase>

Copilot uses AI. Check for mistakes.
@github-actions github-actions bot added the BUILD label Feb 4, 2026
@github-actions
Copy link

github-actions bot commented Feb 4, 2026

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen force-pushed the fix/improve-incremental-build branch from 7c48ecb to 1c2d696 Compare February 4, 2026 12:52
@github-actions
Copy link

github-actions bot commented Feb 4, 2026

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen force-pushed the fix/improve-incremental-build branch from 1c2d696 to 0764fd9 Compare February 4, 2026 14:59
@github-actions
Copy link

github-actions bot commented Feb 4, 2026

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen force-pushed the fix/improve-incremental-build branch from 0764fd9 to f376cb7 Compare February 4, 2026 15:20
@github-actions
Copy link

github-actions bot commented Feb 4, 2026

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen force-pushed the fix/improve-incremental-build branch 2 times, most recently from f702a89 to 2fef0dc Compare February 4, 2026 15:31
@github-actions
Copy link

github-actions bot commented Feb 4, 2026

Run Gluten Clickhouse CI on x86

1 similar comment
@github-actions
Copy link

github-actions bot commented Feb 4, 2026

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen force-pushed the fix/improve-incremental-build branch from 2fef0dc to b83c2fa Compare February 4, 2026 15:50
@github-actions
Copy link

github-actions bot commented Feb 4, 2026

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen force-pushed the fix/improve-incremental-build branch from b83c2fa to b8155e6 Compare February 4, 2026 15:53
@github-actions github-actions bot removed the VELOX label Feb 4, 2026
@github-actions
Copy link

github-actions bot commented Feb 4, 2026

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen force-pushed the fix/improve-incremental-build branch from b8155e6 to 39f8cc1 Compare February 4, 2026 16:01
@github-actions
Copy link

github-actions bot commented Feb 4, 2026

Run Gluten Clickhouse CI on x86

@github-actions github-actions bot added the VELOX label Feb 4, 2026
@github-actions
Copy link

github-actions bot commented Feb 5, 2026

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen force-pushed the fix/improve-incremental-build branch from 5a4da52 to 201d145 Compare February 7, 2026 05:06
@github-actions github-actions bot added the TOOLS label Feb 7, 2026
@github-actions
Copy link

github-actions bot commented Feb 7, 2026

Run Gluten Clickhouse CI on x86

1 similar comment
@github-actions
Copy link

github-actions bot commented Feb 8, 2026

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

github-actions bot commented Feb 9, 2026

Run Gluten Clickhouse CI on x86

…leness

Enable <checkStaleness>true</checkStaleness> in all protobuf-maven-plugin
executions (gluten-core, gluten-substrait, backends-velox) so protobuf
compilation is skipped when .proto files haven't changed, improving
incremental build speed.
1. Upgrade scala-maven-plugin from 4.8.0 to 4.9.2 (aligned with Spark)
2. Change scala.recompile.mode from 'all' to 'incremental'
3. Skip javac compilation - Zinc already handles Java sources in
   incremental mode (same approach as Apache Spark)
4. Add -Ybackend-parallelism 8 for both Scala 2.12 and 2.13 profiles
5. Update gluten-it to use incremental mode and 4.9.2 (hardcoded since
   it's a standalone third-party module without parent POM properties)
Merge build-info and build-info-with-backends into a single execution
in gluten-core, eliminating the separate call from gluten-substrait:

- Remove build-info-with-backends execution from gluten-substrait/pom.xml
- Remove redundant backend profile definitions from gluten-substrait
- Add --backend parameter to gluten-core's build-info execution
- Modify gluten-build-info.sh to compute backend paths internally
  based on backend_type (no longer needs external path argument)
- Remove DO_REMOVAL flag; always regenerate the file from scratch
- dev/run-scala-test.sh: Run ScalaTest like IntelliJ IDEA from CLI
  with auto classpath resolution, profiler support, and mvnd integration
- build/mvnd: Maven Daemon wrapper (auto-downloads mvnd 1.0.3)
  for persistent JVM that keeps Zinc's JIT caches across builds
- build/mvn: Increase ReservedCodeCacheSize from 1g to 2g
- dev/analyze-build-profile.py: Analyze Maven profiler JSON reports
- .gitignore: Add build/mvnd, .run-scala-test-cache/, .profiler/, .mvn/
Remove the <phase>test-compile</phase> override from the prepare-test-jar
execution in all 18 modules. The test-jar goal defaults to the package
phase, so test-jars are no longer rebuilt during mvn test-compile.

This eliminates a Zinc cascade recompilation issue: previously, test-jars
were repackaged at every test-compile invocation (even with no changes),
causing downstream modules to detect classpath changes and triggering
full recompilation of their test sources.

Trade-off: cross-module test-jar dependencies in the reactor are now
resolved from the local repository (~/.m2) during test-compile. Run
'mvn install -DskipTests' after changing upstream test APIs.
Since Scala 2.13.15 (scala/scala#10708), the semantics of combined
-Wconf rules changed: in '-Wconf:x,y', y now takes priority over x
(last-match-wins), whereas before 2.13.15, x took priority
(first-match-wins). This means '-Wconf:cat=deprecation:wv,any:e' now
treats deprecation warnings as errors (any:e overrides
cat=deprecation:wv), breaking Scala 2.13 compilation when -Pdelta is
enabled.

Split into separate -Wconf flags where later flags have higher
priority:
  -Wconf:any:e                                     (baseline)
  -Wconf:msg=While parsing annotations in:silent   (override)
  -Wconf:cat=deprecation:wv                        (override)

This aligns with Apache Spark's approach in SPARK-49746 (983f6f43).
Gluten uses Scala 2.13.17 which is affected by this change.

Reference:
- scala/scala#10708
- apache/spark#48192
AI tools can perform build profile analysis on-demand without
requiring a committed script. Moved to fix/improve-incremental-build-tmp
branch for reference.
@baibaichen baibaichen force-pushed the fix/improve-incremental-build branch from 653a80a to e8dc081 Compare February 10, 2026 07:07
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

1 similar comment
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

Copy link
Contributor

@jinchengchenghh jinchengchenghh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@baibaichen baibaichen merged commit 48ca0ba into apache:main Feb 10, 2026
72 checks passed
@baibaichen baibaichen deleted the fix/improve-incremental-build branch February 10, 2026 11:11
@zhouyuan
Copy link
Member

@baibaichen I ran into a issue seems introduced by the incremental mode:

[2026-02-25T04:38:52.691Z] #10 95.39 [ERROR] /incubator-gluten/gluten-core/src/main/scala/org/apache/gluten/config/ConfigRegistry.scala:48: Symbol 'type org.apache.spark.sql.internal.SqlApiConf' is missing from the classpath.
[2026-02-25T04:38:52.691Z] #10 95.39 This symbol is required by 'class org.apache.spark.sql.internal.SQLConf'.
[2026-02-25T04:38:52.691Z] #10 95.39 Make sure that type SqlApiConf is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
[2026-02-25T04:38:52.691Z] #10 95.39 A full rebuild may help if 'SQLConf.class' was compiled against an incompatible version of org.apache.spark.sql.internal.
[2026-02-25T04:38:52.691Z] #10 95.39 [ERROR] /incubator-gluten/gluten-core/src/main/scala/org/apache/gluten/execution/ColumnarToColumnarExec.scala:31: Symbol 'type org.apache.spark.sql.catalyst.trees.WithOrigin' is missing from the classpath.
[2026-02-25T04:38:52.691Z] #10 95.39 This symbol is required by 'class org.apache.spark.sql.catalyst.trees.TreeNode'.
[2026-02-25T04:38:52.691Z] #10 95.39 Make sure that type WithOrigin is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
[2026-02-25T04:38:52.691Z] #10 95.39 A full rebuild may help if 'TreeNode.class' was compiled against an incompatible version of org.apache.spark.sql.catalyst.trees.
[2026-02-25T04:38:52.691Z] #10 95.39 [ERROR] /incubator-gluten/gluten-core/src/main/scala/org/apache/gluten/extension/injector/package.scala:23: Symbol 'type org.apache.spark.sql.catalyst.parser.DataTypeParserInterface' is missing from the classpath.
[2026-02-25T04:38:52.691Z] #10 95.39 This symbol is required by 'trait org.apache.spark.sql.catalyst.parser.ParserInterface'.
[2026-02-25T04:38:52.691Z] #10 95.39 Make sure that type DataTypeParserInterface is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
[2026-02-25T04:38:52.691Z] #10 95.39 A full rebuild may help if 'ParserInterface.class' was compiled against an incompatible version of org.apache.spark.sql.catalyst.parser.
[2026-02-25T04:38:52.691Z] #10 95.39 [ERROR] /incubator-gluten/gluten-core/src/main/scala/org/apache/spark/sql/execution/adaptive/GlutenCost.scala:35: Symbol 'type org.apache.spark.sql.errors.DataTypeErrorsBase' is missing from the classpath.
[2026-02-25T04:38:52.691Z] #10 95.39 This symbol is required by 'trait org.apache.spark.sql.errors.QueryErrorsBase'.
[2026-02-25T04:38:52.691Z] #10 95.39 Make sure that type DataTypeErrorsBase is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
[2026-02-25T04:38:52.691Z] #10 95.39 A full rebuild may help if 'QueryErrorsBase.class' was compiled against an incompatible version of org.apache.spark.sql.errors.
[2026-02-25T04:38:52.691Z] #10 95.39 [ERROR] four errors found

<os.full.name>unknown</os.full.name>
<!-- To build built-in backend c++ codes -->
<scala.recompile.mode>all</scala.recompile.mode>
<scala.recompile.mode>incremental</scala.recompile.mode>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use a separate profile for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have -Dxxx to control the behavior, and update the default value to original all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI AI Related issues and PR BUILD CLICKHOUSE CORE works for Gluten Core DATA_LAKE TOOLS VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Build] Improve incremental build time

4 participants