Fix build cache determinism for reproducible builds#7173
Fix build cache determinism for reproducible builds#7173timtebeek merged 4 commits intoopenrewrite:mainfrom
Conversation
Replace non-deterministic inputs that cause cache misses when the same commit is built twice from different directories: - Version timestamps: use git commit timestamp instead of wall-clock time (rewrite-python, rewrite-csharp, rewrite-javascript) - Test system properties: use jvmArgumentProviders with @Classpath/@internal instead of absolute path systemProperty (rewrite-gradle, rewrite-gradle-tooling-model) - Exec tasks: use relative paths for executable and args (rewrite-csharp csharpTest, rewrite-python pytestTest) - Exclude __pycache__ directories from pytestTest inputs
…tifacts - Exclude **/build/** from csharpTest/csharpBuild fileTree inputs — dotnet test writes junit.xml inside the OpenRewrite/ tree, which persists across builds (Gradle clean only removes the Gradle build/ dir, not the C# one) - Replace inputs.files(npmInstall) with explicit fileTree excluding .vite-temp/, .vite/, .cache/ — vitest creates these inside node_modules/ during test execution, leaking into the next build's input fingerprint - Remove CSharpParseProjectTest — flaky (RPC timeout after 60s), causes cascading cache misses when it fails in the first build of exp3
The pluginLocalTestClasspath FileCollection includes test-manifest.txt as an artifact, which contains absolute paths. Filtering it from the @classpath property prevents cache misses across different build environments.
472ac6f to
8979f0c
Compare
|
Love the continued help from you and Gradle here @ribafish; much appreciated! Indeed our builds have become more complicated and longer recently as we expanded to cover additional languages. I'll tag colleagues here for review. |
|
I also want to appreciate the help here! Great stuff. Thank you, Gradle team. A comment, specifically to the OR team:
Not fully sure if we want to adopt this change. We tend to rely on unpinned versions of our own dependencies and third-parties. Using the last commit version might prevent us from using a newer dependency which became available after the last commit, right? |
|
That's fair, though I'm not sure if in these three contexts that were modified we'd be picking up a dependency upgrade for some active reason without having a commit originating from somewhere. In other words, I'm wondering if it's minimal risk in terms of the reward. |
|
I'm not too worried about how this affects dynamic dependency constraints in this specific repository. This repository is the source of libraries we take |
|
Thanks again! We'll get this merged and see if there's similar changes to be made with Scala and Go added just in the past few days. |
The goBuild Exec task used an absolute path in its -o argument, which is a cache input. This causes cache misses when the same commit is built from different directories. Use a relative path instead, matching the pattern applied in #7173 for pytestTest and csharpTest.
The goBuild Exec task used an absolute path in its -o argument, which is a cache input. This causes cache misses when the same commit is built from different directories. Use a relative path instead, matching the pattern applied in #7173 for pytestTest and csharpTest.
Summary
This PR fixes build cache non-determinism that causes cache misses when the same commit is built twice from different directories (e.g., Develocity build validation experiment 3). 11 tasks currently produce different cache keys between identical builds.
Changes
1. Version timestamps:
LocalDateTime.now()/Instant.now()→ git commit timestamp (3 files)Files:
rewrite-python/build.gradle.kts,rewrite-csharp/build.gradle.kts,rewrite-javascript/build.gradle.ktsProblem: Snapshot version strings are generated using
LocalDateTime.now()orInstant.now()at build script evaluation time. When the same commit is built twice (even seconds apart), the timestamps differ, producing differentMETA-INF/rewrite-*-version.txtcontent. This version file ends up in the compiled JAR classpath, causing cache key misses for every downstream task that depends on it.Affected tasks (6):
:rewrite-python:test,:rewrite-python:javadoc,:rewrite-python:py2CompatibilityTest:rewrite-csharp:test,:rewrite-csharp:javadoc:rewrite-javascript:javadocFix: Replace
LocalDateTime.now()/Instant.now()with agitCommitTimestamp()helper that reads the HEAD commit's author timestamp viagit log -1 --format=%ct. This is deterministic for the same commit while still providing unique, monotonically increasing versions across different commits. The formatted output matches the existing patterns (yyyyMMddHHmmss/yyyyMMdd-HHmmss).Implications:
git logsubprocess call during script evaluation (negligible overhead, ~5ms)2. Test system properties: absolute paths →
jvmArgumentProviders(2 files)Files:
rewrite-gradle/build.gradle.kts,rewrite-gradle-tooling-model/model/build.gradle.ktsProblem: Both files set a
systemPropertypointing to atest-manifest.txtfile using an absolute path. ThesystemProperty()API registers the value as a plain@Input, so the full absolute path becomes part of the cache key. Builds from different directories produce different cache keys.Additionally, the manifest file content itself contains absolute paths (the JAR paths of
pluginLocalTestClasspath), so even using@PathSensitive(RELATIVE)on the file wouldn't help — the content hash would still differ.Affected tasks (2):
:rewrite-gradle:test:rewrite-gradle-tooling-model:model:testFix: Replace
systemProperty(...)with ajvmArgumentProvidersentry using:@Classpathon the actualpluginLocalTestClasspathFileCollection(filtered to excludetest-manifest.txt) — this hashes JAR contents in a path-insensitive way, providing correct cache key semantics@Internalon the manifestFile— the manifest is only needed at runtime to pass the path to test code; its content (absolute paths) should not affect the cache key since the@Classpathproperty already tracks the actual dependency contentThe system property is still passed to the JVM at execution time via
-D..., so test behavior is unchanged.Implications:
@Classpathinput3. Exec task absolute paths → relative paths (2 files)
Files:
rewrite-csharp/build.gradle.kts,rewrite-python/build.gradle.ktsProblem:
csharpTest: The--loggerargument containsjunitXmlFile.absolutePath, baking an absolute path into the Exec task'sargs(which is a cache input)pytestTest: ThecommandLineusespythonExe.absolutePathas the executable, which is also a cache inputAffected tasks (2):
:rewrite-csharp:csharpTest:rewrite-python:pytestTestFix:
csharpTest: UsejunitXmlFile.relativeTo(csharpDir).path— the file is relative to the working directory anywaypytestTest: Use the relative venv path (.venv/bin/pythonor.venv/Scripts/python.exeon Windows) sinceworkingDiris already set topythonDirImplications:
workingDiris set correctly4. Exclude
__pycache__frompytestTestinputs (1 file)File:
rewrite-python/build.gradle.ktsProblem:
pytestTestdeclaresinputs.dir(pythonDir.resolve("src"))andinputs.dir(pythonDir.resolve("tests")). ThepythonInstalltask (a dependency) runspip install -e .[dev], which creates__pycache__directories insidesrc/andtests/. In a two-build experiment, build B picks up__pycache__directories created by build A as additional inputs (shown as "Directory is an input only in build B" in the build scan comparison).Affected tasks (1):
:rewrite-python:pytestTestFix: Change
inputs.dir(...)toinputs.files(fileTree(...) { exclude("**/__pycache__/**") })for bothsrcandtestsdirectories.Implications:
__pycache__directories are build artifacts that don't affect test correctness — excluding them is safe5. Filter test-manifest.txt from @classpath in rewrite-gradle (1 file)
File:
rewrite-gradle/build.gradle.ktsProblem: The
pluginLocalTestClasspathFileCollection used in the@Classpathproperty of:rewrite-gradle:test'sCommandLineArgumentProviderincludes thetest-manifest.txtartifact from:rewrite-gradle-tooling-model:model. This file contains absolute paths, so its content hash differs across build environments, causing a cache miss even though the actual plugin JARs are identical.Affected tasks (1):
:rewrite-gradle:testFix: Filter
test-manifest.txtfrom the@ClasspathFileCollection. The manifest is already tracked separately as@Internalsince it's only needed at runtime.Implications: