Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 51 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,5 +20,54 @@ jobs:

- uses: gradle/actions/setup-gradle@v6

- name: Build and test
run: ./gradlew check
# Compile once up-front so the two test stages below don't each
# pay for a cold classpath. Splitting test execution from
# compilation also makes the failure signal in the GitHub UI
# unambiguous — a red "Unit tests" or "E2E (Cucumber) tests"
# step means a test actually failed, not a compile error.
- name: Compile
run: ./gradlew classes testClasses functionalTestClasses

# Unit tests: ProjectBuilder-level coverage of the decision
# engine, extension wiring, --explain rendering, and locale
# edge cases. Cheap, fast, and the first gate adopters hit
# when the plugin regresses on a pure-logic change.
- name: Unit tests
run: ./gradlew :affected-tests-core:test :affected-tests-gradle:test

# E2E (Cucumber + Gradle TestKit): drives real Gradle builds
# through the full plugin surface that security-service and
# other adopters consume. Each scenario in
# src/functionalTest/resources/.../features runs a fresh
# Gradle daemon, so these tests catch build-time wiring
# issues (Shadow relocations, plugin-publish variants,
# Configuration Cache compatibility) that the unit suite
# cannot. Surfaced as a distinct step so a red status here
# points straight at the adopter-facing failure mode.
- name: E2E (Cucumber + Gradle TestKit) tests
run: ./gradlew :affected-tests-gradle:functionalTest

# Assemble the shadow JAR and exercise any remaining `check`
# hooks that neither test task above owns. Re-excluding test
# and functionalTest here keeps the step honest: if it fails,
# the cause is packaging, not test results.
- name: Full build (shadow JAR + remaining check tasks)
run: ./gradlew build -x test -x functionalTest

# Upload test reports on any failure above so a failing run
# is debuggable without rerunning with --info locally. The
# Cucumber HTML report in particular is the fastest way to
# read an e2e failure — the console log for a multi-scenario
# failure is dense.
- name: Upload test reports
if: failure()
uses: actions/upload-artifact@v4
with:
name: test-reports
path: |
affected-tests-gradle/build/reports/tests/
affected-tests-gradle/build/test-results/
affected-tests-core/build/reports/tests/
affected-tests-core/build/test-results/
if-no-files-found: ignore
retention-days: 14
1 change: 1 addition & 0 deletions .release-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
2.2.1
149 changes: 128 additions & 21 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,137 @@ All notable changes to this plugin are documented here. The format follows
[Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project
adheres to [Semantic Versioning](https://semver.org/).

## [Unreleased]
## [v2.2.1] — code-review follow-ups after the v2.2 ship

v2.2.1 is a **non-breaking patch** that closes the Medium + Low
findings a structured code review raised against the v2.2
changeset. None of them were adopter-reported regressions — v2.2 is
green end-to-end on the `security-service` pilot — but each one
represents a real failure mode for a second adopter who doesn't
ship in exactly the same posture. Point-releasing them now keeps
the v2.3 tech-plan window open for the larger feature work
(non-conventional test task names, explain-trace truncation
unification) without delaying these fixes behind it.

If you're on v2.2.0 today you can bump to v2.2.1 without touching
`build.gradle`. The only observable change on a green path is that
enabling `org.gradle.configuration-cache` now works. The only
observable changes on the unhappy path are sharper error messages
and a new hint branch for the `onDiscoveryIncomplete = 'skipped'`
opt-in.

### Fixed — `org.gradle.configuration-cache` no longer crashes the plugin (M1)

The v2.2 dispatch-dependency Callable closed over a `Project`
reference to resolve each subproject's `testClasses` task. `Project`
is explicitly documented as **not** configuration-cache-serialisable,
so a repo with `org.gradle.configuration-cache=true` in
`gradle.properties` would fail to serialise the task graph with:

```
Task `:affectedTest` of type `...AffectedTestTask`: cannot serialize
object of type Project
```

The fix resolves each subproject's `testClasses` task once, eagerly,
into a `TaskProvider<?>` (which IS CC-serialisable) and closes the
Callable over that plus the task's own `Property<Boolean>` for the
`--explain` gate. Cache-hit rate and task-graph shape are identical
to v2.2 — only the serialisation surface changed. A new functional
scenario runs `./gradlew affectedTest --configuration-cache
--explain` on a SELECTED-shaped diff and pins "Configuration cache
entry …" in the output so the next edit that accidentally re-captures
`Project` breaks a test instead of only an adopter's CI.

Adopters who don't enable CC see no change.

### Fixed — `Action.SKIPPED` hint no longer calls the skip "safe" (M3)

v2.2's `DISCOVERY_INCOMPLETE` hint was a two-way branch: SELECTED
warned about the partial-selection risk; everything else printed "the
resolved action above is the safe fallback". That "everything else"
bucket silently absorbed `Action.SKIPPED` — which an operator can
opt into explicitly via `onDiscoveryIncomplete = 'skipped'`. Calling
SKIPPED "safe" is precisely inverted advice: the run executed zero
tests on a partial-parse diff.

v2.2.1 gives SKIPPED its own hint branch:

```
Hint: one or more Java files in the diff failed to parse,
so discovery ran with missing inputs.
onDiscoveryIncomplete = 'skipped' meant no tests ran
for this diff — fix the parse error to restore
coverage, or set onDiscoveryIncomplete = 'full_suite'
if silently skipping a partial-parse diff is not the
intended policy.
```

The hint names the exact knob that was set (so a second reader of the
trace can locate the override) and names the escape (`'full_suite'`)
without ever calling the current state "safe". SELECTED and
FULL_SUITE wording are unchanged.

### Fixed — `-PaffectedTestsMode=` (empty) no longer crashes (L1)

CI templates that unconditionally emit `-PaffectedTestsMode=$MODE`
with an unset `$MODE` passed the literal empty string into the
extension's convention, and v2.2 rejected it with
`Unknown affectedTests.mode ''`. v2.2.1 filters empty/whitespace in
the convention chain so the empty case is identical to omitting the
flag — the AUTO default keeps working, and the "unknown mode" error
stays reserved for actual typos.

### Fixed — unknown-mode error names the AUTO fallback (L4)

v2.2's `parseMode` error listed the four legal values only, which
led adopters to ask "so which one is the default?". v2.2.1 adds the
AUTO-fallback hint plus the `CI=true` tripwire that AUTO keys off:

```
Unknown affectedTests.mode 'xyzzy'. Expected one of: auto, local,
ci, strict (omit the value or leave -PaffectedTestsMode unset to
keep the AUTO default, which picks CI when CI=true is exported).
```

### Changed — explain-trace module-block contract now fails loudly on misuse (L2)

`AffectedTestTask#appendModulesBlock` used to silently re-normalise
map keys that were missing a leading colon, papering over any caller
that forgot to route their module paths through
`AffectedTestTask#testTaskPath`. A future edit that fed raw
`"api:test"` strings into the map produced a split trace (`":api:test"`
next to `"core:test"`). The helper now asserts the contract up-front,
so the failure surface moves into the test suite.

The assertion is plain `assert`, active on every Gradle test JVM by
default and a no-op on production runs — the public operator-facing
trace output is unchanged.

_Changelog housekeeping: the v2.2.0 section below has been backfilled
with items that were in the v2.2.0 ship but were still sitting under
`[Unreleased]` in the CHANGELOG when the tag was cut._

## [v2.2.0] — adoption-feedback polish from the security-service pilot

v2.2.0 is a **non-breaking** release driven entirely by what a second
real-world adopter (Modulr's `security-service`, CAR-5190) ran into
while plugging v2.1 in. Every DSL knob, every default, and every
resolved behaviour from v2.1 keeps working bit-for-bit; v2.2 only
sharpens the edges an operator touches when something goes wrong
or when they want to A/B a mode from CI.

If you're on v2.1 today you can bump to v2.2 without touching
`build.gradle`. The only observable difference on a green path is a
faster `--explain` run (Bug A) and a per-module breakdown in that
trace (Polish E). The only observable difference on the unhappy path
is clearer hint wording (Bug B) and — in LOCAL mode specifically — a
loud `WARN` when discovery can't parse part of the diff (Risk C).

### Changed — internal refinements from the v2.2 code review

Follow-up polish on the v2.2 adoption-feedback release, all
non-breaking and none of them visible in build scripts. Captured here
rather than cut as a point release because the operator-facing
contract is identical to v2.2.0 on every green path.
Follow-up polish on the v2.2 adoption-feedback changeset, all
non-breaking and none of them visible in build scripts.

* **Risk C WARN no longer fires with "0 test classes".** The engine
rewrites `SELECTED` with nothing to dispatch into `skipped=true`;
Expand Down Expand Up @@ -45,22 +168,6 @@ contract is identical to v2.2.0 on every green path.
root — catches a regression where someone pins the Callable wiring
to the root project instead of iterating via `allprojects { ... }`.

## [v2.2.0] — adoption-feedback polish from the security-service pilot

v2.2.0 is a **non-breaking** release driven entirely by what a second
real-world adopter (Modulr's `security-service`, CAR-5190) ran into
while plugging v2.1 in. Every DSL knob, every default, and every
resolved behaviour from v2.1 keeps working bit-for-bit; v2.2 only
sharpens the edges an operator touches when something goes wrong
or when they want to A/B a mode from CI.

If you're on v2.1 today you can bump to v2.2 without touching
`build.gradle`. The only observable difference on a green path is a
faster `--explain` run (Bug A) and a per-module breakdown in that
trace (Polish E). The only observable difference on the unhappy path
is clearer hint wording (Bug B) and — in LOCAL mode specifically — a
loud `WARN` when discovery can't parse part of the diff (Risk C).

### Fixed — `--explain` no longer forces a full compile (Bug A)

Pre-v2.2 the `affectedTest` task eagerly depended on `testClasses` in
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -313,6 +313,19 @@ public void theTaskFailsAtConfigurationTime() {
"Failure must come from configuration / evaluation phase, got:\n" + world.project().lastOutput());
}

@Then("the task fails")
public void theTaskFails() {
// Generic failure assertion for scenarios where the failure
// phase is uninteresting — e.g. an unknown `-P` value that
// escapes the extension convention and is only rejected at
// task-action time via {@code parseMode}. Asserting on the
// lifecycle phase would tie the test to an implementation
// detail (exactly where the enum lookup fires), so this
// step only pins "the build did not turn green".
assertTrue(world.project().lastBuildFailed(),
"Expected a failing build, got green:\n" + world.project().lastOutput());
}

@Then("the situation is {word}")
public void theSituationIs(String situation) {
// The --explain trace prints the ACTUAL situation on the
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
Feature: v2.2.1 — code-review follow-ups from the v2.2 ship
Scenarios for the Medium + Low findings raised by the v2.2 code
review: Configuration Cache compatibility (M1), Action.SKIPPED
hint wording (M3), empty-string `-PaffectedTestsMode` coercion
(L1), and the AUTO-fallback error wording on an unknown `-P`
value (L4). Every scenario here pins an operator-facing contract
the v2.2 release shipped slightly short on — if any of these
scenarios regress, a real adopter notices.

Background:
Given a freshly initialised project with a committed baseline

# ------------------------------------------------------------------
# M1 — Configuration Cache compatibility
#
# The v2.2 Callable for the Bug A `testClasses` dependency captured
# `Project p` so a CC-enabled adopter would fail to serialise the
# task graph ("cannot serialize object of type Project"). v2.2.1
# fixed it by capturing `TaskProvider<?>` and `Property<Boolean>`
# eagerly — both CC-serialisable. Pin it with a live
# `--configuration-cache` run on the same SELECTED shape adopters
# use daily: if the Callable ever regresses to capturing Project
# again, this scenario turns red.
# ------------------------------------------------------------------
Scenario: --configuration-cache stores and reuses the task graph without Project capture
Given a production class "com.example.FooService" with its matching test "com.example.FooServiceTest"
And the baseline commit is captured
And the diff modifies "src/main/java/com/example/FooService.java"
And the Gradle command-line argument "--configuration-cache"
When the affected-tests task runs with "--explain"
Then the task succeeds
# Gradle logs the CC-store event on the first run — no store line
# means CC never engaged (e.g. because a problem was detected and
# CC quietly degraded). The exact phrase changed between 8.x and
# 9.x; match the stable prefix "Configuration cache entry" which
# both minor lines emit.
And the output contains "Configuration cache entry"
# A Project-capture regression surfaces as this specific message
# — pin the negative so a future edit that re-introduces it can't
# hide behind a green "entry stored" line.
And the output does not contain "cannot be serialized"
And the output does not contain "cannot serialize"

# ------------------------------------------------------------------
# M3 — Action.SKIPPED hint no longer calls the skip "safe"
#
# v2.2 routed Action.SKIPPED through the FULL_SUITE hint branch,
# which said "the resolved action above is the safe fallback".
# Calling SKIPPED "safe" is precisely inverted advice: SKIPPED ran
# zero tests on a partial-parse diff. v2.2.1 gives SKIPPED its own
# hint branch that names the opt-in knob and recommends
# `'full_suite'` if the skip wasn't intentional.
# ------------------------------------------------------------------
Scenario: DISCOVERY_INCOMPLETE on SKIPPED names the opt-in and offers an escape
# Requires the opt-in: LOCAL mode with
# `onDiscoveryIncomplete = 'skipped'` is the only path that lands
# Action.SKIPPED on a parse-failure diff. CI/STRICT escalate
# (their paired scenario is in feature 06).
Given the affected-tests DSL contains:
"""
mode = 'local'
onDiscoveryIncomplete = 'skipped'
"""
And a canned diff that produces the DISCOVERY_INCOMPLETE situation
When the affected-tests task runs with "--explain"
Then the task succeeds
And the situation is DISCOVERY_INCOMPLETE
And the action is SKIPPED
And the output contains "failed to parse"
# Hint must name both the user-set knob AND the escape hatch so
# an operator reading the trace can tell whether the skip was
# deliberate and, if not, exactly which value to flip.
And the output contains "onDiscoveryIncomplete = 'skipped'"
And the output contains "onDiscoveryIncomplete = 'full_suite'"
# The FULL_SUITE branch's "safe fallback" wording must NOT leak
# into the SKIPPED branch — that claim is factually wrong when
# zero tests ran.
And the output does not contain "safe fallback"
# Partial-selection wording belongs to the SELECTED branch only;
# SKIPPED ran nothing, so the selection is not partial — it's
# non-existent.
And the output does not contain "selection is necessarily partial"

# ------------------------------------------------------------------
# L1 — empty `-PaffectedTestsMode` coerces to absent
#
# A CI template that unconditionally emits
# `-PaffectedTestsMode=$MODE` with an unset $MODE would send the
# literal empty string through. v2.2 crashed parseMode with
# "Unknown affectedTests.mode ''". v2.2.1 filters empty/whitespace
# in the extension convention so the empty case is identical to
# omitting the flag. The scenario below pins "no crash" rather than
# a specific resolved mode, because the AUTO default depends on
# whether a CI=true env var is present on the test JVM — asserting
# on a specific mode would make the scenario environment-sensitive.
# ------------------------------------------------------------------
Scenario: An empty -PaffectedTestsMode behaves like the flag is unset
Given a production class "com.example.FooService" with its matching test "com.example.FooServiceTest"
And the baseline commit is captured
And the diff modifies "src/main/java/com/example/FooService.java"
And the Gradle command-line argument "-PaffectedTestsMode="
When the affected-tests task runs with "--explain"
Then the task succeeds
And the situation is DISCOVERY_SUCCESS
# The regression we're guarding: v2.2 would surface this exact
# error message for the empty-string case. v2.2.1 must not.
And the output does not contain "Unknown affectedTests.mode ''"
And the output does not contain "Unknown affectedTests.mode"

# ------------------------------------------------------------------
# L4 — parseMode error names the AUTO fallback
#
# v2.2's error listed the four legal values only, which led
# adopters to ask "so which one is the default?". v2.2.1 adds the
# AUTO-fallback hint plus the CI=true tripwire to the message so
# the fix is self-service from the operator's terminal.
# ------------------------------------------------------------------
Scenario: An unknown -PaffectedTestsMode names AUTO and CI=true in the error
Given a production class "com.example.FooService" with its matching test "com.example.FooServiceTest"
And the baseline commit is captured
And the diff modifies "src/main/java/com/example/FooService.java"
And the Gradle command-line argument "-PaffectedTestsMode=xyzzy"
When the affected-tests task runs with "--explain"
Then the task fails
# Echo-back of the bad value keeps typo diagnosis local — no need
# to re-read the CI log to find what the template substituted.
And the output contains "Unknown affectedTests.mode 'xyzzy'"
# Legal-values list must stay (the v2.2 wording kept this right).
And the output contains "auto, local, ci, strict"
# The v2.2.1 additions: name the fallback AND the environment
# signal so an adopter understands what "just leave -P off" will
# actually do on each machine.
And the output contains "AUTO"
And the output contains "CI=true"
Loading