vedanthvdev · vedanthvdev · Apr 24, 2026 · Apr 24, 2026
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -20,5 +20,54 @@ jobs:
 
       - uses: gradle/actions/setup-gradle@v6
 
-      - name: Build and test
-        run: ./gradlew check
+      # Compile once up-front so the two test stages below don't each
+      # pay for a cold classpath. Splitting test execution from
+      # compilation also makes the failure signal in the GitHub UI
+      # unambiguous — a red "Unit tests" or "E2E (Cucumber) tests"
+      # step means a test actually failed, not a compile error.
+      - name: Compile
+        run: ./gradlew classes testClasses functionalTestClasses
+
+      # Unit tests: ProjectBuilder-level coverage of the decision
+      # engine, extension wiring, --explain rendering, and locale
+      # edge cases. Cheap, fast, and the first gate adopters hit
+      # when the plugin regresses on a pure-logic change.
+      - name: Unit tests
+        run: ./gradlew :affected-tests-core:test :affected-tests-gradle:test
+
+      # E2E (Cucumber + Gradle TestKit): drives real Gradle builds
+      # through the full plugin surface that security-service and
+      # other adopters consume. Each scenario in
+      # src/functionalTest/resources/.../features runs a fresh
+      # Gradle daemon, so these tests catch build-time wiring
+      # issues (Shadow relocations, plugin-publish variants,
+      # Configuration Cache compatibility) that the unit suite
+      # cannot. Surfaced as a distinct step so a red status here
+      # points straight at the adopter-facing failure mode.
+      - name: E2E (Cucumber + Gradle TestKit) tests
+        run: ./gradlew :affected-tests-gradle:functionalTest
+
+      # Assemble the shadow JAR and exercise any remaining `check`
+      # hooks that neither test task above owns. Re-excluding test
+      # and functionalTest here keeps the step honest: if it fails,
+      # the cause is packaging, not test results.
+      - name: Full build (shadow JAR + remaining check tasks)
+        run: ./gradlew build -x test -x functionalTest
+
+      # Upload test reports on any failure above so a failing run
+      # is debuggable without rerunning with --info locally. The
+      # Cucumber HTML report in particular is the fastest way to
+      # read an e2e failure — the console log for a multi-scenario
+      # failure is dense.
+      - name: Upload test reports
+        if: failure()
+        uses: actions/upload-artifact@v4
+        with:
+          name: test-reports
+          path: |
+            affected-tests-gradle/build/reports/tests/
+            affected-tests-gradle/build/test-results/
+            affected-tests-core/build/reports/tests/
+            affected-tests-core/build/test-results/
+          if-no-files-found: ignore
+          retention-days: 14
diff --git a/.release-version b/.release-version
@@ -0,0 +1 @@
+2.2.1
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,14 +4,137 @@ All notable changes to this plugin are documented here. The format follows
 [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project
 adheres to [Semantic Versioning](https://semver.org/).
 
-## [Unreleased]
+## [v2.2.1] — code-review follow-ups after the v2.2 ship
+
+v2.2.1 is a **non-breaking patch** that closes the Medium + Low
+findings a structured code review raised against the v2.2
+changeset. None of them were adopter-reported regressions — v2.2 is
+green end-to-end on the `security-service` pilot — but each one
+represents a real failure mode for a second adopter who doesn't
+ship in exactly the same posture. Point-releasing them now keeps
+the v2.3 tech-plan window open for the larger feature work
+(non-conventional test task names, explain-trace truncation
+unification) without delaying these fixes behind it.
+
+If you're on v2.2.0 today you can bump to v2.2.1 without touching
+`build.gradle`. The only observable change on a green path is that
+enabling `org.gradle.configuration-cache` now works. The only
+observable changes on the unhappy path are sharper error messages
+and a new hint branch for the `onDiscoveryIncomplete = 'skipped'`
+opt-in.
+
+### Fixed — `org.gradle.configuration-cache` no longer crashes the plugin (M1)
+
+The v2.2 dispatch-dependency Callable closed over a `Project`
+reference to resolve each subproject's `testClasses` task. `Project`
+is explicitly documented as **not** configuration-cache-serialisable,
+so a repo with `org.gradle.configuration-cache=true` in
+`gradle.properties` would fail to serialise the task graph with:
+
+```
+Task `:affectedTest` of type `...AffectedTestTask`: cannot serialize
+object of type Project
+```
+
+The fix resolves each subproject's `testClasses` task once, eagerly,
+into a `TaskProvider<?>` (which IS CC-serialisable) and closes the
+Callable over that plus the task's own `Property<Boolean>` for the
+`--explain` gate. Cache-hit rate and task-graph shape are identical
+to v2.2 — only the serialisation surface changed. A new functional
+scenario runs `./gradlew affectedTest --configuration-cache
+--explain` on a SELECTED-shaped diff and pins "Configuration cache
+entry …" in the output so the next edit that accidentally re-captures
+`Project` breaks a test instead of only an adopter's CI.
+
+Adopters who don't enable CC see no change.
+
+### Fixed — `Action.SKIPPED` hint no longer calls the skip "safe" (M3)
+
+v2.2's `DISCOVERY_INCOMPLETE` hint was a two-way branch: SELECTED
+warned about the partial-selection risk; everything else printed "the
+resolved action above is the safe fallback". That "everything else"
+bucket silently absorbed `Action.SKIPPED` — which an operator can
+opt into explicitly via `onDiscoveryIncomplete = 'skipped'`. Calling
+SKIPPED "safe" is precisely inverted advice: the run executed zero
+tests on a partial-parse diff.
+
+v2.2.1 gives SKIPPED its own hint branch:
+
+```
+Hint:            one or more Java files in the diff failed to parse,
+                 so discovery ran with missing inputs.
+                 onDiscoveryIncomplete = 'skipped' meant no tests ran
+                 for this diff — fix the parse error to restore
+                 coverage, or set onDiscoveryIncomplete = 'full_suite'
+                 if silently skipping a partial-parse diff is not the
+                 intended policy.
+```
+
+The hint names the exact knob that was set (so a second reader of the
+trace can locate the override) and names the escape (`'full_suite'`)
+without ever calling the current state "safe". SELECTED and
+FULL_SUITE wording are unchanged.
+
+### Fixed — `-PaffectedTestsMode=` (empty) no longer crashes (L1)
+
+CI templates that unconditionally emit `-PaffectedTestsMode=$MODE`
+with an unset `$MODE` passed the literal empty string into the
+extension's convention, and v2.2 rejected it with
+`Unknown affectedTests.mode ''`. v2.2.1 filters empty/whitespace in
+the convention chain so the empty case is identical to omitting the
+flag — the AUTO default keeps working, and the "unknown mode" error
+stays reserved for actual typos.
+
+### Fixed — unknown-mode error names the AUTO fallback (L4)
+
+v2.2's `parseMode` error listed the four legal values only, which
+led adopters to ask "so which one is the default?". v2.2.1 adds the
+AUTO-fallback hint plus the `CI=true` tripwire that AUTO keys off:
+
+```
+Unknown affectedTests.mode 'xyzzy'. Expected one of: auto, local,
+ci, strict (omit the value or leave -PaffectedTestsMode unset to
+keep the AUTO default, which picks CI when CI=true is exported).
+```
+
+### Changed — explain-trace module-block contract now fails loudly on misuse (L2)
+
+`AffectedTestTask#appendModulesBlock` used to silently re-normalise
+map keys that were missing a leading colon, papering over any caller
+that forgot to route their module paths through
+`AffectedTestTask#testTaskPath`. A future edit that fed raw
+`"api:test"` strings into the map produced a split trace (`":api:test"`
+next to `"core:test"`). The helper now asserts the contract up-front,
+so the failure surface moves into the test suite.
+
+The assertion is plain `assert`, active on every Gradle test JVM by
+default and a no-op on production runs — the public operator-facing
+trace output is unchanged.
+
+_Changelog housekeeping: the v2.2.0 section below has been backfilled
+with items that were in the v2.2.0 ship but were still sitting under
+`[Unreleased]` in the CHANGELOG when the tag was cut._
+
+## [v2.2.0] — adoption-feedback polish from the security-service pilot
+
+v2.2.0 is a **non-breaking** release driven entirely by what a second
+real-world adopter (Modulr's `security-service`, CAR-5190) ran into
+while plugging v2.1 in. Every DSL knob, every default, and every
+resolved behaviour from v2.1 keeps working bit-for-bit; v2.2 only
+sharpens the edges an operator touches when something goes wrong
+or when they want to A/B a mode from CI.
+
+If you're on v2.1 today you can bump to v2.2 without touching
+`build.gradle`. The only observable difference on a green path is a
+faster `--explain` run (Bug A) and a per-module breakdown in that
+trace (Polish E). The only observable difference on the unhappy path
+is clearer hint wording (Bug B) and — in LOCAL mode specifically — a
+loud `WARN` when discovery can't parse part of the diff (Risk C).
 
 ### Changed — internal refinements from the v2.2 code review
 
-Follow-up polish on the v2.2 adoption-feedback release, all
-non-breaking and none of them visible in build scripts. Captured here
-rather than cut as a point release because the operator-facing
-contract is identical to v2.2.0 on every green path.
+Follow-up polish on the v2.2 adoption-feedback changeset, all
+non-breaking and none of them visible in build scripts.
 
 * **Risk C WARN no longer fires with "0 test classes".** The engine
   rewrites `SELECTED` with nothing to dispatch into `skipped=true`;
@@ -45,22 +168,6 @@ contract is identical to v2.2.0 on every green path.
   root — catches a regression where someone pins the Callable wiring
   to the root project instead of iterating via `allprojects { ... }`.
 
-## [v2.2.0] — adoption-feedback polish from the security-service pilot
-
-v2.2.0 is a **non-breaking** release driven entirely by what a second
-real-world adopter (Modulr's `security-service`, CAR-5190) ran into
-while plugging v2.1 in. Every DSL knob, every default, and every
-resolved behaviour from v2.1 keeps working bit-for-bit; v2.2 only
-sharpens the edges an operator touches when something goes wrong
-or when they want to A/B a mode from CI.
-
-If you're on v2.1 today you can bump to v2.2 without touching
-`build.gradle`. The only observable difference on a green path is a
-faster `--explain` run (Bug A) and a per-module breakdown in that
-trace (Polish E). The only observable difference on the unhappy path
-is clearer hint wording (Bug B) and — in LOCAL mode specifically — a
-loud `WARN` when discovery can't parse part of the diff (Risk C).
-
 ### Fixed — `--explain` no longer forces a full compile (Bug A)
 
 Pre-v2.2 the `affectedTest` task eagerly depended on `testClasses` in

diff --git a/...d-tests-gradle/src/functionalTest/java/io/affectedtests/gradle/e2e/steps/CommonSteps.java b/...d-tests-gradle/src/functionalTest/java/io/affectedtests/gradle/e2e/steps/CommonSteps.java
@@ -313,6 +313,19 @@ public void theTaskFailsAtConfigurationTime() {
                 "Failure must come from configuration / evaluation phase, got:\n" + world.project().lastOutput());
     }
 
+    @Then("the task fails")
+    public void theTaskFails() {
+        // Generic failure assertion for scenarios where the failure
+        // phase is uninteresting — e.g. an unknown `-P` value that
+        // escapes the extension convention and is only rejected at
+        // task-action time via {@code parseMode}. Asserting on the
+        // lifecycle phase would tie the test to an implementation
+        // detail (exactly where the enum lookup fires), so this
+        // step only pins "the build did not turn green".
+        assertTrue(world.project().lastBuildFailed(),
+                "Expected a failing build, got green:\n" + world.project().lastOutput());
+    }
+
     @Then("the situation is {word}")
     public void theSituationIs(String situation) {
         // The --explain trace prints the ACTUAL situation on the

diff --git a/...nalTest/resources/io/affectedtests/gradle/e2e/features/07-v2.2.1-review-followups.feature b/...nalTest/resources/io/affectedtests/gradle/e2e/features/07-v2.2.1-review-followups.feature
@@ -0,0 +1,134 @@
+Feature: v2.2.1 — code-review follow-ups from the v2.2 ship
+  Scenarios for the Medium + Low findings raised by the v2.2 code
+  review: Configuration Cache compatibility (M1), Action.SKIPPED
+  hint wording (M3), empty-string `-PaffectedTestsMode` coercion
+  (L1), and the AUTO-fallback error wording on an unknown `-P`
+  value (L4). Every scenario here pins an operator-facing contract
+  the v2.2 release shipped slightly short on — if any of these
+  scenarios regress, a real adopter notices.
+
+  Background:
+    Given a freshly initialised project with a committed baseline
+
+  # ------------------------------------------------------------------
+  # M1 — Configuration Cache compatibility
+  #
+  # The v2.2 Callable for the Bug A `testClasses` dependency captured
+  # `Project p` so a CC-enabled adopter would fail to serialise the
+  # task graph ("cannot serialize object of type Project"). v2.2.1
+  # fixed it by capturing `TaskProvider<?>` and `Property<Boolean>`
+  # eagerly — both CC-serialisable. Pin it with a live
+  # `--configuration-cache` run on the same SELECTED shape adopters
+  # use daily: if the Callable ever regresses to capturing Project
+  # again, this scenario turns red.
+  # ------------------------------------------------------------------
+  Scenario: --configuration-cache stores and reuses the task graph without Project capture
+    Given a production class "com.example.FooService" with its matching test "com.example.FooServiceTest"
+    And the baseline commit is captured
+    And the diff modifies "src/main/java/com/example/FooService.java"
+    And the Gradle command-line argument "--configuration-cache"
+    When the affected-tests task runs with "--explain"
+    Then the task succeeds
+    # Gradle logs the CC-store event on the first run — no store line
+    # means CC never engaged (e.g. because a problem was detected and
+    # CC quietly degraded). The exact phrase changed between 8.x and
+    # 9.x; match the stable prefix "Configuration cache entry" which
+    # both minor lines emit.
+    And the output contains "Configuration cache entry"
+    # A Project-capture regression surfaces as this specific message
+    # — pin the negative so a future edit that re-introduces it can't
+    # hide behind a green "entry stored" line.
+    And the output does not contain "cannot be serialized"
+    And the output does not contain "cannot serialize"
+
+  # ------------------------------------------------------------------
+  # M3 — Action.SKIPPED hint no longer calls the skip "safe"
+  #
+  # v2.2 routed Action.SKIPPED through the FULL_SUITE hint branch,
+  # which said "the resolved action above is the safe fallback".
+  # Calling SKIPPED "safe" is precisely inverted advice: SKIPPED ran
+  # zero tests on a partial-parse diff. v2.2.1 gives SKIPPED its own
+  # hint branch that names the opt-in knob and recommends
+  # `'full_suite'` if the skip wasn't intentional.
+  # ------------------------------------------------------------------
+  Scenario: DISCOVERY_INCOMPLETE on SKIPPED names the opt-in and offers an escape
+    # Requires the opt-in: LOCAL mode with
+    # `onDiscoveryIncomplete = 'skipped'` is the only path that lands
+    # Action.SKIPPED on a parse-failure diff. CI/STRICT escalate
+    # (their paired scenario is in feature 06).
+    Given the affected-tests DSL contains:
+      """
+      mode = 'local'
+      onDiscoveryIncomplete = 'skipped'
+      """
+    And a canned diff that produces the DISCOVERY_INCOMPLETE situation
+    When the affected-tests task runs with "--explain"
+    Then the task succeeds
+    And the situation is DISCOVERY_INCOMPLETE
+    And the action is SKIPPED
+    And the output contains "failed to parse"
+    # Hint must name both the user-set knob AND the escape hatch so
+    # an operator reading the trace can tell whether the skip was
+    # deliberate and, if not, exactly which value to flip.
+    And the output contains "onDiscoveryIncomplete = 'skipped'"
+    And the output contains "onDiscoveryIncomplete = 'full_suite'"
+    # The FULL_SUITE branch's "safe fallback" wording must NOT leak
+    # into the SKIPPED branch — that claim is factually wrong when
+    # zero tests ran.
+    And the output does not contain "safe fallback"
+    # Partial-selection wording belongs to the SELECTED branch only;
+    # SKIPPED ran nothing, so the selection is not partial — it's
+    # non-existent.
+    And the output does not contain "selection is necessarily partial"
+
+  # ------------------------------------------------------------------
+  # L1 — empty `-PaffectedTestsMode` coerces to absent
+  #
+  # A CI template that unconditionally emits
+  # `-PaffectedTestsMode=$MODE` with an unset $MODE would send the
+  # literal empty string through. v2.2 crashed parseMode with
+  # "Unknown affectedTests.mode ''". v2.2.1 filters empty/whitespace
+  # in the extension convention so the empty case is identical to
+  # omitting the flag. The scenario below pins "no crash" rather than
+  # a specific resolved mode, because the AUTO default depends on
+  # whether a CI=true env var is present on the test JVM — asserting
+  # on a specific mode would make the scenario environment-sensitive.
+  # ------------------------------------------------------------------
+  Scenario: An empty -PaffectedTestsMode behaves like the flag is unset
+    Given a production class "com.example.FooService" with its matching test "com.example.FooServiceTest"
+    And the baseline commit is captured
+    And the diff modifies "src/main/java/com/example/FooService.java"
+    And the Gradle command-line argument "-PaffectedTestsMode="
+    When the affected-tests task runs with "--explain"
+    Then the task succeeds
+    And the situation is DISCOVERY_SUCCESS
+    # The regression we're guarding: v2.2 would surface this exact
+    # error message for the empty-string case. v2.2.1 must not.
+    And the output does not contain "Unknown affectedTests.mode ''"
+    And the output does not contain "Unknown affectedTests.mode"
+
+  # ------------------------------------------------------------------
+  # L4 — parseMode error names the AUTO fallback
+  #
+  # v2.2's error listed the four legal values only, which led
+  # adopters to ask "so which one is the default?". v2.2.1 adds the
+  # AUTO-fallback hint plus the CI=true tripwire to the message so
+  # the fix is self-service from the operator's terminal.
+  # ------------------------------------------------------------------
+  Scenario: An unknown -PaffectedTestsMode names AUTO and CI=true in the error
+    Given a production class "com.example.FooService" with its matching test "com.example.FooServiceTest"
+    And the baseline commit is captured
+    And the diff modifies "src/main/java/com/example/FooService.java"
+    And the Gradle command-line argument "-PaffectedTestsMode=xyzzy"
+    When the affected-tests task runs with "--explain"
+    Then the task fails
+    # Echo-back of the bad value keeps typo diagnosis local — no need
+    # to re-read the CI log to find what the template substituted.
+    And the output contains "Unknown affectedTests.mode 'xyzzy'"
+    # Legal-values list must stay (the v2.2 wording kept this right).
+    And the output contains "auto, local, ci, strict"
+    # The v2.2.1 additions: name the fallback AND the environment
+    # signal so an adopter understands what "just leave -P off" will
+    # actually do on each machine.
+    And the output contains "AUTO"
+    And the output contains "CI=true"