BUILD-11294 Add cache-size-bytes output and pipeline runtime metrics JSON#62
Conversation
6aafd88 to
2000f2a
Compare
|
8797eff to
d15f7f3
Compare
Initial Claude Code guidance file produced via /init. Documents the build/test commands, the critical ncc-bundle commit rule, the composite-action + sub-action architecture, the credential-isolation rationale tied to each regression test job, the backend & cache-key resolution chain, and the pinned tooling versions. Points at README.md and action.yml as the source of truth for user-facing inputs to avoid duplication.
173cd91 to
1d3db98
Compare
Summary
This PR adds cache metrics collection to support BUILD-11068's pipeline-runtime-metrics workstream. It introduces: 4 new top-level outputs: Optional JSON record emission: When New sub-action: Zero behavioral changes for users who don't set What reviewers should knowStart with: Key architecture decisions:
Test coverage: 22 new Vitest cases in Things to verify:
|
3ae9aea to
a8cc7b6
Compare
a8cc7b6 to
712fb29
Compare
712fb29 to
a65e68b
Compare
a65e68b to
e2439c6
Compare
e2439c6 to
337e669
Compare
337e669 to
dfe936e
Compare
jayadeep-km-sonarsource
left a comment
There was a problem hiding this comment.
The approach looks good to me. I didn't fully understand the cache size calculation part. It would be nice if we can add some tests to ensure that the calculation is correct
dfe936e to
04c9412
Compare
…step}.json Add a new `cache-metrics` Node 20 sub-action (main + post) that captures cache size and save status for every invocation of `gh-action_cache`, feeding the pipeline-runtime-metrics workstream (BUILD-11068 / job-completed.sh, BUILD-11291). ## Action outputs (top-level) - `cache-hit` (unchanged): exact match on primary key. - `cache-matched-key` (new): mirrors `actions/cache`'s output — primary key on exact hit, restore key on partial, empty on miss. - `restore-key-hit` (new): the prefix-matched restore key, set ONLY on partial hits (empty on exact hit and full miss). - `backend` (new): which backend was used (`github` or `s3`). - `cache-size-bytes` (new): size of cached path(s) at restore-time (Linux only). ## Per-invocation JSON record (Linux only) `/tmp/ci-metrics/cache-${step}.json` is written by the cache-metrics sub-action where `${step}` is the slugified id of the calling step. Schema covers four scenarios (see Jira BUILD-11294 comment 914885 + README): | scenario | cache-hit | restore-key-hit | saved | restored size | end-of-job size | |----------------|-----------|------------------|-------|---------------|-----------------| | exact hit | true | null | false | restored | path at end | | partial hit | false | <restore-key> | true | older partial | new (saved) cache| | no match | false | null | true | 0 | new (saved) cache| | lookup-only | - | null | false | 0 | path at end | The cache action skips save on an exact hit (cache keys are content-immutable in actions/cache and runs-on/cache), so `saved=false` even though `size-bytes-at-end` may differ from `size-bytes-restored` when the user modified the cached path during the job. ## Architecture - New `cache-metrics/` Node 20 sub-action with main + post. - Top-level composite invokes it on Linux runners only, after the last restore branch, so both the GitHub and S3 backends are covered. - Sub-action is referenced via `./.actions/cache-metrics` symlinked at job-start to `\$GITHUB_ACTION_PATH/cache-metrics`. The symlink is `git add -f`'d so it survives `git clean -ffdx` (which previously broke similar workspace-copy patterns — see PR #39). Survives both clean and post-step re-resolution. - Step fails open: any measurement error surfaces as `core.warning`, never breaks the cache flow. ## Tests - 22 new Vitest cases in `__tests__/cache-metrics.test.ts` covering helpers, main, post, all four scenarios. - New CI assertion job `test-cache-metrics-output` validates the JSON file is written and the `cache-size-bytes` output is set end-to-end. - All existing regression tests still pass (credential isolation, multi-cache invocations, Windows, preset AWS profile, git-clean survival). ## Misc - `.pre-commit-config.yaml`: extended trailing-whitespace / EOF / check-added-large-files excludes to cover all `*/dist/` directories (ncc bundles aren't formatted by us). - README: new Outputs rows, Pipeline runtime metrics section with two example JSONs (Scenario 1 exact hit + Scenario 2 partial restore) and a field-semantics table.
04c9412 to
7c29e3c
Compare
|
|
There was a problem hiding this comment.
LGTM! ✅
The only change in this incremental diff is a comment simplification in the test workflow — removing the reference to "runner pod template / WarpBuild AMI" from the CI_METRICS_ENABLED env var comment. This is a cosmetic-only change with no functional impact. The simplified comment is accurate for the test context (the test explicitly opts in), while the authoritative infrastructure documentation lives in action.yml.
… step
Drop the dedicated `metrics-flag` shell step and check `env.CI_METRICS_ENABLED == 'true'` directly in the
`cache-metrics-prep` step's `if:` expression. This:
* Removes the bash 3.2 incompatibility that crashed macOS runners with `bad substitution` on `${RAW,,}`
(observed in SonarSource/sonar-dummy#592 Build macOS, run 26091741374), which also cascaded into a
template-validity error for downstream `build-maven` `if:` expressions.
* Removes the case-insensitive variant matching. Accepting only the literal `'true'` matches the convention
of the existing env vars in this action (`CACHE_BACKEND`, `CACHE_IMPORT_GITHUB`) and avoids the GHA-
expression equivalent of bash 4-only `${VAR,,}` (no `tolower` in GHA expressions).
* Eliminates the bash step entirely, so there's no portability surface to worry about across runner
platforms.
The `cache-metrics-prep` step's combined `if: runner.os == 'Linux' && env.CI_METRICS_ENABLED == 'true'`
keeps the Linux-only scope. The downstream `cache-metrics` step continues to chain on
`steps.cache-metrics-prep.outputs.prepared == 'true'`, so when the flag is unset or non-`true` the entire
metrics flow is skipped without warning, matching the pre-#62 behaviour for non-pilot consumers.
README updated to document the strict `true` requirement instead of "case-insensitive".
Surfaced during the BUILD-11297 pilot rollout on sonar-dummy.
… step
Drop the dedicated `metrics-flag` shell step and check `env.CI_METRICS_ENABLED == 'true'` directly in the
`cache-metrics-prep` step's `if:` expression. This:
* Removes the bash 3.2 incompatibility that crashed macOS runners with `bad substitution` on `${RAW,,}`
(observed in SonarSource/sonar-dummy#592 Build macOS, run 26091741374), which also cascaded into a
template-validity error for downstream `build-maven` `if:` expressions.
* Removes the case-insensitive variant matching. Accepting only the literal `'true'` matches the convention
of the existing env vars in this action (`CACHE_BACKEND`, `CACHE_IMPORT_GITHUB`) and avoids the GHA-
expression equivalent of bash 4-only `${VAR,,}` (no `tolower` in GHA expressions).
* Eliminates the bash step entirely, so there's no portability surface to worry about across runner
platforms.
The `cache-metrics-prep` step's combined `if: runner.os == 'Linux' && env.CI_METRICS_ENABLED == 'true'`
keeps the Linux-only scope. The downstream `cache-metrics` step continues to chain on
`steps.cache-metrics-prep.outputs.prepared == 'true'`, so when the flag is unset or non-`true` the entire
metrics flow is skipped without warning, matching the pre-#62 behaviour for non-pilot consumers.
README updated to document the strict `true` requirement instead of "case-insensitive".
Surfaced during the BUILD-11297 pilot rollout on sonar-dummy.
… step
Drop the dedicated `metrics-flag` shell step and check `env.CI_METRICS_ENABLED == 'true'` directly in the
`cache-metrics-prep` step's `if:` expression. This:
* Removes the bash 3.2 incompatibility that crashed macOS runners with `bad substitution` on `${RAW,,}`
(observed in SonarSource/sonar-dummy#592 Build macOS, run 26091741374), which also cascaded into a
template-validity error for downstream `build-maven` `if:` expressions.
* Removes the case-insensitive variant matching. Accepting only the literal `'true'` matches the convention
of the existing env vars in this action (`CACHE_BACKEND`, `CACHE_IMPORT_GITHUB`) and avoids the GHA-
expression equivalent of bash 4-only `${VAR,,}` (no `tolower` in GHA expressions).
* Eliminates the bash step entirely, so there's no portability surface to worry about across runner
platforms.
The `cache-metrics-prep` step's combined `if: runner.os == 'Linux' && env.CI_METRICS_ENABLED == 'true'`
keeps the Linux-only scope. The downstream `cache-metrics` step continues to chain on
`steps.cache-metrics-prep.outputs.prepared == 'true'`, so when the flag is unset or non-`true` the entire
metrics flow is skipped without warning, matching the pre-#62 behaviour for non-pilot consumers.
README updated to document the strict `true` requirement instead of "case-insensitive".
Surfaced during the BUILD-11297 pilot rollout on sonar-dummy.
… step
Drop the dedicated `metrics-flag` shell step and check `env.CI_METRICS_ENABLED == 'true'` directly in the
`cache-metrics-prep` step's `if:` expression. This:
* Removes the bash 3.2 incompatibility that crashed macOS runners with `bad substitution` on `${RAW,,}`
(observed in SonarSource/sonar-dummy#592 Build macOS, run 26091741374), which also cascaded into a
template-validity error for downstream `build-maven` `if:` expressions.
* Removes the case-insensitive variant matching. Accepting only the literal `'true'` matches the convention
of the existing env vars in this action (`CACHE_BACKEND`, `CACHE_IMPORT_GITHUB`) and avoids the GHA-
expression equivalent of bash 4-only `${VAR,,}` (no `tolower` in GHA expressions).
* Eliminates the bash step entirely, so there's no portability surface to worry about across runner
platforms.
The `cache-metrics-prep` step's combined `if: runner.os == 'Linux' && env.CI_METRICS_ENABLED == 'true'`
keeps the Linux-only scope. The downstream `cache-metrics` step continues to chain on
`steps.cache-metrics-prep.outputs.prepared == 'true'`, so when the flag is unset or non-`true` the entire
metrics flow is skipped without warning, matching the pre-#62 behaviour for non-pilot consumers.
README updated to document the strict `true` requirement instead of "case-insensitive".
Surfaced during the BUILD-11297 pilot rollout on sonar-dummy.



Summary
Adds the building block that the BUILD-11068 pipeline-runtime-metrics workstream needs from
gh-action_cache: an opt-in per-invocation JSON record at${CI_METRICS_DIR}/cache-${step}.json(defaults to/tmp/ci-metrics; emission gated byCI_METRICS_ENABLED=true) and four new top-level outputs surfacing the cache state to downstream workflow steps. The M1.3job-completed.shhook is the direct consumer of the JSON files.New top-level outputs
cache-hit(unchanged)cache-matched-key(new)actions/cache's output — primary key on exact hit, restore key on partial, empty on miss.restore-key-hit(new)cache-matched-key.backend(new)githubors3).cache-size-bytes(new)JSON record schema and the four scenarios
${CI_METRICS_DIR}/cache-${step}.json(defaults to/tmp/ci-metrics; only emitted whenCI_METRICS_ENABLED=true) is written by the newcache-metricssub-action where${step}is the slugifiedgithub.actionof the calling step. Multiple cache calls per job produce distinct files.cache-hitrestore-key-hitsavedsize-bytes-restoredsize-bytes-at-endtruenullfalsefalse<the matched restore key>truefalsenulltrue0lookup-only: truetrueorfalse)nullfalse0savedisfalsewhenever the underlying cache action skips the post-step save — i.e. on exact hits (cache keys are content-immutable inactions/cache/runs-on/cache) or whenlookup-onlyis set. Full field semantics + examples are inREADME.mdand in the Jira ticket comment 914885.Architecture
cache-metrics/Node 20 sub-action withmain+post(mirrors thecredential-guardpattern at the file-layout level; lives next tocredential-setup/andcredential-guard/).Enforce fail-on-cache-miss→ cache-metrics →Credential guard). LIFO post order on the S3 path: credential-guard (restore AWS creds) → cache-metrics (measure pre-save size + update JSON) → s3-cache (save to S3). cache-metrics never needs AWS creds../.actions/cache-metrics, symlinked at job-start to$GITHUB_ACTION_PATH/cache-metrics. The symlink isgit add -f'd so it survivesgit clean -ffdx(which previously broke the workspace-copy pattern — see #39). Works for bothuses: ./(dev tests in this repo) and external consumers usingSonarSource/gh-action_cache@v1. Survives the test-s3-cache-survives-git-clean regression.core.warning, never breaks the cache flow.Behavioral changes
CI_METRICS_ENABLED=true): writes${CI_METRICS_DIR}/cache-${step}.json(directory defaults to/tmp/ci-metrics; created defensively if absent). When the flag is unset/empty/any-other-value, no metrics step runs andcache-size-bytesis empty..actions/cache-metricssymlink in$GITHUB_WORKSPACEandgit add -f's it. Ephemeral; never persisted to a commit.Tests
__tests__/cache-metrics.test.tscovering helpers, main, post, and all four scenarios. Repo total: 45 cases pass.test-cache-metrics-outputvalidates thatcache-size-bytesis set and${CI_METRICS_DIR}/cache-*.jsonis written with the expected schema.test-github-cache,test-s3-cache*(multi-invocations, with-fallback, with-credential-interference, with-preset-aws-config, survives-git-clean, Windows),tests-credential-isolation(sonar-dummy downstream — exercises real consumer pattern),pre-commit, SonarCloud / SonarCloud Code Analysis.Test plan
npm test— 45 Vitest cases pass (22 new incache-metrics.test.ts)npm run build— all five ncc bundles producedpre-commit run --all-files— cleantest-cache-metrics-outputvalidates the JSON + thecache-size-bytesoutputrunner.os == 'Linux'skip (output is empty on Windows, no JSON written)Related
job-completed.shingests${CI_METRICS_DIR}/cache-*.jsonNote for the reviewer
The ticket was originally scoped as 2 SP (S) for "compute
du -sbonce on the cached path." Scope expanded during implementation to capture the post-step size +savedflag, plus the additional top-level outputs (cache-matched-key,restore-key-hit,backend) for downstream-step convenience. Ticket re-sized to 5 SP (M) in Jira.The PR was originally two commits (BUILD-11294 + the unrelated
Add CLAUDE.mdproduced by/init); the BUILD-11294 work was squashed into a single commit for clean review.