BUILD-11295 Gate cache-metrics on layered decision file (workflow env > pre-job file)#71
Conversation
6afa9f4 to
3bd7362
Compare
SummaryThis PR replaces a simple What changed:
Why: Enables metrics gatekeeping via both workflow-level overrides (for explicit opt-in/out) and per-runner allow/deny lists (configured in github-runners-infra), while keeping the gate decision testable and the post-job hook synchronized. What reviewers should knowKey files to review:
Non-obvious details:
Testing notes: Existing tests for
|
… > pre-job file)
Replaces the simple `env.CI_METRICS_ENABLED == 'true'` gate on the
`cache-metrics-prep` step with a layered check that honours both the
workflow-level env override AND a presence-only decision file
`${CI_METRICS_DIR}/enabled` written by the runner pre-job hook
(github-runners-infra `hooks/job-started.sh`, sourcing the per-env, per-repo,
per-workflow allow/deny resolver in `hooks/decide.sh`).
New gate (one expression):
runner.os == 'Linux' &&
env.CI_METRICS_ENABLED != 'false' &&
(env.CI_METRICS_ENABLED == 'true' ||
hashFiles(format('{0}/enabled', env.CI_METRICS_DIR)) != '') &&
steps.cache-backend.outputs.cache-backend == 's3'
Semantics:
- workflow `env.CI_METRICS_ENABLED == 'false'` → off (beats everything)
- workflow `env.CI_METRICS_ENABLED == 'true'` → on (beats the allow/deny lists)
- otherwise on iff the decision file exists at job-start
New early step `Toggle CI metrics decision file` propagates the workflow's
override to the file (touch on 'true', `rm -f` on 'false') so the post-job
`job-completed.sh` hook sees the same decision the cache action did. The
step also defaults `CI_METRICS_DIR=/tmp/ci-metrics` into $GITHUB_ENV so the
subsequent `hashFiles()` expression has a stable path even on runners that
don't pre-set the variable.
Tests:
- new `test-cache-metrics-via-decision-file` — file-only path, no env override
- new `test-cache-metrics-workflow-false-beats-file` — env=false beats pre-touched file
- existing env=true tests unchanged (the new gate still honours `'true'`)
README documents the gate resolution order and links the workflow-env override
to the file mutation, including the 4-line fallback setup step for workflows
that don't use this action but still want to opt in / out.
Aligns with BUILD-11295 design comment 919695 (the github-runners-infra side
landed under the same branch name).
Resolve CI-metrics gate in bash (hashFiles doesn't accept absolute paths)
CI surfaced a real bug in the previous gate:
hashFiles(format('{0}/enabled', env.CI_METRICS_DIR))
returns empty for absolute paths outside $GITHUB_WORKSPACE (well-known GHA
constraint — hashFiles only globs under the workspace root). The decision
file lives at `/tmp/ci-metrics/enabled`, so `test-cache-metrics-via-decision-file`
failed: env.CI_METRICS_ENABLED was unset, hashFiles returned empty → gate off,
metrics didn't run.
Fix: collapse the gate into a single bash step (`Resolve CI metrics gate`)
that:
- applies the layered resolution in shell (env > file presence)
- mutates the file when workflow env is set (so the post-job hook agrees)
- emits `decision=true|false` as a step output
The `cache-metrics-prep` gate becomes:
steps.ci-metrics-gate.outputs.decision == 'true' &&
steps.cache-backend.outputs.cache-backend == 's3'
No semantic change vs. the design: same priority order
(workflow-false > workflow-true > file-present > no-signal). The decision is
just computed in a place that can actually read `/tmp/ci-metrics/enabled`.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
3bd7362 to
7acc8d2
Compare
|
|
There was a problem hiding this comment.
LGTM! ✅
Solid implementation. The gate logic, file mutation side-effect, and non-Linux safety are all correct. Nothing here requires changes before merge.
A few things worth understanding as you read:
- Non-Linux safety: The
runner.os == 'Linux'guard that was previously onPrepare local cache-metrics sub-actionis now implicit — sinceci-metrics-gateonly runs on Linux, its output is empty on other OSes, and'' == 'true'is false. The behaviour is identical; it's just expressed differently. - File mutation is intentional: When
CI_METRICS_ENABLEDis set totrueorfalse, the gate step touches/removes${CI_METRICS_DIR}/enabledso the post-jobjob-completed.shhook sees the same decision. This couples the action to the hook via a shared file, which is documented in the README. - Test path convention: Verify steps hardcode
/tmp/ci-metricsrather than expanding${CI_METRICS_DIR:-/tmp/ci-metrics}— consistent with the pre-existingtest-cache-metrics-outputjob (line 411). These tests assumeCI_METRICS_DIRis unset or equals/tmp/ci-metricsongithub-ubuntu-latest-s, which holds given all 26 checks are green.



Summary
Replaces the simple
env.CI_METRICS_ENABLED == 'true'gate on thecache-metrics-prepstep with a layered check that honours both the workflow-level env override AND a presence-only decision file${CI_METRICS_DIR}/enabledwritten by the runner pre-job hook (ingithub-runners-infra, see Validation chain below).Resolution (a new early step
Resolve CI metrics gatedoes this in bash, then exposes the result assteps.ci-metrics-gate.outputs.decision):env.CI_METRICS_ENABLED == 'false'→ off (beats everything) — alsorm -fthe decision file so the post-job hook agreesenv.CI_METRICS_ENABLED == 'true'→ on (beats the allow/deny lists) — alsotouchthe decision file${CI_METRICS_DIR}/enabledexists at job-startNew gate (one expression on
Prepare local cache-metrics sub-action):We resolve in bash rather than splitting workflow-env vs. file-presence between a
casestep and a GHAif:expression because GHA'shashFiles()only accepts paths under$GITHUB_WORKSPACE— it returns empty for/tmp/ci-metrics/enabled. Resolving in shell and emitting a step output keeps the gate one-shot and correct on every path.Validation chain
deploy-devlabel): https://github.com/SonarSource/github-runners-infra/pull/410Test plan
actionlintpass.test-cache-metrics-via-decision-file(new) — file-only path, no env override. Pre-touches${CI_METRICS_DIR}/enabled, verifies cache-metrics emit fires.test-cache-metrics-workflow-false-beats-file(new) — env=falsebeats pre-touched file; verifies the propagatorrm -f'd the file and metrics did NOT emit.truetests (test-cache-metrics-output,test-s3-cache-survives-git-clean-with-metrics) — still pass under the new gate.sonar-devrunners — all checks green.Links