Skip to content

BUILD-11295 Gate cache-metrics on layered decision file (workflow env > pre-job file)#71

Merged
julien-carsique-sonarsource merged 1 commit into
masterfrom
feat/jcarsique/BUILD-11295-metricsFeatureFlag
May 22, 2026
Merged

BUILD-11295 Gate cache-metrics on layered decision file (workflow env > pre-job file)#71
julien-carsique-sonarsource merged 1 commit into
masterfrom
feat/jcarsique/BUILD-11295-metricsFeatureFlag

Conversation

@julien-carsique-sonarsource
Copy link
Copy Markdown
Contributor

@julien-carsique-sonarsource julien-carsique-sonarsource commented May 21, 2026

Summary

Replaces the simple env.CI_METRICS_ENABLED == 'true' gate on the cache-metrics-prep step with a layered check that honours both the workflow-level env override AND a presence-only decision file ${CI_METRICS_DIR}/enabled written by the runner pre-job hook (in github-runners-infra, see Validation chain below).

Resolution (a new early step Resolve CI metrics gate does this in bash, then exposes the result as steps.ci-metrics-gate.outputs.decision):

  • workflow env.CI_METRICS_ENABLED == 'false' → off (beats everything) — also rm -f the decision file so the post-job hook agrees
  • workflow env.CI_METRICS_ENABLED == 'true' → on (beats the allow/deny lists) — also touch the decision file
  • otherwise on iff ${CI_METRICS_DIR}/enabled exists at job-start

New gate (one expression on Prepare local cache-metrics sub-action):

if: >-
  steps.ci-metrics-gate.outputs.decision == 'true' &&
  steps.cache-backend.outputs.cache-backend == 's3'

We resolve in bash rather than splitting workflow-env vs. file-presence between a case step and a GHA if: expression because GHA's hashFiles() only accepts paths under $GITHUB_WORKSPACE — it returns empty for /tmp/ci-metrics/enabled. Resolving in shell and emitting a step output keeps the gate one-shot and correct on every path.

Validation chain

Test plan

  • YAML lint + actionlint pass.
  • test-cache-metrics-via-decision-file (new) — file-only path, no env override. Pre-touches ${CI_METRICS_DIR}/enabled, verifies cache-metrics emit fires.
  • test-cache-metrics-workflow-false-beats-file (new) — env=false beats pre-touched file; verifies the propagator rm -f'd the file and metrics did NOT emit.
  • Existing env=true tests (test-cache-metrics-output, test-s3-cache-survives-git-clean-with-metrics) — still pass under the new gate.
  • All 26 CI checks green.
  • End-to-end validation through SonarSource/sonar-dummy#592 → BUILD-11295 test gh-action_cache@feat/.../BUILD-11295-metricsFeatureFlag ci-github-actions#267 → this PR, on sonar-dev runners — all checks green.

Links

@sonar-review-alpha
Copy link
Copy Markdown

sonar-review-alpha Bot commented May 21, 2026

Summary

This PR replaces a simple CI_METRICS_ENABLED == 'true' gate with a layered decision system that respects both workflow-level environment overrides and a runner-side signal file.

What changed:

  • New Resolve CI metrics gate step in action.yml (bash) that evaluates three paths in priority order:
    1. CI_METRICS_ENABLED='true' → metrics on (and touch decision file for post-job hook)
    2. CI_METRICS_ENABLED='false' → metrics off (and remove decision file)
    3. Otherwise → metrics on iff ${CI_METRICS_DIR}/enabled exists (written by runner pre-job hook)
  • The gate step outputs decision (true/false), replacing the inline env check in the Prepare local cache-metrics sub-action conditional
  • Two new test jobs covering the file-only path and the workflow-env-override-beats-file path
  • README documentation of the three-tier resolution strategy for operators and workflow authors

Why: Enables metrics gatekeeping via both workflow-level overrides (for explicit opt-in/out) and per-runner allow/deny lists (configured in github-runners-infra), while keeping the gate decision testable and the post-job hook synchronized.

What reviewers should know

Key files to review:

  1. action.yml (lines 247–283): The new Resolve CI metrics gate step is the heart of the change. It's straightforward bash with clear case logic—read this first to understand the three paths.

  2. README.md (lines 192–209): "Gate resolution" section documents the priority order and use cases. This is reference material for understanding why the step does what it does.

  3. test-action.yml (two new jobs):

    • test-cache-metrics-via-decision-file: Exercises the file-only path (step outputs should be set, JSON written).
    • test-cache-metrics-workflow-false-beats-file: Exercises workflow opt-out, verifies the file is cleaned up and metrics did not emit.

Non-obvious details:

  • The step mutates ${CI_METRICS_DIR}/enabled (touch on true, rm on false) so the post-job hook in github-runners-infra sees the same decision. This is important for audit trails.
  • The step only runs on Linux (if: runner.os == 'Linux'), matching the existing restriction on metrics collection.
  • The gate conditional changed from env.CI_METRICS_ENABLED == 'true' to steps.ci-metrics-gate.outputs.decision == 'true'—no additional S3/backend checks were added (those remain).

Testing notes: Existing tests for CI_METRICS_ENABLED=true should still pass (the author confirms this). The two new tests specifically validate the new decision paths.


  • Generate Walkthrough
  • Generate Diagram

🗣️ Give feedback

sonar-review-alpha[bot]

This comment was marked as outdated.

… > pre-job file)

Replaces the simple `env.CI_METRICS_ENABLED == 'true'` gate on the
`cache-metrics-prep` step with a layered check that honours both the
workflow-level env override AND a presence-only decision file
`${CI_METRICS_DIR}/enabled` written by the runner pre-job hook
(github-runners-infra `hooks/job-started.sh`, sourcing the per-env, per-repo,
per-workflow allow/deny resolver in `hooks/decide.sh`).

New gate (one expression):

  runner.os == 'Linux' &&
  env.CI_METRICS_ENABLED != 'false' &&
  (env.CI_METRICS_ENABLED == 'true' ||
   hashFiles(format('{0}/enabled', env.CI_METRICS_DIR)) != '') &&
  steps.cache-backend.outputs.cache-backend == 's3'

Semantics:
  - workflow `env.CI_METRICS_ENABLED == 'false'` → off (beats everything)
  - workflow `env.CI_METRICS_ENABLED == 'true'`  → on  (beats the allow/deny lists)
  - otherwise on iff the decision file exists at job-start

New early step `Toggle CI metrics decision file` propagates the workflow's
override to the file (touch on 'true', `rm -f` on 'false') so the post-job
`job-completed.sh` hook sees the same decision the cache action did. The
step also defaults `CI_METRICS_DIR=/tmp/ci-metrics` into $GITHUB_ENV so the
subsequent `hashFiles()` expression has a stable path even on runners that
don't pre-set the variable.

Tests:
  - new `test-cache-metrics-via-decision-file` — file-only path, no env override
  - new `test-cache-metrics-workflow-false-beats-file` — env=false beats pre-touched file
  - existing env=true tests unchanged (the new gate still honours `'true'`)

README documents the gate resolution order and links the workflow-env override
to the file mutation, including the 4-line fallback setup step for workflows
that don't use this action but still want to opt in / out.

Aligns with BUILD-11295 design comment 919695 (the github-runners-infra side
landed under the same branch name).

Resolve CI-metrics gate in bash (hashFiles doesn't accept absolute paths)

CI surfaced a real bug in the previous gate:

  hashFiles(format('{0}/enabled', env.CI_METRICS_DIR))

returns empty for absolute paths outside $GITHUB_WORKSPACE (well-known GHA
constraint — hashFiles only globs under the workspace root). The decision
file lives at `/tmp/ci-metrics/enabled`, so `test-cache-metrics-via-decision-file`
failed: env.CI_METRICS_ENABLED was unset, hashFiles returned empty → gate off,
metrics didn't run.

Fix: collapse the gate into a single bash step (`Resolve CI metrics gate`)
that:
  - applies the layered resolution in shell (env > file presence)
  - mutates the file when workflow env is set (so the post-job hook agrees)
  - emits `decision=true|false` as a step output

The `cache-metrics-prep` gate becomes:

  steps.ci-metrics-gate.outputs.decision == 'true' &&
  steps.cache-backend.outputs.cache-backend == 's3'

No semantic change vs. the design: same priority order
(workflow-false > workflow-true > file-present > no-signal). The decision is
just computed in a place that can actually read `/tmp/ci-metrics/enabled`.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@julien-carsique-sonarsource julien-carsique-sonarsource force-pushed the feat/jcarsique/BUILD-11295-metricsFeatureFlag branch from 3bd7362 to 7acc8d2 Compare May 22, 2026 09:46
@julien-carsique-sonarsource julien-carsique-sonarsource enabled auto-merge (rebase) May 22, 2026 09:46
@sonarqubecloud
Copy link
Copy Markdown

@sonarqube-cloud-us
Copy link
Copy Markdown

@julien-carsique-sonarsource julien-carsique-sonarsource merged commit c5032b2 into master May 22, 2026
28 checks passed
@julien-carsique-sonarsource julien-carsique-sonarsource deleted the feat/jcarsique/BUILD-11295-metricsFeatureFlag branch May 22, 2026 09:49
Copy link
Copy Markdown

@sonar-review-alpha sonar-review-alpha Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! ✅

Solid implementation. The gate logic, file mutation side-effect, and non-Linux safety are all correct. Nothing here requires changes before merge.

A few things worth understanding as you read:

  • Non-Linux safety: The runner.os == 'Linux' guard that was previously on Prepare local cache-metrics sub-action is now implicit — since ci-metrics-gate only runs on Linux, its output is empty on other OSes, and '' == 'true' is false. The behaviour is identical; it's just expressed differently.
  • File mutation is intentional: When CI_METRICS_ENABLED is set to true or false, the gate step touches/removes ${CI_METRICS_DIR}/enabled so the post-job job-completed.sh hook sees the same decision. This couples the action to the hook via a shared file, which is documented in the README.
  • Test path convention: Verify steps hardcode /tmp/ci-metrics rather than expanding ${CI_METRICS_DIR:-/tmp/ci-metrics} — consistent with the pre-existing test-cache-metrics-output job (line 411). These tests assume CI_METRICS_DIR is unset or equals /tmp/ci-metrics on github-ubuntu-latest-s, which holds given all 26 checks are green.

🗣️ Give feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants