fix(cooldown): close silent-bypass regressions from #25, #26, #27#28
Merged
fix(cooldown): close silent-bypass regressions from #25, #26, #27#28
Conversation
Captures the exact diff from j7an/nexus-mcp#160 that exposed issue #27 (astral-sh/setup-uv silently dropped). Test fails until the real extractor is implemented in the next commit. Refs #27
Root cause of #27: the v2.0.1 regex `^\+\s+uses:` in dependency-cooldown.yml does not match YAML list-marker prefixed lines of the form `+ - uses: foo@...`, which is exactly how astral-sh/setup-uv appears in nexus-mcp#160's diff. Empirical verification against the captured fixture: $ grep -cE '^\+.*uses:' tests/fixtures/extract-deps/nexus-mcp-160.diff 17 $ grep -cE '^\+\s+uses:' tests/fixtures/extract-deps/nexus-mcp-160.diff 15 The two dropped lines are both astral-sh/setup-uv (list-marker form). The other actions (harden-runner, trufflehog) appear without the `- ` prefix and were extracted correctly, so setup-uv was silently dropped while the workflow reported success. The new bash-native parser uses [[ =~ ]] regex with an optional `(-[[:space:]]+)?` group that matches both shapes, plus: 1. Whitespace normalization via ${line%$'\r'} and [[:space:]] classes 2. Ecosystem-prefixed dedup keys (actions:foo vs pypi:foo) so a cross-ecosystem name collision doesn't drop a row 3. Strict mode (set -euo pipefail) and explicit malformed-input detection (exit 2 when input has no diff markers) 4. Bash 3.2 compatible dedup (newline-delimited sentinel string instead of `declare -A`) so the script runs on macOS system bash as well as Linux CI Fixes #27
Proves the extractor handles requirements.txt-style bumps and produces clean zero-row output on an empty diff. Complements the nexus-mcp#160 regression test for actions coverage. Refs #27
…#160 Captures hand-crafted GitHub API response fixtures for the three actions that appeared in j7an/nexus-mcp#160, with published_at values tuned to produce 3d/4d/14d ages relative to NOW_EPOCH=1775995200 (2026-04-12). Test fails until the real checker is implemented in the next commit. Refs #25
… lookups Single-tier release-date lookup: - Actions: gh api repos/<owner>/<repo>/releases/tags/v<version> - PyPI: pypi.org/pypi/<pkg>/<version>/json (with yanked detection) Two-attempt retry on transient failures. Fixture mode via AGE_FIXTURE_DIR for hermetic tests. COOLDOWN_DAYS=0 disables the gate entirely (each row emitted as pass with no API call). Bash 3.2 compatible (no associative arrays, no [[ -v ]]). Refs #25
Covers the COOLDOWN_DAYS=0 escape hatch, PyPI happy path with naive UTC upload_time format, yanked-release forced-fail, and 404-simulates-error. Refs #25
Replaces the inline action+Python extraction block with a single shell-out to scripts/extract-deps.sh, then populates the legacy ACTIONS/PY_DEPS/ACTION_VERSIONS/PY_VERSIONS variables that the downstream advisory-scan loops still consume. Preserves PR-body version fallback for both ecosystems. Adds the extraction-count sanity-check guard (#27 acceptance criterion): when the diff line count and extractor row count disagree, the scan comment surfaces an EXTRACTION_WARNING that downstream comment-rendering code (added in Task 12) will display. Refs #27
Introduces two new workflow_call inputs: - cooldown_days (default 7) — enforces release age before auto-merge - fail_on_cooldown (default false) — opt-in hard-red gate for strict consumers The phase-2 check runs after extraction and before the advisory scan, populating COOLDOWN_FAILURES for the gate state machine added in Task 11. AGE_TSV is preserved for the comment-body Release Age section added in Task 12. When COOLDOWN_DAYS=0, the script invocation is skipped entirely (escape hatch preserves pre-v2.0.2 behavior for consumers who can't adopt the new gate immediately). Refs #25
Replaces the additive 'gh pr edit --add-label' block in the 'advisories found' branch with a reconcile_label() helper that runs on every scan regardless of verdict. The helper adds OR removes each label to match the current scan result, so a dirty-then-clean sequence (observed on j7an/nexus-mcp#160) leaves no stale labels. Handles two labels generically: - security-review-needed (red, B60205) — advisory gate - cooldown-pending (amber, FBCA04) — release-age gate (consumed by Task 11 state machine) Each label has its own color and description; the helper takes them as parameters so adding a third label later is one call, not a copy-paste. Fixes #26
Replaces the advisory-only auto-merge decision with a three-state machine:
TOTAL>0 → gate=success, AUTO_MERGE_OK=false ('review needed')
COOLDOWN_FAILURES>0 → gate=pending (or failure if FAIL_ON_COOLDOWN=true),
AUTO_MERGE_OK=false ('waiting for age')
both 0 → gate=success, AUTO_MERGE_OK=true ('ready for merge')
Auto-merge is now gated on AUTO_MERGE_OK=true, which is set only in the
fully-clean case. This closes the silent-bypass path observed on
j7an/nexus-mcp#160 — even if advisories are clean, sub-cooldown versions
block auto-merge until the next scan naturally ages past the threshold.
Advisory failures win the status-description tie-break (shorter, more
actionable than the cooldown message). The comment body and labels
(reconciled in Task 10) reflect both gates independently.
The HAS_ERROR early-exit branch in the comment builder is unchanged
— scan errors still short-circuit to gate=error before the state machine
runs.
Refs #25
Release Age table shows per-package published date, age in days, and pass/fail/error status. Footer computes the 'earliest unblock' timestamp as max(fail.published_at) + cooldown_days so consumers see a concrete wait time instead of anxious-looking 'pending' state. Extraction Warning section renders near the top of the comment when diff-line count and extractor output disagree, surfacing silent parsing drops (#27 acceptance criterion: extraction-count guard). Date parsing for the unblock-time computation uses a multi-flag fallback chain (GNU date -d, GNU date -d with appended Z, BSD date -jf with and without Z stripping) to handle both GitHub published_at and PyPI upload_time formats portably. Refs #25, #27
…ding label Adds two new inputs to the Inputs table, rewrites the Cool-down configuration section to describe both enforcement layers (Dependabot native + workflow-side), adds a Labels subsection explaining the authoritative reconciliation semantic, and adds a Recommended section on scheduled re-scan for long-pending PRs. Refs #25, #26
Cross-task interaction found in final review: Task 10's reconcile_label block sat below the v2.0.1 HAS_ERROR early-exit branch, so an API-error re-scan (GHSA rate-limit, OSV timeout) exited with state=error before reconciling labels. A previously-dirty PR would keep its stale security-review-needed label across an error-path re-scan, which is exactly the failure mode #26 was filed about — under a narrower trigger. Hoists the reconcile block to run immediately after TOTAL is computed but before the if/elif/else comment-builder branches (including the HAS_ERROR exit). Labels now reconcile on every scan, including error paths, honoring the README's 'reconciled on every scan' guarantee. The reconcile block's content is unchanged from Task 10 — only its position moved. Header comment updated to reflect the reason for the position. Refs #26
…removal on error
Three cross-task fixes from final parallel review:
1. Add actions/checkout step — reusable workflows don't auto-checkout
their own repo, so ${GITHUB_WORKSPACE}/scripts/... resolved to nothing
at runtime and the entire fix silently failed. Checkout pulls
j7an/shared-workflows at github.workflow_sha (same commit as the
workflow YAML), sparse-checkout scripts/ for minimal clone. Script
invocation paths updated to shared-workflows/scripts/*.sh.
2. Fix extraction sanity-check to compare unique action names rather
than raw line counts. Previous version compared grep -c (line count)
to post-dedup extraction count, which always fired on real diffs
bumping an action used across multiple workflow files — including
the nexus-mcp#160 fixture this PR exists to defend (17 raw lines,
3 unique actions). New version dedups via sort -u and uses the
extractor's exact regex shape (-[[:space:]]+)? to eliminate
whitespace drift.
3. Guard reconcile_label's removal branch on empty HAS_ERROR. Task 13a
hoisted reconcile above the HAS_ERROR exit to fix the 'stale label
never removed on error re-scan' bug, but this introduced the inverse:
a re-scan where all GHSA queries error leaves TOTAL=0, causing the
reconciler to remove security-review-needed from a genuinely-dirty
PR. The new guard makes reconcile additive-only when HAS_ERROR is
set: labels can still be added on partial-scan evidence, but never
removed unless the scan was clean enough to trust the zero-count.
Refs #25, #26, #27
… order Two small fixes from the post-13b parallel review: 1. Sanity-check dedup pipeline now filters ./ and docker:// prefixes. Task 13b's deduped EXPECTED_ACTIONS counted all unique uses: lines in the diff — including local and docker action references that scripts/extract-deps.sh intentionally skips (lines 51-52). A PR adding a local action would trip the 'Extraction mismatch' warning despite correct extractor behavior. The new grep -v clause mirrors the extractor's skip semantics so the sanity-check stays honest. 2. Release Age unblock-epoch date conversion now tries GNU 'date -d @' first, then BSD 'date -r' as fallback. The previous order worked by accident on Ubuntu runners because GNU 'date -r <integer>' treats the argument as a file path (not an epoch), silently fails with non-existent file, and falls through. Brittle: a file named after the epoch integer in cwd would return that file's mtime instead. GNU-first order matches the actual production platform. Refs #25, #27
The ci-scripts.yml paths filter previously only triggered on scripts/**, tests/**, or its own file. Tasks 13b and 13c made several changes to dependency-cooldown.yml (checkout step, sanity-check dedup, reconcile_label HAS_ERROR guard, local/docker filter, date fallback order) — none of which triggered bats CI because the workflow file wasn't in the paths filter. Adding it ensures that any future change to dependency-cooldown.yml runs the extract-deps and check-release-age regression tests as a partial safety net. bats can't reach workflow-level inline bash, but it can catch script-adjacent regressions that the workflow invokes. Refs #25, #27
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes three regressions in
dependency-cooldown.ymlobserved on j7an/nexus-mcp#160 on 2026-04-12:@dependabot rebasesecurity-review-neededlabel from a prior dirty scan #26 — stalesecurity-review-neededlabel not removed on dirty-then-clean re-scanastral-sh/setup-uvsilently omitted from Packages Scanned list in dependency-cooldown.yml #27 —astral-sh/setup-uvsilently dropped from extractor due to YAML list-marker shape@v2withauto_merge: truewill see Dependabot PRs sit inpendingfor up to 7 days instead of auto-merging immediately on day 0. Setcooldown_days: 0in your caller workflow to restore pre-v2.0.2 behavior.security-review-neededlabel is now reconciled on every scan. Automation that treated this label as persistent will now see it removed when a re-scan finds zero applicable advisories. Labels are preserved (not removed) when a scan encounters API errors, so transient failures don't silently clear warnings.cooldown-pendinglabel (amber) applied when target versions are within the configured cooldown window.dependency-cooldown / gatecontext can now end atpending(cooldown-only block, default) orfailure(cooldown-only block withfail_on_cooldown: true) in addition to the existingsuccessanderrorstates.New Inputs
cooldown_days70to disable workflow-side release-age enforcement.fail_on_cooldownfalsetrue, cooldown blocks set the gate tofailureinstead ofpending. Use when branch protection requires a hard-red blocker.Architecture
Extracted two standalone scripts from the inline workflow bash:
scripts/extract-deps.sh— parses PR diff, emits TSV (name\tversion\tecosystem). Rewritten with bash-native regex to fixastral-sh/setup-uvsilently omitted from Packages Scanned list in dependency-cooldown.yml #27 (handles YAML list-marker- uses:form). Bash 3.2 compatible.scripts/check-release-age.sh— takes dep TSV, queries GitHub Releases + PyPI APIs, emits verdict TSV with per-row pass/fail/error verdicts. Two-attempt retry, fixture mode for tests, yanked detection.The workflow itself (
dependency-cooldown.yml) was restructured with:actions/checkoutstep pullingj7an/shared-workflows@github.workflow_shawith sparse-checkoutreconcile_label()helper that authoritatively adds/removes both labels on every scan (HAS_ERROR-aware — labels are preserved on API-error re-scans)Test Coverage
New bats test suite with 8 tests (all passing):
tests/extract-deps.bats— 3 tests against hand-crafted and real-captured diffstests/check-release-age.bats— 5 tests with fixture-mode API responses, fixedNOW_EPOCHfor determinismtests/fixtures/extract-deps/nexus-mcp-160.diff— real captured diff from deps: bump the all-actions group across 1 directory with 3 updates nexus-mcp#160 (the regression fixture).github/workflows/ci-scripts.yml— new CI runner that triggers bats onscripts/**,tests/**,ci-scripts.yml, ordependency-cooldown.ymlchangesRoot Cause of #27
The v2.0.1 regex
^\+\s+uses:did not match YAML list-marker-prefixed lines of the form+ - uses: foo@sha # v1.0.0, which is exactly howastral-sh/setup-uvappears in nexus-mcp#160's diff (17 raw+...uses:lines, 2 of them- uses:shape for setup-uv, 15 plainuses:for the other two actions). The new regex^\+[[:space:]]+(-[[:space:]]+)?uses:accepts both shapes.Commits
17
fix:commits (all patch-bump under tag-release.yml's conventional-commit analyzer):1f3f981— extract-deps regression fixture for nexus-mcp#160998554a— implement extract-deps.sh with bash-native parser96dcc5e— add Python and empty-diff extract-deps tests9ca390a— add check-release-age regression fixture8beeb3d— implement check-release-age.sh with tier-1 GitHub/PyPI lookups747ce1f— add edge-case tests for check-release-age94f40fe— add bats test runner workflowaab2e50— wire extract-deps.sh into dependency-cooldown.ymlfc3f140— add cooldown_days input and wire check-release-age.shd09b8fa— reconcile labels on every scan to fix stale-label bug22120ca— combine advisory and cooldown gates in state machine2790d8d— add Release Age and Extraction Warning comment sections9c3fe65— document cooldown_days, fail_on_cooldown, and cooldown-pending label9afccfd— run reconcile_label before HAS_ERROR early-exit053996e— add checkout, fix sanity-check dedup, guard reconcile removal on error1ae4585— filter local/docker in sanity check, fix date fallback orderdf5d327— also run bats on dependency-cooldown.yml changesTest Plan
bats tests/— 8/8 passing locallypython3 -c 'import yaml; yaml.safe_load(...)'clean on bothdependency-cooldown.ymlandci-scripts.ymlfix:prefixdependency-cooldown.yml)cooldown-pendinglabel applied, gatepending, auto-merge not enabledDeferred to Follow-ups (not blocking)
age_daysclamp on clock skew.github/workflows/**changesReview
This PR was developed under an extensive 3-pass parallel review process across 5 domains (security, correctness, API contract, portability, test quality). Final score: Security 9/10, Correctness 9/10, API Contract 8/10, Portability 9/10, Test Quality 6/10 → naive average 8.2/10, no remaining blockers.
Fixes #25
Fixes #26
Fixes #27