fix: daily ET guardrail step never fails activation job; improve conclusion reporting by Copilot · Pull Request #36497 · github/gh-aw

Copilot · 2026-06-02T17:21:58Z

The check_daily_effective_workflow_guardrail step had no top-level error handling, so any transient GitHub API failure would throw an unhandled exception, fail the activation job, and leave daily_effective_workflow_exceeded=false — letting the agent run against a broken guardrail. Separately, agent_failure_comment.md was missing {daily_effective_workflow_exceeded_context}, so repeat-failure comments had no ET guidance, and the exceeded template lacked actionable help.

Changes

check_daily_effective_workflow_guardrail.cjs — wrap all GitHub API calls after the initial guard checks in a try/catch; on error, emit a warning and return with daily_effective_workflow_exceeded=false so the step always exits cleanly. Also documents the bypass-on-error contract in the JSDoc.
daily_effective_workflow_exceeded.md — replace the single-paragraph template with three <details> progressive-disclosure sections: How to raise the daily limit, What the guardrail is, and How to disable it, each with concrete frontmatter snippets and CLI commands.
agent_failure_comment.md — add the missing {daily_effective_workflow_exceeded_context} placeholder after {stale_lock_file_failed_context} so repeat-failure comments include ET guidance, not just newly created issue bodies.
Tests — add a main() error-recovery test asserting the step resolves without throwing when API calls fail; update template assertions to cover the three new <details> headings.

…prove conclusion reporting - Wrap all GitHub API calls in check_daily_effective_workflow_guardrail.cjs in a top-level try-catch so transient API errors never fail the activation job step. On error the step logs a warning and exits cleanly with the default daily_effective_workflow_exceeded=false output, letting the agent run rather than producing a confusing activation failure. - Enhance daily_effective_workflow_exceeded.md with progressive disclosure: three <details> sections covering how to raise the daily limit, what the guardrail is, and how to disable it entirely. - Add {daily_effective_workflow_exceeded_context} placeholder to agent_failure_comment.md so repeat-failure comments include the ET context (previously the context was only present in new-issue bodies). - Update check_daily_effective_workflow_guardrail.test.cjs with a test asserting main() resolves (does not throw) when GitHub API calls fail. - Update handle_agent_failure_daily_effective_workflow.test.cjs to assert the three progressive-disclosure sections are present in the rendered template. Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot

Pull request overview

This PR hardens the daily per-workflow effective-token (ET) guardrail step so transient GitHub API failures no longer fail the activation job, and improves the user-facing messaging when the guardrail is exceeded (including ensuring repeat-failure comments include the same guidance).

Changes:

Wrap the guardrail’s GitHub API interactions in a top-level try/catch so the activation job step exits cleanly on API errors and preserves the safe default (daily_effective_workflow_exceeded=false).
Expand the “daily ET guardrail exceeded” markdown into actionable, progressive-disclosure guidance with concrete frontmatter and CLI commands.
Add coverage for the new markdown structure and for the guardrail’s error-recovery behavior; include the guardrail context placeholder in repeat-failure comments.

Show a summary per file

File	Description
actions/setup/js/check_daily_effective_workflow_guardrail.cjs	Adds top-level error recovery around GitHub API calls and documents the bypass-on-error contract.
actions/setup/md/daily_effective_workflow_exceeded.md	Replaces the single-paragraph message with `<details>` sections explaining raise/disable/what-it-is.
actions/setup/md/agent_failure_comment.md	Adds `{daily_effective_workflow_exceeded_context}` so repeat-failure comments include ET guidance.
actions/setup/js/handle_agent_failure_daily_effective_workflow.test.cjs	Updates assertions to cover the new progressive-disclosure headings/content.
actions/setup/js/check_daily_effective_workflow_guardrail.test.cjs	Adds a test ensuring `main()` resolves (doesn’t throw) when GitHub API calls fail.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 5/5 changed files
Comments generated: 1

+    const workflowID = process.env.GH_AW_WORKFLOW_ID || "";
+    const workflowName = process.env.GH_AW_WORKFLOW_NAME || workflowID || "workflow";
+    const runUrl = process.env.GH_AW_RUN_URL || currentRun.data.html_url || "";
+    const actorLogin = process.env.GITHUB_TRIGGERING_ACTOR || currentRun.data.triggering_actor?.login || currentRun.data.actor?.login || process.env.GITHUB_ACTOR || "";


github-actions · 2026-06-02T17:39:52Z

✅ PR Code Quality Reviewer completed the code quality review.

github-actions · 2026-06-02T17:39:55Z

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

github-actions · 2026-06-02T17:40:02Z

🧪 Test Quality Sentinel completed test quality analysis.

github-actions · 2026-06-02T17:40:09Z

✅ Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #36497 does not have the 'implementation' label (has_implementation_label=false) and has 0 new lines of code in business logic directories (requires_adr_by_default_volume=false, threshold=100). Neither enforcement condition is met.

github-actions · 2026-06-02T17:45:35Z

🧪 Test Quality Sentinel Report

✅ Test Quality Score: 91/100 — Excellent

Analyzed 10 test(s): 10 design (behavioral contracts), 0 implementation, 0 guideline violations.

📊 Metrics & Test Classification (10 tests analyzed)

Metric	Value
New/modified tests analyzed	10
✅ Design tests (behavioral contracts)	10 (100%)
⚠️ Implementation tests (low value)	0 (0%)
Tests with error/edge cases	7 (70%)
Duplicate test clusters	0
Test inflation detected	No (test +57 lines / prod +157 lines = 0.36 ratio)
🚨 Coding-guideline violations	0

Test Classification Details

Test	File	Classification	Issues Detected
`skips workflow_call, repository_dispatch, and workflow_dispatch with aw_context`	`check_daily_effective_workflow_guardrail.test.cjs`	✅ Design	Covers both `true` and `false` return paths
`matches both firewall-audit-logs and unified agent artifacts`	`check_daily_effective_workflow_guardrail.test.cjs`	✅ Design	Verifies both matching and non-matching artifact names
`sums effective tokens from explicit token-usage entries`	`check_daily_effective_workflow_guardrail.test.cjs`	✅ Design	Asserts on aggregated return value
`computes aggregate ET statistics for prior runs`	`check_daily_effective_workflow_guardrail.test.cjs`	✅ Design	Verifies full stats object shape
`caps inspection when GitHub API rate limit headroom is low`	`check_daily_effective_workflow_guardrail.test.cjs`	✅ Design	Tests boundary values (110 vs 120 headroom)
`formats structured daily ET log messages`	`check_daily_effective_workflow_guardrail.test.cjs`	✅ Design	Verifies both with-context and no-context output
`renders a daily ET details summary with stats and prior runs`	`check_daily_effective_workflow_guardrail.test.cjs`	✅ Design	Exercises `truncatedByRateLimit` edge case
`main() does not fail the step when GitHub API calls throw`	`check_daily_effective_workflow_guardrail.test.cjs`	✅ Design	Core error-resilience contract: step must not throw; output stays `"false"`; warning is emitted
`renders the daily workflow ET guardrail context when exceeded`	`handle_agent_failure_daily_effective_workflow.test.cjs`	✅ Design	Verifies rendered markdown content when guardrail triggers
`returns empty string when the guardrail did not trigger`	`handle_agent_failure_daily_effective_workflow.test.cjs`	✅ Design	Edge case: no-op when `exceeded = false`

Language Support

Tests analyzed:

🐹 Go (*_test.go): 0 tests
🟨 JavaScript (*.test.cjs): 10 tests (vitest)

Verdict

✅ Check passed. 0% of new tests are implementation tests (threshold: 30%). The new test for main() error resilience directly validates the behavioral contract described in the PR title ("guardrail step never fails activation job").

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

Assert on observable outputs, return values, or state changes
Cover error paths and boundary conditions
Would catch a behavioral regression if deleted
Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

Assert on internal function calls (mocking internals)
Only test the happy path with typical inputs
Break during legitimate refactoring even when behavior is correct
Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

🧪 Test quality analysis by Test Quality Sentinel · sonnet46 1.2M · ◷

github-actions

✅ Test Quality Sentinel: 91/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All 10 new tests verify behavioral contracts, including the critical error-resilience contract for main() that directly corresponds to the fix in this PR.

github-actions

Skills-Based Review 🧠

Applied /diagnose and /tdd — approving with minor suggestions on test structure and observability.

📋 Key Themes & Highlights

Key Themes

Test assertions in try block: The new regression test puts its expect() calls inside the try, so if the first assertion throws, subsequent ones are skipped silently. Minor structural fix recommended.
Missing "exceeded=true" regression test: The big refactor (wrapping all of main() in a single outer try/catch) isn't validated from the happy path — a future logic bug could silently prevent daily_effective_workflow_exceeded from being set to "true" without a test catching it.
Permanent vs. transient bypass: The outer catch correctly swallows all errors by design, but persistent configuration errors (bad token, missing permission) will bypass the guardrail on every run with only a job-log warning as signal.

Positive Highlights

✅ Root cause correctly identified: absence of outer try/catch was the real issue, not any individual API call
✅ The safe-bypass contract (fail → false, not fail → job failure) is clearly documented in both the JSDoc and inline comments
✅ The agent_failure_comment.md one-liner fix is exactly right — no overengineering
✅ daily_effective_workflow_exceeded.md upgrade is excellent: progressive disclosure, concrete YAML snippets, caution callout for the disable path
✅ Template regression tests updated in lockstep with the template changes

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · sonnet46 1.7M

github-actions · 2026-06-02T17:46:51Z

+
+    try {
+      // Should resolve without throwing even though the API calls throw
+      await expect(exports.main()).resolves.toBeUndefined();


[/tdd] Assertions inside the try block mean that if the first expect throws, later assertions are silently skipped, reducing test confidence.

💡 Suggested refactor

Move the await expect(...) call outside the try/finally (or keep the try/finally only for cleanup), so Jest sees all assertion failures independently:

let threw = false; try { await exports.main(); } catch (err) { threw = true; } finally { delete global.core; delete global.github; delete global.context; delete process.env.GH_AW_MAX_DAILY_EFFECTIVE_TOKENS; delete process.env.GH_AW_GITHUB_TOKEN; } expect(threw).toBe(false); expect(coreOutputs["daily_effective_workflow_exceeded"]).toBe("false"); expect(coreWarnings.some(w => /unexpected error.*skipped/i.test(w))).toBe(true);

This way all three assertions are always evaluated regardless of which one fails.

github-actions · 2026-06-02T17:46:51Z

+      delete process.env.GH_AW_GITHUB_TOKEN;
+    }
+  });
 });


[/tdd] The refactor wraps the entire success path in the outer try/catch too, but there's no regression test verifying that a normal execution where the threshold is exceeded still sets daily_effective_workflow_exceeded to "true". A logic bug inside the try block (e.g., an early return or swallowed error) could silently prevent the output from being set without any test catching it.

💡 What to add

Add a companion test that mocks a successful API response returning runs whose total ET exceeds the threshold, and asserts:

it("main() sets daily_effective_workflow_exceeded=true when threshold is exceeded", async () => { // ... mock APIs returning one run with runEffectiveTokens > threshold ... await exports.main(); expect(coreOutputs["daily_effective_workflow_exceeded"]).toBe("true"); });

This guards against the outer catch accidentally consuming an exception that would have set the output.

github-actions · 2026-06-02T17:46:51Z

-    core.info(`Daily workflow ET guardrail not exceeded (${totalEffectiveTokens}/${threshold}).`);
-    return;
+    core.warning(`Daily workflow ET guardrail exceeded for ${workflowName}: ${totalEffectiveTokens}/${threshold}.`);
+  } catch (error) {


[/diagnose] The outer catch treats all errors — including permanent ones like missing permissions or misconfigured tokens — the same as transient network blips. A persistent configuration error will silently bypass the guardrail on every run, with only a core.warning in the job log as the signal.

💡 Suggestion

The current design is explicitly documented as intentional (safe bypass > broken activation), so this is not a blocker. However, consider adding a core.error() call alongside the warning for non-retryable error classes, or emitting a core.summary line so the bypass is visible in the Actions run summary rather than only in the raw log:

} catch (error) { core.warning(`Daily workflow ET guardrail encountered an unexpected error and will be skipped: ${getErrorMessage(error)}`); // Make the bypass visible in the run summary so operators notice persistent failures await core.summary.addRaw(`> [!WARNING]\n> ET guardrail skipped due to error: ${getErrorMessage(error)}`).write().catch(() => {}); }

This preserves the safe-bypass contract while making a stuck guardrail discoverable without digging into logs.

github-actions

The design is sound — wrapping GitHub API calls in a try-catch so that transient failures never block the activation job is the right call, and the comments explaining the "safe bypass" intent are clear. The inline comments flag the most actionable issues; no blocking correctness bugs found.

### Review themes

Test mock completeness (medium) — mockCore is missing error/debug/notice stubs. A missing stub causes a TypeError that gets silently caught by the very try-catch under test, producing a warning that matches the regex and making the test pass vacuously. See inline comment on line 133.

Global state cleanup (medium) — The finally block uses delete rather than save-and-restore, which is fragile if beforeEach ever starts populating these globals. See inline comment on line 172.

Observability gap (low) — daily_effective_workflow_exceeded=false is ambiguous: it could mean "under budget" or "evaluation errored and was skipped". A dedicated daily_effective_workflow_guardrail_error output would make cost audits and dashboards significantly more reliable. See inline comment on catch block.

Coverage gap (low) — There is no test for the case where an exception is thrown after core.setOutput("daily_effective_workflow_exceeded", "true") has already been set (i.e., appendDailyEffectiveWorkflowSummary throws on the exceeded path). The behavior is actually correct — exceeded stays "true" and the warning fires — but this scenario is worth an explicit test since it represents the most operationally significant partial-state.

🔎 Code quality review by PR Code Quality Reviewer · sonnet46 327.1K

github-actions · 2026-06-02T17:50:25Z

+    // The step should catch the error and NOT rethrow it, keeping daily_effective_workflow_exceeded at "false".
+    const coreOutputs = {};
+    const coreWarnings = [];
+    const mockCore = {


Incomplete mockCore stub may cause vacuously-passing tests. mockCore is missing error, debug, and notice methods. If any production code path inside the try block calls one of them, the TypeError: core.xxx is not a function will be caught by the outer catch and produce a warning like "Daily workflow ET guardrail encountered an unexpected error and will be skipped: core.error is not a function" — which matches the regex /unexpected error.*skipped/i. The test then passes while hiding the actual regression.

💡 Suggested fix

Add no-op stubs for the missing methods:

const mockCore = { setOutput: (key, value) => { coreOutputs[key] = value; }, info: () => {}, warning: msg => coreWarnings.push(msg), error: () => {}, // prevent silent swallow of TypeError debug: () => {}, notice: () => {}, };

This ensures that any accidental call to these methods causes a type-safe no-op rather than a silently-caught TypeError that fools the regex assertion.

github-actions · 2026-06-02T17:50:25Z

+      // A warning must be emitted describing the error
+      expect(coreWarnings.some(w => /unexpected error.*skipped/i.test(w))).toBe(true);
+    } finally {
+      delete global.core;


Global state deleted rather than restored — fragile isolation. delete global.core permanently removes the property rather than restoring whatever was there before. If a future beforeEach or test-helper ever pre-populates these globals, this finally block will silently trash them for subsequent tests, causing confusing cross-test contamination.

💡 Suggested fix

Save and restore:

const savedCore = global.core; const savedGithub = global.github; const savedContext = global.context; global.core = mockCore; global.github = mockGithub; global.context = mockContext; try { // assertions... } finally { if (savedCore === undefined) delete global.core; else global.core = savedCore; if (savedGithub === undefined) delete global.github; else global.github = savedGithub; if (savedContext === undefined) delete global.context; else global.context = savedContext; delete process.env.GH_AW_MAX_DAILY_EFFECTIVE_TOKENS; delete process.env.GH_AW_GITHUB_TOKEN; }

Alternatively, use afterEach for cleanup so Vitest handles the lifecycle correctly.

github-actions · 2026-06-02T17:50:26Z

+    // Treat any unexpected error as a non-blocking skip so the step never fails the
+    // activation job.  The output stays at the default "false", allowing the agent to
+    // run.  The guardrail is effectively bypassed for this invocation.
+    core.warning(`Daily workflow ET guardrail encountered an unexpected error and will be skipped: ${getErrorMessage(error)}`);


Silent bypass is indistinguishable from a clean pass in workflow outputs. When the catch fires, daily_effective_workflow_exceeded stays at "false" — the same value it has when the guardrail genuinely evaluated to "not exceeded". Any downstream observability (dashboards, audits) that reads this output cannot tell whether the guardrail passed or failed silently. The only signal is a transient ::warning:: in the step log that is easily missed.

💡 Suggested improvement

Consider adding a dedicated output that makes the skip explicit:

} catch (error) { core.warning(`Daily workflow ET guardrail encountered an unexpected error and will be skipped: ${getErrorMessage(error)}`); core.setOutput("daily_effective_workflow_guardrail_error", "true"); // new }

Downstream steps and dashboards can then distinguish three states:

exceeded=true → guardrail fired, block the agent

exceeded=false, guardrail_error=false → evaluated cleanly, under budget

exceeded=false, guardrail_error=true → evaluation failed, agent allowed as safe fallback

This is especially important for post-incident cost audits where "did the guardrail skip silently?" is a key question.

pelikhan · 2026-06-02T17:55:12Z

@copilot review comments and review comments. apply copilot-review skill

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

pelikhan · 2026-06-02T18:13:05Z

@copilot lint js and fix https://github.com/github/gh-aw/actions/runs/26838723127/job/79139596655?pr=36497

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-06-02T18:19:44Z

@copilot lint js and fix https://github.com/github/gh-aw/actions/runs/26838723127/job/79139596655?pr=36497

Fixed in f7e508b. I formatted actions/setup/js/check_daily_effective_workflow_guardrail.test.cjs with Prettier and re-ran npm run lint:cjs and the targeted guardrail test successfully.

Copilot AI and others added 2 commits June 2, 2026 17:20

refine: address review feedback on JSDoc and billing note clarity

7d03622

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI assigned Copilot and pelikhan Jun 2, 2026

Copilot AI changed the title ~~fix: ensure daily ET guardrail step never fails activation job and improve conclusion reporting~~ fix: daily ET guardrail step never fails activation job; improve conclusion reporting Jun 2, 2026

Copilot finished work on behalf of pelikhan June 2, 2026 17:22

Copilot AI requested a review from pelikhan June 2, 2026 17:22

Copilot created this pull request from a session on behalf of pelikhan June 2, 2026 17:22 View session

pelikhan marked this pull request as ready for review June 2, 2026 17:24

Copilot AI review requested due to automatic review settings June 2, 2026 17:24

Copilot started reviewing on behalf of pelikhan June 2, 2026 17:25 View session

Copilot AI reviewed Jun 2, 2026

View reviewed changes

github-actions Bot mentioned this pull request Jun 2, 2026

[aw] No-Op Runs #36156

Open

github-actions Bot approved these changes Jun 2, 2026

View reviewed changes

github-actions Bot reviewed Jun 2, 2026

View reviewed changes

Copilot started work on behalf of pelikhan June 2, 2026 17:55 View session

Apply remaining changes

5b87535

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan June 2, 2026 18:04

Merge branch 'main' into copilot/review-js-check-daily-tokens

2884717

Copilot started work on behalf of pelikhan June 2, 2026 18:13 View session

github-actions Bot mentioned this pull request Jun 2, 2026

[Contribution Check Report] Contribution Check — 2026-06-02 #36383

Open

fix: format guardrail test for cjs lint

f7e508b

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan June 2, 2026 18:20

pelikhan merged commit 220e424 into main Jun 2, 2026
12 checks passed

pelikhan deleted the copilot/review-js-check-daily-tokens branch June 2, 2026 18:22

Conversation

Copilot AI commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

github-actions Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 2, 2026

🧪 Test Quality Sentinel Report

Test Classification Details

Language Support

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Skills-Based Review 🧠

Key Themes

Positive Highlights

Uh oh!

github-actions Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

pelikhan commented Jun 2, 2026

Uh oh!

pelikhan commented Jun 2, 2026

Uh oh!

Copilot AI commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Jun 2, 2026 •

edited

Loading

github-actions Bot commented Jun 2, 2026 •

edited

Loading

github-actions Bot commented Jun 2, 2026 •

edited

Loading

github-actions Bot commented Jun 2, 2026 •

edited

Loading

github-actions Bot commented Jun 2, 2026 •

edited

Loading