Add structured diagnostics to the daily workflow ET guardrail#36164
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR improves debuggability of the daily effective-token (ET) workflow guardrail by adding consistently prefixed, structured diagnostics across workflow-run discovery, artifact selection/download, and per-run ET accumulation, plus a test to validate the log format.
Changes:
- Added
formatDailyGuardrailLogMessage/logDailyGuardrailhelpers and emitted structured log lines at key inspection steps. - Hardened artifact selection by skipping artifacts missing required metadata (e.g., missing
id) and logging the reason. - Updated JS tests to assert the structured log format in a stable way (prefix + JSON payload).
Show a summary per file
| File | Description |
|---|---|
| actions/setup/js/check_daily_effective_workflow_guardrail.cjs | Adds structured diagnostic logging around run pagination, artifact discovery/download, and ET accumulation. |
| actions/setup/js/check_daily_effective_workflow_guardrail.test.cjs | Adds coverage for the structured log message formatter. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 2/2 changed files
- Comments generated: 3
| const artifactSummaries = artifacts.map(item => ({ id: item?.id ?? null, name: item?.name || "" })); | ||
| logDailyGuardrail("Listed workflow artifacts", { | ||
| runId, | ||
| artifactCount: artifacts.length, | ||
| artifacts: artifactSummaries, | ||
| }); |
| logDailyGuardrail("Received workflow runs page", { | ||
| page, | ||
| runCount: runs.length, | ||
| runIds: runs.map(run => run?.id).filter(Boolean), | ||
| }); |
| logDailyGuardrail("Prepared candidate workflow runs for artifact inspection", { | ||
| candidateRunsCount: candidateRuns.length, | ||
| candidateRunIds: candidateRuns.map(run => run.id), | ||
| maxInspectableRuns, | ||
| truncatedByRateLimit, | ||
| }); |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. No ADR enforcement needed: PR #36164 does not have the 'implementation' label (has_implementation_label=false) and has 0 new lines of code in business logic directories (default_business_additions=0, threshold=100). Neither enforcement condition is met. |
|
🧪 Test Quality Sentinel completed test quality analysis. |
|
✅ PR Code Quality Reviewer completed the code quality review. |
There was a problem hiding this comment.
🔎 Code quality review by PR Code Quality Reviewer · sonnet46 754.1K
| logDailyGuardrail("Updated current ET state", { | ||
| runId: run.id, | ||
| runEffectiveTokens, | ||
| currentEffectiveTokens: totalEffectiveTokens, |
There was a problem hiding this comment.
O(n2) log output inside the per-run loop: countedRunIds: countedRuns.map(item => item.id) remaps the full (and ever-growing) countedRuns array on every iteration, producing quadratic work and log volume as counted runs accumulate.
💡 Suggested fix
Log only the most recently added run ID, not the full list on each iteration:
logDailyGuardrail("Updated current ET state", {
runId: run.id,
runEffectiveTokens,
currentEffectiveTokens: totalEffectiveTokens,
threshold,
countedRunsCount: countedRuns.length, // just the count
});The full countedRunIds list is already emitted once at the end in the "Completed ET inspection window" log, so repeating it on every iteration adds no diagnostic value and compounds the log-volume concern already flagged in existing review threads.
🧪 Test Quality Sentinel Report✅ Test Quality Score: 100/100 — Excellent
📊 Metrics & Test Classification (7 tests analyzed)
Test Classification Details
Language SupportTests analyzed:
Verdict
📖 Understanding Test ClassificationsDesign Tests (High Value) verify what the system does:
Implementation Tests (Low Value) verify how the system does it:
Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.
|
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /tdd and /zoom-out — requesting changes on test-coverage gaps for the two bundled behaviour changes.
📋 Key Themes & Highlights
Key Themes
- Behaviour changes without tests: Two silent-return guards (
item?.name &&and!artifact.id) alter existing control flow but have no test coverage. These are the blocking items. - Log verbosity at scale:
runIdsper-page (up to 100 IDs × multiple pages) andcountedRunIdsper-iteration (growing array) could generate unexpectedly large log output in high-volume repos. - Error context lost in catch: The formatter's
catchblock discards the original serialisation error, making future debugging harder.
Positive Highlights
- ✅ Stable
[daily-workflow-et]prefix makes log linesgrep-friendly and easy to filter. - ✅ Defensive
try/catcharoundJSON.stringifyis exactly right — diagnostics must not break the guardrail path. - ✅
formatDailyGuardrailLogMessageis correctly exported and unit-tested; the test'sJSON.parseassertion is a clean way to verify structured output without brittle string matching. - ✅ Structured log at the final
"Completed ET inspection window"gives a complete picture in one place.
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · sonnet46 1.8M
| } | ||
| if (!artifact.id) { | ||
| logDailyGuardrail("Skipping guardrail artifact without an id", { | ||
| runId, |
There was a problem hiding this comment.
[/tdd] The new artifact.id null-guard is a behaviour change with no test coverage — an artifact whose name matches but whose id is falsy will silently yield 0 ET instead of proceeding to downloadArtifact.
💡 Suggested test
Add a unit test that stubs listArtifacts to return [{ id: undefined, name: "guardrail-artifact" }] and asserts that getRunEffectiveTokens returns 0 and emits the "Skipping guardrail artifact without an id" log message.
This pins both the guard and the new structured log emitted on that path.
| }); | ||
|
|
||
| const artifact = artifacts.find(item => matchesGuardrailArtifactName(item.name)); | ||
| const artifact = artifacts.find(item => item?.name && matchesGuardrailArtifactName(item.name)); |
There was a problem hiding this comment.
[/tdd] Changing the predicate from matchesGuardrailArtifactName(item.name) to item?.name && matchesGuardrailArtifactName(item.name) is a behaviour change bundled into logging work — artifacts with a falsy name are now silently skipped. No test covers this path.
💡 Suggested test
Add a case where listArtifacts returns [{ id: 1, name: "" }, { id: 2, name: "guardrail-artifact" }] and assert only the second artifact is inspected. This prevents the guard from regressing silently.
| runId: run.id, | ||
| runEffectiveTokens, | ||
| currentEffectiveTokens: totalEffectiveTokens, | ||
| threshold, |
There was a problem hiding this comment.
[/zoom-out] countedRunIds is re-serialised on every iteration of the inspection loop. With up to maxInspectableRuns runs, the emitted log volume grows O(n2) — the final log line could list dozens of IDs already reported in earlier iterations.
💡 Suggested fix
Log only the newly-added run ID and the current count rather than the full accumulated array each time:
logDailyGuardrail("Updated current ET state", {
runId: run.id,
runEffectiveTokens,
currentEffectiveTokens: totalEffectiveTokens,
threshold,
countedRunCount: countedRuns.length, // count, not the full array
});The complete list is already captured in the final "Completed ET inspection window" log, so the per-iteration array is redundant.
| serializedDetails = JSON.stringify(details); | ||
| } catch { | ||
| serializedDetails = JSON.stringify({ error: "failed to serialize log details" }); | ||
| } |
There was a problem hiding this comment.
[/improve-codebase-architecture] The catch block in formatDailyGuardrailLogMessage swallows the original serialisation error completely, making it hard to know why serialisation failed during debugging.
💡 Suggested improvement
} catch (err) {
serializedDetails = JSON.stringify({
error: "failed to serialize log details",
reason: err instanceof Error ? err.message : String(err),
});
}Adding reason preserves the intent of defensive logging while still surfacing enough context to diagnose circular-reference or BigInt serialisation failures.
| logDailyGuardrail("Received workflow runs page", { | ||
| page, | ||
| runCount: runs.length, | ||
| runIds: runs.map(run => run?.id).filter(Boolean), |
There was a problem hiding this comment.
[/zoom-out] runIds: runs.map(run => run?.id).filter(Boolean) can produce an array of up to 100 elements per paginated page. With MAX_WORKFLOW_RUN_PAGES pages this could generate megabytes of log output in a busy repo, and the IDs are already captured in the subsequent "Prepared candidate workflow runs" log.
💡 Suggested fix
Replace the full ID array with a count, matching the style used in nearby logs:
logDailyGuardrail("Received workflow runs page", {
page,
runCount: runs.length,
// runIds omitted — captured in candidateRunIds after filtering
});If per-page IDs are genuinely useful, consider capping the array: runs.slice(0, 20).map(r => r?.id).
| currentEffectiveTokens: 789, | ||
| }); | ||
| expect(exports.formatDailyGuardrailLogMessage("Completed ET inspection window")).toBe("[daily-workflow-et] Completed ET inspection window"); | ||
| }); |
There was a problem hiding this comment.
[/tdd] logDailyGuardrail is not exported and therefore cannot be directly unit-tested. The two new guard branches (item?.name && and !artifact.id) are also not covered. The single new test only verifies the formatter, leaving the runtime paths dark.
💡 Suggestions
- Export
logDailyGuardrail(or extract itscore.infocall behind a seam) so tests can assert it was called with the right message. - Add tests for the guard branches in a
getRunEffectiveTokens-level test: one whereartifact.nameis""and one whereartifact.idisundefined, each asserting the function returns0. - Split the two
formatDailyGuardrailLogMessageassertions into separateit()blocks — one for the details path, one for the no-details path — so failure messages are unambiguous.
The daily effective-token guardrail did not emit enough runtime detail to explain what it queried or why a run was counted or skipped. This change adds structured logs around workflow run discovery, artifact selection/download, and ET accumulation so workflow output is debuggable without re-running blindly.
What changed
[daily-workflow-et]prefix.Guardrails in the logging
Tests
Example