Improve PR Sous Chef engine-failure context for AWF startup crashes#34524
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Improves PR Sous Chef’s engine-failure context extraction so AWF startup crashes surface actionable root-cause lines in generated issue bodies instead of low-signal fallback output.
Changes:
- Extend
buildEngineFailureContext()to capture AWF-style bracketed error logs ([ERROR] ...) and AWF docker-compose dependency startup failures (dependency failed to start: ...). - Add a regression test to ensure the extracted failure context includes the AWF root-cause messages and does not include fallback-only noise like
stdout/stderr: undefined.
Show a summary per file
| File | Description |
|---|---|
| actions/setup/js/handle_agent_failure.cjs | Adds new AWF-specific parsing patterns to engine failure context extraction. |
| actions/setup/js/handle_agent_failure.test.cjs | Adds a focused test case covering AWF startup failure log shapes and expected extracted context. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comments suppressed due to low confidence (1)
actions/setup/js/handle_agent_failure.cjs:1594
- The function header comment says it matches known patterns like
ERROR:,Error:,Fatal:,panic:, etc., but it now also matches AWF-specific patterns ([ERROR] ...anddependency failed to start: ...). Please update the doc comment to include these patterns so future changes don’t accidentally regress them.
// AWF docker-compose dependency failures surface this root-cause line without
// an explicit log-level prefix.
const awfDependencyFailureMatch = line.match(/^dependency failed to start:\s*(.+)$/);
if (awfDependencyFailureMatch) {
errorMessages.add(`dependency failed to start: ${awfDependencyFailureMatch[1].trim()}`);
continue;
}
- Files reviewed: 2/2 changed files
- Comments generated: 2
| // AWF runtime logs: "[ERROR] <message>" | ||
| const bracketErrorMatch = line.match(/^\[ERROR\]\s*(.+)$/); | ||
| if (bracketErrorMatch) { | ||
| errorMessages.add(bracketErrorMatch[1].trim()); |
| it("extracts AWF startup errors from bracketed logs", () => { | ||
| const lines = [ | ||
| " Container awf-squid Error", | ||
| "dependency failed to start: container awf-squid is unhealthy", | ||
| "[ERROR] Failed to start containers: Error: Command failed with exit code 1: docker compose up -d --pull never", |
|
Please address the unresolved review feedback on
|
|
✅ PR Code Quality Reviewer completed the code quality review. |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. No ADR enforcement needed: PR #34524 does not have the 'implementation' label and has 0 new lines of code in default business logic directories (well under the 100-line threshold). |
|
🧪 Test Quality Sentinel completed test quality analysis. |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
There was a problem hiding this comment.
🔎 Code quality review by PR Code Quality Reviewer · sonnet46 701.5K
| } | ||
|
|
||
| // AWF docker-compose dependency failures surface this root-cause line without | ||
| // an explicit log-level prefix. |
There was a problem hiding this comment.
Redundant string reconstruction: the template literal re-assembles the full original line unnecessarily.
💡 Suggested simplification
Instead of:
errorMessages.add(`dependency failed to start: ${awfDependencyFailureMatch[1].trim()}`);Use the capture group directly or just trim the original line:
errorMessages.add(line.trim());
// or
errorMessages.add(`dependency failed to start: ${awfDependencyFailureMatch[1].trim()}`);Actually the current form is fine, but note the inconsistency: the [ERROR] handler strips its prefix (bracketErrorMatch[1].trim()), while this handler preserves it (dependency failed to start: ...). If both are intentional, a brief comment on why [ERROR] is stripped would help readers understand the difference.
🧪 Test Quality Sentinel Report✅ Test Quality Score: 100/100 — Excellent
📊 Metrics & Test Classification (1 test analyzed)
Test Classification Details
Language SupportTests analyzed:
Verdict
📖 Understanding Test ClassificationsDesign Tests (High Value) verify what the system does:
Implementation Tests (Low Value) verify how the system does it:
Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.
|
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /diagnose and /tdd — clean, well-scoped bug fix. Approving with minor suggestions.
📋 Key Themes & Highlights
Key Themes
- Pattern breadth: The
[ERROR]prefix is common across many tools — worth a short comment confirming it's AWF-specific in this context. - Test assertion durability:
not.toContain("Last agent output")works today but is fragile if the heading changes. - Minor reconstruction quirk: The dependency match rebuilds the prefix string rather than using
match[0]— safe but slightly roundabout.
Positive Highlights
- ✅ Purely additive — zero risk of breaking existing extraction paths
- ✅ Regression test covers both the positive (root-cause lines included) and negative (fallback noise excluded) cases
- ✅ Consistent with existing pattern-matching structure in
buildEngineFailureContext() - ✅ PR description is clear and includes concrete before/after examples
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · sonnet46 1.2M
| continue; | ||
| } | ||
|
|
||
| // AWF runtime logs: "[ERROR] <message>" |
There was a problem hiding this comment.
[/diagnose] The [ERROR] pattern is broad — any log line from any tool/library that uses [ERROR] prefix (e.g., npm, Docker BuildKit, other CLI tools) will be captured as an engine failure context line.
💡 Consider making the pattern more specific
If AWF startup logs have a consistent prefix (e.g., [awf] or [compose]), tightening the pattern would prevent false positives from third-party tooling:
// More specific — only AWF logs
const bracketErrorMatch = line.match(/^\[ERROR\]\s*(.+)$/);
// Could be guarded by checking surrounding context or using a more specific prefixThat said, if the [ERROR] prefix is reliably AWF-specific in practice (e.g., only gh-aw's own logger emits it at startup), this is fine as-is. Worth a short comment confirming the assumption.
| // AWF docker-compose dependency failures surface this root-cause line without | ||
| // an explicit log-level prefix. | ||
| const awfDependencyFailureMatch = line.match(/^dependency failed to start:\s*(.+)$/); | ||
| if (awfDependencyFailureMatch) { |
There was a problem hiding this comment.
[/diagnose] Minor: the dependency failure message is reconstructed from the capture group rather than using line.trim() or match[0]. This normalises whitespace between the prefix and container name, which is intentional but subtle.
💡 Simpler alternative
Using match[0] (the full matched string) or line.trim() avoids the reconstruction and is more obviously correct:
// Option A — use the full match (already trimmed by regex anchor)
errorMessages.add(awfDependencyFailureMatch[0]);
// Option B — trim the original line
errorMessages.add(line.trim());The current approach is safe, but a comment noting the whitespace normalisation intent would help the next reader.
| expect(result).not.toContain("stderr: undefined"); | ||
| }); | ||
|
|
||
| it("detects Fatal: prefix pattern", () => { |
There was a problem hiding this comment.
[/tdd] not.toContain("Last agent output") ties the test to a specific heading string — if that section heading changes the assertion silently becomes weaker without failing.
💡 Consider a more structural assertion
If the intent is to assert the fallback path was not taken, a positive assertion on the structured output (e.g. that only the Engine Failure section is present and not the fallback block) is more durable:
// Assert the fallback section is absent by checking its structural marker
expect(result).not.toMatch(/Last agent output|Fallback/i);
// or: assert the result does NOT contain the fallback sentinel string you ownAlternatively, a short comment explaining what "Last agent output" refers to (the fallback section heading) would make the intent clear for future maintainers.
|
@copilot review all comments and review comments |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Addressed. I reviewed the feedback and pushed
Validated with |
|
``
|
|
``
|
|
|
PR Sous Chef failures were being reported as generic Copilot termination with low-signal output (
stdout/stderr: undefined), masking the actual AWF startup cause. This change makes failure context extraction recognize AWF startup error patterns so issues surface actionable root-cause lines.Failure-context parsing updates
buildEngineFailureContext()to treat AWF bracketed error logs as first-class engine errors:^\[ERROR\]\s*(.+)$^dependency failed to start:\s*(.+)$awf-squid) instead of fallback noise.Regression coverage for PR Sous Chef failure shape
handle_agent_failure.test.cjswith real failure-style lines:dependency failed to start: container awf-squid is unhealthy[ERROR] Failed to start containers: ... docker compose up -d --pull neverstdout/stderr: undefined).