Skip to content

Detect and surface excessive Copilot tool-denial guardrail failures#37363

Merged
pelikhan merged 3 commits into
mainfrom
copilot/daily-spdd-spec-planner-fix
Jun 6, 2026
Merged

Detect and surface excessive Copilot tool-denial guardrail failures#37363
pelikhan merged 3 commits into
mainfrom
copilot/daily-spdd-spec-planner-fix

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Jun 6, 2026

The Daily SPDD Spec Planner failure was caused by repeated workflow tool permission denials that tripped the Copilot SDK denial threshold, but this condition was only surfaced as a generic engine failure. This change adds a first-class signal for that guardrail stop and renders it explicitly in failure issues/comments.

  • Copilot SDK: emit explicit guardrail event
    • Added a dedicated JSONL event in events.jsonl when max denials are reached:
      • type: "guard.tool_denials_exceeded"
      • payload includes denialCount, threshold, and reason.
  • Conclusion job: detect and classify denial-threshold failures
    • handle_agent_failure.cjs now reads Copilot session events.jsonl and extracts guard.tool_denials_exceeded.
    • Adds tool_denials_exceeded to failure-category matching for precise issue reuse/dedup behavior.
    • Scoped to Copilot engine paths to avoid cross-engine false positives.
  • Failure rendering: dedicated user-facing context
    • Added a new templated section for this condition in both:
      • agent_failure_issue.md
      • agent_failure_comment.md
    • New template: tool_denials_exceeded_context.md with clear guardrail semantics and actionable remediation.
  • Tests
    • Added coverage for SDK event emission and for loading/rendering the new failure context in conclusion handling.
{"type":"guard.tool_denials_exceeded","timestamp":"...","data":{"denialCount":5,"threshold":5,"reason":"permission denied: read"}}

Copilot AI linked an issue Jun 6, 2026 that may be closed by this pull request
Copilot AI and others added 2 commits June 6, 2026 17:13
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix daily SPDD Spec Planner workflow failure Detect and surface excessive Copilot tool-denial guardrail failures Jun 6, 2026
Copilot AI requested a review from pelikhan June 6, 2026 17:15
@pelikhan pelikhan marked this pull request as ready for review June 6, 2026 17:35
Copilot AI review requested due to automatic review settings June 6, 2026 17:35
@pelikhan pelikhan merged commit 79b7da0 into main Jun 6, 2026
@pelikhan pelikhan deleted the copilot/daily-spdd-spec-planner-fix branch June 6, 2026 17:35
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an explicit, first-class signal for Copilot SDK “max tool denial” guardrail stops and surfaces that condition clearly in generated failure issues/comments, so repeated tool permission denials aren’t misclassified as generic engine failures.

Changes:

  • Emit a dedicated guard.tool_denials_exceeded JSONL event from the Copilot SDK driver when the denial threshold is reached.
  • In the conclusion handling, detect that event from Copilot session events.jsonl, classify it as tool_denials_exceeded, and render a dedicated failure context section.
  • Add tests covering event emission, loading, and rendering of the new context.
Show a summary per file
File Description
actions/setup/md/tool_denials_exceeded_context.md New user-facing markdown fragment explaining the guardrail and remediation.
actions/setup/md/agent_failure_issue.md Adds {tool_denials_exceeded_context} into the failure issue template.
actions/setup/md/agent_failure_comment.md Adds {tool_denials_exceeded_context} into the failure comment template.
actions/setup/js/handle_agent_failure.test.cjs Adds tests for loading the guardrail event and rendering its context.
actions/setup/js/handle_agent_failure.cjs Implements event loading, failure categorization, and context rendering for tool-denials-exceeded.
actions/setup/js/copilot_sdk_driver.test.cjs Extends test to assert the driver emits the guardrail JSONL event.
actions/setup/js/copilot_sdk_driver.cjs Emits guard.tool_denials_exceeded when the SDK denial threshold is hit.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 7/7 changed files
  • Comments generated: 2

Comment on lines +1128 to +1143
const latestEvent = events[events.length - 1];
const denialCount = String(latestEvent.denialCount);
const threshold = String(latestEvent.threshold);
const reason = latestEvent.reason || "permission denied by workflow tool permissions";

try {
const templatePath = getPromptPath("tool_denials_exceeded_context.md");
const template = fs.readFileSync(templatePath, "utf8");
return (
"\n" +
renderTemplate(template, {
denial_count: denialCount,
threshold,
reason: `\`${reason}\``,
workflow_id: workflowId || "the workflow",
})
Comment on lines +1147 to +1149
`\n**⚠️ Excessive Tool Denials**: The Copilot SDK stopped the session after ${denialCount}/${threshold} permission denials.\n\n` +
`**Last denied request:** \`${reason}\`\n\n` +
"This is a guardrail stop (`guard.tool_denials_exceeded`) and indicates the workflow's allowed tool set does not match the prompt's requested actions.\n"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[aw] Daily SPDD Spec Planner failed

3 participants