Refactor harness permission-denied handling into shared helper module by Copilot · Pull Request #36415 · github/gh-aw

Copilot · 2026-06-02T06:52:46Z

The three harness entrypoints (claude, copilot, codex) each carried near-identical permission-denied classification and payload code, creating parallel maintenance and drift risk. This change centralizes that pipeline so all engines use the same detection and escalation behavior.

Shared permission-denied module
- Added actions/setup/js/permission_denied_helpers.cjs with:
  - countPermissionDeniedIssues(output)
  - hasNumerousPermissionDeniedIssues(output)
  - extractDeniedCommands(output)
  - buildMissingToolPermissionIssuePayload(deniedCommands)
Harness deduplication
- Updated:
  - actions/setup/js/claude_harness.cjs
  - actions/setup/js/copilot_harness.cjs
  - actions/setup/js/codex_harness.cjs
- Removed in-file duplicated implementations and imported the shared helpers instead.
- Kept harness-specific retry/error policy logic in each harness unchanged.
Focused coverage for extracted logic
- Added actions/setup/js/permission_denied_helpers.test.cjs to validate shared behavior independently of any single harness.

const {
  countPermissionDeniedIssues,
  hasNumerousPermissionDeniedIssues,
  extractDeniedCommands,
  buildMissingToolPermissionIssuePayload,
} = require("./permission_denied_helpers.cjs");

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Copilot

Pull request overview

This PR centralizes “permission denied” detection and escalation logic used by the claude, copilot, and codex harness entrypoints into a shared helper module, reducing duplication and drift risk across engines.

Changes:

Added permission_denied_helpers.cjs providing shared counting, threshold detection, denied-command extraction, and issue-payload construction utilities.
Updated claude_harness.cjs, copilot_harness.cjs, and codex_harness.cjs to import the shared helpers and removed the duplicated in-file implementations.
Added permission_denied_helpers.test.cjs to cover the extracted helper behavior independently of any single harness.

Show a summary per file

File	Description
actions/setup/js/permission_denied_helpers.cjs	Introduces shared helper functions for permission-denied classification, extraction, and payload creation.
actions/setup/js/permission_denied_helpers.test.cjs	Adds focused unit tests validating the shared helper behaviors.
actions/setup/js/claude_harness.cjs	Replaces duplicated permission-denied logic with imports from the shared helpers.
actions/setup/js/copilot_harness.cjs	Replaces duplicated permission-denied logic with imports from the shared helpers.
actions/setup/js/codex_harness.cjs	Replaces duplicated permission-denied logic with imports from the shared helpers.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 5/5 changed files
Comments generated: 0

github-actions · 2026-06-02T13:28:22Z

✅ PR Code Quality Reviewer completed the code quality review.

github-actions · 2026-06-02T13:28:27Z

🧪 Test Quality Sentinel completed test quality analysis.

github-actions · 2026-06-02T13:28:29Z

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

github-actions · 2026-06-02T13:28:34Z

✅ Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #36415 does not have the 'implementation' label and has 0 new lines of code in business logic directories (threshold 100, no custom config).

github-actions · 2026-06-02T13:33:53Z

🧪 Test Quality Sentinel Report

✅ Test Quality Score: 85/100 — Excellent

Analyzed 4 test(s): 4 design, 0 implementation, 0 guideline violation(s).

📊 Metrics & Test Classification (4 tests analyzed)

Metric	Value
New/modified tests analyzed	4
✅ Design tests (behavioral contracts)	4 (100%)
⚠️ Implementation tests (low value)	0 (0%)
Tests with error/edge cases	2 (50%)
Duplicate test clusters	0
Test inflation detected	No (33 test lines / 77 production lines ≈ 0.43:1)
🚨 Coding-guideline violations	0

Test Classification Details

Test	File	Classification	Issues Detected
`counts repeated permission-denied signals`	`permission_denied_helpers.test.cjs`	✅ Design	No test for empty/null input (returns 0)
`detects numerous permission-denied issues at threshold`	`permission_denied_helpers.test.cjs`	✅ Design (boundary)	Only tests `true` path; missing below-threshold (`false`) case
`extracts denied commands from pipe-marked output`	`permission_denied_helpers.test.cjs`	✅ Design	No test for empty input or output with no pipe markers
`builds missing_tool payload with default denied commands`	`permission_denied_helpers.test.cjs`	✅ Design (edge case)	Only tests empty `deniedCommands`; missing test with actual commands populated

Language Support

Tests analyzed:

🐹 Go (*_test.go): 0 tests
🟨 JavaScript (*.test.cjs): 4 tests (vitest)

⚠️ Flagged Tests — Minor Suggestions (not blocking) (4 issue(s))

💡 `counts repeated permission-denied signals`

Classification: Design test
Minor gap: No test for empty string input (countPermissionDeniedIssues("") should return 0) or null/undefined input. The production code handles this with if (!output) return 0 but it's untested.
Suggested addition: expect(countPermissionDeniedIssues("")).toBe(0) and expect(countPermissionDeniedIssues(null)).toBe(0).

💡 `detects numerous permission-denied issues at threshold`

Classification: Design test (boundary)
Minor gap: Only tests the true path (exactly at threshold = 3). A test for below-threshold input would confirm the false return and fully specify the contract.
Suggested addition: expect(hasNumerousPermissionDeniedIssues("permission denied\npermission denied")).toBe(false).

💡 `extracts denied commands from pipe-marked output`

Classification: Design test
Minor gap: No test for empty input (extractDeniedCommands("") → []) or output that contains "permission denied" but no preceding pipe-marked command lines.
Suggested addition: Test the empty-input and no-extractable-commands edge cases.

💡 `builds missing_tool payload with default denied commands`

Classification: Design test
Minor gap: Only tests with no deniedCommands argument. The production code has a branch for when deniedCommands is non-empty — that branch is untested.
Suggested addition: buildMissingToolPermissionIssuePayload(["go version 2>&1"]) and assert denied_commands contains the command.

Verdict

✅ Check passed. 0% of new tests are implementation tests (threshold: 30%). All 4 new tests verify observable behavioral contracts. The suggestions above are minor gaps (missing edge cases for empty/null inputs and the non-empty deniedCommands path) but do not affect the pass verdict.

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

Assert on observable outputs, return values, or state changes
Cover error paths and boundary conditions
Would catch a behavioral regression if deleted
Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

Assert on internal function calls (mocking internals)
Only test the happy path with typical inputs
Break during legitimate refactoring even when behavior is correct
Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References: §26822143887

🧪 Test quality analysis by Test Quality Sentinel · sonnet46 947.7K · ◷

github-actions

✅ Test Quality Sentinel: 85/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All 4 tests verify observable behavioral contracts of the new permission_denied_helpers.cjs module.

github-actions

One blocker: test file uses ESM syntax in a `.cjs` module

The refactor itself is clean — the extraction is correct and the shared helper logic is identical to what was in all three harnesses. The only issue blocking merge is the test file.

permission_denied_helpers.test.cjs uses ESM import statements (import { describe, it, expect } from "vitest", import { createRequire } from "module"). Node.js treats .cjs files as CommonJS unconditionally, so this file will throw a SyntaxError before any test runs. The test coverage this PR adds effectively does not exist.

Fix: replace the ESM imports with require("vitest") and require("./permission_denied_helpers.cjs") directly.

🔎 Code quality review by PR Code Quality Reviewer · sonnet46 1.5M

github-actions · 2026-06-02T13:34:22Z

@@ -0,0 +1,33 @@
+import { describe, it, expect } from "vitest";


Tests will never run: ESM import syntax in a .cjs file causes a SyntaxError at load time — Node.js unconditionally treats .cjs as CommonJS regardless of Vitest transforms.

💡 Suggested fix

Replace the ESM imports with CJS equivalents:

// `@ts-check` "use strict"; const { describe, it, expect } = require("vitest"); const { countPermissionDeniedIssues, hasNumerousPermissionDeniedIssues, extractDeniedCommands, buildMissingToolPermissionIssuePayload, } = require("./permission_denied_helpers.cjs");

Drop createRequire(import.meta.url) (also ESM) and use require() directly — consistent with every other *.cjs test file in the project. The current file will produce zero test results in any runner that honours the .cjs extension contract.

github-actions

Skills-Based Review 🧠

Applied /tdd and /improve-codebase-architecture — approving with minor suggestions.

📋 Key Themes & Highlights

Key Themes

Test coverage gaps: Two small but meaningful test cases are missing — a below-threshold false-case for hasNumerousPermissionDeniedIssues, and a non-empty denied_commands case for buildMissingToolPermissionIssuePayload.
Stateful regex footgun: The module-level /gi regex is safe with the current String.match() usage, but becomes a subtle trap for any future caller using .test() in a loop.

Positive Highlights

✅ Excellent scoping — only the duplicated logic moved, harness-specific retry policy untouched
✅ Clean, focused module with JSDoc on every function
✅ Test file follows established ESM-over-CJS pattern correctly
✅ The refactor eliminates ~198 lines of triplicated code with zero behaviour change — a textbook /improve-codebase-architecture win

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · sonnet46 1.6M

github-actions · 2026-06-02T13:34:27Z

+  it("detects numerous permission-denied issues at threshold", () => {
+    const output = "permission denied\npermission denied\npermission denied";
+    expect(hasNumerousPermissionDeniedIssues(output)).toBe(true);
+  });


[/tdd] Only the at-threshold (true) case is tested — missing a below-threshold assertion to guard against an accidental off-by-one if the threshold ever changes.

💡 Suggested addition

it("returns false when below the threshold", () => { const output = "permission denied\npermission denied"; // 2 < 3 expect(hasNumerousPermissionDeniedIssues(output)).toBe(false); });

Without a false-case test, a regression that lowers the threshold or removes the check would still pass the suite.

github-actions · 2026-06-02T13:34:27Z

+      alternatives: "Verify token scopes, repository permissions, and MCP/tool access configuration.",
+      denied_commands: [],
+    });
+  });


[/tdd] buildMissingToolPermissionIssuePayload is only tested with no arguments (empty denied_commands). The non-empty path — the one that fires in production — is untested.

💡 Suggested addition

it("builds missing_tool payload with denied commands list", () => { const payload = JSON.parse(buildMissingToolPermissionIssuePayload(["git push", "npm publish"])); expect(payload.denied_commands).toEqual(["git push", "npm publish"]); });

This guards the deniedCommands && deniedCommands.length > 0 conditional, which is the production path when extractDeniedCommands finds matches.

github-actions · 2026-06-02T13:34:27Z

+"use strict";
+
+const PERMISSION_DENIED_PATTERN = /\b(?:permission denied|permissions denied|EACCES|EPERM)\b/gi;
+const NUMEROUS_PERMISSION_DENIED_THRESHOLD = 3;


[/improve-codebase-architecture] The module-level /gi regex is stateful — its lastIndex advances on every .test() or .exec() call. It's safe today because countPermissionDeniedIssues only calls String.match() (which resets lastIndex), but any future caller using .test() in a loop will get intermittent misses.

💡 Suggestion: use a factory function instead

// Instead of a module-level regex constant: const PERMISSION_DENIED_PATTERN = () => /\b(?:permission denied|permissions denied|EACCES|EPERM)\b/gi; // Usage: const matches = output.match(PERMISSION_DENIED_PATTERN());

Or, since String.match() is the only consumer and it resets lastIndex, at minimum add a comment noting the statefulness risk to warn future callers.

pelikhan · 2026-06-02T14:04:15Z

@copilot review all comments
Fix js https://github.com/github/gh-aw/actions/runs/26822143918/job/79082057694
Apply copilot-review skill

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-06-02T14:18:44Z

@copilot review all comments
Fix js https://github.com/github/gh-aw/actions/runs/26822143918/job/79082057694
Apply copilot-review skill

Addressed in dea10f2.

I investigated the referenced js-typecheck failure and fixed actions/setup/js/copilot_harness.cjs by making the SDK/CLI execution branches explicit so SDK mode preserves the existing prompt invariant while satisfying typecheck. Re-ran npm run typecheck, targeted test permission_denied_helpers.test.cjs, npm run lint:cjs, and parallel validation; all passed.

Initial plan

fb2d992

Copilot AI assigned Copilot and gh-aw-bot Jun 2, 2026

Copilot started work on behalf of gh-aw-bot June 2, 2026 06:52 View session

Copilot AI linked an issue Jun 2, 2026 that may be closed by this pull request

[duplicate-code] Permission-denied handling duplicated across harnesses #36414

Closed

Copilot AI and others added 2 commits June 2, 2026 07:03

refactor: centralize permission-denied harness helpers

b149bb9

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

docs: make shared permission helper comments engine-neutral

4383940

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Refactor permission-denied handling to eliminate duplication~~ Refactor harness permission-denied handling into shared helper module Jun 2, 2026

Copilot finished work on behalf of gh-aw-bot June 2, 2026 07:06

Copilot AI requested a review from gh-aw-bot June 2, 2026 07:06

github-actions Bot mentioned this pull request Jun 2, 2026

[aw] No-Op Runs #36156

Closed

pelikhan marked this pull request as ready for review June 2, 2026 13:14

Copilot AI review requested due to automatic review settings June 2, 2026 13:14

Copilot started reviewing on behalf of pelikhan June 2, 2026 13:14 View session

Copilot AI reviewed Jun 2, 2026

View reviewed changes

github-actions Bot approved these changes Jun 2, 2026

View reviewed changes

github-actions Bot requested changes Jun 2, 2026

View reviewed changes

github-actions Bot approved these changes Jun 2, 2026

View reviewed changes

Copilot started work on behalf of pelikhan June 2, 2026 14:07 View session

Copilot AI and others added 2 commits June 2, 2026 14:17

fix: resolve copilot harness SDK prompt typecheck error

7765734

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

refactor: keep SDK prompt invariant explicit in copilot harness

dea10f2

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan June 2, 2026 14:19

Copilot AI requested a review from pelikhan June 2, 2026 14:19

Merge branch 'main' into copilot/duplicate-code-permission-denied

be2dce5

pelikhan merged commit da0ba04 into main Jun 2, 2026
12 checks passed

pelikhan deleted the copilot/duplicate-code-permission-denied branch June 2, 2026 14:33

		@@ -0,0 +1,33 @@
		import { describe, it, expect } from "vitest";

Conversation

Copilot AI commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

github-actions Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 2, 2026

🧪 Test Quality Sentinel Report

Test Classification Details

Language Support

💡 counts repeated permission-denied signals

💡 detects numerous permission-denied issues at threshold

💡 extracts denied commands from pipe-marked output

💡 builds missing_tool payload with default denied commands

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

One blocker: test file uses ESM syntax in a .cjs module

Uh oh!

github-actions Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Skills-Based Review 🧠

Key Themes

Positive Highlights

Uh oh!

github-actions Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

pelikhan commented Jun 2, 2026

Uh oh!

Copilot AI commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Copilot AI commented Jun 2, 2026 •

edited

Loading

github-actions Bot commented Jun 2, 2026 •

edited

Loading

github-actions Bot commented Jun 2, 2026 •

edited

Loading

github-actions Bot commented Jun 2, 2026 •

edited

Loading

github-actions Bot commented Jun 2, 2026 •

edited

Loading

💡 `counts repeated permission-denied signals`

💡 `detects numerous permission-denied issues at threshold`

💡 `extracts denied commands from pipe-marked output`

💡 `builds missing_tool payload with default denied commands`

One blocker: test file uses ESM syntax in a `.cjs` module