refactor(workflow): decompose Claude allowed-tools assembly to reduce function complexity#35812
Conversation
Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>
Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>
Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Refactors computeAllowedClaudeToolsString in pkg/workflow/claude_tools.go into a series of small, focused helper functions to reduce function complexity, with new helper-level unit tests added for the extracted wildcard-detection and sandbox-path-normalization logic. The public contract (--allowed-tools string, sorted and deduplicated) is preserved.
Changes:
- Decomposed the monolithic
computeAllowedClaudeToolsStringinto ~12 focused helpers (preparation, collection, top-level/MCP/cache-memory/sandbox/safe-outputs/mcp-scripts expansion, dedupe). - Extracted
hasBashWildcard,isClaudeToolName,normalizeSandboxWritablePattern,getOrCreateToolMap, andappendIfMissingutilities, removing the priorgoto-based control flow for Bash wildcard handling. - Added
claude_tools_helpers_test.gocoveringhasBashWildcardandnormalizeSandboxWritablePattern.
Show a summary per file
| File | Description |
|---|---|
| pkg/workflow/claude_tools.go | Breaks computeAllowedClaudeToolsString into composable helpers while preserving the sorted/deduped output contract. |
| pkg/workflow/claude_tools_helpers_test.go | New unit tests pinning down wildcard detection and sandbox-path normalization semantics. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 2/2 changed files
- Comments generated: 0
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. |
|
🧪 Test Quality Sentinel completed test quality analysis. |
|
✅ PR Code Quality Reviewer completed the code quality review. |
🧪 Test Quality Sentinel Report✅ Test Quality Score: 100/100 — Excellent
📊 Metrics & Test Classification (2 tests analyzed)
Test Classification Details
Language SupportTests analyzed:
Verdict
📖 Understanding Test ClassificationsDesign Tests (High Value) verify what the system does:
Implementation Tests (Low Value) verify how the system does it:
Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.
|
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🏗️ Design Decision Gate — ADR RequiredThis PR makes significant changes to core business logic (307 new lines in 📄 Draft ADR committed:
📋 What to do next
Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision. ❓ Why ADRs Matter
ADRs create a searchable, permanent record of why the codebase looks the way it does. Even behavior-preserving refactors of security-sensitive paths (here, the agent tool-permission allowlist) benefit from a recorded rationale: future contributors editing 📋 Michael Nygard ADR Format ReferenceAn ADR must contain these four sections to be considered complete:
All ADRs are stored in
|
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /zoom-out and /tdd — leaving as COMMENT; no blocking correctness issues, but a few things worth addressing.
📋 Key Themes & Highlights
Positive Highlights
- ✅ Clean pipeline decomposition in
computeAllowedClaudeToolsString— the new 7-step sequence reads like a specification - ✅
normalizeSandboxWritablePatternis a textbook extract: pure function, well-tested, removes duplication - ✅
appendIfMissingeliminates the scatteredif !slices.Contains(...) { append }pattern — good DRY move - ✅
switchinappendTopLevelClaudeToolsis cleaner than the original chained if/else
Issues Found
getOrCreateToolMapmisleading name — doesn't write back to the container; callers must do it manually (medium risk of future bugs)- Double scan in
appendGitHubMCPTools/appendGenericMCPTools— minor inefficiency, introduces inconsistency with single-pass original - Test coverage gap — only 2 of ~15 helpers have unit tests; the PR description says "focused unit coverage" but most helpers are untested
isClaudeToolNameis an undocumented heuristic — any uppercase-first map key becomes an allowed tool; deserves a clarifying comment
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · sonnet46 1.9M
| for _, defaultTool := range defaultClaudeTools { | ||
| if _, exists := claudeExistingAllowed[defaultTool]; !exists { | ||
| claudeExistingAllowed[defaultTool] = nil // Add tool with null value | ||
| func getOrCreateToolMap(container map[string]any, key string) map[string]any { |
There was a problem hiding this comment.
[/zoom-out] getOrCreateToolMap doesn't actually store the new map back in container — the name implies it does both, but callers must remember to assign the result back themselves.
💡 Rename or redesign to avoid subtle mutation bugs
The current contract:
claudeSection := getOrCreateToolMap(tools, "claude")
// ...mutate claudeSection...
tools["claude"] = claudeSection // ← caller must remember thisThis is safe today because both call-sites in ensureDefaultClaudeAllowedTools manually write back, but the name suggests the map is already wired into the container. A future caller could skip the write-back.
Consider renaming to extractOrNewMap (read-only extraction), or change the signature to accept the parent container and key and mutate it:
// Option A – rename to reflect read-only semantics
func extractOrNewMap(container map[string]any, key string) map[string]any
// Option B – keep setOrCreate with the write-back inside
func setOrCreateMap(container map[string]any, key string) map[string]any {
m := extractOrNewMap(container, key)
container[key] = m // always write back
return m
}Note also that collectClaudeAllowedTools uses typeutil.LookupMap for the same pattern — unifying on one approach would reduce cognitive load.
| func appendGitHubMCPTools(allowedTools []string, toolName string, toolValue any, mcpConfig map[string]any) []string { | ||
| githubConfig := parseGitHubTool(toolValue) | ||
| if githubConfig != nil && len(githubConfig.Allowed) > 0 { | ||
| for _, tool := range githubConfig.Allowed { |
There was a problem hiding this comment.
[/zoom-out] appendGitHubMCPTools (and appendGenericMCPTools below) scan the allowed list twice: once for a wildcard and once for specific tools. A single pass with an early-return flag was used in the original code; the double-pass here is correct but slightly less efficient.
💡 Single-pass alternative
hasWildcard := false
for _, tool := range githubConfig.Allowed {
if string(tool) == "*" {
hasWildcard = true
break
}
}
if hasWildcard {
return append(allowedTools, "mcp__"+toolName)
}
for _, tool := range githubConfig.Allowed {
allowedTools = append(allowedTools, fmt.Sprintf("mcp__%s__%s", toolName, string(tool)))
}For large Allowed slices the double-scan wastes work. The single-pass approach with a boolean flag reads clearly and mirrors the original logic.
| assert.Equal(t, tt.wantOkay, gotOkay, "ok flag should match expected") | ||
| }) | ||
| } | ||
| } |
There was a problem hiding this comment.
[/tdd] The new test file covers only 2 of the ~15 extracted helpers. Untested helpers like appendClaudeBashTools, appendGitHubMCPTools, and dedupeAllowedTools contain nuanced logic and are the most likely sources of behavioral regressions.
💡 Suggested additional test cases
appendClaudeBashTools edge cases:
- nil value →
Bash - wildcard
*→Bash - trailing space-star:
"jq *"normalizes toBash(jq) - specific command:
"git"→Bash(git)
appendGitHubMCPTools:
- wildcard allowed list →
mcp__github - specific tools →
mcp__github__tool_name - nil allowed list → default GitHub tools
dedupeAllowedTools:
["Read", "Bash", "Read"]→["Read", "Bash"]
These would fulfil the PR description's goal of "focused unit coverage to lock semantics."
| return false | ||
| } | ||
|
|
||
| func isClaudeToolName(toolName string) bool { |
There was a problem hiding this comment.
[/zoom-out] isClaudeToolName checks only that the first character is uppercase — it's really startsWithUppercase, not a semantic validator. Any map key with an uppercase first letter (including potential future non-tool keys) would be treated as a Claude tool.
💡 Consider a more explicit check or a comment clarifying intent
The original code had the same logic inline, but wrapping it in a named function now makes it look authoritative. Add a comment clarifying the heuristic:
// isClaudeToolName returns true if toolName appears to be a Claude tool
// by convention: Claude tool names are PascalCase and start with an uppercase letter.
// This is a heuristic; map keys that accidentally start with uppercase will pass.
func isClaudeToolName(toolName string) bool {
return len(toolName) > 0 && unicode.IsUpper(rune(toolName[0]))
}Alternatively, maintain an explicit allowlist of known Claude tool names to make unexpected keys fail loudly rather than silently become allowed tools.
There was a problem hiding this comment.
Clean, behavior-preserving refactor — the pipeline decomposition in computeAllowedClaudeToolsString reads well and the overall structure is an improvement.
Non-blocking observations (3)
Issues raised
-
getOrCreateToolMapmisleading contract (line 200) — returns a detached empty map without writing it back; callers must manually write back or mutations are lost. Works today, silent trap for future callers. Rename or make it self-storing. -
Explanatory comment dropped from
appendSafeOutputsTools(line 420) — the original code had a comment + disabled scoped-write explaining why a bareWriteis granted, referencing issue #244. Stripped in the refactor, leaving a security-sensitive grant with no rationale. -
Test coverage skips the complex helpers (test file) — only
hasBashWildcardandnormalizeSandboxWritablePatternare tested.appendGitHubMCPTools,collectClaudeAllowedTools,ensureDefaultClaudeAllowedTools, anddedupeAllowedToolshave no new tests despite being the logic-heavy paths that most benefit from extraction.
None of these are blocking.
🔎 Code quality review by PR Code Quality Reviewer · sonnet46 2.5M
| return existingMap | ||
| } | ||
| } | ||
| return make(map[string]any) |
There was a problem hiding this comment.
getOrCreateToolMap has a misleading contract: it never writes the newly created map back to container, so callers must manually assign the returned map back — which works today but is a trap for future callers.
💡 Details
The function name implies ensure
| } | ||
| allowedTools = append(allowedTools, "mcp__"+string(constants.SafeOutputsMCPServerID)) | ||
| if !slices.Contains(allowedTools, "Write") { | ||
| allowedTools = append(allowedTools, "Write") |
There was a problem hiding this comment.
Explanatory comment removed from appendSafeOutputsTools: the unscoped Write grant looks like an oversight without the original context, but it was a deliberate known limitation.
💡 Details
The original code included:
// Ideally we would only give permission to the exact file, but that doesn't seem
// to be working with Claude. See https://github.com/github/gh-aw/issues/244#issuecomment-3240319103
(allowedtools/redacted) = append(allowedTools, "Write(${{ env.GH_AW_SAFE_OUTPUTS }})")The refactor stripped this, leaving a bare unrestricted Write (grants filesystem write to any path) with no rationale. A future reader or security reviewer will flag this immediately. Add the godoc comment back referencing #244 so the intent is clear.
| assert.Equal(t, tt.wantOkay, gotOkay, "ok flag should match expected") | ||
| }) | ||
| } | ||
| } |
There was a problem hiding this comment.
Test coverage is thin for the highest-complexity extracted helpers: only two trivial leaf functions are tested, while the functions carrying the actual allowed-tools assembly logic have no new tests.
💡 Details
Untested extracted helpers include:
collectClaudeAllowedTools— iterates over the allowed map and invokes the Bash-tool path; edge cases like empty map, missing claude key, or a map where only KillBash/BashOutput appear are not coveredappendGitHubMCPTools— wildcard vs specific-tool branching, fallback to default tools, remote vs local modeappendGenericMCPTools— wildcard early-return vs per-item expansionensureDefaultClaudeAllowedTools— default-tool injection and Bash-implies-KillBash/BashOutput side-effectdedupeAllowedTools— dedup correctness for duplicate Bash(...) entries
The PR description says 'Focused unit coverage for extracted logic' but only covers hasBashWildcard and normalizeSandboxWritablePattern, which are the simplest helpers. The complex branching (especially appendGitHubMCPTools) is the most likely source of regressions.
|
@copilot review all review comments from copilot and github-actions actor |
I checked this. |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Reviewed all Copilot and github-actions review feedback and applied the actionable code/test updates in
I also checked the previously linked failing CI job logs; that failure was on older SHA |
This issue targets large-function complexity across workflow/CLI codepaths. This PR addresses one of the highest-impact workflow paths by decomposing Claude allowed-tools generation into focused helpers without changing behavior.
Scope reduced in
claude_tools.gocomputeAllowedClaudeToolsStringinto composable helpers for:allowedextraction--allowed-toolsstring remains sorted/deduped).Behavior-preserving helper extraction
*,:*)Focused unit coverage for extracted logic
Example of the extracted pattern: