Add gh aw fix codemod to rewrite single-quoted tools.bash args safely#31874
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
gh aw fix codemod to rewrite single-quoted tools.bash args safely
There was a problem hiding this comment.
Pull request overview
Adds a new gh aw fix codemod to migrate tools.bash allowlist entries away from single-quoted shell arguments (which currently trigger Copilot CLI truncation/prefix sanitization), plus registry integration and unit tests.
Changes:
- Registers a new codemod (
bash-single-quoted-args-rewrite) in the fix codemod registry and expected execution order. - Implements the codemod to rewrite single-quoted segments in
tools.bashentries into double-quoted equivalents with escaping. - Adds focused unit tests for the codemod and the rewrite helper.
Show a summary per file
| File | Description |
|---|---|
| pkg/cli/fix_codemods.go | Registers the new codemod in the global codemod list. |
| pkg/cli/fix_codemods_test.go | Updates registry/order assertions to include the new codemod ID. |
| pkg/cli/codemod_bash_single_quoted_args.go | Implements the codemod and the single-quote rewrite helper. |
| pkg/cli/codemod_bash_single_quoted_args_test.go | Adds unit tests validating codemod behavior and helper escaping. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comments suppressed due to low confidence (1)
pkg/cli/codemod_bash_single_quoted_args.go:127
- rewriteSingleQuotedBashArgs treats every ' byte as the start of a single-quoted segment without tracking shell context (e.g., inside double quotes/backticks) or unquoted escaping ('). This can incorrectly rewrite commands like echo "a'b'" and can also mis-classify escaped apostrophes as “unmatched”, changing semantics or emitting misleading diagnostics. Consider a small shell-quoting state machine (unquoted/single/double/backtick + escape handling) and mark complex constructs (e.g. ''' concatenation used to embed a quote) as unsafe/no-op rather than partially rewriting.
for i := 0; i < len(cmd); {
if cmd[i] != '\'' {
b.WriteByte(cmd[i])
i++
continue
- Files reviewed: 4/4 changed files
- Comments generated: 3
|
|
||
| for _, cmd := range unsafeCommands { | ||
| fmt.Fprintln(os.Stderr, console.FormatWarningMessage( | ||
| fmt.Sprintf("tools.bash entry %q contains unmatched single quotes and could not be safely rewritten; left unchanged", cmd))) |
| var lines []string | ||
| lines = append(lines, "---") | ||
| if frontmatterStr != "" { | ||
| lines = append(lines, strings.Split(frontmatterStr, "\n")...) | ||
| } |
| func TestRewriteSingleQuotedBashArgs(t *testing.T) { | ||
| tests := []struct { | ||
| name string | ||
| input string | ||
| want string |
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /diagnose and /tdd — this is a bug-fix PR with new unit coverage, so both skills are directly applicable.
Key Themes
- Double-quote context gap (correctness):
rewriteSingleQuotedBashArgsprocesses bytes without tracking whether the current position is inside a"-quoted segment. A command likegrep 'pattern' "it's valid"causes a false-positive "unmatched single quotes" warning and is left unchanged, even though both single-quoted segments are valid. The fix is a smallinDoublebool in the loop. - Missing regression test: No table entry covers the mixed double/single-quote scenario, so the above bug has no automated guard.
- Frontmatter mutation side-effect:
Applywrites back into the caller-suppliedfrontmattermap. Other codemods in the registry only read the map; this is an undocumented inconsistency that could silently corrupt state if the pipeline ordering or sharing ever changes.
Positive Highlights
- ✅ Safety-first design: unmatched quotes are never rewritten, only warned about.
- ✅ Escaping logic correctly handles
\,",$, and ``` to preserve literal semantics. - ✅ Table-driven tests are well-named and cover the happy-path cases cleanly.
- ✅ Codemod registration and ordering tests updated in the same PR — no drift.
Verdict
Requesting changes on the double-quote-context correctness issue before merge; it affects real-world commands (anything with a contraction or possessive inside double quotes mixed with single-quoted args).
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · ● 7.2M
| if frontmatterStr != "" { | ||
| lines = append(lines, strings.Split(frontmatterStr, "\n")...) | ||
| } | ||
| lines = append(lines, "---") |
There was a problem hiding this comment.
[/diagnose] The byte scanner doesn't track whether it's currently inside a double-quoted segment, so a command like grep 'pattern' "it's valid" triggers a false-positive "unmatched single quotes" warning and is left unchanged — even though the single quotes are perfectly valid.
Walk-through: after rewriting 'pattern' → "pattern", the loop hits the ' inside "it's" at the next iteration, fails to find a matching ', returns cmd, false, false, and the command ends up in unsafeCommands.
Consider tracking double-quote state:
inDouble := false
for i := 0; i < len(cmd); {
if cmd[i] == '"' {
b.WriteByte(cmd[i])
i++
continue
}
if cmd[i] != '\'' || inDouble {
b.WriteByte(cmd[i])
i++
continue
}
// ... existing single-quote rewrite logic
}A regression test covering grep 'pattern' "it's ok" would lock this in.
| wantApply: false, | ||
| }, | ||
| { | ||
| name: "unmatched single quote", |
There was a problem hiding this comment.
[/tdd] The TestRewriteSingleQuotedBashArgs table is missing a case that would catch the double-quote-context bug: a command mixing a rewritable single-quoted segment with a single quote embedded inside a double-quoted word (e.g. grep 'pattern' "it's valid").
Suggested addition:
{
name: "single-quote inside double-quoted word is skipped",
input: "grep 'pattern' \"it's valid\"",
want: "grep \"pattern\" \"it's valid\"",
wantSafe: true,
wantApply: true,
},Without it, the double-quote-tracking fix has no automated guard and the regression can silently reappear.
| } | ||
|
|
||
| if !changed { | ||
| return content, false, nil |
There was a problem hiding this comment.
[/diagnose] Apply mutates the caller-supplied frontmatter map in place (toolsMap["bash"] = updated, frontmatter["tools"] = toolsMap). Every other codemod in the registry only reads the map; if the pipeline ever passes the same map to multiple codemods this will cause silent state contamination.
A quick defensive copy at the top of the Apply body, or assigning to a local rather than back-writing to frontmatter, would make the side-effect explicit and safe.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Commit pushed:
|
🏗️ Design Decision Gate — ADR RequiredThis PR makes significant changes to core business logic (332 new lines in AI has analyzed the PR diff and generated a draft ADR to help you get started: 📄 Draft ADR: What to do next
Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision. Why ADRs Matter
ADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you. 📋 Michael Nygard ADR Format ReferenceAn ADR must contain these four sections to be considered complete:
All ADRs are stored in References:
|
🧪 Test Quality Sentinel ReportTest Quality Score: 85/100✅ Excellent test quality
Test Classification DetailsView all 6 test classifications
Flagged Tests — Requires Review
|
| Component | Weight | Score |
|---|---|---|
| Behavioral coverage (6/6 design tests) | 40 | 40 |
| Error/edge case coverage (3/6 tests) | 30 | 15 |
| Low duplication (0 clusters) | 20 | 20 |
| Proportional growth (ratio 1.08×) | 10 | 10 |
| Total | 100 | 85 |
Language Support
Tests analyzed:
- 🐹 Go (
*_test.go): 6 tests — unit (//go:build !integration) - 🟨 JavaScript (
*.test.cjs,*.test.js): 0 tests
Verdict
✅ Check passed. 0% of new tests are implementation tests (threshold: 30%). The single flagged item (missing assertion messages) is a style guideline to improve debuggability but does not block merging.
📖 Understanding Test Classifications
Design Tests (High Value) verify what the system does:
- Assert on observable outputs, return values, or state changes
- Cover error paths and boundary conditions
- Would catch a behavioral regression if deleted
- Remain valid even after internal refactoring
Implementation Tests (Low Value) verify how the system does it:
- Assert on internal function calls (mocking internals)
- Only test the happy path with typical inputs
- Break during legitimate refactoring even when behavior is correct
- Give false assurance: they pass even when the system is wrong
Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.
References: §25781277304
🧪 Test quality analysis by Test Quality Sentinel · ● 4.4M · ◷
|
@copilot review all comments |
|
@copilot review all comments |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Bug Fix
What was the bug?
tools.bashallowlist entries containing single-quoted args (for examplegrep -rn 'pattern' --include='*.lua') were later truncated to a prefix for Copilot shell prefix-matching, causing users to lose command precision. This change adds an automatic migration path ingh aw fix.How did you fix it?
bash-single-quoted-args-rewritetools.bashlist entries in frontmatter.\,",$, and`to preserve literal meaning.gh aw fix --writeapplies it.Testing
'pattern'→"pattern")--include='*.lua'→--include="*.lua")rewrites to: