Copy skills from aw.yml manifest first; copy skill folders recursively and safely#35946
Conversation
…ursively Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…lPath extraction Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
✅ PR Code Quality Reviewer completed the code quality review. |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. |
|
🧪 Test Quality Sentinel completed test quality analysis. |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
There was a problem hiding this comment.
Pull request overview
Fixes two gaps in package skill resolution: ensures manifest-listed skills are resolved before auto-scanned ones, and copies skill folders recursively (including nested subdirectories) instead of only top-level files.
Changes:
- Add manifest-first ordering with auto-scan appended for any skills not already covered (deduplicated by skill name).
- Add recursive directory listing helpers (
ListDirAllFilesRecursivelyForHost+ git fallback) and use them when listing skill folder contents. - Preserve nested file structure on disk by reconstructing each file's relative path under its skill name when installing.
Show a summary per file
| File | Description |
|---|---|
| pkg/parser/remote_fetch.go | Adds recursive Contents-API and git ls-tree -r listing helpers. |
| pkg/cli/add_package_manifest.go | Reorders skill resolution (manifest-first + always-auto-scan + dedup) and switches to recursive listing per skill dir. |
| pkg/cli/add_command.go | Reconstructs the relative path under the skill dir using a /<skill>/ boundary so nested files are written under the matching subdirectory. |
| pkg/cli/add_package_manifest_test.go | Wires the new injectable recursive lister into existing tests and adds two cases (manifest-first ordering, nested files). |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 4/4 changed files
- Comments generated: 0
…recursive copy Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🏗️ Design Decision Gate — ADR RequiredThis PR makes significant changes to core business logic (322 new lines under 📄 Draft ADR committed:
📋 What to do next
Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision. ❓ Why ADRs Matter
ADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you. 📋 Michael Nygard ADR Format ReferenceAn ADR must contain these four sections to be considered complete:
All ADRs are stored in
|
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /diagnose, /tdd, and /zoom-out. The PR fixes two real silent-data-loss bugs cleanly and ships good regression tests. Four observations worth addressing — none are blocking on their own, but together they'd harden a correctness-focused PR.
📋 Key Themes & Highlights
Key Themes
- Unbounded recursion in
listContentsRecursively— no depth cap, GitHub Contents API silently truncates at 1000 entries per directory - Misleading guard in the git fallback —
afterDirPath != ""doesn't actually filter non-matching lines - Silent flattening fallback in
add_command.go— whenstrings.Indexreturns -1,filepath.Basequietly loses the nested path, reintroducing the bug this PR set out to fix - Deduplication key in
appendIfNewuses base name rather than full path — two paths with the same base name silently collapse
Positive Highlights
- ✅ The
strings.Indexwith/skillName/boundary guards prevent false truncation when skill name appears in a parent segment — nice - ✅ New tests directly cover both fixed bugs (ordering and nested files) with clear arrange/act/assert structure
- ✅
manifestSkillDirSetcleanly separates the SKILL.md validation concern from dedup logic - ✅ Auth-error fallback pattern in the new recursive function is consistent with existing
listDirAllFilesForHost
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · sonnet46 1.9M
| func listDirAllFilesRecursivelyViaGitForHost(owner, repo, ref, dirPath, host string) ([]string, error) { | ||
| remoteLog.Printf("Git fallback for listing all dir files recursively: %s/%s@%s (path: %s)", owner, repo, ref, dirPath) | ||
|
|
||
| tmpDir, err := getOrCreateListRepoClone(owner, repo, ref, host) |
There was a problem hiding this comment.
[/diagnose] listContentsRecursively has no depth limit — a repo with a deep or cyclic directory tree (e.g. symlinks on GitHub) could trigger unbounded recursion and exhaust the stack or make many unexpected API calls.
💡 Suggestion
Add a depth parameter and bail out after a reasonable limit (e.g. 10 levels):
const maxSkillDirDepth = 10
func listContentsRecursively(client *api.RESTClient, owner, repo, ref, dirPath string, depth int) ([]string, error) {
if depth > maxSkillDirDepth {
return nil, fmt.Errorf("skill directory %q exceeds max depth (%d)", dirPath, maxSkillDirDepth)
}
// ...
subFiles, err := listContentsRecursively(client, owner, repo, ref, item.Path, depth+1)
}Also worth documenting: the GitHub Contents API caps at 1000 entries per directory request. Files are silently truncated for oversized directories. A comment noting this known limit would help future maintainers.
| } | ||
| } | ||
|
|
||
| remoteLog.Printf("Found %d files recursively in dir via git for %s/%s@%s (path: %s)", len(files), owner, repo, ref, dirPath) |
There was a problem hiding this comment.
[/diagnose] The afterDirPath != "" guard doesn't actually filter non-matching lines — strings.TrimPrefix returns the original string unchanged when the prefix isn't present, so any non-empty line passes the check. The git ls-tree command already scopes output to files under cleanDirPath/, so in practice this is harmless, but the comment "Include all files at any depth under dirPath" and the conditional together are misleading.
💡 Suggestion
Either remove the redundant check and comment, or make it explicit:
// git ls-tree already scopes output to dirPath; include every non-empty line.
for _, line := range lines {
if line == "" {
continue
}
files = append(files, line)
}Alternatively, use strings.HasPrefix(line, dirPrefix) if you want a real guard against unexpected git output.
| idx := strings.Index(resolved.Spec.WorkflowPath, skillComponent) | ||
| var relPath string | ||
| if idx >= 0 { | ||
| relPath = filepath.FromSlash(resolved.Spec.WorkflowPath[idx+len(skillComponent):]) |
There was a problem hiding this comment.
[/diagnose] When idx < 0 (the skill name doesn't appear as a path component in WorkflowPath), the fallback silently uses filepath.Base, which flattens any nested file to the skill root. This reproduces the exact silent-drop bug that this PR fixes — just in a subtler form.
💡 Suggestion
Return an error instead of silently falling back, so unexpected path shapes surface immediately:
if idx < 0 {
return fmt.Errorf("skill %q: cannot determine relative path from WorkflowPath %q", resolved.SkillName, resolved.Spec.WorkflowPath)
}A test case where WorkflowPath doesn't contain the skill name would also help pin down the contract.
| return nil, nil, err | ||
| appendIfNew := func(dir string) { | ||
| name := filepath.Base(dir) | ||
| if _, exists := seenSkillNames[name]; !exists { |
There was a problem hiding this comment.
[/zoom-out] appendIfNew deduplicates by filepath.Base(dir). If two different full paths share the same base name (e.g. a manifest entry vendor/skills/review and an auto-scanned skills/review), the second is silently skipped. This is unlikely in current usage but the silent discard contradicts the correctness focus of this PR.
💡 Suggestion
Deduplicate on the full canonical path rather than the base name, or document the assumption that base names are unique across all skill dirs:
// appendIfNew deduplicates by full dir path.
appendIfNew := func(dir string) {
if _, exists := seenSkillNames[dir]; !exists {
seenSkillNames[dir] = struct{}{}
skillDirs = append(skillDirs, dir)
}
}The skill name for display/install can still be derived from filepath.Base(dir) downstream.
🧪 Test Quality Sentinel Report
📊 Metrics & Test Classification (20 tests analyzed)
Test Classification Details
Language SupportTests analyzed:
|
There was a problem hiding this comment.
REQUEST_CHANGES — one security bug and one behavioural regression must be fixed before merge.
### Blocking issues (3)
-
Path traversal (
add_command.go) —relPathsliced from a remoteWorkflowPathis passed directly tofilepath.Join(skillDir, relPath).filepath.Joincleans..but does not clamp withinskillDir; a crafted path likeskills/myskill/../../.github/workflows/evil.ymloverwrites arbitrary files in the repo. Add astrings.HasPrefixguard afterJoin. -
Unconditional
scanPackageSkillDirs(add_package_manifest.go) — was previously skipped when manifest skills were provided; now always runs. A transient API error (auth, rate limit) will fail the entireaddcommand for packages that previously worked fine with explicit skill dirs only. Treat auto-scan errors as non-fatal warnings. -
Unbounded recursion (
remote_fetch.go) —listContentsRecursivelyhas no depth limit; a deeply nested or adversarial package causes stack exhaustion. Add amaxDepthguard (10 levels is more than sufficient).
### Non-blocking (1)
strings.Indexfirst-match (add_command.go) — finds the first/skillName/in the path, which misidentifies the skill root if the same name appears earlier as a path component. Low probability but silent wrong-path bug; a path-split approach is safer.
🔎 Code quality review by PR Code Quality Reviewer · sonnet46 2M
| var relPath string | ||
| if idx >= 0 { | ||
| relPath = filepath.FromSlash(resolved.Spec.WorkflowPath[idx+len(skillComponent):]) | ||
| } else { |
There was a problem hiding this comment.
Path traversal: relPath from a remote URL can escape skillDir and write files anywhere on disk.
💡 Details and fix
filepath.Join resolves .. segments but does NOT clamp the result inside skillDir. A WorkflowPath like skills/myskill/../../.github/workflows/evil.yml produces a relPath of ../../.github/workflows/evil.yml, and filepath.Join(skillDir, relPath) happily produces a path two levels above skillDir.
Since WorkflowPath comes from a remote GitHub Contents API response (untrusted data), an adversarial or compromised package could overwrite arbitrary files in the local repo.
Suggested fix — validate the final path stays inside skillDir:
destFile := filepath.Join(skillDir, relPath)
if !strings.HasPrefix(destFile, skillDir+string(os.PathSeparator)) {
return fmt.Errorf("skill file path %q escapes skill directory %q", relPath, skillDir)
}| func resolvePackageSkillFiles(owner, repo, packagePath, ref, host string, explicitSkillDirs []string) ([]resolvedPackageSkillFile, []string, error) { | ||
| // seenSkillNames tracks skill names already added so that auto-scanned duplicates | ||
| // of manifest-specified skills are not added a second time. | ||
| seenSkillNames := make(map[string]struct{}) |
There was a problem hiding this comment.
Regression: scanPackageSkillDirs now runs unconditionally, breaking packages that use only manifest skills when the skills/ directory is absent or the API call fails.
💡 Details and fix
Previously, scanPackageSkillDirs was only called when explicitSkillDirs was empty. Now it always runs. If the call returns any non-404 error (auth failure, rate limit, network timeout), the entire resolvePackageSkillFiles call fails — even for packages whose manifests fully enumerate their skills and don't need auto-scanning at all.
This is a behavioural regression: adding a package that worked fine before (explicit skills, no skills/ dir) could now fail intermittently due to transient API errors on an API call that wasn't previously made.
Suggested fix — propagate scanPackageSkillDirs errors as warnings (not hard errors), since auto-scan is supplementary:
autoScanned, scanErr := scanPackageSkillDirs(owner, repo, packagePath, ref, host)
if scanErr != nil {
// Auto-scan is best-effort; log the error but don't block manifest skills.
warnings = append(warnings, fmt.Sprintf("auto-scan of skills/ failed: %v", scanErr))
}Or guard it: only auto-scan when explicitSkillDirs is empty.
| case "dir": | ||
| subFiles, err := listContentsRecursively(client, owner, repo, ref, item.Path) | ||
| if err != nil { | ||
| return nil, err |
There was a problem hiding this comment.
Unbounded recursion in listContentsRecursively can exhaust stack or make thousands of serial API calls against a deep/adversarial directory tree.
💡 Details and fix
Each dir entry in the Contents API response triggers a recursive call with no depth limit. A skill folder with 10 levels of nesting makes 10 serial recursive calls; one with 100 levels causes a stack overflow. Since this function processes content from an external package (untrusted), a malicious or misconfigured package could cause the agent process to hang or crash.
Suggested fix — add a depth guard:
const maxRecursionDepth = 10
func listContentsRecursively(client *api.RESTClient, owner, repo, ref, dirPath string) ([]string, error) {
return listContentsRecursivelyWithDepth(client, owner, repo, ref, dirPath, 0)
}
func listContentsRecursivelyWithDepth(client *api.RESTClient, owner, repo, ref, dirPath string, depth int) ([]string, error) {
if depth > maxRecursionDepth {
return nil, fmt.Errorf("skill directory %q exceeds maximum nesting depth %d", dirPath, maxRecursionDepth)
}
// ... existing code, passing depth+1 for recursive calls ...
}|
|
||
| // Determine the relative path of the file within the skill directory so that any | ||
| // nested subdirectories are preserved (e.g. "scripts/query.sh" stays under scripts/). | ||
| // We search for the skill name as a proper path component (bounded by "/") so that a |
There was a problem hiding this comment.
strings.Index finds the first occurrence of the skill name as a path component, which can match the wrong segment if the name appears earlier in the path.
💡 Details and fix
If WorkflowPath is something like vendor/foo/bar/foo/file.yaml and SkillName is foo, strings.Index returns the position of the first /foo/ (inside vendor/foo/bar), producing a relPath of bar/foo/file.yaml instead of the intended file.yaml. The bug is silent — no error is returned, just a wrong destination path.
A safer approach splits on path separators:
parts := strings.Split(filepath.ToSlash(resolved.Spec.WorkflowPath), "/")
var relParts []string
for i, part := range parts {
if part == resolved.SkillName && i+1 < len(parts) {
relParts = parts[i+1:]
break
}
}
if len(relParts) > 0 {
relPath = filepath.Join(relParts...)
} else {
relPath = filepath.Base(resolved.Spec.WorkflowPath)
}This matches on the exact path component rather than a substring.
|
|
|
@copilot review all comments and reviews from copilot and GitHub-actions actor. Resolve. |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Addressed the unresolved Copilot/github-actions feedback in commits d3ebb31 and 35f97af. Resolved items:
|
|
|
Skill resolution had two gaps: manifest-listed skills weren't guaranteed to appear before auto-scanned ones, and only top-level files in a skill folder were copied (subdirectories were silently dropped). This PR also hardens the new behavior based on review feedback.
Behaviour changes
skills:inaw.ymlare resolved first;skills/is then auto-scanned and any additional skill folders not already covered by the manifest are appended.scripts/,prompts/,scripts/helpers/).Implementation
pkg/parser/remote_fetch.go— addslistDirAllFilesRecursivelyForHost(Contents API walk) and a git-based fallbacklistDirAllFilesRecursivelyViaGitForHost(ls-tree -r), plus a recursion depth guard for recursive directory traversal.pkg/cli/add_package_manifest.go— new injectablelistPackageDirFilesRecursivelyForHostvar;resolvePackageSkillFilesrewritten with manifest-first ordering, full-path deduplication, recursive listing per skill dir, and non-fatal auto-scan errors when manifest skills are explicit.pkg/cli/add_command.go—addSkillFileWithTrackingnow derives relative paths using skill-root path components and validates the final destination remains within the skill directory (prevents traversal/flattening issues while preserving nested structure).pkg/cli/add_package_manifest_test.gofor manifest-first + auto-scan extras, nested skill files, and auto-scan error warning behavior.pkg/cli/add_command_test.gofor correct nested path reconstruction, traversal rejection, and invalid path-shape rejection.pkg/parser/remote_fetch_test.gofor recursion depth guard behavior.