diff --git a/.claude/skills/address-pr-comments/SKILL.md b/.claude/skills/address-pr-comments/SKILL.md deleted file mode 100644 index d31e1ced..00000000 --- a/.claude/skills/address-pr-comments/SKILL.md +++ /dev/null @@ -1,336 +0,0 @@ ---- -name: address-pr-comments -description: Read PR review comments, evaluate validity, implement fixes, push changes, and reply/resolve threads -argument-hint: "[pr-number|pr-url]" ---- - -Address code review comments on **$ARGUMENTS** (or the current branch's PR if no argument is given). - ---- - -## Workflow - -### 1. Identify the PR - -Determine the target PR: - -```bash -# If argument provided, use it; otherwise detect from current branch -gh pr view $ARGUMENTS --json number,url,headRefName,baseRefName,author -``` - -If no PR is found, stop and inform the user. - -Extract owner, repo, PR number, and **PR author login** for subsequent API calls: - -```bash -gh repo view --json owner,name --jq '"\(.owner.login)/\(.name)"' -``` - -### 2. Fetch review comments and summaries - -#### 2a. Determine the latest review round - -Find the timestamp of the most recent push to the PR branch — this marks the boundary of the current review round: - -```bash -# Get the most recent push event (last commit pushed) -gh api repos/{owner}/{repo}/pulls/{pr-number}/commits \ - --jq '.[-1].commit.committer.date' -``` - -Store this as `$LAST_PUSH_DATE`. Comments created **after** this timestamp are from the current (latest) review round. If no filtering by round is desired (e.g., first review), process all unresolved comments. - -#### 2b. Fetch inline review comments - -Retrieve all review comments (inline code comments) on the PR: - -```bash -gh api repos/{owner}/{repo}/pulls/{pr-number}/comments \ - --paginate \ - --jq '.[] | {id: .id, node_id: .node_id, user: .user.login, path: .path, line: .line, original_line: .original_line, side: .side, body: .body, in_reply_to_id: .in_reply_to_id, created_at: .created_at}' \ - 2>&1 | head -500 -``` - -#### 2c. Fetch review summaries - -Fetch top-level review comments (review bodies/summaries). These often contain high-level feedback and action items: - -```bash -gh api repos/{owner}/{repo}/pulls/{pr-number}/reviews \ - --jq '.[] | select(.body != "" and .body != null) | {id: .id, user: .user.login, state: .state, body: .body, submitted_at: .submitted_at}' \ - 2>&1 | head -200 -``` - -**Pay special attention to review summaries** — they often list multiple action items in a single review body. Parse each action item from the summary as a separate work item. - -#### 2d. Filter comments - -**Include** comments from: -- **Reviewers** (anyone who is not the PR author) — standard review feedback -- **The PR author themselves** — self-comments are treated as actionable TODOs/notes-to-self that should be addressed -- **@codex** and other AI reviewers — treat their comments with the same weight as human reviewer comments - -**Exclude**: -- Already-resolved threads -- Bot comments that are purely informational (CI status, auto-generated labels, etc.) — but NOT @codex or other AI reviewer comments, which are substantive - -Check which threads are already resolved: - -```bash -gh api graphql -f query=' - query($owner: String!, $repo: String!, $pr: Int!) { - repository(owner: $owner, name: $repo) { - pullRequest(number: $pr) { - reviewThreads(first: 100) { - nodes { - id - isResolved - comments(first: 10) { - nodes { - databaseId - body - author { login } - } - } - } - } - } - } - } -' -f owner="{owner}" -f repo="{repo}" -F pr={pr-number} -``` - -Only process **unresolved** threads with actionable comments. - -#### 2e. Prioritize latest comments - -When there are many unresolved comments, prioritize: -1. Comments from the **latest review round** (after `$LAST_PUSH_DATE`) -2. Comments from review summaries (they represent the reviewer's consolidated view) -3. Older unresolved comments that are still relevant - -### 2b. Read the PR specs - -Before evaluating any comment, read the PR description to check for a **SPECS** section: - -```bash -gh pr view $ARGUMENTS --json body --jq '.body' -``` - -If a SPECS section is present, **it defines the authoritative requirements for this PR**. Specs override: -- Your assumptions about backward compatibility or design intent -- Inline code comments -- Conventions from other parts of the codebase - -Store the specs for use in step 4 (validity evaluation). If a reviewer comment aligns with a spec, the comment is **valid by definition** — even if you think the current implementation is reasonable. - -### 3. Understand each comment - -For each unresolved review comment: - -1. **Read the file and surrounding context** at the line referenced by the comment -2. **Read the PR diff** to understand what changed: - ```bash - gh pr diff $ARGUMENTS -- - ``` -3. **Classify the comment** into one of these categories: - -| Category | Description | Action | -|----------|------------|--------| -| **Bug/correctness** | Reviewer identified a real bug or incorrect behavior | Fix the code | -| **Style/convention** | Naming, formatting, or project convention issue | Fix to match convention | -| **Suggestion/improvement** | A better approach or simplification | Evaluate and implement if it improves the code | -| **Question** | Reviewer asking for clarification | Reply with an explanation, no code change needed | -| **Nitpick** | Minor optional suggestion | Evaluate — fix if trivial, otherwise reply explaining the tradeoff | -| **Invalid/outdated** | Comment doesn't apply or is based on a misunderstanding | Reply politely explaining why | - -### 4. Evaluate validity — specs and bash behavior are the sources of truth - -There are two sources of truth, checked in this order: - -1. **PR specs** (from step 2b) — if present, specs are the highest authority for what this PR should do -2. **Bash behavior** — the shell must match bash unless it intentionally diverges (sandbox restrictions, blocked commands, readonly enforcement) - -**CRITICAL: Never invent justifications for dismissing a comment.** If a reviewer says "the spec requires X" and the spec does require X, the comment is valid — even if you think the current implementation is a reasonable alternative. Do not fabricate reasons like "backward compatibility" or "design intent" unless those reasons are explicitly stated in the specs or CLAUDE.md. - -For each comment, determine if it is **valid and actionable**: - -1. **Check against PR specs first** — if a SPECS section exists and the comment aligns with a spec, the comment is **valid by definition**. Do not dismiss it. -2. **Verify against bash** — for comments about shell behavior, check what bash actually does: - ```bash - docker run --rm debian:bookworm-slim bash -c '' - ``` -3. **Read the relevant code** in full — not just the diff, but the surrounding implementation -4. **Check project conventions** in `CLAUDE.md` and `AGENTS.md` -5. **Consider side effects** — will the change break other tests or behaviors? -6. **Check for duplicates** — is the same issue raised in multiple comments? Group them - -Decision matrix: - -| Reviewer says | Spec says | Bash does | Action | -|--------------|-----------|-----------|--------| -| "Spec requires X" | Spec does require X | N/A | **Fix the implementation** to match the spec | -| "Spec requires X" | No such spec exists | N/A | **Reply** noting the spec doesn't mention this | -| "This is wrong" | No spec relevant | Reviewer is right | **Fix the implementation** to match bash | -| "This is wrong" | No spec relevant | Current code matches bash | **Reply** explaining it matches bash, with proof | -| "This is wrong" | No spec relevant | N/A (sandbox/security) | **Reply** explaining the intentional divergence | -| "Do it differently" | No spec relevant | Suggestion matches bash better | **Fix the implementation** to match bash | -| "Do it differently" | No spec relevant | Current code already matches bash | **Reply** — bash compatibility takes priority | - -If a comment is **not valid**: -- Prepare a polite reply with proof (e.g., "This matches bash behavior — verified with `docker run --rm debian:bookworm-slim bash -c '...'`") -- If the divergence is intentional, explain why (sandbox restriction, security, etc.) -- **Never claim "backward compatibility" or "design intent" unless you can point to a specific line in the specs or CLAUDE.md that says so** - -If a comment is **valid** (i.e., it aligns with a spec, brings the shell closer to bash, or addresses a real bug): -- Proceed to step 5 - -### 5. Implement fixes - -For each valid comment, apply the fix. **Always prefer fixing the shell implementation over adjusting tests or expectations**, unless the shell intentionally diverges from bash. - -1. **Read the file** being modified -2. **Determine what bash does** if not already verified: - ```bash - docker run --rm debian:bookworm-slim bash -c '' - ``` -3. **Fix the implementation** to match bash behavior — do NOT adjust test expectations to match broken implementation -4. **Check for related issues** — if the comment reveals a pattern, fix all occurrences (not just the one the reviewer flagged) -5. **Run relevant tests** to verify: - ```bash - # Run tests for the affected package - go test -race -v ./interp/... ./tests/... -run "" -timeout 60s - - # If YAML scenarios were touched, run bash comparison - RSHELL_BASH_TEST=1 go test ./tests/ -run TestShellScenariosAgainstBash -timeout 120s - ``` -6. If tests fail, iterate on the **implementation fix** (not the test) until they pass -7. Only set `skip_assert_against_bash: true` when the behavior intentionally diverges from bash (sandbox restrictions, blocked commands, readonly enforcement) - -Group related comment fixes into a single logical commit when possible. - -### 6. Commit and push - -After all fixes are verified: - -```bash -# Stage the changed files explicitly -git add ... - -# Commit with a descriptive message -git commit -m "$(cat <<'EOF' -Address review comments: - -
-EOF -)" - -# Push to the PR branch -git push -``` - -If fixes span unrelated areas, prefer multiple focused commits over one large commit. - -### 7. Reply to and resolve comments - -**All replies MUST be prefixed with `[]`** (e.g. `[Claude Opus 4.6]`) so reviewers can tell the response came from an AI. - -Handle comments differently based on who authored them: - -#### Reviewer comments (not the PR author) - -For each reviewer comment that was addressed: - -1. **Reply** explaining what was fixed: - ```bash - gh api repos/{owner}/{repo}/pulls/{pr-number}/comments/{comment-id}/replies \ - -f body="[ - ] Done — " - ``` - -2. **Resolve** the thread: - ```bash - # Get the GraphQL thread ID - gh api graphql -f query=' - query($owner: String!, $repo: String!, $pr: Int!) { - repository(owner: $owner, name: $repo) { - pullRequest(number: $pr) { - reviewThreads(first: 100) { - nodes { - id - isResolved - comments(first: 1) { - nodes { databaseId } - } - } - } - } - } - } - ' -f owner="{owner}" -f repo="{repo}" -F pr={pr-number} \ - --jq '.data.repository.pullRequest.reviewThreads.nodes[] | select(.comments.nodes[0].databaseId == {comment-id}) | .id' - - # Resolve the thread - gh api graphql -f query=' - mutation($threadId: ID!) { - resolveReviewThread(input: {threadId: $threadId}) { - thread { isResolved } - } - } - ' -f threadId="" - ``` - -#### PR author self-comments - -For comments authored by the PR author (self-notes/TODOs): - -1. **Fix the issue** described in the comment (these are actionable items the author left for themselves) -2. **Resolve** the thread (the PR author can resolve their own threads) -3. **Do NOT reply** to self-comments — just fix and resolve. No need for the AI to narrate back to the same person who wrote the note. - -#### Review summary action items - -For action items extracted from review summaries (step 2c): - -1. **Fix each action item** as if it were an inline comment -2. **Reply to the review** with a summary of all action items addressed: - ```bash - gh api repos/{owner}/{repo}/pulls/{pr-number}/reviews/{review-id}/comments \ - -f body="[ - ] Addressed the following from this review: - - : - - : " - ``` - If the `comments` endpoint doesn't work for review-level replies, use an issue comment instead: - ```bash - gh api repos/{owner}/{repo}/issues/{pr-number}/comments \ - -f body="[ - ] Addressed review feedback from @{reviewer}: - - : - - : " - ``` - -#### Invalid or question comments - -For comments that were **not valid** or were **questions**, reply (prefixed with `[ - ]`) with an explanation but do NOT resolve — let the reviewer decide. - -**IMPORTANT: Never resolve a thread where the reviewer's comment aligns with a PR spec but the implementation doesn't match.** These are valid spec violations — fix the code instead. If you cannot fix it, leave the thread unresolved and explain the blocker. - -### 8. Summary - -Provide a final summary organized by source: - -**Reviewer inline comments addressed:** -- List each comment with: the comment (abbreviated), classification (bug, style, suggestion, etc.), what was changed - -**Review summary action items addressed:** -- List each action item from review summaries that was implemented - -**PR author self-comments addressed:** -- List each self-note/TODO that was fixed and resolved - -**Not fixed (with reason):** -- List any comments replied to but not fixed, with explanation - -**Could not be addressed:** -- List any comments that could not be addressed, with explanation - -Confirm the commit(s) pushed and threads resolved. diff --git a/.claude/skills/code-review/SKILL.md b/.claude/skills/code-review/SKILL.md index d24f6778..84d52ae5 100644 --- a/.claude/skills/code-review/SKILL.md +++ b/.claude/skills/code-review/SKILL.md @@ -26,36 +26,7 @@ git diff main...HEAD If no changes are found, inform the user and stop. -### 2. Verify specs implementation - -Read the PR description and look for a **SPECS** section: - -```bash -gh pr view $ARGUMENTS --json body --jq '.body' -``` - -If a SPECS section is present, it defines the requirements that this PR MUST implement. **Every single spec must be verified against the diff.** -The specs override other instructions (code, inline comments in code, etc). ALL specs MUST be implemented. - -For each spec: -1. **Find the code** that implements the spec -2. **Verify correctness** — does the implementation fully satisfy the spec? -3. **Check for missing specs** — is any spec not implemented at all? - -Flag any unimplemented or partially implemented spec as a **P1 finding** (missing functionality that was explicitly required). - -Include a spec coverage table in the review output: - -```markdown -| Spec | Implemented | Location | Notes | -|------|:-----------:|----------|-------| -| Must support `--flag` option | Yes | `interp/api.go:42` | Fully implemented | -| Must return exit code 2 on error | **No** | — | Not found in diff | -``` - -If no SPECS section is found in the PR description, skip this step. - -### 3. Read and understand all changed code +### 2. Read and understand all changed code For each changed file: diff --git a/.claude/skills/fix-ci-tests/SKILL.md b/.claude/skills/fix-ci-tests/SKILL.md index ab1354df..d493ab35 100644 --- a/.claude/skills/fix-ci-tests/SKILL.md +++ b/.claude/skills/fix-ci-tests/SKILL.md @@ -204,62 +204,11 @@ EOF git push ``` -### 9. Reply to and resolve CI review comments - -If there are review comments on the PR related to the CI failures (e.g. a reviewer or bot flagged the failure), reply to them and mark them as resolved: - -```bash -# Fetch review comments on the PR -gh api repos/{owner}/{repo}/pulls/{pr-number}/comments --jq '.[] | {id, body, path, line}' 2>&1 | head -100 -``` - -For each comment that relates to a CI failure you just fixed: - -1. **Reply** (prefixed with `[Claude Opus 4.6]`) explaining what was fixed and how: - ```bash - gh api repos/{owner}/{repo}/pulls/{pr-number}/comments/{comment-id}/replies \ - -f body="[Claude Opus 4.6] Fixed — " - ``` - -2. **Resolve** the conversation thread (requires GraphQL since the REST API does not support resolving): - ```bash - # First get the GraphQL thread ID for the comment - gh api graphql -f query=' - query($owner: String!, $repo: String!, $pr: Int!) { - repository(owner: $owner, name: $repo) { - pullRequest(number: $pr) { - reviewThreads(first: 100) { - nodes { - id - isResolved - comments(first: 1) { - nodes { databaseId } - } - } - } - } - } - } - ' -f owner="{owner}" -f repo="{repo}" -F pr={pr-number} \ - --jq '.data.repository.pullRequest.reviewThreads.nodes[] | select(.comments.nodes[0].databaseId == {comment-id}) | .id' - - # Then resolve it - gh api graphql -f query=' - mutation($threadId: ID!) { - resolveReviewThread(input: {threadId: $threadId}) { - thread { isResolved } - } - } - ' -f threadId="" - ``` - -If there are no review comments related to CI failures, skip this step. - -### 10. Summary +### 9. Summary Provide a final summary: - List each CI failure that was fixed - Briefly explain the root cause and fix for each - Note any failures that could not be reproduced or fixed (with explanation) -- Confirm the commit was pushed and which review comments were resolved +- Confirm the commit was pushed diff --git a/.claude/skills/review-fix-loop/SKILL.md b/.claude/skills/review-fix-loop/SKILL.md index d9f1bbe5..ebb85ded 100644 --- a/.claude/skills/review-fix-loop/SKILL.md +++ b/.claude/skills/review-fix-loop/SKILL.md @@ -1,6 +1,6 @@ --- name: review-fix-loop -description: "Self-review a PR, fix all issues, and re-review in a loop until clean. Coordinates code-review, address-pr-comments, and fix-ci-tests skills." +description: "Self-review a PR against RULES.md, fix all issues, and re-review in a loop until clean. Coordinates local codex review and fix-ci-tests skills." argument-hint: "[pr-number|pr-url]" --- @@ -18,9 +18,9 @@ Your very first action — before reading ANY files, before running ANY commands 1. "Step 1: Identify the PR" 2. "Step 2: Run the review-fix loop" ← **Update subject with iteration number each loop** (e.g. "Step 2: Run the review-fix loop (iteration 1)") -3. "Step 2A1: Self-review (code-review)" ← **parallel with 2A2** -4. "Step 2A2: Request external reviews (@codex)" ← **parallel with 2A1** -5. "Step 2B: Address PR comments (address-pr-comments)" +3. "Step 2A1: Self-review (RULES.md)" ← **parallel with 2A2** +4. "Step 2A2: Run local codex review" ← **parallel with 2A1** +5. "Step 2B: Address Self-review and Codex findings" 6. "Step 2C: Fix CI failures (fix-ci-tests)" 7. "Step 2D: Verify push and resolve conflicts" 8. "Step 2E: Check CI status" @@ -57,7 +57,7 @@ Step 1 → Step 2 (loop: [2A1 ∥ 2A2] → 2B → 2C → 2D → 2E → 2F) → S - Do NOT skip the review (Step 2A1) because you think the code is fine - Do NOT skip verification (Step 3) because tests passed during fixes -- Do NOT skip the external review trigger — @codex reviews catch issues the self-review misses +- Do NOT skip the local codex review — it catches issues the self-review misses - Do NOT mark a step completed until every sub-bullet in that step is satisfied If you catch yourself wanting to skip a step, STOP and do the step anyway. @@ -99,36 +99,77 @@ Set `iteration = 1`. Maximum iterations: **30**. Repeat sub-steps A through E wh ### Sub-step 2A1 — Self-review ← **parallel with 2A2** -Run the **code-review** skill on the PR: -``` -/code-review +Review the PR diff directly against the rules in `.claude/skills/implement-posix-command/RULES.md`: + +```bash +gh pr diff ``` -This analyzes the full diff against main, posts findings as a GitHub PR review with inline comments, and classifies findings by severity (P0–P3). -### Sub-step 2A2 — Request external reviews ← **parallel with 2A1** +Read `.claude/skills/implement-posix-command/RULES.md` and check every changed file against each rule category: +- Flag Parsing (pflag, help flag, error handling) +- File System Safety (use `callCtx.OpenFile`, no `os.*` filesystem calls) +- Memory Safety & Resource Limits (bounded buffers, no full-file loads) +- Input Validation & Error Handling (numeric overflow, exit codes, stderr for errors) +- Special File Handling (/dev/zero, FIFOs, /proc) +- Path & Traversal Safety +- Concurrency & Race Conditions +- Denial of Service Prevention (context cancellation, no infinite loops) +- Integer Safety +- Output Consistency +- Testing Requirements (all flags, edge cases, error paths, security properties tested) +- Cross-Platform Compatibility (filepath package, Windows reserved names, CRLF) + +Classify each finding by severity: +- **P0**: Security vulnerability or data corruption +- **P1**: Correctness bug or sandbox bypass +- **P2**: Missing test coverage or rule violation with workaround +- **P3**: Style / minor quality issue + +### Sub-step 2A2 — Run local codex review ← **parallel with 2A1** + +First, check whether the `codex` CLI is available: -Post a comment to trigger @codex reviews: ```bash -gh pr comment --body "@codex review this PR +which codex 2>/dev/null && echo AVAILABLE || echo UNAVAILABLE +``` -Important: Read the SPECS section of the PR description. If SPECS are present: **make sure the implementation matches ALL the specs**. -The specs override other instructions (code, inline comments in code, etc). ALL specs MUST be implemented. -" +**If `codex` is available**, run a local codex review: +```bash +gh pr diff | codex "Review this PR diff. Check for bugs, security issues, correctness, and code quality. Report findings by severity (P0–P3) with file and line references where applicable." ``` -The external reviews arrive asynchronously — their comments will be picked up by **address-pr-comments** in Sub-step 2B1. +Capture the output. Codex findings will be addressed in **Sub-step 2B** alongside self-review findings. + +**If `codex` is not available**, skip this sub-step and note "codex unavailable" in the iteration log. ### After 2A1 ∥ 2A2 complete Wait for **both** to complete before proceeding. -**Post the self-review outcome (from 2A1) as a GitHub PR comment** so it is always visible on the PR: +**Post the self-review outcome (from 2A1) as a GitHub PR comment** so it is always visible on the PR. Format it like this: ```bash -gh pr comment --body "" +gh pr comment --body "## Self-review (iteration N/) +Findings: 1×P1, 2×P2 ← or 'No findings.' if APPROVE with nothing to report + +P1 — path/to/file.go:42: + +P2 — path/to/other.go:17: +P2 — path/to/other.go:88: " ``` -**Record the self-review outcome:** -- If the review result is **APPROVE** (no findings) → skip to **Sub-step 2E (CI check)** -- If there are findings → continue to **Sub-step 2B** +**Post the codex review findings (from 2A2) as a separate GitHub PR comment**. Parse and reformat the raw codex output into the same structured format: +```bash +gh pr comment --body "## Codex review (iteration N/) +Findings: 1×P1, 2×P2 ← or 'No findings.' if codex reported nothing + +P1 — path/to/file.go:42: + +P2 — path/to/other.go:17: +P2 — path/to/other.go:88: " +``` + +**Record the self-review outcome and codex findings:** +- If both 2A1 and 2A2 produce no findings → skip to **Sub-step 2E (CI check)** +- If there are findings from either source → continue to **Sub-step 2B** --- @@ -141,16 +182,26 @@ git status git pull --rebase origin ``` -### Sub-step 2B — Address PR comments +### Sub-step 2B — Address Self-review and Codex findings -Run the **address-pr-comments** skill: -``` -/address-pr-comments -``` -This reads all unresolved review comments, evaluates validity, implements fixes, commits, pushes, and replies/resolves threads. +Address all findings reported by Sub-step 2A1 (self-review) and Sub-step 2A2 (local codex review): + +1. Collect all findings from both sources. +2. For each finding, evaluate its validity: + - **P0/P1**: Must be fixed immediately. + - **P2**: Fix unless there is a clear, documented reason not to. + - **P3**: Fix if straightforward; otherwise note it as a known low-priority item. +3. Implement fixes directly in the codebase. Do not skip findings without justification. +4. After all fixes are applied, stage and commit: + ```bash + git add -p # or specific files + git commit -m "[iter ] " + git push origin + ``` **Commit message prefix:** All commits created in this sub-step MUST be prefixed with the current loop iteration number, e.g. `[iter 3] Fix null check in parser`. + Wait for completion before proceeding to 2C. ### Sub-step 2C — Fix CI failures @@ -212,26 +263,7 @@ Check **all three** review sources for remaining issues: 1. **Self-review** — Was the latest `/code-review` result **APPROVE** (no findings)? -2. **External reviews** — Are there unresolved PR comment threads from @codex or @chatgpt-codex-connector[bot]? - ```bash - gh api graphql -f query=' - query($owner: String!, $repo: String!, $pr: Int!) { - repository(owner: $owner, name: $repo) { - pullRequest(number: $pr) { - reviewThreads(first: 100) { - nodes { - isResolved - comments(first: 1) { - nodes { author { login } body } - } - } - } - } - } - } - ' -f owner="{owner}" -f repo="{repo}" -F pr={pr-number} \ - --jq '.data.repository.pullRequest.reviewThreads.nodes[] | select(.isResolved == false)' - ``` +2. **Local codex review** — Did the `codex` CLI output from Sub-step 2A2 report any findings? 3. **CI** — Are all checks passing? ```bash @@ -240,12 +272,12 @@ Check **all three** review sources for remaining issues: **Decision matrix:** -| Self-review | External comments | CI | Action | -|------------|-------------------|-----|--------| -| APPROVE | None unresolved | Passing | **STOP — PR is clean** | +| Self-review | Codex findings | CI | Action | +|------------|----------------|-----|--------| +| APPROVE | None | Passing | **STOP — PR is clean** | | Any findings | Any | Any | **Continue** → go back to Sub-step 2A1 ∥ 2A2 | -| APPROVE | Unresolved threads | Any | **Continue** → go back to Sub-step 2A1 ∥ 2A2 (address-pr-comments will handle them) | -| APPROVE | None unresolved | Failing | **Continue** → go back to Sub-step 2A1 ∥ 2A2 (fix-ci-tests will handle it) | +| APPROVE | Findings present | Any | **Continue** → go back to Sub-step 2A1 ∥ 2A2 | +| APPROVE | None | Failing | **Continue** → go back to Sub-step 2A1 ∥ 2A2 (fix-ci-tests will handle it) | | — | — | — | If `iteration > 30` → **STOP — iteration limit reached** | Log the iteration result before continuing or stopping: @@ -279,67 +311,13 @@ Run a final verification regardless of how the loop exited: gh pr checks --json name,state ``` -3. **Confirm no unresolved threads:** - ```bash - gh api graphql -f query=' - query($owner: String!, $repo: String!, $pr: Int!) { - repository(owner: $owner, name: $repo) { - pullRequest(number: $pr) { - reviewThreads(first: 100) { - nodes { - isResolved - comments(first: 1) { - nodes { author { login } body } - } - } - } - } - } - } - ' -f owner="{owner}" -f repo="{repo}" -F pr={pr-number} \ - --jq '.data.repository.pullRequest.reviewThreads.nodes[] | select(.isResolved == false) | .comments.nodes[0].body' \ - 2>&1 | head -50 - ``` - -4. **Confirm Codex has replied to the LATEST review request (with polling):** - - The review request comment posted in Step 2A2 triggers Codex asynchronously. The bot may respond as either `codex` or `chatgpt-codex-connector[bot]` (the GitHub App). It can take **15+ minutes** to respond. You must verify that the bot has actually responded to **the most recent** request, not a previous iteration's request. Replies from earlier iterations do NOT count. - - **How to check:** - - Find the timestamp of the **last** Codex review request comment (the one posted in Step 2A2 of the final iteration). You can identify it by looking for comments authored by the current user containing "@codex" in the body: - ```bash - gh api repos/{owner}/{repo}/issues/{pr-number}/comments --paginate --jq ' - [.[] | select(.body | test("@codex")) | select(.user.login != "codex") | select(.user.login != "chatgpt-codex-connector[bot]")] | last | .created_at' - ``` - - Then check whether the codex bot has posted a review **after** that timestamp. Check both possible bot logins (`codex` and `chatgpt-codex-connector[bot]`): - ```bash - gh api repos/{owner}/{repo}/pulls/{pr-number}/reviews --paginate --jq ' - [.[] | select(.user.login == "codex" or .user.login == "chatgpt-codex-connector[bot]")] | last | {submitted_at, state, user: .user.login}' - ``` - - Also check issue comments (the bot may reply as a comment instead of a review): - ```bash - gh api repos/{owner}/{repo}/issues/{pr-number}/comments --paginate --jq ' - [.[] | select(.user.login == "codex" or .user.login == "chatgpt-codex-connector[bot]")] | last | {created_at, user: .user.login}' - ``` - - Compare timestamps. If the bot's latest review `submitted_at` (or comment `created_at`) is **after** the latest request's `created_at`, the bot has replied — **verification passes**. Use whichever response (review or comment) has the most recent timestamp. - - **Polling wait if the bot hasn't replied yet:** - - Do NOT immediately fail. Instead, poll and wait: - - **Poll interval:** 1 minute (use `sleep 60` between checks) - - **Maximum wait:** 10 minutes (up to 10 poll attempts) - - On each poll iteration, re-run the `gh api` commands above and compare timestamps - - Log each poll attempt: `"Waiting for Codex reply... (attempt N/10, elapsed Xm)"` - - **Only fail this verification** if the bot has still not replied after the full 10-minute wait. Then go back to **Step 2: Run the review-fix loop**. - - **If the bot has no reviews or comments at all** after the 10-minute wait, the verification also fails. +3. **Confirm the latest local codex review (Sub-step 2A2) reported no findings.** -Record the final state of each dimension (self-review, external reviews, CI, Codex response). +Record the final state of each dimension (self-review, local codex review, CI). -Track how many times Step 3 has **succeeded** (all four verifications passed) across the entire run. +Track how many times Step 3 has **succeeded** (all three verifications passed) across the entire run. -**If any verification fails** (CI failing, unresolved threads remain, unpushed commits that can't be pushed, or Codex hasn't responded to the latest review request), reset the success counter to 0, reset Step 2 and all its sub-steps to `pending`, and go back to **Step 2: Run the review-fix loop** for another iteration. +**If any verification fails** (CI failing, unpushed commits that can't be pushed, or the latest local codex review reported findings), reset the success counter to 0, reset Step 2 and all its sub-steps to `pending`, and go back to **Step 2: Run the review-fix loop** for another iteration. **If all verifications pass**, increment the success counter. If this is the **5th consecutive success** of Step 3 → proceed to **Step 4**. Otherwise → reset Step 2 and all its sub-steps to `pending`, and go back to **Step 2: Run the review-fix loop** for another iteration to re-confirm stability. @@ -371,7 +349,7 @@ Provide a summary in this exact format: ### Final state - **Self-review**: APPROVE / REQUEST_CHANGES / COMMENT -- **Unresolved external comments**: (list authors) +- **Local codex review**: Clean / Findings present (count) - **CI**: Passing / Failing (list failing checks) ### Remaining issues (if any) @@ -392,8 +370,8 @@ gh pr comment --body "" - **Never skip the review step** — always re-review after fixes to catch regressions or new issues introduced by the fixes themselves. - **Always submit reviews to GitHub** — each iteration's review must be posted as PR comments so there's a visible trail. -- **Run address-pr-comments before fix-ci-tests** — 2B then 2C, sequentially, so CI fixes run on code that already incorporates review feedback. +- **Address review findings before fix-ci-tests** — 2B then 2C, sequentially, so CI fixes run on code that already incorporates review feedback. - **Pull before fixing** — always `git pull --rebase` before launching fix agents to avoid working on stale code. -- **Stop early on APPROVE + CI green + no unresolved threads** — don't waste iterations if the PR is already clean. +- **Stop early on APPROVE + CI green + codex clean** — don't waste iterations if the PR is already clean. - **Respect the iteration limit** — hard stop at 30 to prevent infinite loops. If issues persist after 30 iterations, report what's left for the user to handle. - **Use gate checks** — always call TaskList and verify prerequisites before starting a step. This prevents out-of-order execution. diff --git a/README.md b/README.md index 57cf27fe..5a68ac5d 100644 --- a/README.md +++ b/README.md @@ -56,6 +56,8 @@ Every access path is default-deny: **AllowedPaths** restricts all file operations to specified directories using Go's `os.Root` API (`openat` syscalls), making it immune to symlink traversal, TOCTOU races, and `..` escape attacks. +**ProcPath** (Linux-only) overrides the proc filesystem root used by the `ps` builtin (default `/proc`). This is a privileged option set at runner construction time by trusted caller code — scripts cannot influence it. Access to the proc path is intentionally not subject to `AllowedPaths` restrictions, since proc is a read-only virtual filesystem that does not expose host data under the normal file hierarchy. + ## Shell Features See [SHELL_FEATURES.md](SHELL_FEATURES.md) for the complete list of supported and blocked features. diff --git a/SHELL_FEATURES.md b/SHELL_FEATURES.md index a95e5ce7..07492524 100644 --- a/SHELL_FEATURES.md +++ b/SHELL_FEATURES.md @@ -98,6 +98,7 @@ Blocked features are rejected before execution with exit code 2. - ✅ AllowedCommands — restricts which commands (builtins or external) may be executed; commands require the `rshell:` namespace prefix (e.g. `rshell:cat`); if not set, no commands are allowed - ✅ AllowAllCommands — permits any command (testing convenience) - ✅ AllowedPaths filesystem sandboxing — restricts all file access to specified directories +- ✅ ProcPath — overrides the proc filesystem path used by `ps` (default `/proc`; Linux-only; useful for testing/container environments) - ❌ External commands — blocked by default; requires an ExecHandler to be configured and the binary to be within AllowedPaths - ❌ Background execution: `cmd &` - ❌ Coprocesses: `coproc` diff --git a/allowedsymbols/symbols_internal.go b/allowedsymbols/symbols_internal.go index 2d8106b5..3f914f60 100644 --- a/allowedsymbols/symbols_internal.go +++ b/allowedsymbols/symbols_internal.go @@ -16,14 +16,19 @@ var internalPerPackageSymbols = map[string][]string{ "bufio.NewScanner", // 🟢 line-by-line reading of /proc files; no write capability. "bytes.NewReader", // 🟢 wraps a byte slice as an in-memory io.Reader; no I/O side effects. "context.Context", // 🟢 deadline/cancellation interface; no side effects. + "errors.Is", // 🟢 checks whether an error in a chain matches a target; pure function, no I/O. "errors.New", // 🟢 creates a sentinel error (unsupported-platform stub); pure function, no I/O. "fmt.Errorf", // 🟢 error formatting; pure function, no I/O. + "os.ErrNotExist", // 🟢 sentinel error value indicating a file or directory does not exist; read-only constant, no I/O. "fmt.Sprintf", // 🟢 string formatting; pure function, no I/O. "os.Getpid", // 🟠 returns the current process ID; read-only, no side effects. "os.Open", // 🟠 opens a file read-only; needed to stream /proc/stat line-by-line. "os.ReadDir", // 🟠 reads a directory listing; needed to enumerate /proc entries. "os.ReadFile", // 🟠 reads a whole file; needed to read /proc/[pid]/{stat,cmdline,status}. + "os.Stat", // 🟠 validates that the proc path exists before enumeration; read-only metadata, no write capability. + "path/filepath.Join", // 🟢 joins path elements to construct /proc//stat paths; pure function, no I/O. "strconv.Atoi", // 🟢 string-to-int conversion; pure function, no I/O. + "strconv.Itoa", // 🟢 int-to-string conversion for PID directory names; pure function, no I/O. "strconv.ParseInt", // 🟢 string to int64 with base/bit-size; pure function, no I/O. "strings.Fields", // 🟢 splits a string on whitespace; pure function, no I/O. "strings.HasPrefix", // 🟢 checks string prefix; pure function, no I/O. @@ -76,14 +81,19 @@ var internalAllowedSymbols = []string{ "context.Context", // 🟢 procinfo: deadline/cancellation interface; no side effects. "encoding/binary.BigEndian", // 🟢 winnet: reads big-endian IPv6 group values from DLL buffer; pure value, no I/O. "encoding/binary.LittleEndian", // 🟢 winnet: reads little-endian DWORD fields from DLL buffer; pure value, no I/O. + "errors.Is", // 🟢 procinfo: checks whether an error in a chain matches a target; pure function, no I/O. "errors.New", // 🟢 creates a sentinel error; pure function, no I/O. "fmt.Errorf", // 🟢 error formatting; pure function, no I/O. + "os.ErrNotExist", // 🟢 procinfo: sentinel error value indicating a file or directory does not exist; read-only constant, no I/O. "fmt.Sprintf", // 🟢 string formatting; pure function, no I/O. "os.Getpid", // 🟠 procinfo: returns the current process ID; read-only, no side effects. "os.Open", // 🟠 procinfo: opens a file read-only; needed to stream /proc/stat line-by-line. "os.ReadDir", // 🟠 procinfo: reads a directory listing; needed to enumerate /proc entries. "os.ReadFile", // 🟠 procinfo: reads a whole file; needed to read /proc/[pid]/{stat,cmdline,status}. + "os.Stat", // 🟠 procinfo: validates that the proc path exists before enumeration; read-only metadata, no write capability. + "path/filepath.Join", // 🟢 procinfo: joins path elements to construct /proc//stat paths; pure function, no I/O. "strconv.Atoi", // 🟢 string-to-int conversion; pure function, no I/O. + "strconv.Itoa", // 🟢 procinfo: int-to-string conversion for PID directory names; pure function, no I/O. "strconv.ParseInt", // 🟢 procinfo: string to int64 with base/bit-size; pure function, no I/O. "strings.Fields", // 🟢 procinfo: splits a string on whitespace; pure function, no I/O. "strings.HasPrefix", // 🟢 procinfo: checks string prefix; pure function, no I/O. diff --git a/builtins/builtins.go b/builtins/builtins.go index cdeed233..586d0ff5 100644 --- a/builtins/builtins.go +++ b/builtins/builtins.go @@ -148,6 +148,10 @@ type CallContext struct { // current shell policy. Used by the help builtin to list only executable // commands. CommandAllowed func(name string) bool + + // Proc provides access to the proc filesystem for the ps builtin. + // The path is fixed at construction time and cannot be overridden by callers. + Proc *ProcProvider } // Out writes a string to stdout. diff --git a/builtins/internal/procinfo/procinfo.go b/builtins/internal/procinfo/procinfo.go index db525346..ce375a1f 100644 --- a/builtins/internal/procinfo/procinfo.go +++ b/builtins/internal/procinfo/procinfo.go @@ -30,20 +30,37 @@ type ProcInfo struct { Cmd string // full cmdline, truncated to MaxCmdLen } +// DefaultProcPath is the default path to the proc filesystem. +const DefaultProcPath = "/proc" + +// resolveProcPath returns procPath if non-empty, otherwise DefaultProcPath. +func resolveProcPath(procPath string) string { + if procPath == "" { + return DefaultProcPath + } + return procPath +} + // ListAll returns all running processes. -func ListAll(ctx context.Context) ([]ProcInfo, error) { - return listAll(ctx) +// procPath is the path to the proc filesystem (e.g. "/proc"); pass +// DefaultProcPath or an empty string to use the default. +func ListAll(ctx context.Context, procPath string) ([]ProcInfo, error) { + return listAll(ctx, resolveProcPath(procPath)) } // GetSession returns processes in the current process session // (walks PPID chain from os.Getpid() upward to collect ancestors, plus // any processes that share the same session ID when available). -func GetSession(ctx context.Context) ([]ProcInfo, error) { - return getSession(ctx) +// procPath is the path to the proc filesystem; pass DefaultProcPath or an +// empty string to use the default. +func GetSession(ctx context.Context, procPath string) ([]ProcInfo, error) { + return getSession(ctx, resolveProcPath(procPath)) } // GetByPIDs returns process info for the given PIDs. // Missing PIDs are silently skipped. -func GetByPIDs(ctx context.Context, pids []int) ([]ProcInfo, error) { - return getByPIDs(ctx, pids) +// procPath is the path to the proc filesystem; pass DefaultProcPath or an +// empty string to use the default. +func GetByPIDs(ctx context.Context, procPath string, pids []int) ([]ProcInfo, error) { + return getByPIDs(ctx, resolveProcPath(procPath), pids) } diff --git a/builtins/internal/procinfo/procinfo_darwin.go b/builtins/internal/procinfo/procinfo_darwin.go index 8b64b296..1d802689 100644 --- a/builtins/internal/procinfo/procinfo_darwin.go +++ b/builtins/internal/procinfo/procinfo_darwin.go @@ -18,7 +18,7 @@ import ( "golang.org/x/sys/unix" ) -func listAll(ctx context.Context) ([]ProcInfo, error) { +func listAll(ctx context.Context, _ string) ([]ProcInfo, error) { kprocs, err := unix.SysctlKinfoProcSlice("kern.proc.all") if err != nil { return nil, fmt.Errorf("ps: SysctlKinfoProcSlice: %w", err) @@ -41,8 +41,8 @@ func listAll(ctx context.Context) ([]ProcInfo, error) { return procs, nil } -func getSession(ctx context.Context) ([]ProcInfo, error) { - all, err := listAll(ctx) +func getSession(ctx context.Context, procPath string) ([]ProcInfo, error) { + all, err := listAll(ctx, procPath) if err != nil { return nil, err } @@ -99,7 +99,7 @@ func getSession(ctx context.Context) ([]ProcInfo, error) { return result, nil } -func getByPIDs(ctx context.Context, pids []int) ([]ProcInfo, error) { +func getByPIDs(ctx context.Context, _ string, pids []int) ([]ProcInfo, error) { var result []ProcInfo for _, pid := range pids { if ctx.Err() != nil { diff --git a/builtins/internal/procinfo/procinfo_linux.go b/builtins/internal/procinfo/procinfo_linux.go index 94aaa2f6..76c61e80 100644 --- a/builtins/internal/procinfo/procinfo_linux.go +++ b/builtins/internal/procinfo/procinfo_linux.go @@ -11,8 +11,10 @@ import ( "bufio" "bytes" "context" + "errors" "fmt" "os" + "path/filepath" "strconv" "strings" "time" @@ -22,13 +24,13 @@ import ( // almost always 100, but we default to 100 and let procBootTime handle errors. const clkTck = 100 -func listAll(ctx context.Context) ([]ProcInfo, error) { - entries, err := os.ReadDir("/proc") +func listAll(ctx context.Context, procPath string) ([]ProcInfo, error) { + entries, err := os.ReadDir(procPath) if err != nil { - return nil, fmt.Errorf("ps: cannot read /proc: %w", err) + return nil, fmt.Errorf("ps: cannot read %s: %w", procPath, err) } - btime, _ := procBootTime() + btime, _ := procBootTime(procPath) var procs []ProcInfo for _, e := range entries { if ctx.Err() != nil { @@ -44,7 +46,7 @@ func listAll(ctx context.Context) ([]ProcInfo, error) { if err != nil { continue } - info, err := readProc(pid, btime) + info, err := readProc(procPath, pid, btime) if err != nil { continue } @@ -53,8 +55,8 @@ func listAll(ctx context.Context) ([]ProcInfo, error) { return procs, nil } -func getSession(ctx context.Context) ([]ProcInfo, error) { - all, err := listAll(ctx) +func getSession(ctx context.Context, procPath string) ([]ProcInfo, error) { + all, err := listAll(ctx, procPath) if err != nil { return nil, err } @@ -65,6 +67,9 @@ func getSession(ctx context.Context) ([]ProcInfo, error) { } // Walk PPID chain from current process upward; collect session ancestors. + // Note: if procPath points to a foreign PID namespace (e.g. a container), + // our host PID is unlikely to appear there, so the session result will be + // empty. This is expected — GetSession is designed for the current host. selfPID := os.Getpid() ancestors := make(map[int]bool) cur := selfPID @@ -80,7 +85,7 @@ func getSession(ctx context.Context) ([]ProcInfo, error) { // Also include all processes that share our SID (best-effort; fall back to // ancestor chain only). var selfSID int - if data, err := os.ReadFile(fmt.Sprintf("/proc/%d/stat", selfPID)); err == nil { + if data, err := os.ReadFile(filepath.Join(procPath, strconv.Itoa(selfPID), "stat")); err == nil { selfSID = parseSID(data) } @@ -94,7 +99,7 @@ func getSession(ctx context.Context) ([]ProcInfo, error) { continue } if selfSID != 0 { - if data, err := os.ReadFile(fmt.Sprintf("/proc/%d/stat", p.PID)); err == nil { + if data, err := os.ReadFile(filepath.Join(procPath, strconv.Itoa(p.PID), "stat")); err == nil { if parseSID(data) == selfSID { result = append(result, p) } @@ -104,25 +109,38 @@ func getSession(ctx context.Context) ([]ProcInfo, error) { return result, nil } -func getByPIDs(ctx context.Context, pids []int) ([]ProcInfo, error) { - btime, _ := procBootTime() +func getByPIDs(ctx context.Context, procPath string, pids []int) ([]ProcInfo, error) { + fi, err := os.Stat(procPath) + if err != nil { + return nil, fmt.Errorf("ps: cannot read %s: %w", procPath, err) + } + if !fi.IsDir() { + return nil, fmt.Errorf("ps: cannot read %s: not a directory", procPath) + } + btime, _ := procBootTime(procPath) var result []ProcInfo for _, pid := range pids { if ctx.Err() != nil { break } - info, err := readProc(pid, btime) + info, err := readProc(procPath, pid, btime) if err != nil { - continue + // ENOENT means the process no longer exists — skip silently. + // Any other error (EACCES, I/O, etc.) indicates a configuration + // or read failure and should be surfaced to the caller. + if errors.Is(err, os.ErrNotExist) { + continue + } + return nil, fmt.Errorf("ps: cannot read %s: %w", filepath.Join(procPath, strconv.Itoa(pid)), err) } result = append(result, info) } return result, nil } -// readProc reads process info for a single PID from /proc. -func readProc(pid int, btime int64) (ProcInfo, error) { - statData, err := os.ReadFile(fmt.Sprintf("/proc/%d/stat", pid)) +// readProc reads process info for a single PID from procPath. +func readProc(procPath string, pid int, btime int64) (ProcInfo, error) { + statData, err := os.ReadFile(filepath.Join(procPath, strconv.Itoa(pid), "stat")) if err != nil { return ProcInfo{}, err } @@ -177,11 +195,11 @@ func readProc(pid int, btime int64) (ProcInfo, error) { info.STime = "?" } - // UID from /proc/pid/status. - info.UID = readUID(pid) + // UID from procPath/pid/status. + info.UID = readUID(procPath, pid) - // Full cmdline from /proc/pid/cmdline (null-separated). - cmdline := readCmdline(pid) + // Full cmdline from procPath/pid/cmdline (null-separated). + cmdline := readCmdline(procPath, pid) if cmdline == "" { // Kernel thread: show [comm]. info.Cmd = "[" + comm + "]" @@ -222,9 +240,9 @@ func resolveTTY(_ int, ttyNr int64) string { } } -// readUID reads the real UID from /proc/pid/status. -func readUID(pid int) string { - data, err := os.ReadFile(fmt.Sprintf("/proc/%d/status", pid)) +// readUID reads the real UID from procPath/pid/status. +func readUID(procPath string, pid int) string { + data, err := os.ReadFile(filepath.Join(procPath, strconv.Itoa(pid), "status")) if err != nil { return "?" } @@ -241,9 +259,9 @@ func readUID(pid int) string { return "?" } -// readCmdline reads /proc/pid/cmdline and returns the command line string. -func readCmdline(pid int) string { - data, err := os.ReadFile(fmt.Sprintf("/proc/%d/cmdline", pid)) +// readCmdline reads procPath/pid/cmdline and returns the command line string. +func readCmdline(procPath string, pid int) string { + data, err := os.ReadFile(filepath.Join(procPath, strconv.Itoa(pid), "cmdline")) if err != nil || len(data) == 0 { return "" } @@ -260,9 +278,9 @@ func readCmdline(pid int) string { return cmd } -// procBootTime reads the boot time (seconds since epoch) from /proc/stat. -func procBootTime() (int64, error) { - f, err := os.Open("/proc/stat") +// procBootTime reads the boot time (seconds since epoch) from procPath/stat. +func procBootTime(procPath string) (int64, error) { + f, err := os.Open(filepath.Join(procPath, "stat")) if err != nil { return 0, err } diff --git a/builtins/internal/procinfo/procinfo_other.go b/builtins/internal/procinfo/procinfo_other.go index d732ebe1..87e50197 100644 --- a/builtins/internal/procinfo/procinfo_other.go +++ b/builtins/internal/procinfo/procinfo_other.go @@ -14,14 +14,14 @@ import ( var errUnsupported = errors.New("not supported on this platform") -func listAll(_ context.Context) ([]ProcInfo, error) { +func listAll(_ context.Context, _ string) ([]ProcInfo, error) { return nil, errUnsupported } -func getSession(_ context.Context) ([]ProcInfo, error) { +func getSession(_ context.Context, _ string) ([]ProcInfo, error) { return nil, errUnsupported } -func getByPIDs(_ context.Context, _ []int) ([]ProcInfo, error) { +func getByPIDs(_ context.Context, _ string, _ []int) ([]ProcInfo, error) { return nil, errUnsupported } diff --git a/builtins/internal/procinfo/procinfo_windows.go b/builtins/internal/procinfo/procinfo_windows.go index f6455831..54865ebb 100644 --- a/builtins/internal/procinfo/procinfo_windows.go +++ b/builtins/internal/procinfo/procinfo_windows.go @@ -15,7 +15,7 @@ import ( "golang.org/x/sys/windows" ) -func listAll(ctx context.Context) ([]ProcInfo, error) { +func listAll(ctx context.Context, _ string) ([]ProcInfo, error) { snapshot, err := windows.CreateToolhelp32Snapshot(windows.TH32CS_SNAPPROCESS, 0) if err != nil { return nil, fmt.Errorf("ps: CreateToolhelp32Snapshot: %w", err) @@ -49,8 +49,8 @@ func listAll(ctx context.Context) ([]ProcInfo, error) { return procs, nil } -func getSession(ctx context.Context) ([]ProcInfo, error) { - all, err := listAll(ctx) +func getSession(ctx context.Context, procPath string) ([]ProcInfo, error) { + all, err := listAll(ctx, procPath) if err != nil { return nil, err } @@ -87,8 +87,8 @@ func getSession(ctx context.Context) ([]ProcInfo, error) { return result, nil } -func getByPIDs(ctx context.Context, pids []int) ([]ProcInfo, error) { - all, err := listAll(ctx) +func getByPIDs(ctx context.Context, procPath string, pids []int) ([]ProcInfo, error) { + all, err := listAll(ctx, procPath) if err != nil { return nil, err } diff --git a/builtins/proc_provider.go b/builtins/proc_provider.go new file mode 100644 index 00000000..9b832797 --- /dev/null +++ b/builtins/proc_provider.go @@ -0,0 +1,42 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package builtins + +import ( + "context" + + "github.com/DataDog/rshell/builtins/internal/procinfo" +) + +// ProcProvider gives builtins controlled access to the proc filesystem. +// The path is fixed at construction time and cannot be overridden by callers. +type ProcProvider struct { + path string +} + +// NewProcProvider returns a ProcProvider for the given proc filesystem path. +// If path is empty, DefaultProcPath ("/proc") is used. +func NewProcProvider(path string) *ProcProvider { + if path == "" { + path = procinfo.DefaultProcPath + } + return &ProcProvider{path: path} +} + +// ListAll returns all running processes. +func (p *ProcProvider) ListAll(ctx context.Context) ([]procinfo.ProcInfo, error) { + return procinfo.ListAll(ctx, p.path) +} + +// GetSession returns processes in the current process session. +func (p *ProcProvider) GetSession(ctx context.Context) ([]procinfo.ProcInfo, error) { + return procinfo.GetSession(ctx, p.path) +} + +// GetByPIDs returns process info for the given PIDs. +func (p *ProcProvider) GetByPIDs(ctx context.Context, pids []int) ([]procinfo.ProcInfo, error) { + return procinfo.GetByPIDs(ctx, p.path, pids) +} diff --git a/builtins/ps/ps.go b/builtins/ps/ps.go index 9138b3da..169ebfe1 100644 --- a/builtins/ps/ps.go +++ b/builtins/ps/ps.go @@ -118,16 +118,16 @@ func registerFlags(fs *builtins.FlagSet) builtins.HandlerFunc { // -e / -A: all processes. Takes priority over -p because it is a // superset; GNU ps treats selection options as additive (union), so // -e -p still returns all processes and exits 0. - procs, err = procinfo.ListAll(ctx) + procs, err = callCtx.Proc.ListAll(ctx) case len(parsedPIDs) > 0: // -p only (no -e/-A): select specific PIDs. pidMode = true - procs, err = procinfo.GetByPIDs(ctx, parsedPIDs) + procs, err = callCtx.Proc.GetByPIDs(ctx, parsedPIDs) default: // Default: current session processes. - procs, err = procinfo.GetSession(ctx) + procs, err = callCtx.Proc.GetSession(ctx) } if err != nil { diff --git a/builtins/ps/ps_procpath_linux_test.go b/builtins/ps/ps_procpath_linux_test.go new file mode 100644 index 00000000..c0dc11eb --- /dev/null +++ b/builtins/ps/ps_procpath_linux_test.go @@ -0,0 +1,221 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +//go:build linux + +package ps_test + +import ( + "bytes" + "context" + "errors" + "fmt" + "os" + "path/filepath" + "strconv" + "strings" + "testing" + + "github.com/stretchr/testify/require" + "mvdan.cc/sh/v3/syntax" + + "github.com/DataDog/rshell/interp" +) + +func runScriptWithProcPath(t *testing.T, script, procPath string) (stdout, stderr string, code int) { + t.Helper() + parser := syntax.NewParser() + prog, err := parser.Parse(strings.NewReader(script), "") + if err != nil { + t.Fatal(err) + } + var outBuf, errBuf bytes.Buffer + runner, err := interp.New( + interp.StdIO(nil, &outBuf, &errBuf), + interp.AllowAllCommands(), + interp.ProcPath(procPath), + ) + if err != nil { + t.Fatal(err) + } + defer runner.Close() + runErr := runner.Run(context.Background(), prog) + exitCode := 0 + if runErr != nil { + var es interp.ExitStatus + if errors.As(runErr, &es) { + exitCode = int(es) + } else { + t.Fatalf("unexpected runner error: %v", runErr) + } + } + return outBuf.String(), errBuf.String(), exitCode +} + +// TestProcPathNonexistentDirErrors ensures ps -e fails gracefully when the +// configured proc path does not exist. +func TestProcPathNonexistentDirErrors(t *testing.T) { + _, stderr, code := runScriptWithProcPath(t, "ps -e", "/nonexistent/proc/path") + if code != 1 { + t.Errorf("expected exit code 1 for nonexistent proc path, got %d", code) + } + if !strings.Contains(stderr, "ps:") { + t.Errorf("expected error message in stderr, got: %s", stderr) + } +} + +// TestProcPathNonexistentDirErrorsByPID ensures ps -p fails gracefully when +// the configured proc path does not exist. +func TestProcPathNonexistentDirErrorsByPID(t *testing.T) { + _, stderr, code := runScriptWithProcPath(t, "ps -p 1", "/nonexistent/proc/path") + if code != 1 { + t.Errorf("expected exit code 1 for nonexistent proc path, got %d", code) + } + if !strings.Contains(stderr, "ps:") { + t.Errorf("expected error message in stderr, got: %s", stderr) + } +} + +// TestProcPathNotADirErrors_ListAll ensures ps -e fails with a clear error +// when the configured proc path exists but is a regular file, not a directory. +func TestProcPathNotADirErrors_ListAll(t *testing.T) { + f, err := os.CreateTemp(t.TempDir(), "not-a-dir") + require.NoError(t, err) + require.NoError(t, f.Close()) + + _, stderr, code := runScriptWithProcPath(t, "ps -e", f.Name()) + if code != 1 { + t.Errorf("expected exit code 1 for file proc path, got %d", code) + } + if !strings.Contains(stderr, "ps:") { + t.Errorf("expected ps: error in stderr, got: %s", stderr) + } +} + +// TestProcPathNotADirErrors_ByPID ensures ps -p fails with a clear error +// when the configured proc path exists but is a regular file, not a directory. +func TestProcPathNotADirErrors_ByPID(t *testing.T) { + f, err := os.CreateTemp(t.TempDir(), "not-a-dir") + require.NoError(t, err) + require.NoError(t, f.Close()) + + _, stderr, code := runScriptWithProcPath(t, "ps -p 1", f.Name()) + if code != 1 { + t.Errorf("expected exit code 1 for file proc path, got %d", code) + } + if !strings.Contains(stderr, "ps:") { + t.Errorf("expected ps: error in stderr, got: %s", stderr) + } + if !strings.Contains(stderr, "not a directory") { + t.Errorf("expected 'not a directory' in stderr, got: %s", stderr) + } +} + +// TestProcPathEmptyUsesDefault ensures an empty ProcPath falls back to /proc +// and ps -e runs successfully. +func TestProcPathEmptyUsesDefault(t *testing.T) { + _, stderr, code := runScriptWithProcPath(t, "ps -e", "") + if code != 0 { + t.Fatalf("ps -e with empty ProcPath exited %d; stderr: %s", code, stderr) + } +} + +// writeFakeProc creates a minimal fake /proc filesystem under dir and returns +// the procPath. It writes a single fake process with the given pid and name. +func writeFakeProc(t *testing.T, pid int, name string) string { + t.Helper() + dir := t.TempDir() + + // Write /stat for boot time. + require.NoError(t, os.WriteFile(filepath.Join(dir, "stat"), []byte("cpu 0 0 0 0\nbtime 1000000000\n"), 0o644)) + + // Create the PID subdirectory using the provided pid. + pidDir := filepath.Join(dir, strconv.Itoa(pid)) + require.NoError(t, os.MkdirAll(pidDir, 0o755)) + + // Write //stat. + // Format: pid (comm) state ppid pgroup session tty_nr tpgid flags minflt + // cminflt majflt cmajflt utime stime cutime cstime priority nice + // numthreads itrealvalue starttime ... + // Fields after (comm): at least 20 required by readProc. + statContent := fmt.Sprintf("%d (%s) S 0 %d %d 0 -1 4194560 0 0 0 0 0 0 0 0 20 0 1 0 100\n", pid, name, pid, pid) + require.NoError(t, os.WriteFile(filepath.Join(pidDir, "stat"), []byte(statContent), 0o644)) + + // Write //status for UID lookup. + require.NoError(t, os.WriteFile(filepath.Join(pidDir, "status"), []byte("Name:\t"+name+"\nUid:\t1000 1000 1000 1000\n"), 0o644)) + + // Write //cmdline. + require.NoError(t, os.WriteFile(filepath.Join(pidDir, "cmdline"), []byte(name+"\x00"), 0o644)) + + return dir +} + +// TestProcPathFakeProc ensures ps -e reads from the custom proc path and +// returns entries from the fake proc tree rather than the real /proc. +func TestProcPathFakeProc(t *testing.T) { + procPath := writeFakeProc(t, 1, "fakeinit") + + stdout, stderr, code := runScriptWithProcPath(t, "ps -e", procPath) + if code != 0 { + t.Fatalf("ps -e with fake proc path exited %d; stderr: %s", code, stderr) + } + if !strings.Contains(stdout, "fakeinit") { + t.Errorf("expected 'fakeinit' in ps output; got:\n%s", stdout) + } + // The output should not contain real system processes. + if strings.Count(stdout, "\n") > 5 { + t.Errorf("expected only fake proc entries, but got many lines:\n%s", stdout) + } +} + +// TestProcPathFakeProcFullFormat ensures ps -ef also reads from the custom +// proc path and includes UID and PPID columns. +func TestProcPathFakeProcFullFormat(t *testing.T) { + procPath := writeFakeProc(t, 1, "fakeinit") + + stdout, stderr, code := runScriptWithProcPath(t, "ps -ef", procPath) + if code != 0 { + t.Fatalf("ps -ef with fake proc path exited %d; stderr: %s", code, stderr) + } + if !strings.Contains(stdout, "fakeinit") { + t.Errorf("expected 'fakeinit' in ps -ef output; got:\n%s", stdout) + } + if !strings.Contains(stdout, "PPID") { + t.Errorf("expected PPID column in ps -ef output; got:\n%s", stdout) + } +} + +// TestProcPathFakeProcByPID ensures ps -p also reads from the custom proc path. +func TestProcPathFakeProcByPID(t *testing.T) { + procPath := writeFakeProc(t, 1, "fakeinit") + + stdout, stderr, code := runScriptWithProcPath(t, "ps -p 1", procPath) + if code != 0 { + t.Fatalf("ps -p 1 with fake proc path exited %d; stderr: %s", code, stderr) + } + if !strings.Contains(stdout, "1") { + t.Errorf("expected PID 1 in output; got:\n%s", stdout) + } +} + +// TestProcPathFakeProcSession ensures bare ps (no flags) reads from the custom +// proc path via GetSession rather than the real /proc. +func TestProcPathFakeProcSession(t *testing.T) { + procPath := writeFakeProc(t, 1, "fakeinit") + // Bare ps uses GetSession. Our process's PID is not present in the fake + // proc tree (which only has PID 1), so the session walk returns no + // ancestors and the SID check finds no matches. The output must contain + // only the header line — if real /proc were read, many additional lines + // would appear. + stdout, stderr, code := runScriptWithProcPath(t, "ps", procPath) + if code != 0 { + t.Fatalf("ps with fake proc path exited %d; stderr: %s", code, stderr) + } + // Only the header line should be present (session is empty for the fake proc). + lineCount := strings.Count(stdout, "\n") + if lineCount > 1 { + t.Errorf("expected only header line (real /proc must not be read), got %d lines:\n%s", lineCount, stdout) + } +} diff --git a/cmd/rshell/main.go b/cmd/rshell/main.go index 7db4ef33..838bc36d 100644 --- a/cmd/rshell/main.go +++ b/cmd/rshell/main.go @@ -30,6 +30,7 @@ func run(args []string, stdin io.Reader, stdout, stderr io.Writer) int { allowedPaths string allowedCommands string allowAllCmds bool + procPath string ) cmd := &cobra.Command{ @@ -62,6 +63,7 @@ func run(args []string, stdin io.Reader, stdout, stderr io.Writer) int { allowedPaths: paths, allowedCommands: cmds, allowAllCommands: allowAllCmds, + procPath: procPath, } if commandSet { @@ -107,6 +109,7 @@ func run(args []string, stdin io.Reader, stdout, stderr io.Writer) int { cmd.Flags().StringVarP(&allowedPaths, "allowed-paths", "p", "", "comma-separated list of directories the shell is allowed to access") cmd.Flags().StringVar(&allowedCommands, "allowed-commands", "", "comma-separated list of namespaced commands (e.g. rshell:cat,rshell:find)") cmd.Flags().BoolVar(&allowAllCmds, "allow-all-commands", false, "allow execution of all commands (builtins and external)") + cmd.Flags().StringVar(&procPath, "proc-path", "", "path to the proc filesystem used by ps (default \"/proc\")") if err := cmd.Execute(); err != nil { var status interp.ExitStatus @@ -140,6 +143,7 @@ type executeOpts struct { allowedPaths []string allowedCommands []string allowAllCommands bool + procPath string } func execute(ctx context.Context, script, name string, opts executeOpts, stdin io.Reader, stdout, stderr io.Writer) error { @@ -163,6 +167,9 @@ func execute(ctx context.Context, script, name string, opts executeOpts, stdin i } else if len(opts.allowedCommands) > 0 { runOpts = append(runOpts, interp.AllowedCommands(opts.allowedCommands)) } + if opts.procPath != "" { + runOpts = append(runOpts, interp.ProcPath(opts.procPath)) + } runner, err := interp.New(runOpts...) if err != nil { diff --git a/cmd/rshell/main_procpath_linux_test.go b/cmd/rshell/main_procpath_linux_test.go new file mode 100644 index 00000000..67c6494e --- /dev/null +++ b/cmd/rshell/main_procpath_linux_test.go @@ -0,0 +1,58 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +//go:build linux + +package main + +import ( + "fmt" + "os" + "path/filepath" + "strconv" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +// TestProcPathFlagNonexistentDir ensures --proc-path with a nonexistent +// directory causes ps -e to fail with a non-zero exit code. +func TestProcPathFlagNonexistentDir(t *testing.T) { + code, _, stderr := runCLI(t, + "--allow-all-commands", + "--proc-path", "/nonexistent/proc/path", + "-c", "ps -e", + ) + assert.NotEqual(t, 0, code) + assert.Contains(t, stderr, "ps:") +} + +// writeFakeProcCLI creates a minimal fake proc filesystem and returns its path. +func writeFakeProcCLI(t *testing.T, pid int, name string) string { + t.Helper() + dir := t.TempDir() + require.NoError(t, os.WriteFile(filepath.Join(dir, "stat"), []byte("cpu 0 0 0 0\nbtime 1000000000\n"), 0o644)) + pidDir := filepath.Join(dir, strconv.Itoa(pid)) + require.NoError(t, os.MkdirAll(pidDir, 0o755)) + stat := fmt.Sprintf("%d (%s) S 0 %d %d 0 -1 4194560 0 0 0 0 0 0 0 0 20 0 1 0 100\n", pid, name, pid, pid) + require.NoError(t, os.WriteFile(filepath.Join(pidDir, "stat"), []byte(stat), 0o644)) + require.NoError(t, os.WriteFile(filepath.Join(pidDir, "status"), []byte("Name:\t"+name+"\nUid:\t1000 1000 1000 1000\n"), 0o644)) + require.NoError(t, os.WriteFile(filepath.Join(pidDir, "cmdline"), []byte(name+"\x00"), 0o644)) + return dir +} + +// TestProcPathFlagFakeProc ensures --proc-path with a valid fake proc tree +// causes ps -e to succeed and list processes from the fake tree. +func TestProcPathFlagFakeProc(t *testing.T) { + procPath := writeFakeProcCLI(t, 1, "fakeinit") + code, stdout, stderr := runCLI(t, + "--allow-all-commands", + "--proc-path", procPath, + "-c", "ps -e", + ) + assert.Equal(t, 0, code, "stderr: %s", stderr) + assert.Contains(t, stdout, "fakeinit") +} diff --git a/cmd/rshell/main_test.go b/cmd/rshell/main_test.go index d475413a..b4575503 100644 --- a/cmd/rshell/main_test.go +++ b/cmd/rshell/main_test.go @@ -242,3 +242,9 @@ func TestCommandLongFormRejected(t *testing.T) { assert.NotEqual(t, 0, code) assert.Contains(t, stderr, "unknown flag: --command") } + +func TestProcPathFlagInHelp(t *testing.T) { + code, stdout, _ := runCLI(t, "--help") + assert.Equal(t, 0, code) + assert.Contains(t, stdout, "--proc-path") +} diff --git a/interp/api.go b/interp/api.go index 2fd490fe..bba2242a 100644 --- a/interp/api.go +++ b/interp/api.go @@ -25,6 +25,7 @@ import ( "mvdan.cc/sh/v3/syntax" "github.com/DataDog/rshell/allowedpaths" + "github.com/DataDog/rshell/builtins" ) // runnerConfig holds the immutable configuration of a [Runner]. @@ -58,6 +59,14 @@ type runnerConfig struct { // command. Intended for testing convenience. allowAllCommands bool + // procPath is the path to the proc filesystem used by the ps builtin. + // Defaults to "/proc" when empty. + procPath string + + // proc is the ProcProvider constructed from procPath, created once in + // New() and shared across subshells via runnerConfig value copy. + proc *builtins.ProcProvider + // usedNew is set by New() and checked in Reset() to ensure a Runner // was properly constructed rather than zero-initialized. usedNew bool @@ -215,6 +224,7 @@ func New(opts ...RunnerOption) (*Runner, error) { if r.stdout == nil || r.stderr == nil { StdIO(r.stdin, r.stdout, r.stderr)(r) } + r.proc = builtins.NewProcProvider(r.procPath) return r, nil } @@ -479,6 +489,21 @@ func AllowAllCommands() RunnerOption { } } +// ProcPath sets the path to the proc filesystem used by the ps builtin. +// When not set (default), ps uses "/proc". This option has no effect on +// non-Linux platforms. +// +// Note: bare ps (session mode) uses the host process's PID to walk the PPID +// chain. If path points to a proc filesystem from a different PID namespace, +// the host PID will likely not be found there and session output will be empty. +// ps -e and ps -p work correctly against any proc tree. +func ProcPath(path string) RunnerOption { + return func(r *Runner) error { + r.procPath = path + return nil + } +} + // subshell creates a child Runner that inherits the parent's state. // If background is false, the child shares the parent's environment overlay // without copying, which is more efficient but must not be used concurrently. diff --git a/interp/runner_exec.go b/interp/runner_exec.go index 7ccdb0fb..c41d5a5b 100644 --- a/interp/runner_exec.go +++ b/interp/runner_exec.go @@ -310,6 +310,7 @@ func (r *Runner) call(ctx context.Context, pos syntax.Pos, args []string) { CommandAllowed: func(cmdName string) bool { return r.allowAllCommands || cmdName == "help" || r.allowedCommands[cmdName] }, + Proc: r.proc, } if r.stdin != nil { // do not assign a typed nil into the io.Reader interface call.Stdin = r.stdin