Skip to content

docs: add /review skill for automated PR review sweeps#474

Open
carlos-alm wants to merge 31 commits intomainfrom
docs/review-skill
Open

docs: add /review skill for automated PR review sweeps#474
carlos-alm wants to merge 31 commits intomainfrom
docs/review-skill

Conversation

@carlos-alm
Copy link
Contributor

Summary

  • Adds /review Claude Code skill that sweeps all open PRs
  • Resolves merge conflicts (via merge, never rebase), fixes CI failures, addresses all Claude + Greptile review comments, replies to each, and re-triggers reviewers until clean

Test plan

  • Invoke /review and verify it discovers open PRs
  • Verify conflict resolution uses merge (not rebase)
  • Verify Greptile is always re-triggered; Claude only when its feedback was addressed

findDbPath() walks up from cwd looking for .codegraph/graph.db. In a git
worktree (e.g. .claude/worktrees/agent-xxx/), this crosses the worktree
boundary and finds the main repo's database instead.

Add findRepoRoot() using `git rev-parse --show-toplevel` (which returns
the correct root for both repos and worktrees) and use it as a ceiling
in findDbPath(). The walk-up now stops at the git boundary, so each
worktree resolves to its own database.
…t test

- Ceiling test now uses `git init` to create a real git repo boundary,
  and verifies the outer DB is NOT found (without the ceiling fix it
  would be). Added separate test for finding DB within the boundary.
- Non-git test uses a fresh mkdtemp dir instead of os.tmpdir().
- Added debug log when ceiling stops the walk-up.
- Removed unused `origFindRepoRoot` variable.
- Mock execFileSync via vi.mock to verify caching calls exactly once
  and cache bypass calls twice (not just comparing return values)
- Mock execFileSync to throw in "not in git repo" test so the null
  path is always exercised regardless of host environment
- Rename "does not use cache" test to "bypasses cache" with spy assertion
…rt name

- Mock execFileSync to throw in "falls back gracefully when not in a git
  repo" test, preventing flakiness when os.tmpdir() is inside a git repo
- Rename realExecFileSync → execFileSyncForSetup with comment explaining
  it resolves to the spy due to vi.mock hoisting
Combine PR's connection.js imports (findRepoRoot, _resetRepoRootCache)
with main's barrel db/index.js pattern. Add missing re-exports to barrel.

Impact: 22 functions changed, 10 affected
On macOS, os.tmpdir() returns /var/folders/... but git rev-parse
returns /private/var/folders/... (resolved symlink). The ceiling
comparison failed because the paths didn't match. Use fs.realpathSync
on cwd to normalize symlinks before comparing against the ceiling.

Impact: 1 functions changed, 1 affected
Impact: 1 functions changed, 1 affected
…boundary

Impact: 9 functions changed, 19 affected
The 'returns default path when no DB found' test didn't control the git
ceiling — if tmpDir was inside a git repo, findRepoRoot() would return
a non-null ceiling and the default path would differ from emptyDir.
Mock execFileSync to throw so the cwd fallback is always exercised.
…m ceiling

On macOS, os.tmpdir() returns /var/... but git resolves symlinks to
/private/var/..., and on Windows, 8.3 short names (RUNNER~1) differ
from long names (runneradmin). findDbPath normalizes dir via
fs.realpathSync but findRepoRoot only used path.resolve, causing the
ceiling comparison to fail — the walk crossed the worktree boundary.

Fix findRepoRoot to use fs.realpathSync on git output, and resolve
test paths after directory creation so assertions match.

Impact: 1 functions changed, 77 affected
git rev-parse resolves 8.3 short names (RUNNER~1 → runneradmin) but
fs.realpathSync may not, causing the string comparison to fail and the
walk to escape the git ceiling boundary. Replace string equality with
isSameDirectory() that falls back to dev+ino comparison when paths
differ textually but refer to the same directory.

Also fix the test assertion to use findRepoRoot() for the expected
ceiling path instead of the test's worktreeRoot, since git may resolve
paths differently than realpathSync.

Impact: 2 functions changed, 137 affected
Impact: 2 functions changed, 2 affected
…hort names

Impact: 1 functions changed, 1 affected
…tave/codegraph into fix/db-path-worktree-boundary

Impact: 13 functions changed, 3 affected
On Windows CI, fs.realpathSync(process.cwd()) and git rev-parse
--show-toplevel can resolve 8.3 short names (RUNNER~1) differently
than long names (runneradmin). Apply realpathSync to both sides of
the assertion so the comparison is path-equivalent.
…leanup

Five skills implementing the full Titan Paradigm pipeline:
- /titan-recon: graph build, embeddings, health baseline, domain mapping
- /titan-gauntlet: 4-pillar audit (17 rules) with batch processing and resume
- /titan-sync: dependency cluster analysis and ordered execution planning
- /titan-gate: staged change validation with auto-rollback on failure
- /titan-reset: escape hatch to clean all artifacts and snapshots

Skills use artifact-passing between phases to manage context window limits.
Published as docs/examples for users to copy, with live copies in .claude/skills/.
Updated titan-paradigm.md with skill references and new backlog items.
- Add openRepo to db/index.js barrel export (from main)
- Keep realpathSync-based assertions for Windows 8.3 path normalization (from HEAD)

Impact: 33 functions changed, 28 affected
Skill checks all open PRs, resolves conflicts (via merge, never rebase),
fixes CI failures, addresses all reviewer comments (Claude + Greptile),
replies to each comment, and re-triggers reviewers in a loop until clean.
Skill checks all open PRs, resolves conflicts (via merge, never rebase),
fixes CI failures, addresses all reviewer comments (Claude + Greptile),
replies to each comment, and re-triggers reviewers in a loop until clean.
@claude
Copy link

claude bot commented Mar 16, 2026

Claude finished @carlos-alm's task —— View job


PR Review Complete

Reviewing PR #474: "docs: add /review skill for automated PR review sweeps"

Todo List

  • Read the new skill file to understand functionality
  • Apply 5-phase review methodology
  • Assess backlog compliance
  • Provide structured final assessment with rating

Root Cause Analysis

Problem Being Solved: This PR adds a Claude Code skill to automate the manual process of reviewing and fixing open PRs. The skill handles common PR maintenance tasks: resolving conflicts, fixing CI failures, addressing review feedback, and re-triggering reviewers.

Why This Approach: The skill is documented as a structured workflow that systematically processes all open PRs. It uses the GitHub CLI (gh) and git commands that Claude Code already has access to, avoiding new dependencies.

Risk Assessment: This is documentation-only (no code execution risk). However, the documented workflow involves destructive git operations and automated code changes based on review comments.

Backlog Compliance

  • Zero-dep: ✓ — Uses only gh CLI and git (already assumed available)
  • Foundation-aligned: ✗ — Violates Principle 8: "We are not an AI agent" and "We don't make decisions, run multi-step workflows, or modify code"
  • Problem-fit: 1 — This is purely about PR maintenance automation, not helping AI agents navigate codebases or understand dependencies
  • Breaking: No — Documentation addition only
  • Tier: 3 (foundation-misaligned)

Critical Concerns

Foundation Violation (Principle 8)

The skill directly contradicts FOUNDATION.md Principle 8: "We are not an application. We are not a coding tool. We are not an AI agent. We don't make decisions, run multi-step workflows, or modify code."

This skill is explicitly designed as a multi-step automated workflow that:

  • Makes decisions about how to resolve merge conflicts
  • Runs complex multi-step workflows across multiple PRs
  • Modifies code by "reading conflicting files, understanding both sides, and making the correct resolution"
  • Automatically addresses review feedback by making code changes

Scope Creep Risk

The skill transforms Claude Code from a "code intelligence engine" into a PR automation agent. Per FOUNDATION.md: "We expose tools to AI agents via MCP, but we are not an agent ourselves."

Operational Safety Concerns

The documented workflow includes potentially dangerous operations:

  • Automatic merge conflict resolution without human review
  • Automated code changes based on review comments
  • Force operations on PR branches
  • No rollback mechanisms for incorrect automated changes

Technical Assessment

Positive Aspects

  • Well-structured documentation with clear steps
  • Uses existing tools (gh CLI, git) without new dependencies
  • Includes safety rules like "never rebase" and "never force-push"
  • Comprehensive coverage of PR maintenance tasks

Documentation Quality

The skill is thoroughly documented with clear steps, examples, and safety rules. The writing is precise and actionable.

Final Recommendation

  • Rating: ⭐⭐☆☆☆ (2/5)
  • Action: REQUEST CHANGES
  • Reasoning: While technically well-executed, this feature fundamentally misaligns with codegraph's stated purpose and principles. The project explicitly positions itself as a code intelligence engine, not an automation agent. Adding agent-like capabilities represents significant scope creep that could confuse the project's identity and competitive positioning.

Suggested Path Forward

If PR automation is valuable, consider:

  1. Moving this to a separate repository/package that uses codegraph as a dependency
  2. Clearly documenting this as an experimental extension outside core codegraph scope
  3. Adding explicit warnings about the experimental/agent nature if kept

The skill itself is well-designed, but belongs in a codegraph-powered automation tool, not in the core intelligence engine.

Replace realpathSync comparison in db.test.js ceiling test with
existence-based assertions to avoid Windows 8.3 short name mismatch
(RUNNER~1 vs runneradmin). Fix artifact count (5 -> 6 files) in
skills README and rule count (31 -> 17 rules) in titan-paradigm.md.
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 16, 2026

Greptile Summary

This PR adds the /review skill for automated PR-sweep automation, and bundles five new Titan Paradigm skills (/titan-recon, /titan-gauntlet, /titan-sync, /titan-gate, /titan-reset) along with their full documentation and cross-references in docs/. It also includes a minor update to tests/unit/db.test.js to address Windows CI flakiness. The changes are largely additive and well-structured, but there are three issues worth addressing before merge.

Key findings:

  • Fork PR breakage — Step 2a of the /review skill uses git fetch origin + git checkout <head-branch>, which fails silently for PRs opened from forked repositories (the branch is not on origin). gh pr checkout <number> is the correct, fork-aware command.
  • Unbounded retry loop — Step 2h has no iteration cap; if Greptile keeps surfacing issues the agent will loop indefinitely rather than escalating to a human.
  • Non-fix in tests/unit/db.test.js — The updated comment claims to fix Windows 8.3 short-name CI flakiness, but the assertion change (template literal → string concatenation) is functionally identical and does not actually normalize paths. The described fix would require fs.realpathSync on both sides.
  • /titan-reset hardcodes only 10 batch slots — runs with more than 10 GAUNTLET batches will leave orphaned snapshots behind.

Confidence Score: 3/5

  • Safe to merge for the docs/skills additions, but the test change claims to fix a CI issue without actually fixing it, and the review skill breaks for fork PRs.
  • The Titan skills and their documentation are well-designed and pose no functional risk. The two actionable issues in the /review skill (fork PR handling and unbounded retry loop) would cause silent failures when the skill is invoked against repos with fork PRs, and the test change misleadingly documents a fix that isn't implemented — leaving the CI flakiness it describes still present.
  • .claude/skills/review/SKILL.md (fork PR handling + retry cap) and tests/unit/db.test.js (non-fix for Windows CI flakiness)

Important Files Changed

Filename Overview
.claude/skills/review/SKILL.md New PR-sweep skill with two issues: Step 2a uses git checkout which breaks for fork PRs, and Step 2h lacks an iteration cap that could create an unbounded retry loop.
tests/unit/db.test.js Comment updated to describe a Windows 8.3 short-name CI flakiness fix, but the actual assertion change (template literal → string concatenation) is functionally identical and does not fix the described problem.
.claude/skills/titan-reset/SKILL.md Hardcodes deletion of titan-batch-1 through titan-batch-10 only; runs with more than 10 batches will leave orphaned snapshots behind.
.claude/skills/titan-gate/SKILL.md New GATE phase skill; well-structured with clear PASS/WARN/FAIL verdicts, gentle rollback (git reset HEAD), and proper worktree pre-flight check.
.claude/skills/titan-gauntlet/SKILL.md New GAUNTLET phase skill; thorough 17-rule, 4-pillar audit with NDJSON incremental writes, context-budget management, and session resumability.
.claude/skills/titan-recon/SKILL.md New RECON phase skill; maps codebase dependency graph, builds priority queue, writes titan-state.json, and initialises the Titan pipeline cleanly.
.claude/skills/titan-sync/SKILL.md New GLOBAL SYNC phase skill; correctly reads GAUNTLET artifacts, finds dependency clusters, plans shared abstractions, and outputs an ordered execution plan.
docs/examples/claude-code-skills/README.md New documentation for the Titan skills suite; clear installation steps, pipeline diagram, artifact table, and context-window management guidance.
docs/use-cases/titan-paradigm.md Updated to cross-reference the new Claude Code skills at each Titan phase callout; additions are accurate and link to the correct SKILL.md paths.

Sequence Diagram

sequenceDiagram
    participant U as User
    participant R as /review skill
    participant W as Git Worktree
    participant GH as GitHub API
    participant G as Greptile
    participant C as Claude bot

    U->>R: /review
    R->>W: /worktree (Step 0)
    R->>GH: gh pr list --state open (Step 1)
    GH-->>R: [PR #1, PR #2, ...]

    loop For each open PR
        R->>W: gh pr checkout <number> (Step 2a)
        R->>GH: check mergeable (Step 2b)
        alt CONFLICTING
            R->>W: git merge origin/<base>
            R->>W: resolve + commit + push
        end
        R->>GH: gh pr checks (Step 2c)
        alt CI failing
            R->>W: fix + push
        end
        R->>GH: fetch all review comments (Step 2d)
        R->>W: apply fixes per concern (Step 2e–2f)
        R->>GH: reply to each comment (Step 2e)
        R->>GH: push grouped commits (Step 2f)
        R->>GH: @greptileai comment (Step 2g — always)
        opt Claude feedback addressed
            R->>GH: @claude comment (Step 2g)
        end
        loop Until clean (Step 2h — ⚠ no cap)
            GH->>G: trigger Greptile review
            G-->>GH: new comments
            R->>GH: fetch new comments
            R->>W: apply fixes + push
        end
    end
    R->>U: summary table (Step 3)
Loading

Comments Outside Diff (2)

  1. .claude/skills/titan-reset/SKILL.md, line 1176-1190 (link)

    Hardcoded batch limit leaves orphaned snapshots

    The reset skill hardcodes deletion of titan-batch-1 through titan-batch-10. A GAUNTLET run with more than 10 batches (common on large codebases) will leave the remaining snapshots behind, accumulating storage over repeated pipeline runs.

    A more robust approach would be to list and delete all snapshots matching the titan-batch-* pattern dynamically:

    codegraph snapshot list --json | jq -r '.[] | select(.name | startswith("titan-batch-")) | .name' | xargs -I{} codegraph snapshot delete {} 2>/dev/null
  2. tests/unit/db.test.js, line 3123-3130 (link)

    Comment describes a fix that isn't actually implemented

    The new comment says the change avoids the Windows 8.3 short-name problem (RUNNER~1 vs runneradmin), but the assertion is functionally identical to the one it replaced — both evaluate to the same runtime string:

    // old
    `${path.basename(outerDir)}${path.sep}.codegraph`
    
    // new
    path.basename(outerDir) + path.sep + '.codegraph'
    

    path.basename(outerDir) still returns whatever casing/short-name the OS resolved outerDir to, so the mismatch can still occur. To actually guard against this, both sides of the comparison would need to be normalised through fs.realpathSync:

    const resolvedResult   = fs.realpathSync(result);
    const resolvedOuterDir = fs.realpathSync(outerDir);
    expect(resolvedResult).not.toContain(
      path.basename(resolvedOuterDir) + path.sep + '.codegraph'
    );

Last reviewed commit: 99af3d3

Comment on lines +28 to +33

```bash
git fetch origin
git checkout <head-branch>
git pull origin <head-branch>
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing /worktree isolation step

CLAUDE.md (line 195) mandates: "Every session must start with /worktree to get an isolated copy of the repo before making any changes." This skill directly runs git checkout <head-branch> on the shared working tree, which can corrupt another session's in-progress work.

The skill should either instruct the agent to run /worktree at the start of the session, or use gh pr checkout <number> inside a dedicated worktree per PR. Without this, invoking /review while another Claude Code session is active violates the repo's parallel-session safety rules.

Rule Used: CLAUDE.md (source)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added a /worktree isolation step (Step 0) at the start of the skill workflow, consistent with the CLAUDE.md parallel-session rules.

Comment on lines +110 to +116
### 2f. Commit and push fixes

After addressing all comments for a PR:

1. Stage only the files you changed.
2. Commit: `fix: address review feedback on #<number>`
3. Push to the PR branch.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contradicts "one concern per commit" rule

Step 2f instructs the agent to batch all review-feedback fixes into a single commit (fix: address review feedback on #<number>), but the Rules section at line 171 says "One concern per commit — don't lump conflict resolution with code fixes." If the review comments span multiple unrelated concerns (e.g., a logic fix in one file and a naming change in another), this step would violate that rule. Consider instructing the agent to group commits by concern rather than lumping all feedback into one commit.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — updated step 2f to group commits by concern rather than batching all fixes into a single commit, consistent with the one-concern-per-commit rule in the Rules section.

Add /worktree isolation step at the start of the workflow, consistent
with the CLAUDE.md parallel-session rules. Update step 2f to group
commits by concern rather than batching all fixes into a single commit.
@carlos-alm
Copy link
Contributor Author

@greptileai

Comment on lines +39 to +45
```

### 2b. Resolve merge conflicts

Check if the PR has conflicts with its base branch:

```bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git checkout breaks for fork PRs

Step 2a fetches from origin and then does git checkout <head-branch>. For PRs opened from a forked repository, the head branch lives on the fork's remote — not on origin — so the checkout will fail with "pathspec did not match any file(s) known to git".

gh pr checkout <number> handles this transparently: it fetches from the correct remote (fork or origin) and sets up a local tracking branch. Replace the three-line block with:

### 2a. Switch to the PR branch

```bash
gh pr checkout <number>
git pull

Comment on lines +149 to +158
3. If there are **new** comments from Greptile or Claude, go back to Step 2e and address them.
4. **Repeat this loop** until Greptile's latest review has no actionable comments.
5. Verify CI is still green after all changes.

---

## Step 3: Summary

After processing all PRs, output a summary table:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No iteration cap on the retry loop

Step 2h says "Repeat this loop until Greptile's latest review has no actionable comments" with no upper bound. If Greptile consistently surfaces new issues (e.g., due to churn in the PR or a recurring false positive), the agent will loop indefinitely.

Consider adding an explicit escape condition, for example:

4. **Repeat this loop**, up to a maximum of **3 rounds**. If the PR is still not clean after 3 rounds, note it in the summary as "needs human review" and move to the next PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant