feat: /titan-run orchestrator with diff review, semantic assertions, arch snapshots#557
feat: /titan-run orchestrator with diff review, semantic assertions, arch snapshots#557carlos-alm wants to merge 70 commits intomainfrom
Conversation
…in backlog These two items deliver the highest immediate impact on agent experience and graph accuracy without requiring Rust porting or TypeScript migration. They should be implemented before any Phase 4+ roadmap work. - #83: hook-optimized `codegraph brief` enriches passively-injected context - #71: basic type inference closes the biggest resolution gap for TS/Java
Impact: 14 functions changed, 0 affected
Add new Phase 4 covering the port of JS-only build phases to Rust: - 4.1-4.3: AST nodes, CFG, dataflow visitor ports (~587ms savings) - 4.4: Batch SQLite inserts (~143ms) - 4.5: Role classification & structure (~42ms) - 4.6: Complete complexity pre-computation - 4.7: Fix incremental rebuild data loss on native engine - 4.8: Incremental rebuild performance (target sub-100ms) Bump old Phases 4-10 to 5-11 with all cross-references updated. Benchmark evidence shows ~50% of native build time is spent in JS visitors that run identically on both engines.
Take main's corrected #57 section anchors; keep HEAD's v2.7.0 version reference. Impact: 10 functions changed, 11 affected
…ative-acceleration Impact: 25 functions changed, 46 affected
- Add COMMITS=0 guard in publish.yml to return clean version when HEAD is exactly at a tag (mirrors bench-version.js early return) - Change bench-version.js to use PATCH+1-dev.COMMITS format instead of PATCH+COMMITS-dev.SHA (mirrors publish.yml's new scheme) - Fix fallback in bench-version.js to use dev.1 matching publish.yml's no-tags COMMITS=1 default Impact: 1 functions changed, 0 affected
The release skill now scans commit history using conventional commit rules to determine major/minor/patch automatically. Explicit version argument still works as before.
…ns, and architectural snapshot Add /titan-run skill that dispatches the full Titan pipeline (recon → gauntlet → sync → forge) to sub-agents with fresh context windows, enabling end-to-end autonomous execution. Hardening layers added across the pipeline: - Pre-Agent Gate (G1-G4): git health, worktree validity, state integrity, backups - Post-phase validation (V1-V15): artifact structure, coverage, consistency checks - Stall detection with per-phase thresholds and no-progress abort - Mandatory human checkpoint before forge (unless --yes) New validation tools integrated into forge and gate: - Diff Review Agent (forge Step 9): verifies each diff matches the gauntlet recommendation and sync plan intent before gate runs - Semantic Assertions (gate Step 5): export signature stability, import resolution integrity, dependency direction, re-export chain validation - Architectural Snapshot Comparator (gate Step 5.5): community stability, cross-domain dependency direction, cohesion delta, drift detection vs pre-forge baseline
Greptile SummaryThis PR adds the The PR incorporates an unusually large number of fixes from prior review cycles, all of which appear correctly applied in both the
Two remaining concerns:
Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["titan-run start"] --> B["Step 0: Pre-flight + G1-G4 gate"]
B --> C{"--start-from?"}
C -->|yes| D["Step 0.5: Artifact pre-validation\nV1-V10 + NDJSON integrity"]
C -->|no| E["Step 1: RECON sub-agent"]
D --> E
E --> F["V1-V4 validation"]
F --> G["Step 2: GAUNTLET loop\nsub-agent x N iterations\nstall detection maxStalls=3"]
G --> H["V5-V7 + NDJSON integrity check"]
H --> I["Step 3: SYNC sub-agent"]
I --> J["V8-V10 validation"]
J --> K["Step 3.5a: Arch Snapshot Capture\ncommunities + structure + drift\narch-snapshot.json"]
K --> L["Step 3.5b: Human Checkpoint\nMANDATORY PAUSE unless --yes"]
L --> M["Step 4: FORGE loop\nstall detection maxStalls=2"]
M --> M1
subgraph forge_sub["Forge Sub-agent per phase"]
M1["Apply change"] --> M2["git add staged files"]
M2 --> M3{"Diff Review D1-D5\nscope, intent, commit msg,\ndeletion audit, leftovers"}
M3 -->|"DIFF FAIL"| M4["Rollback + failedTargets\ncontinue next target"]
M3 -->|"DIFF PASS or WARN"| M5["Run tests pre-gate\nfast-fail optimization"]
M5 -->|"fail"| M6["Step 13: rollback\ngit reset + git checkout"]
M5 -->|"pass"| M7["titan-gate skill"]
M7 --> M8{"Gate verdict"}
M8 -->|"test/lint/build FAIL\ngate auto-unstaged"| M6
M8 -->|"semantic/structural FAIL\nstaged preserved by gate"| M9["forge inline rollback\nskip step 13"]
M8 -->|"PASS"| M10["git commit\nrecord diffWarnings"]
end
M10 --> N["V11-V13 post-agent checks\ncommit audit + test suite"]
M4 --> N
M6 --> N
M9 --> N
N --> O{"All phases complete?"}
O -->|"yes"| P["V14-V15 final validation\nstate consistency + gate-log"]
O -->|"no"| M
P --> Q["Step 5: Final Report"]
subgraph gate_sub["Gate Sub-agent per commit"]
G1s["Step 1: codegraph check --staged\n+ diff-impact"] --> G2s["Step 2: Cycles"]
G2s --> G3s["Step 3: Complexity delta"]
G3s --> G4s["Step 4: Lint, Build, Test\nauto-rollback on FAIL"]
G4s --> G5s["Step 5: Semantic Assertions\n5a signature stability\n5b import resolution\n5c dependency direction\n5d barrel re-export chain"]
G5s --> G55s["Step 5.5: Arch Snapshot Compare\nA1 community drift\nA2 domain direction\nA3 cohesion delta\nA4 resolved drift signal"]
G55s --> G9s["Step 9: Verdict\nSteps 1-3,5,5.5,6-8 no rollback\nStep 4 failures auto-rollback"]
end
M7 --> gate_sub
|
|
|
||
| if currentAuditedCount == previousAuditedCount: | ||
| stallCount += 1 | ||
| Print: "WARNING: Gauntlet iteration <iteration> made no progress (stall <stallCount>/<maxStalls>)" | ||
| if stallCount >= maxStalls: | ||
| Stop: "Gauntlet stalled for <maxStalls> consecutive iterations at <currentAuditedCount>/<expectedTargetCount> targets. Likely stuck on a problematic target. Check gauntlet.ndjson for the last successful entry and investigate the next target in the batch." |
There was a problem hiding this comment.
Undefined variable
previousAuditedCountBeforeAgent in gauntlet efficiency check
The efficiency check references previousAuditedCountBeforeAgent, which is never defined in the pseudocode. By the time this line is reached, previousAuditedCount has already been updated to currentAuditedCount on the line just above, so using it there would always yield 0.
To correctly compute how many targets this iteration processed, you need to capture the pre-agent count before the update. For example:
previousAuditedCount = currentAuditedCount # update for next iteration's stall check
# Efficiency check: if progress is very slow (< 2 targets per iteration), warn
targetsThisIteration = currentAuditedCount - previousAuditedCountBeforeAgent # ← undefined
Should be:
# Save count before update for efficiency check
countBeforeUpdate = previousAuditedCount
previousAuditedCount = currentAuditedCount # update for next iteration's stall check
# Efficiency check
targetsThisIteration = currentAuditedCount - countBeforeUpdate
if targetsThisIteration == 1 and iteration > 3:
Print: "WARNING: Only 1 target per iteration..."
This same issue exists in the identical copy at docs/examples/claude-code-skills/titan-run/SKILL.md at the same line.
There was a problem hiding this comment.
Fixed — saved pre-update count to countBeforeUpdate before reassignment. Applied to both .claude/skills/ and docs/examples/ copies.
Impact: 1662 functions changed, 760 affected
|
Resolved merge conflicts with main and fixed the undefined previousAuditedCountBeforeAgent variable in the gauntlet loop efficiency check. |
| If any SHA doesn't exist → **WARN**: "Commit <sha> recorded in state but not found in git history. State may be out of sync." | ||
|
|
||
| **V15. Gate log consistency:** | ||
| If `.codegraph/titan/gate-log.ndjson` exists: |
There was a problem hiding this comment.
Incomplete merge conflict detection — misses
AU, UA, DU, UD markers
The check only lists UU, AA, DD as conflict indicators in git status --porcelain output. Git also produces AU (added by us, modified by them), UA (modified by us, added by them), DU (deleted by us, modified by them), and UD (modified by us, deleted by them) in the porcelain format for merge conflicts. Missing these cases could allow the pipeline to proceed with unresolved conflicts into a forge sub-agent that commits.
| If `.codegraph/titan/gate-log.ndjson` exists: | |
| - **Merge conflicts** (lines starting with `UU`, `AA`, `DD`, `AU`, `UA`, `DU`, `UD`): Stop immediately: "Unresolved merge conflict detected. Resolve before continuing." |
There was a problem hiding this comment.
Fixed — added AU, UA, DU, UD to the merge conflict detection markers in both .claude/skills/ and docs/examples/ copies.
| Gate now includes semantic assertions (Step 5) and architectural snapshot comparison (Step 5.5). | ||
| The arch snapshot is at .codegraph/titan/arch-snapshot.json. | ||
| 6. Commit on success, rollback on failure | ||
|
|
There was a problem hiding this comment.
--start-from forge silently disables architectural comparison without warning
When using --start-from forge, Step 3.5a (architectural snapshot capture) is skipped entirely since it lives inside Step 3.5, which comes after sync. The gate's Step 5.5 gracefully handles a missing arch-snapshot.json by skipping the comparison, but the user receives no warning that the architectural drift check will be inactive for the entire forge run.
Consider adding an explicit notice at the start of the forge loop when arch-snapshot.json is absent and the start phase was forge:
if startPhase == "forge" and not exists(".codegraph/titan/arch-snapshot.json"):
Print: "NOTE: No arch-snapshot.json found. Architectural comparison in /titan-gate (Step 5.5) will be skipped for this run.
To enable it, run '/titan-run --start-from sync' to re-capture the pre-forge snapshot."
There was a problem hiding this comment.
Fixed — added a pre-loop check in Step 4a that warns when arch-snapshot.json is missing, advising the user to run --start-from sync to re-capture it.
- Fix undefined previousAuditedCountBeforeAgent variable in gauntlet efficiency check (save pre-update count before reassignment) - Add AU, UA, DU, UD to merge conflict detection markers - Add warning when --start-from forge runs without arch-snapshot.json
- Fix undefined previousAuditedCountBeforeAgent variable in gauntlet efficiency check (save pre-update count before reassignment) - Add AU, UA, DU, UD to merge conflict detection markers - Add warning when --start-from forge runs without arch-snapshot.json
.claude/skills/titan-forge/SKILL.md
Outdated
| 13. **On failure (test, gate, or diff-review):** | ||
| ```bash | ||
| git checkout -- <changed files> |
There was a problem hiding this comment.
Incomplete rollback — staged files not unstaged in Step 13
The new flow stages files in Step 8 (git add) before running tests (Step 10) and gate (Step 11). When a test or gate failure routes to Step 13, only the working-tree restoration is performed:
git checkout -- <changed files>This restores the file contents on disk but does not remove the files from the staging area. The changes remain indexed in git. If the orchestrator loops back and the next target's git add picks up different files, the leftover staged changes from the failed target will be silently included in the next commit.
The rollback should unstage first, then restore the working tree:
| 13. **On failure (test, gate, or diff-review):** | |
| ```bash | |
| git checkout -- <changed files> | |
| 13. **On failure (test, gate, or diff-review):** | |
| ```bash | |
| git reset HEAD <changed files> | |
| git checkout -- <changed files> | |
| ``` |
Note: the DIFF FAIL path in Step 9 already says "Unstage and revert changes" correctly — this gap only affects Step 13. The same fix is needed in the docs mirror at docs/examples/claude-code-skills/titan-forge/SKILL.md.
There was a problem hiding this comment.
Fixed — added unstaging step before working tree restore in Step 13 rollback, so staged files are properly cleared before checkout. Applied to both .claude/skills/ and docs/examples/ copies.
Both commands are called in titan-forge's new diff review step (Step 9): fn-impact for deletion audit (D4) and exports for re-export chain checks.
|
Addressed the remaining open item from the summary: README command table (P2): Restored The |
| | `codegraph exports` | GAUNTLET | Per-symbol export consumers | | ||
| | `codegraph fn-impact` | GAUNTLET, SYNC | Blast radius | | ||
| | `codegraph search` | GAUNTLET | Duplicate code detection (needs embeddings) | |
There was a problem hiding this comment.
Command table stale after new diff-review and semantic-assertion additions
Three commands were removed from FORGE/GATE in the table, but the new steps introduced in this PR use them:
-
codegraph fn-impact— listed as "GAUNTLET, SYNC" only, but is still called in titan-forge Step 4 (pre-touch understanding) and in the new Step 9 D4 (deletion audit). It's also now used in titan-gate Step 5a. All three users (FORGE, GATE) are missing. -
codegraph context— listed as "SYNC" only, but titan-forge Step 4 still callscodegraph context <target> -T --json(this code is unchanged and confirmed in the current file). FORGE should be restored. -
codegraph exports— listed as "GAUNTLET" only, but the new titan-gate Step 5a explicitly runscodegraph exports <changed-file> -T --json. GATE should be added.
The same table appears in .claude/skills/ (mirrored) so the same correction applies there.
Suggested correction:
| | `codegraph exports` | GAUNTLET | Per-symbol export consumers | | |
| | `codegraph fn-impact` | GAUNTLET, SYNC | Blast radius | | |
| | `codegraph search` | GAUNTLET | Duplicate code detection (needs embeddings) | | |
| | `codegraph exports` | GAUNTLET, GATE | Per-symbol export consumers | | |
| | `codegraph fn-impact` | GAUNTLET, SYNC, FORGE, GATE | Blast radius | |
And separately restore FORGE to the codegraph context row:
| `codegraph context` | SYNC, FORGE | Full function context |
There was a problem hiding this comment.
Fixed — added GATE to exports and fn-impact rows, added FORGE to context row in the command table. Verified against actual usage in titan-gate Step 5a and titan-forge Steps 4 and 9.
There was a problem hiding this comment.
Fixed — added GATE to codegraph communities, codegraph structure, and codegraph deps in the command table. These are all used by GATE Step 5.5 (architectural snapshot comparison) but were missing from the "Used by" column.
| 5. Run /titan-gate — read .claude/skills/titan-gate/SKILL.md and follow it exactly. | ||
| Gate now includes semantic assertions (Step 5) and architectural snapshot comparison (Step 5.5). | ||
| The arch snapshot is at .codegraph/titan/arch-snapshot.json. | ||
| 6. Commit on success, rollback on failure |
There was a problem hiding this comment.
--yes description inaccurate — forge no longer accepts this flag
The argument hint says --yes is "passed through to forge", but this PR removes --yes from titan-forge's supported arguments list (the --yes → skip confirmation prompt line was deleted from titan-forge). The flag is now passed to forge's dispatch invocation (line ~835 of this file: yesFlag = "--yes" if autoConfirm else ""), but forge silently ignores it since it no longer handles that argument.
--yes still has meaningful effects within the orchestrator itself (auto-resume on existing state, skip the pre-pipeline and forge-checkpoint confirmations), so it should stay — but the description should be updated to reflect its actual scope.
| 6. Commit on success, rollback on failure | |
| - `--yes` → skip all confirmation prompts in the orchestrator (pre-pipeline, forge checkpoint, and resume prompts) |
There was a problem hiding this comment.
Fixed — updated the --yes description to reflect its actual scope: orchestrator-level prompts only (pre-pipeline, forge checkpoint, and resume). Applied to both .claude/skills/ and docs/examples/ copies.
There was a problem hiding this comment.
Fixed — updated the Rules section description of --yes to clarify both purposes: (1) passed through to forge so it skips its per-phase confirmation prompt, and (2) within the orchestrator itself, skips pre-pipeline, forge checkpoint, and resume prompts. Updated in both .claude/skills/titan-run/SKILL.md and the example copy.
.claude/skills/titan-gate/SKILL.md
Outdated
| Clean up the temp file (recover path from sidecar): | ||
| ```bash | ||
| BARREL_TMP=$(cat .codegraph/titan/.barrel-tmp 2>/dev/null) | ||
| if [ -n "$BARREL_TMP" ]; then rm -f "$BARREL_TMP"; fi | ||
| rm -f .codegraph/titan/.barrel-tmp | ||
| ``` |
There was a problem hiding this comment.
Step 5d cleanup not guaranteed to run on early exit
Step 5.5 explicitly marks its cleanup block with "MUST run even on failure or early exit, before proceeding to Step 9." Step 5d's cleanup has no such instruction — just a trailing block at the end of the step.
When an AI agent produces a FAIL verdict at Step 5d (exports dropped with active callers), it may jump directly to Step 9 for verdict aggregation without executing the cleanup block. The .barrel-tmp sidecar and the mktemp file will persist. On the next gate invocation:
- A new
BARREL_TMPpath is written to.barrel-tmp, overwriting the old one - The orphaned temp file from the previous run is never deleted
This is the exact class of stale-file problem that was explicitly fixed for .arch-tmpdir — but the same guarantee was not applied here. The same applies to docs/examples/claude-code-skills/titan-gate/SKILL.md at the corresponding lines.
Consider adding the same explicit instruction:
| Clean up the temp file (recover path from sidecar): | |
| ```bash | |
| BARREL_TMP=$(cat .codegraph/titan/.barrel-tmp 2>/dev/null) | |
| if [ -n "$BARREL_TMP" ]; then rm -f "$BARREL_TMP"; fi | |
| rm -f .codegraph/titan/.barrel-tmp | |
| ``` | |
| Clean up the temp file (recover path from sidecar). **This MUST run even if Step 5d produced a FAIL verdict — run it before proceeding to Step 9:** | |
| ```bash | |
| BARREL_TMP=$(cat .codegraph/titan/.barrel-tmp 2>/dev/null) | |
| if [ -n "$BARREL_TMP" ]; then rm -f "$BARREL_TMP"; fi | |
| rm -f .codegraph/titan/.barrel-tmp |
There was a problem hiding this comment.
Fixed — added the same "MUST run even if Step 5d produced a FAIL verdict" guarantee to the barrel cleanup block, matching the pattern already used in Step 5.5's cleanup. Applied to both .claude/skills/ and docs/examples/ copies.
.claude/skills/titan-gate/SKILL.md
Outdated
| - Check against `codegraph check --boundaries -T --json` | ||
| - New dependency from a lower layer to a higher layer → **FAIL**: "New upward dependency: `<source>` → `<target>` violates layer boundary" | ||
| - New dependency on a module flagged in sync.json as "to be removed" or "to be split" → **WARN**: "New dependency on `<module>` which is scheduled for decomposition" |
There was a problem hiding this comment.
Step 5c suggests running
codegraph check --boundaries without --staged, producing false positives
The bullet point Check against \codegraph check --boundaries -T --json`specifies the command without--staged`. An agent following this literally runs a full-codebase boundaries check, which returns ALL boundary violations — including pre-existing ones that were there before this commit. This would incorrectly flag long-standing boundary violations as new failures introduced by the current change.
Step 1 already ran codegraph check --staged --cycles --blast-radius 30 --boundaries -T --json, which includes the boundaries check scoped to staged changes. Step 5c should reference those results instead:
| - Check against `codegraph check --boundaries -T --json` | |
| - New dependency from a lower layer to a higher layer → **FAIL**: "New upward dependency: `<source>` → `<target>` violates layer boundary" | |
| - New dependency on a module flagged in sync.json as "to be removed" or "to be split" → **WARN**: "New dependency on `<module>` which is scheduled for decomposition" | |
| From the diff-impact results already collected in Step 1, extract any **new** edges (imports that didn't exist before). | |
| For each new dependency: | |
| - Check against `GLOBAL_ARCH.md` layer rules (if Titan artifacts exist) | |
| - Check the Step 1 `codegraph check --staged --boundaries` results for violations on this edge (already collected — do not re-run) | |
| - New dependency from a lower layer to a higher layer → **FAIL**: "New upward dependency: `<source>` → `<target>` violates layer boundary" | |
| - New dependency on a module flagged in sync.json as "to be removed" or "to be split" → **WARN**: "New dependency on `<module>` which is scheduled for decomposition" |
The same issue exists in docs/examples/claude-code-skills/titan-gate/SKILL.md at the corresponding line.
There was a problem hiding this comment.
Fixed — Step 5c now references the Step 1 codegraph check --staged --boundaries results instead of re-running without --staged. Applied to both .claude/skills/ and docs/examples/ copies.
.claude/skills/titan-forge/SKILL.md
Outdated
| 11. **Run /titan-gate:** | ||
| Use the Skill tool to invoke `titan-gate`. | ||
| - If FAIL on **test/lint/build** (gate auto-rolls back staged changes) → go to rollback (step 13) to also revert working tree. | ||
| - If FAIL on **semantic/structural** (gate preserves staged changes per its no-rollback rule) → unstage with `git reset HEAD <files> && git checkout -- <files>`, add to `execution.failedTargets` with reason, log the gate report, and continue to the next target. Do NOT go to step 13 — gate left staged changes intact for potential in-place fixing, and step 13 would silently destroy them. |
There was a problem hiding this comment.
Step 11 semantic FAIL path claims to preserve working tree but immediately destroys it
The instruction says "gate preserves staged changes per its no-rollback rule" and then immediately does git reset HEAD <files> && git checkout -- <files>, which both unstages AND reverts the working tree — destroying all changes. The note "Do NOT go to step 13 — gate left staged changes intact for potential in-place fixing, and step 13 would silently destroy them" is misleading because Step 11 has already destroyed them.
The phrase "gate preserved staged changes" is factually correct about what gate did, but forge immediately overrides that by performing its own full rollback. The "do not go to step 13" is only to avoid a double-rollback attempt, not to preserve any working tree state for fixing.
This creates an inconsistency: gate's Step 11 FAIL template tells the user "Staged changes are intact. Fix the issues above, or manually run git reset HEAD to unstage" — but if forge called gate, those changes have already been wiped by forge Step 11 before the user sees the message.
Consider clarifying the intent — if the purpose is to clean up for the next target (not preserve for fixing), the comment should say so:
| 11. **Run /titan-gate:** | |
| Use the Skill tool to invoke `titan-gate`. | |
| - If FAIL on **test/lint/build** (gate auto-rolls back staged changes) → go to rollback (step 13) to also revert working tree. | |
| - If FAIL on **semantic/structural** (gate preserves staged changes per its no-rollback rule) → unstage with `git reset HEAD <files> && git checkout -- <files>`, add to `execution.failedTargets` with reason, log the gate report, and continue to the next target. Do NOT go to step 13 — gate left staged changes intact for potential in-place fixing, and step 13 would silently destroy them. | |
| - If FAIL on **semantic/structural** (gate does not auto-rollback its staging area but forge must clean up for the next target) → unstage with `git reset HEAD <files> && git checkout -- <files>`, add to `execution.failedTargets` with reason, log the gate report, and continue to the next target. Do NOT go to step 13 — that step is for test/gate failures where gate already unstaged; going there again would attempt a duplicate rollback. |
The same issue exists in docs/examples/claude-code-skills/titan-forge/SKILL.md at the corresponding line.
There was a problem hiding this comment.
Fixed — Step 11 semantic FAIL now correctly says "gate does not auto-rollback its staging area, but forge must clean up for the next target". The "Do NOT go to step 13" note now explains it avoids duplicate rollback, not that changes are preserved for fixing. Applied to both copies.
| @@ -18,7 +18,6 @@ Your goal: read `sync.json`, find the next incomplete execution phase, make the | |||
| - `--phase N` → jump to specific phase | |||
| - `--target <name>` → run single target only (for retrying failures) | |||
| - `--dry-run` → show what would be done without changing code | |||
There was a problem hiding this comment.
--yes removed from argument-hint and list but still functional in Step 0.8
The PR removes --yes from the argument-hint frontmatter and from the arguments section, but Step 0.8 still reads $ARGUMENTS for it:
"Ask for confirmation before starting (unless
$ARGUMENTScontains--yes)."
/titan-run also still passes --yes through to forge (Rules section: "Pass --yes through to forge if the user provided it, so forge skips its per-phase confirmation prompt"). So the flag continues to work when passed by the orchestrator, but is now undocumented when invoking forge directly.
A user calling /titan-forge --yes directly won't see it in the argument list and may think it's invalid. Consider either re-adding it as an undocumented passthrough, or clarifying in the docs that --yes is accepted but only meaningful when invoked by /titan-run.
The same applies to docs/examples/claude-code-skills/titan-forge/SKILL.md.
There was a problem hiding this comment.
Fixed — re-added --yes to the arguments list with description: "skip confirmation prompt (typically passed by /titan-run orchestrator)". Applied to both copies.
| 10. **Run tests** (detect the project's test command from package.json scripts — `npm test`, `yarn test`, `pnpm test`, etc.): | ||
| ```bash | ||
| <detected-test-command> 2>&1 | ||
| ``` | ||
| If tests fail → go to rollback (step 13). |
There was a problem hiding this comment.
Tests run twice per target — in Step 10 and again inside gate (Step 4)
Step 10 runs the test suite:
<detected-test-command> 2>&1Then Step 11 invokes /titan-gate, which also runs tests in its Step 4. For projects with even moderately long test suites, this doubles execution time on every target — a 30-second test run becomes 60 seconds per target, multiplied across all forge targets.
The Step 10 pre-gate test run makes sense as a fast-fail optimization (failing before triggering all gate codegraph checks), but this tradeoff should be documented. Consider noting the rationale inline, or providing a way to skip the Step 10 test if the user prefers the gate to handle all checks (e.g., an environment variable or a --no-pre-gate-test flag).
The same applies to docs/examples/claude-code-skills/titan-forge/SKILL.md.
There was a problem hiding this comment.
Documented — added a Note block after Step 10 explaining the pre-gate test is a fast-fail optimization with the tradeoff: catches failures faster at the cost of 2x test time on passing targets. Applied to both copies.
Step 5c referenced codegraph check --boundaries without --staged, causing false positives from pre-existing violations. Now references the already-collected Step 1 results instead.
…#557) - Step 11 semantic FAIL: clarify forge does its own rollback (not gate) - Re-add --yes to argument list as documented passthrough - Add diffWarnings: [] to execution state init block - Document pre-gate test as fast-fail optimization with tradeoff note
…ptave/codegraph into feat/release-skill-auto-semver
|
Addressed all latest Greptile feedback:
All changes applied to both .claude/skills/ and docs/examples/ copies. |
.claude/skills/titan-gate/SKILL.md
Outdated
| codegraph deps <endpoint-symbol> --json | ||
| ``` | ||
| (Only call this to look up which domain/layer an individual edge endpoint belongs to — not to enumerate all dependencies.) |
There was a problem hiding this comment.
codegraph deps wrong command for domain-layer resolution
A2 instructs the agent to call codegraph deps <endpoint-symbol> --json to resolve "which domain/layer an individual edge endpoint belongs to." But codegraph deps is documented as a file-level dependency map — it returns the dependencies of a file, not metadata about a symbol's location in the architecture.
Calling codegraph deps parseConfig --json (a symbol name) would fail or produce wrong results on every project. The correct approach to resolve a symbol's domain/layer is:
- Use
codegraph where <symbol>to get the file path. - Infer the layer from the path (e.g.,
src/presentation/...→ presentation layer) by matching against the domain map inGLOBAL_ARCH.md.
Alternatively, the source file is already available in the diff-impact edge output and doesn't need a separate lookup.
The same issue exists in docs/examples/claude-code-skills/titan-gate/SKILL.md at the corresponding lines.
There was a problem hiding this comment.
Fixed — replaced codegraph deps (file-level command) with path-based layer inference using the file paths already available in diff-impact edge output. Applied to both .claude/skills/ and docs/examples/ copies.
| **D1. Scope — only planned files touched:** | ||
| Compare staged file paths against `sync.json → executionOrder[currentPhase].targets` and their known file paths (from gauntlet entries). Flag any file NOT associated with the current target or phase. | ||
| - File in a completely different domain → **DIFF FAIL** | ||
| - File is a direct dependency of the target (consumer or import) → **OK** (expected ripple) |
There was a problem hiding this comment.
D2 intent-match check fails silently for dead-code targets
D2 reads "the gauntlet entry's recommendation field" for the current target. However, dead-code targets (deletions of unreferenced symbols) come from titan-state.json → roles.deadSymbols identified during RECON — they have no gauntlet.ndjson entry. When forge processes one of these dead-code targets, D2 would find no matching entry and either skip the intent check or hallucinate the recommendation field, letting any diff pass without verification.
V9 in titan-run already acknowledges this: "OR in titan-state.json → roles.deadSymbols". D2 should have a corresponding guard:
**D2. Intent match:**
- If this is a dead-code target (in `titan-state.json → roles.deadSymbols`), expected recommendation is "remove dead code / delete symbol" — skip gauntlet entry lookup and verify the diff shows only deletions.
- Otherwise, read the gauntlet entry's `recommendation` field...The same fix is needed in docs/examples/claude-code-skills/titan-forge/SKILL.md.
There was a problem hiding this comment.
Fixed — D2 now checks if the target is in titan-state.json deadSymbols first. For dead-code targets, it skips gauntlet entry lookup and verifies the diff shows only deletions. Applied to both copies.
| TITAN_HEAD_SHA=$(git rev-parse HEAD) | ||
| node -e " | ||
| const fs = require('fs'); | ||
| const communities = JSON.parse(fs.readFileSync('.codegraph/titan/arch-snapshot-communities.json','utf8')); | ||
| const structure = JSON.parse(fs.readFileSync('.codegraph/titan/arch-snapshot-structure.json','utf8')); | ||
| const drift = JSON.parse(fs.readFileSync('.codegraph/titan/arch-snapshot-drift.json','utf8')); | ||
| const snapshot = { | ||
| timestamp: new Date().toISOString(), | ||
| capturedBefore: 'forge', | ||
| headSha: '$TITAN_HEAD_SHA', | ||
| communities, | ||
| structure, | ||
| drift | ||
| }; | ||
| fs.writeFileSync('.codegraph/titan/arch-snapshot.json', JSON.stringify(snapshot, null, 2)); | ||
| " | ||
| ``` |
There was a problem hiding this comment.
node -e script has no error handling — silent failure produces no arch-snapshot.json
If any of the three preceding codegraph commands produces malformed output (non-zero exit, partial JSON, or an error message instead of JSON), the corresponding file will contain invalid JSON. The node -e script calls JSON.parse(fs.readFileSync(...)) without any try/catch, so it throws an unhandled exception and exits non-zero. The orchestrator has no documented check for this failure, so it silently proceeds to Step 3.5b and then dispatches forge — but arch-snapshot.json was never written.
Gate Step 5.5 gracefully skips when arch-snapshot.json is absent and prints no user-visible error. The entire architectural comparison layer is silently disabled for the entire forge run with no indication of why.
Add error handling:
TITAN_HEAD_SHA=$(git rev-parse HEAD)
node -e "
try {
const fs = require('fs');
const communities = JSON.parse(fs.readFileSync('.codegraph/titan/arch-snapshot-communities.json','utf8'));
const structure = JSON.parse(fs.readFileSync('.codegraph/titan/arch-snapshot-structure.json','utf8'));
const drift = JSON.parse(fs.readFileSync('.codegraph/titan/arch-snapshot-drift.json','utf8'));
const snapshot = { timestamp: new Date().toISOString(), capturedBefore: 'forge',
headSha: '$TITAN_HEAD_SHA', communities, structure, drift };
fs.writeFileSync('.codegraph/titan/arch-snapshot.json', JSON.stringify(snapshot, null, 2));
console.log('arch-snapshot.json written');
} catch(e) { console.error('ARCH_SNAPSHOT_FAILED: ' + e.message); process.exit(1); }
"If the script fails, print a warning and continue without the snapshot (rather than silently producing no file). The same applies to docs/examples/claude-code-skills/titan-run/SKILL.md.
There was a problem hiding this comment.
Fixed — wrapped the node -e script in try/catch with error logging. On failure, prints a warning and continues without the snapshot instead of silently proceeding. Applied to both copies.
| If `.codegraph/titan/titan-state.json` should exist at this point (i.e., we're past recon): | ||
| ```bash | ||
| node -e "try { JSON.parse(require('fs').readFileSync('.codegraph/titan/titan-state.json','utf8')); console.log('OK'); } catch(e) { console.log('CORRUPT: '+e.message); process.exit(1); }" | ||
| ``` | ||
| - If **CORRUPT** → attempt recovery from backup (see State Backup below). If no backup → stop: "State file corrupted with no backup. Run `/titan-reset` and start over." | ||
|
|
||
| ### G4. State backup | ||
| Before every sub-agent dispatch, back up the current state file: | ||
| ```bash | ||
| cp .codegraph/titan/titan-state.json .codegraph/titan/titan-state.json.bak 2>/dev/null || true | ||
| ``` | ||
| If a sub-agent corrupts the state, G3 on the next iteration will detect it and restore from `.bak`. | ||
|
|
There was a problem hiding this comment.
G3 corruption recovery procedure is not documented
G3 says "attempt recovery from backup (see State Backup below)" but G4 only documents how to create the backup (cp ... .bak), not how to restore from it. An AI agent reading G3 has no documented procedure for the recovery step and would likely hallucinate one (e.g., cp .bak titan-state.json without verifying the backup is valid).
Add an explicit recovery procedure to G3:
### G3. State file integrity
If `.codegraph/titan/titan-state.json` should exist at this point:
```bash
node -e "try { JSON.parse(require('fs').readFileSync('.codegraph/titan/titan-state.json','utf8')); console.log('OK'); } catch(e) { console.log('CORRUPT: '+e.message); process.exit(1); }"- If CORRUPT → attempt recovery:
If backup is also corrupt or missing → stop.
# Verify backup is valid before restoring node -e "try { JSON.parse(require('fs').readFileSync('.codegraph/titan/titan-state.json.bak','utf8')); console.log('OK'); } catch(e) { console.log('CORRUPT'); process.exit(1); }" # If backup is valid: cp .codegraph/titan/titan-state.json.bak .codegraph/titan/titan-state.json
Same fix needed in `docs/examples/claude-code-skills/titan-run/SKILL.md`.
There was a problem hiding this comment.
Fixed — G3 now has an explicit recovery procedure: verify backup exists, validate it's valid JSON, cp .bak over the corrupt file. If backup is also corrupt or missing, stop. Applied to both copies.
| ```bash | ||
| node -e " | ||
| const fs = require('fs'); | ||
| const lines = fs.readFileSync('.codegraph/titan/gauntlet.ndjson','utf8').trim().split('\n'); | ||
| let valid = 0, corrupt = 0; | ||
| for (const line of lines) { | ||
| try { JSON.parse(line); valid++; } catch { corrupt++; } | ||
| } | ||
| console.log(JSON.stringify({ valid, corrupt, total: lines.length })); | ||
| " |
There was a problem hiding this comment.
NDJSON integrity check fails if
gauntlet.ndjson does not yet exist
The NDJSON integrity check at step 2c unconditionally calls fs.readFileSync('.codegraph/titan/gauntlet.ndjson', 'utf8'). If the file doesn't exist (e.g., the gauntlet sub-agent crashed before writing any entry), readFileSync throws ENOENT — the node -e script exits non-zero with a Node.js stack trace, which is a confusing diagnostic for what is simply "gauntlet wrote nothing."
The check should guard against a missing file:
const fs = require('fs');
if (!fs.existsSync('.codegraph/titan/gauntlet.ndjson')) {
console.log(JSON.stringify({ valid: 0, corrupt: 0, total: 0, missing: true }));
process.exit(0);
}
const lines = fs.readFileSync('.codegraph/titan/gauntlet.ndjson','utf8').trim().split('\n');Then the caller should handle missing: true as equivalent to valid == 0 (stop the pipeline). The same fix is needed in docs/examples/claude-code-skills/titan-run/SKILL.md.
There was a problem hiding this comment.
Fixed — added fs.existsSync guard before readFileSync. When gauntlet.ndjson is missing, outputs {valid:0, corrupt:0, total:0, missing:true} and exits 0. Caller handles missing:true as equivalent to valid==0. Applied to both copies.
…ce in A2 codegraph deps is a file-level dependency map, not a symbol lookup. The diff-impact edge output already contains source/target file paths, so the domain/layer can be inferred directly from the path using the GLOBAL_ARCH.md domain map without an extra command.
Dead-code targets have no gauntlet.ndjson entry, causing D2 to fail silently. Now checks titan-state.json deadSymbols first and validates the diff shows only deletions without needing a gauntlet lookup.
The node -e script that assembles arch-snapshot.json had no error handling — if any input file was missing or malformed, it failed silently and no snapshot was written. Now catches errors, prints a warning, and continues without the snapshot.
G3 referenced backup recovery without documenting the steps. Now includes: check backup exists, validate it is valid JSON before restoring, cp .bak over the corrupt file, and stop if backup is also corrupt or missing.
The node -e script crashed with ENOENT if gauntlet.ndjson did not exist yet. Now checks fs.existsSync first and outputs a result with missing:true so the caller can handle it gracefully.
| | `codegraph path` | SYNC | Dependency paths between targets | | ||
| | `codegraph cycles` | SYNC, GATE | Circular dependency detection | | ||
| | `codegraph deps` | SYNC | File-level dependency map | | ||
| | `codegraph deps` | SYNC, GATE | File-level dependency map | |
There was a problem hiding this comment.
codegraph deps incorrectly lists GATE as a consumer
Gate A2 was updated (as part of a previous fix cycle in this PR) to stop calling codegraph deps entirely. The current titan-gate/SKILL.md Step 5.5 A2 now explicitly says:
"do not re-run
codegraph depson the full file… No additional codegraph command is needed — the diff-impact edge output contains the file paths directly."
The GATE entry was added to this row based on an earlier iteration of the design that called codegraph deps for domain/layer resolution. That approach was subsequently replaced with path-based inference from the diff-impact results already in memory. The table was never updated to reflect the final implementation.
An agent or developer reading this table to understand which commands gate calls would incorrectly expect a codegraph deps invocation in the gate flow.
| | `codegraph deps` | SYNC, GATE | File-level dependency map | | |
| | `codegraph deps` | SYNC | File-level dependency map | |
The same row exists in .claude/skills/ — but since the docs mirror is identical to .claude/skills/ at HEAD, only this file needs the fix.
There was a problem hiding this comment.
Fixed — removed GATE from the codegraph deps row. Gate A2 now uses path-based layer inference from diff-impact edge output instead of calling codegraph deps.
Gate A2 no longer calls codegraph deps — it uses path-based layer inference from diff-impact edge output instead. The command table was never updated to reflect this change.
| --- | ||
|
|
||
| ## Step 3.5 — Pre-forge: Architectural Snapshot + Human Checkpoint | ||
|
|
||
| ### 3.5a. Capture architectural snapshot | ||
|
|
||
| Before any code changes, snapshot the codebase's architectural properties. This becomes the baseline for the architectural comparator in `/titan-gate` (Step 5.5). | ||
|
|
||
| ```bash | ||
| codegraph communities -T --json > .codegraph/titan/arch-snapshot-communities.json | ||
| codegraph structure --depth 2 --json > .codegraph/titan/arch-snapshot-structure.json | ||
| codegraph communities --drift -T --json > .codegraph/titan/arch-snapshot-drift.json |
There was a problem hiding this comment.
Checkpoint template unconditionally says "Architectural snapshot captured" even if capture failed
The node -e in Step 3.5a now has proper error handling — on failure it prints a warning and continues without writing arch-snapshot.json. However, the human checkpoint template in Step 3.5b always includes the literal line:
Architectural snapshot captured (for post-change comparison).
If the snapshot failed to write (e.g., malformed JSON output from a codegraph command), the orchestrator following this template will misleadingly tell the user the snapshot was captured when it wasn't. The user proceeds expecting architectural comparisons to run, but they will silently be skipped for the entire forge run.
Consider making this line conditional on whether arch-snapshot.json was actually written:
Architectural snapshot: <captured (for post-change comparison) | NOT captured — architectural comparison will be skipped (see warning above)>
The same issue exists in docs/examples/claude-code-skills/titan-run/SKILL.md at the corresponding lines.
There was a problem hiding this comment.
Fixed — checkpoint template now shows conditional status: 'captured' when snapshot succeeded, 'FAILED — gate A1/A3/A4 will be skipped' when the node -e script failed. Applied to both .claude/skills/ and docs/examples/ copies.
|
Addressed remaining Greptile feedback:
All changes applied to both |
| If `--yes` is NOT set: **stop and wait for user confirmation.** Do NOT proceed. | ||
| If `--yes` IS set: print the summary but continue automatically. |
There was a problem hiding this comment.
Forge sub-agent stalls when
--yes not set after user confirms at checkpoint
When autoConfirm=false, the orchestrator correctly pauses at Step 3.5b for human confirmation. However, once the user confirms and the forge loop begins, each forge sub-agent is dispatched without --yes (line 462: yesFlag = "--yes" if autoConfirm else ""). Forge Step 0.8 then asks for per-phase confirmation — but sub-agents invoked via the Agent tool cannot always reliably surface an interactive prompt to the user depending on context. In a non-interactive agent execution, the forge sub-agent would receive no response and halt, triggering stall detection after 2 iterations (maxStalls=2), aborting the pipeline.
The user explicitly approved the forge run at Step 3.5b, so the per-phase confirmation inside forge is redundant and potentially blocking in this context. After the checkpoint confirmation, the orchestrator should treat the session as implicitly confirmed for the forge loop:
# After user confirms at Step 3.5b:
forgeAutoConfirm = true # user already approved at checkpoint
...
yesFlag = "--yes" if (autoConfirm or forgeAutoConfirm) else ""
The same applies to docs/examples/claude-code-skills/titan-run/SKILL.md.
Summary
/titan-runorchestrator skill that dispatches the full Titan pipeline (recon → gauntlet → sync → forge) to sub-agents with fresh context windows, enabling hands-free end-to-end execution/titan-forge(Step 9) — verifies each diff matches the gauntlet recommendation and sync plan intent before gate runs (scope check, intent match, commit message accuracy, deletion audit, leftover check)/titan-gate(Step 5) — export signature stability, import resolution integrity, dependency direction assertions, re-export chain validation/titan-gate(Step 5.5) — captures pre-forge architectural baseline, compares community stability, cross-domain dependency direction, cohesion delta, and drift after each commit/titan-run: Pre-Agent Gate (G1-G4), post-phase validation (V1-V15), stall detection, state file backup/recovery, NDJSON integrity checks, mandatory human checkpoint before forgeTest plan
/titan-runon a test codebase in a worktree — verify full pipeline completes--start-from forgeskips analysis phases but validates their artifacts exist