Skip to content

fix(titan): verify embeddings per-worktree instead of trusting stale state flag#1805

Open
carlos-alm wants to merge 2 commits into
mainfrom
fix/titan-embeddings-per-worktree
Open

fix(titan): verify embeddings per-worktree instead of trusting stale state flag#1805
carlos-alm wants to merge 2 commits into
mainfrom
fix/titan-embeddings-per-worktree

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

Summary

  • .codegraph/ is gitignored, so graph.db (and its embeddings) is local, per-worktree filesystem state — it never travels via git merge or a worktree switch.
  • The v3.15.0 Titan run (see titan-report-v3.15.0) hit embeddingsAvailable: true in titan-state.json carried over from a different worktree's DB, causing GAUNTLET's Rule 11 (DRY) semantic search to silently return empty results for the entire run instead of erroring.
  • titan-recon now smoke-tests codegraph search against the current worktree before setting embeddingsAvailable, and titan-gauntlet re-verifies (and regenerates if stale) before relying on it for Rule 11 checks.

Test plan

  • Reviewed both skill files render correctly as markdown
  • Next Titan run should confirm embeddingsAvailable reflects the operating worktree's actual DB state

…state flag

.codegraph/ is gitignored, so graph.db and its embeddings are local,
per-worktree filesystem state that never travels via git merge or a
worktree switch. A v3.15.0 Titan run hit embeddingsAvailable: true in
titan-state.json carried over from a different worktree's DB, causing
GAUNTLET's Rule 11 (DRY) semantic search to silently return empty
results for the entire run instead of erroring.

titan-recon now smoke-tests codegraph search against the current
worktree before setting embeddingsAvailable, and titan-gauntlet
re-verifies it (and regenerates if stale) before relying on it for
Rule 11 checks.
@greptile-apps

greptile-apps Bot commented Jul 5, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a silent-failure bug where embeddingsAvailable: true carried over from a different worktree's titan-state.json caused GAUNTLET's Rule 11 semantic search to silently return empty results, producing a false clean pass. Both skill files are updated to verify embeddings against the actual operating worktree rather than trusting the state flag blindly.

  • titan-recon/SKILL.md: Step 2 now smoke-tests codegraph search \"test query\" --json before setting embeddingsAvailable, and persists embeddingsWorktreePath (output of git rev-parse --show-toplevel) alongside the flag in titan-state.json, giving downstream phases a deterministic identity anchor.
  • titan-gauntlet/SKILL.md: Rule 11's note replaces the previous subjective "empty where a hit would be expected" heuristic with a two-step guard — a path comparison between the current worktree and embeddingsWorktreePath, followed by a one-time smoke-test using the same concrete error/ENGINE_UNAVAILABLE criterion as RECON — before triggering regeneration or falling back to grep-only DRY checks.

Confidence Score: 5/5

Both changed files are Markdown skill instructions for AI agents; no executable code is modified, and the changes only add stricter verification guards.

The path-comparison + smoke-test logic is a clear improvement over the previous trust-the-flag approach and addresses the exact incident described in the PR. The one remaining note about RECON's success-condition phrasing being harder to parse than GAUNTLET's is a wording nit whose logical outcome is identical in practice.

No files require special attention. Both changes are Markdown instruction files for AI agents.

Important Files Changed

Filename Overview
.claude/skills/titan-gauntlet/SKILL.md Replaces the ambiguous 'empty where a hit would be expected' heuristic with a deterministic two-step guard: path comparison between git rev-parse --show-toplevel and embeddingsWorktreePath, followed by a one-time smoke-test using the same concrete error/ENGINE_UNAVAILABLE criterion as RECON. Regeneration + fallback to grep-only DRY checks on failure is also specified.
.claude/skills/titan-recon/SKILL.md Adds a smoke-test before setting embeddingsAvailable and persists embeddingsWorktreePath (via git rev-parse --show-toplevel) in titan-state.json. Schema example updated. The success condition phrasing is slightly inconsistent with GAUNTLET's cleaner negation form, which risks an agent misreading 'returns results' as requiring non-empty output.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant RECON as titan-recon
    participant STATE as titan-state.json
    participant GAUNTLET as titan-gauntlet
    participant CG as codegraph

    RECON->>CG: codegraph embed -m minilm
    RECON->>CG: codegraph search "test query" --json
    CG-->>RECON: result (or error/ENGINE_UNAVAILABLE)
    alt No error/ENGINE_UNAVAILABLE
        RECON->>STATE: embeddingsAvailable: true + embeddingsWorktreePath
    else Error or ENGINE_UNAVAILABLE
        RECON->>STATE: embeddingsAvailable: false
    end

    Note over GAUNTLET: Rule 11 — DRY check
    GAUNTLET->>STATE: read embeddingsAvailable + embeddingsWorktreePath
    GAUNTLET->>GAUNTLET: git rev-parse --show-toplevel
    alt Paths match AND embeddingsAvailable: true
        GAUNTLET->>CG: smoke-test: codegraph search "test query" --json
        alt No error/ENGINE_UNAVAILABLE
            GAUNTLET->>CG: codegraph search function-purpose --json
        else Error/ENGINE_UNAVAILABLE
            GAUNTLET->>CG: codegraph embed -m minilm (regenerate)
            GAUNTLET->>STATE: update embeddingsAvailable + embeddingsWorktreePath
        end
    else Path mismatch or missing embeddingsWorktreePath
        GAUNTLET->>CG: codegraph embed -m minilm (regenerate)
        GAUNTLET->>STATE: update embeddingsAvailable + embeddingsWorktreePath
    end
    alt Regeneration fails
        GAUNTLET->>GAUNTLET: grep-only DRY checks + log to issues.ndjson
    end
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant RECON as titan-recon
    participant STATE as titan-state.json
    participant GAUNTLET as titan-gauntlet
    participant CG as codegraph

    RECON->>CG: codegraph embed -m minilm
    RECON->>CG: codegraph search "test query" --json
    CG-->>RECON: result (or error/ENGINE_UNAVAILABLE)
    alt No error/ENGINE_UNAVAILABLE
        RECON->>STATE: embeddingsAvailable: true + embeddingsWorktreePath
    else Error or ENGINE_UNAVAILABLE
        RECON->>STATE: embeddingsAvailable: false
    end

    Note over GAUNTLET: Rule 11 — DRY check
    GAUNTLET->>STATE: read embeddingsAvailable + embeddingsWorktreePath
    GAUNTLET->>GAUNTLET: git rev-parse --show-toplevel
    alt Paths match AND embeddingsAvailable: true
        GAUNTLET->>CG: smoke-test: codegraph search "test query" --json
        alt No error/ENGINE_UNAVAILABLE
            GAUNTLET->>CG: codegraph search function-purpose --json
        else Error/ENGINE_UNAVAILABLE
            GAUNTLET->>CG: codegraph embed -m minilm (regenerate)
            GAUNTLET->>STATE: update embeddingsAvailable + embeddingsWorktreePath
        end
    else Path mismatch or missing embeddingsWorktreePath
        GAUNTLET->>CG: codegraph embed -m minilm (regenerate)
        GAUNTLET->>STATE: update embeddingsAvailable + embeddingsWorktreePath
    end
    alt Regeneration fails
        GAUNTLET->>GAUNTLET: grep-only DRY checks + log to issues.ndjson
    end
Loading

Reviews (3): Last reviewed commit: "fix: record embeddingsWorktreePath for d..." | Re-trigger Greptile

Find semantically similar functions. If `codegraph search` fails (no embeddings), use grep for function signature patterns. **Warn:** similar patterns. **Fail:** near-verbatim copy.

> Note: requires embeddings from `/titan-recon`. If `titan-state.json → embeddingsAvailable` is false, skip semantic search and note it.
> **Don't trust `embeddingsAvailable` blindly — verify it against the current worktree.** `.codegraph/` is gitignored, so `graph.db` and its embeddings are local, per-worktree filesystem state; they are never carried over by a branch merge or a "switch to that worktree's state" step in Step 0. A `titan-state.json` merged in from a different worktree/session can say `embeddingsAvailable: true` while the graph.db actually open right now has none — `codegraph search` will then run without erroring and silently return empty results, so Rule 11 looks clean when it never actually checked anything.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Ambiguous empty-result trigger may miss the exact failure scenario this PR targets

The GAUNTLET smoke-test condition "if it errors or returns empty where a hit would be expected" leaves the executing agent to judge subjectively whether a hit from codegraph search "test query" --json is "expected" — and for a generic query like "test query", the agent will almost always decide no hit is expected. This means GAUNTLET will not trigger regeneration on a silent-empty response, which is precisely what the incident report says happened: codegraph search ran without erroring and returned empty results without any explicit ENGINE_UNAVAILABLE signal.

RECON's condition is more concrete ("not an error/ENGINE_UNAVAILABLE"), but the two conditions aren't symmetrically guarding against the same failure mode. The most deterministic fix would be to have RECON record the worktree path alongside embeddingsAvailable (e.g., a new embeddingsWorktreePath field in titan-state.json) and have GAUNTLET compare its current working directory against that value — a mismatch would be an unambiguous trigger for regeneration regardless of what the smoke test returns.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — RECON now records embeddingsWorktreePath (git rev-parse --show-toplevel) in titan-state.json alongside embeddingsAvailable. GAUNTLET's Rule 11 note now does a deterministic path comparison against that field first (mismatch or missing field = don't trust the flag), and the smoke-test fallback now uses the same explicit error/ENGINE_UNAVAILABLE criterion RECON uses instead of the subjective "empty where a hit would be expected" judgment call.

```bash
codegraph search "test query" --json
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Missing worktree identity record makes downstream verification probabilistic

The skill correctly smoke-tests the engine before setting embeddingsAvailable: true, but doesn't persist which worktree those embeddings belong to. Without an embeddingsWorktreePath field (e.g., the output of git rev-parse --show-toplevel) stored alongside the flag in titan-state.json, GAUNTLET has no deterministic way to know whether the flag it reads applies to its own working directory. It falls back to the ambiguous "where a hit would be expected" heuristic, which cannot reliably distinguish "engine healthy but zero semantic matches" from "engine silently returning nothing due to missing embeddings" — the exact failure mode this PR is addressing.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — RECON now records embeddingsWorktreePath (output of git rev-parse --show-toplevel) alongside embeddingsAvailable in titan-state.json, including in the schema example. GAUNTLET compares its own worktree path against this field for a deterministic identity check before falling back to the smoke test.

…cks (#1805)

RECON now stores the worktree path (git rev-parse --show-toplevel) alongside
embeddingsAvailable in titan-state.json. GAUNTLET compares this against its
own worktree path before trusting the flag, replacing the ambiguous "empty
result where a hit would be expected" heuristic with a deterministic check,
and tightens the smoke-test fallback to the same explicit ENGINE_UNAVAILABLE
criterion RECON already uses.
@carlos-alm

Copy link
Copy Markdown
Contributor Author

Addressed both Greptile findings: RECON now records embeddingsWorktreePath (git rev-parse --show-toplevel) alongside embeddingsAvailable in titan-state.json (including the schema example). GAUNTLET's Rule 11 note now does a deterministic worktree-path comparison against that field before falling back to the smoke test, and the smoke-test criterion is tightened to match RECON's explicit error/ENGINE_UNAVAILABLE check instead of the previous subjective "empty where a hit would be expected" wording.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant