fix: reindex stability — sentinel resume, donor safety, worktree subdirs, progress bar#67
Merged
fix: reindex stability — sentinel resume, donor safety, worktree subdirs, progress bar#67
Conversation
When reindexing takes longer than 15s, semantic_search returns stale results with a warning instead of blocking the agent indefinitely. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Buffered done channel (cap 1) to prevent goroutine leak on timeout - Goroutine calls touchChecked on success for correct TTL behavior - Nil progress func in goroutine (request ctx may be gone) - Log errors from background EnsureFresh at Warn level - sync.WaitGroup for graceful shutdown in Close() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7-task plan with TDD approach: struct changes, WaitGroup, timeout goroutine, formatSearchResults, and tests including a test hook (ensureFreshFunc) to exercise the 15s timeout path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… reindex EnsureFresh now runs in a goroutine. If it completes within 15s, results are returned normally. If it exceeds the timeout, stale results are returned immediately with a StaleWarning while reindexing continues in the background (up to 10min). The goroutine acquires an exclusive flock to avoid concurrent writes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Go 1.25+ provides wg.Go() which simplifies goroutine tracking. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…exed Add ensureFreshFunc test hook to indexerCache (follows existing findDonorFunc/seedFunc pattern) and three new tests: - TestEnsureIndexed_TimeoutReturnsStaleWarning: injects a slow EnsureFresh that exceeds the 15s timeout, verifies StaleWarning is returned and Reindexed=false. - TestEnsureIndexed_FastEnsureFreshNoWarning: injects an instant EnsureFresh, verifies no warning and correct stats propagation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…or vec_chunks Handles slow embedding batches and retries on SQLite contention without timing out. INSERT OR REPLACE prevents duplicate key errors when re-embedding chunks that already exist in the vector table. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three root causes fixed:
1. SessionStart double-spawn and no freshness gate (hook.go):
- Remove unconditional spawnBackgroundIndexer from runHookSessionStart;
generateSessionContextInternal now owns all spawn decisions
- After opening the DB for stats, check last_indexed_at: skip spawn
when indexed within backgroundIndexStaleness (5 min), spawn when
stale or never completed. Prevents every new terminal from triggering
a full merkle walk.
2. Goroutine zero-result treated as "fresh" (stdio.go):
- Add skipped bool to freshResult. When TryAcquire returns nil (TOCTOU
race — another process grabbed the lock) or errors, send
freshResult{skipped: true}. Main select now returns StaleWarning for
skipped results, consistent with the IsHeld fast-path. Previously the
zero result looked like "index is fresh", silently skipping
touchChecked and causing the next search to immediately re-spawn.
3. Redundant merkle walk after lumen index finishes (stdio.go):
- In the goroutine, after acquiring the flock, check idx.LastIndexedAt().
If within freshnessTTL, call touchChecked() and return without calling
EnsureFresh. Uses the DB timestamp as a shared cross-process freshness
signal so the MCP server doesn't duplicate the walk just completed by
the background indexer.
Also fix pre-existing errcheck lint in tui/progress.go.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nge breakdown - index.go: add newDebugLogger() for background path; log start, skip, cancel, error, and completion with full Stats fields; pass logger to Indexer via SetLogger() so indexWithTree can log the indexing plan - index/index.go: add FilesAdded/FilesModified/FilesRemoved/Reason/ OldRootHash/NewRootHash to Stats; populate them in Index, EnsureFresh, and indexWithTree; add SetLogger/logger field to Indexer - hook_spawn_unix.go: discard stderr of background indexer (slog writes to debug.log; piping stderr would mix pterm progress into the log) - search.go: pass nil logger to setupIndexer (interactive command) - CLAUDE.md: document interactive vs background output strategy Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The force_reindex parameter on semantic_search is removed. Reindexing is exclusively triggered by the SessionStart hook and by the background goroutine inside ensureIndexed. Progress notifications are restored and now flow through the background goroutine path so the Claude Code status indicator animates during indexing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Enrich the "indexing plan" slog entry with: - old_root_hash: stored merkle root before this run - new_root_hash: computed merkle root from current filesystem - main_worktree: main git repo root (only when projectDir is a worktree) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When SQLite reports "database disk image is malformed" or "disk I/O error", the index is permanently broken until manually purged. Every subsequent semantic_search call would fail with the same error because touchChecked is never set and each retry hits the same corrupted file. This change adds automatic recovery at two layers: - store.New: if open/schema-setup fails with a corruption error, delete the DB file and its WAL/SHM sidecars and retry once from a clean state. In-memory databases are never deleted. - Indexer.EnsureFresh / Index: if indexWithTree returns a corruption error mid-operation, log ERROR "corrupted database detected, rebuilding", call rebuildStore() (close → delete files → reopen), then retry with an empty stored hash so the fresh DB receives a full index pass. Adds IsCorruptionErr(err) to the store package as the single source of truth for what constitutes a SQLite corruption error. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
pterm's cursor positioning assumes bar title fits on one line. Long paths cause wrapping which shifts the cursor, leaving duplicated output on each redraw. Truncate to (terminal_width - 45) chars, reserving space for the bar chrome, appending an ellipsis when truncated. Also benefits terminal resize: pterm.GetTerminalWidth() is called live on every Update(), so the budget adjusts automatically. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tion Files registered mid-run have hash="" (sentinel). Previously, if the root hash hadn't changed between sessions the indexer returned early, leaving those files permanently unembedded. - Add HasSentinelFiles() to store: EXISTS query on files WHERE hash='' - In Index() and EnsureFresh(), check sentinels before the early-return: if any exist, fall through to incremental indexing regardless of hash - Replace four separate SetMeta calls at end-of-run with a saveMeta() closure; call it on mid-batch embedding failures too so progress is persisted even when Ollama times out partway through a large repo Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
A donor whose first indexing pass was interrupted has no root_hash in project_meta. Seeding from such a donor propagates partially-indexed state to the new worktree, causing it to believe it is current when it is not. Guard: open the donor read-only, query root_hash before the WAL checkpoint, and bail out (return false, nil) if the value is missing or empty. The new TestSeedFromDonor_IncompleteDonor test covers this path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
FindDonorIndexBase compared worktree root paths directly. When the effective project root is a subdirectory (e.g. monorepo/backoffice), the DB path is derived from that subdirectory, not from the worktree root — so no sibling worktrees were ever found. Fix: identify which worktree contains the project, compute the relative suffix (e.g. "backoffice"), then look for a DB at <sibling_worktree>/<relSuffix> in each sibling. This correctly resolves donor indexes regardless of how deep the effective root sits inside the worktree. Symlinks are resolved at every comparison point. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
auto-merge was automatically disabled
March 27, 2026 15:43
Rebase failed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
hash="". Previously the next session would see a matching root hash and return early, leaving those files permanently unembedded. NowHasSentinelFiles()is checked before the early-return; any sentinel triggers incremental indexing to complete the run.saveMeta()closure savesroot_hash+ timestamps on both success and mid-batch embedding failures, so the next session can match the hash and skip already-complete files.SeedFromDonornow verifies the donor has a non-emptyroot_hashbefore copying. Seeding from an incomplete donor propagated corrupted state; this now returns(false, nil)and the worktree starts fresh.FindDonorIndexBasepreviously compared worktree root paths, missing cases where the effective project root is a subdirectory (e.g.monorepo/backoffice). The fix computes the relative suffix inside the worktree and looks for a DB at<sibling_worktree>/<relSuffix>.(terminal_width - 45)chars beforeUpdateTitle().Test plan
go test ./...— all packages pass (verified locally)TestSeedFromDonor_IncompleteDonor— new test covering donor safety gate🤖 Generated with Claude Code