Skip to content

v0.7.0: Warm sync with file watcher, placement learning, fuzzy links#8

Merged
devwhodevs merged 22 commits intomainfrom
feature/v0.7-warm-sync
Mar 25, 2026
Merged

v0.7.0: Warm sync with file watcher, placement learning, fuzzy links#8
devwhodevs merged 22 commits intomainfrom
feature/v0.7-warm-sync

Conversation

@devwhodevs
Copy link
Copy Markdown
Owner

Summary

  • File watcher inside engraph serve — real-time detection of vault changes via notify-debouncer-full, automatic re-indexing with 2s debounce
  • Placement correction learning — detects when user moves a note from suggested_folder to a different folder, updates centroids incrementally
  • Fuzzy link matching — sliding window Levenshtein matching (0.92 threshold) + first-name matching for People notes (suggestion-only)
  • created_by filtering — track note origin, filter via list MCP tool
  • Centroid math fix — replaced EMA (0.9/0.1) with true online mean
  • Indexer refactoring — extracted index_file, remove_file, rename_file for per-file operations

Architecture

┌──────────────────┐     tokio::mpsc (64)     ┌──────────────────┐
│  std::thread     │  ──── Vec<WatchEvent> ──► │  tokio::spawn    │
│  (producer)      │      blocking_send()      │  (consumer)      │
│                  │                           │                  │
│  notify debouncer│                           │  Pass 1: mutate  │
│  2s timeout      │                           │  Pass 2: edges   │
│  .md filter      │                           │  Centroids       │
└──────────────────┘                           └──────────────────┘

Test plan

  • 225 unit tests passing (was 190)
  • Clippy clean, fmt clean
  • Manual smoke test: engraph serve with live vault editing
  • Verify placement correction on file move with suggested_folder frontmatter
  • Verify fuzzy link matching in create MCP tool response
  • Build from source on macOS arm64

🤖 Generated with Claude Code

devwhodevs and others added 22 commits March 25, 2026 14:05
Replaces 0.9/0.1 exponential moving average with true online mean
math. Adds adjust_folder_centroid() for incremental add/remove.
Fixes file_count to represent actual file count, not chunk count.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds created_by TEXT column to files table for tracking note origin.
Adds update_file_path() that preserves file_id for edge integrity.
Updates insert_file() signature with optional created_by parameter.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds store method to retrieve chunk vectors by file_id from BLOB column.
Creates placement_corrections table for tracking user folder corrections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Allows filtering notes by their creator agent via the list MCP tool.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Self-contained per-file indexing: chunk, embed, store in a single
transaction. run_index now delegates to index_file per file.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
remove_file handles vec/FTS cleanup before cascade delete.
rename_file uses UPDATE to preserve file_id and edge integrity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Splits run_index into run_index (creates own resources) and
run_index_shared (accepts shared references). Needed for watcher's
FullRescan path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OS thread watches vault recursively with 2s debounce. Filters to .md
files, applies exclude patterns, converts to WatchEvent types.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two-pass batch processing: mutations then edge rebuild. Content hash
move detection for cross-platform reliability.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Watcher starts on serve, reconciles index on startup, then watches
for real-time changes. Shuts down cleanly with the server.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Detects when user moves a note from suggested_folder to a different
folder. Updates centroids incrementally, logs correction, strips
frontmatter.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduces confidence-scored fuzzy matching with type priority
for overlap resolution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Byte-offset-aware fuzzy matching via normalized Levenshtein with
0.92 threshold. Respects existing exact matches, sorts by type
priority for overlap resolution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Unique first-name matches emitted at 650bp confidence, never
auto-applied. Fuzzy matches above 920bp auto-applied. FirstName
matches returned as suggestions in MCP tool response.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixes content hash divergence between diff_vault (raw bytes) and
run_index_inner (post-read_to_string bytes) that could cause
BOM-prefixed files to appear perpetually changed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
index_file skips its own transaction when already inside one.
run_index_inner wraps the file loop in a single transaction for
bulk indexing performance. Watcher still uses per-file transactions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Centroids now updated when files are added or removed via external
edits, preventing drift. FullRescan lock scope documented as known
limitation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Moves handler now releases the store lock before writing frontmatter
to disk, reducing MCP latency. remove_file wrapped in transaction
for crash safety.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@devwhodevs devwhodevs merged commit 14bb4f7 into main Mar 25, 2026
3 checks passed
@devwhodevs devwhodevs deleted the feature/v0.7-warm-sync branch March 25, 2026 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant