release: v0.2.5818 — MCP stability + issue-44 + security + correctness#497
Conversation
…roken stdout Three root causes fixed: - No SIGPIPE handler: writing to a broken pipe killed the process outright - writeResult/writeError/writeRequest silently swallowed write failures (catch return), leaving the main loop unaware the client disconnected - Main loop had no exit path for write failures or watchdog shutdown signal Added cio.ignoreSigpipe() via sigaction at startup. Added stdout_broken atomic flag set on any stdout write failure. Main loop now checks both stdout_broken and a shutdown flag (passed from the watchdog thread) each iteration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ing tree changes loadSnapshotFast correctly detected stale files (disk mtime > snapshot mtime) but re-indexed them with indexFileOutlineOnly, which skips the word index and trigram index. Since searchContent relies on these indices, updated content was invisible to all search tiers. Changed to indexFile (full_index=true) so stale files get complete indexing. Also fixed isSensitivePath to block .env-* and .env_* variants (not just .env.*) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
isSensitivePath only matched .env and .env.X (dot-delimited) but missed .env-local, .env_production and similar hyphen/underscore variants. Extended the check to also treat '-' and '_' as delimiters after .env prefix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two correctness fixes: 1. commitParsedFileOwnedOutline: prior_content was hardcoded to null, so the errdefer on trigram indexing failure always removed the word index entry instead of restoring the prior content. Now fetches prior content before overwriting. 2. avgDocLength returns 1.0 when total_tokens=0 (prevents Inf in BM25 normalization). Sort comparator in rerankAndFinalize treats NaN scores as 0 to prevent unstable ordering. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 11,621-line tests.zig (467 tests) compiled as a single binary that pegged CPU. Split into 8 independent binaries by domain: test-core 21 tests (store, agent, config, edit) test-explore 95 tests (explorer, word index, dep-graph, git, threads) test-index 141 tests (trigram, bloom, regex, disk, sparse ngram, perf) test-parser 46 tests (PHP, Go, Ruby, Swift, C, HCL, R, Dart, Python, TS) test-search 47 tests (BM25, rerank, callers) test-snapshot 16 tests (snapshot read/write/corruption) test-mcp 57 tests (MCP protocol, bundle, nuke, update, telemetry) test-query 44 tests (query pipeline, fuzzy, glob) Each can be run individually (zig build test-core) or all at once (zig build test). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s fixes Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Instruments searchContent with nanoTimestamp() around each of the 7 search tiers plus rerank. Stores a SearchBreakdown struct on the Explorer after every search, emitted as a search_breakdown telemetry event for codedb_search/codedb_find/codedb_word calls. Fields: tier0_ns through tier5_ns, rerank_ns, tier_reached, candidate_count, result_count. ~160ns overhead per search (8 clock_gettime calls on Apple Silicon). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Benchmark Regression ReportThresholds: 10.00% and 50,000 ns absolute delta
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 577785a5b8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (std.mem.eql(u8, name, "codedb_search") or std.mem.eql(u8, name, "codedb_find") or std.mem.eql(u8, name, "codedb_word")) { | ||
| telem.recordSearchBreakdown(explorer.last_search_breakdown); |
There was a problem hiding this comment.
Record breakdown only for real tiered content searches
recordSearchBreakdown is emitted for codedb_find and codedb_word as well as failed codedb_search calls, but those paths do not update explorer.last_search_breakdown, so this logs stale data from a previous request. In practice, a successful codedb_search followed by codedb_find will attribute the old tier timings to the filename search, which corrupts per-tool telemetry and any downstream analysis based on these events.
Useful? React with 👍 / 👎.
| breakdown.tier_reached = if (breakdown.tier5_ns > 0 and result_list.items.len > 0) 7 | ||
| else if (breakdown.tier4_ns > 0 and result_list.items.len > 0) 6 |
There was a problem hiding this comment.
Derive tier_reached from actual executed search tier
tier_reached is computed from tier5_ns > 0, but tier5_ns is always positive because timing is recorded even when Tier 5 is skipped. Any non-early-return search with results therefore reports tier 7, even if matches were found in earlier tiers, making the new per-tier telemetry misleading for performance/debugging decisions.
Useful? React with 👍 / 👎.
Summary
.env-localand.env_productionno longer bypass the sensitive path filtertests.zigsplit into 8 independent test binariesMerge strategy
Merge commit — do not squash.
🤖 Generated with Claude Code