From 44a5712fad2be7ff5cc6db9d9dc1f481a0f1ea3e Mon Sep 17 00:00:00 2001 From: justrach <54503978+justrach@users.noreply.github.com> Date: Thu, 28 May 2026 20:26:26 +0800 Subject: [PATCH 01/11] feat(dist): codedeebee npm package + README/install.sh refresh MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - npm/: codedeebee npm package (codedb's npx-friendly sibling) - postinstall downloads matching native binary from GH release and verifies against checksums.sha256 - thin spawnSync launcher preserves cwd/stdio/args/env - published at codedeebee@0.2.5820 (closes #501) - README.md: - new "Or via npm/npx" install section - updated "16 MCP tools" → "21 MCP tools" everywhere - rewrote tools table: dropped codedb_bundle (no longer registered), added codedb_callers / codedb_context / codedb_find / codedb_glob / codedb_ls / codedb_query / disambiguated codedb_find vs codedb_symbol - added `codedb read ` to CLI table - removed v0.2.579 hotfix section (long obsolete) - install/install.sh merge_hook: when a competing legacy-tools hook (block-legacy-tools.sh / muonry / zigrep / zigread) is already registered for the same event/matcher, insert codedb's entry at index 0 instead of appending. Reshuffles existing installs too. Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 73 ++++++++++------- install/install.sh | 31 ++++++- npm/.npmignore | 4 + npm/README.md | 69 ++++++++++++++++ npm/bin/codedb.js | 33 ++++++++ npm/package.json | 44 ++++++++++ npm/scripts/postinstall.js | 162 +++++++++++++++++++++++++++++++++++++ 7 files changed, 381 insertions(+), 35 deletions(-) create mode 100644 npm/.npmignore create mode 100644 npm/README.md create mode 100644 npm/bin/codedb.js create mode 100644 npm/package.json create mode 100644 npm/scripts/postinstall.js diff --git a/README.md b/README.md index 967d60a..0d3942d 100644 --- a/README.md +++ b/README.md @@ -46,7 +46,7 @@ | What works today | What's in progress | |--------------------------------------------------------|------------------------------------------| -| 16 MCP tools for full codebase intelligence | Deeper parser coverage and edge-case handling | +| 21 MCP tools for full codebase intelligence | Deeper parser coverage and edge-case handling | | Trigram v2: integer doc IDs, batch-accumulate, merge intersect | Incremental segment-based indexing | | 538x faster than ripgrep on pre-indexed queries | WASM target for Cloudflare Workers | | O(1) inverted word index for identifier lookup | Multi-project support | @@ -72,37 +72,43 @@ curl -fsSL https://codedb.codegraff.com/install.sh | bash Downloads the binary for your platform and auto-registers codedb as an MCP server in **Claude Code**, **Codex**, **Gemini CLI**, and **Cursor**. The installer prints the exact `codedb mcp` command it registered plus hook setup pointers for Codex and Claude Code. -### Updating or repairing an older install +### Or via npm/npx (zero-install for MCP clients) -If `codedb update` fails on an older release, rerun the installer: +```bash +npx -y codedeebee mcp +``` + +Or install globally: ```bash -curl -fsSL https://codedb.codegraff.com/install.sh | bash +npm install -g codedeebee +codedb mcp ``` -This replaces the `codedb` binary with the latest GitHub Release and keeps your existing MCP registrations, config, caches, and snapshots. Use this path for any release whose built-in updater cannot fetch release checksums. +The npm package is named [`codedeebee`](https://www.npmjs.com/package/codedeebee) (the bare `codedb` name is restricted on npm); it ships a thin launcher that downloads the matching native binary from GitHub Releases on `postinstall` and verifies the SHA256 checksum. The installed CLI is still called `codedb`. + +Useful for MCP clients (Claude Code, Cursor, opencode, Claude Desktop) that already use `npx`: -### v0.2.579 MCP hotfix and release checksums +```json +{ + "codedb": { + "type": "local", + "command": ["npx", "-y", "codedeebee"], + "args": ["mcp"], + "enabled": true + } +} +``` -This note applies to `v0.2.579` only. Earlier `v0.2.579` binaries were rebuilt -and re-uploaded on May 2, 2026 because they passed the normal Zig test suite but -missed an MCP end-to-end regression: after `codedb_index` reported success, -follow-up MCP queries could still see an empty in-memory project (`files: 0`, -`scan: loading_snapshot`, empty `tree`/`find`/`search`, or `file not indexed`). +### Updating or repairing an older install -The fixed `v0.2.579` release assets were rebuilt from source commit -`1b634f0ba5cd1072e9ca54cabf442b573e034f53`. The values below are SHA256 -checksums for the uploaded binaries, not Git commit SHAs: +If `codedb update` fails on an older release, rerun the installer: -| Binary | SHA256 | -|--------|--------| -| `codedb-darwin-arm64` | `b5bddba01767e38e9723f28c7b3ff55370c4eda5f9e0e84172aaec1ff5094cb2` | -| `codedb-darwin-x86_64` | `cf2a9ec511f99fd839d2349cc17e671cd9566260cf601b8b23dd649665c22999` | -| `codedb-linux-arm64` | `955b0288c5cfb5c360f7b814cd3cc288ecc42c63a569f65fac358bd9454d788b` | -| `codedb-linux-x86_64` | `201dfe26bec33b3569c44a3d4893c51822bc793e06fab69fd93e81c0354232ee` | +```bash +curl -fsSL https://codedb.codegraff.com/install.sh | bash +``` -If you installed `v0.2.579` before this hotfix, rerun the installer above so the -binary matches the final uploaded checksum for your platform. +This replaces the `codedb` binary with the latest GitHub Release and keeps your existing MCP registrations, config, caches, and snapshots. Use this path for any release whose built-in updater cannot fetch release checksums. ## Documentation @@ -127,7 +133,7 @@ Or install manually from [GitHub Releases](https://github.com/justrach/codedb/re ### As an MCP server (recommended) -After installing, codedb is automatically registered. Just open a project and the 16 MCP tools are available to your AI agent. +After installing, codedb is automatically registered. Just open a project and the 21 MCP tools are available to your AI agent. ```bash # Manual MCP start (auto-configured by install script) @@ -156,7 +162,7 @@ codedb hot # recently modified files ## 🔧 MCP Tools -16 tools over the Model Context Protocol (JSON-RPC 2.0 over stdio): +21 tools over the Model Context Protocol (JSON-RPC 2.0 over stdio): | Tool | Description | |------|-------------| @@ -165,18 +171,22 @@ codedb hot # recently modified files | `codedb_symbol` | Find where a symbol is defined across the codebase | | `codedb_search` | Trigram-accelerated full-text search (supports regex, scoped results) | | `codedb_word` | O(1) inverted index word lookup | +| `codedb_callers` | Every call site of a symbol — word index ∩ outline scope, in one round-trip | +| `codedb_context` | Task-shaped composer — pass a NL task, get keywords + symbol defs + ranked files + top snippets in one block (replaces 3–5 sequential calls) | | `codedb_hot` | Most recently modified files | -| `codedb_deps` | Reverse dependency graph (which files import this file) | -| `codedb_read` | Read file content (supports line ranges, hash-based caching) | -| `codedb_edit` | Apply line-range edits (replace, insert, delete — atomic writes) | +| `codedb_deps` | Dependency graph: `imported_by` (default) or `depends_on`; `transitive=true` for full BFS | +| `codedb_read` | Read file content (line ranges, `if_hash` skip-unchanged, `compact` mode) | +| `codedb_edit` | Apply line-range edits (replace, insert, delete — atomic writes, optional `if_hash` guard) | | `codedb_changes` | Changed files since a sequence number | -| `codedb_status` | Index status (file count, current sequence) | +| `codedb_status` | Index status (file count, current sequence, scan phase) | | `codedb_snapshot` | Full pre-rendered JSON snapshot of the codebase | -| `codedb_bundle` | Batch multiple read-only queries in one call (max 20 ops) | | `codedb_remote` | Query indexed public repos via api.wiki.codes — no local clone needed | | `codedb_projects` | List all locally indexed projects on this machine | -| `codedb_index` | Index a local folder and create a codedb.snapshot | - +| `codedb_index` | Index a local folder and write `codedb.snapshot` | +| `codedb_find` | Fuzzy **file-name** search (typo-tolerant subsequence match against indexed paths — not a content/symbol search) | +| `codedb_glob` | Match indexed paths against a glob pattern (`src/**/*.zig`, `*.md`, …) | +| `codedb_ls` | List immediate children of a directory — dirs first, then files with language + counts | +| `codedb_query` | Composable pipeline — chain `find`, `search`, `filter`, `deps`, `outline`, `read`, `sort`, `limit` in one request | ### `codedb_remote` — Cloud Intelligence @@ -224,6 +234,7 @@ For Codex and Claude Code hook examples around `codedb_remote`, see [`docs/hooks | `codedb search ` | Full-text search (trigram, case-insensitive) | | `codedb search --regex ` | Regex search | | `codedb word ` | Exact word lookup via inverted index | +| `codedb read ` | Read file contents (supports `-L FROM-TO`, `--compact`) | | `codedb hot` | Recently modified files | | `codedb snapshot` | Write codedb.snapshot to project root | | `codedb serve` | HTTP daemon on :7719 | diff --git a/install/install.sh b/install/install.sh index 19155be..b31b4ba 100644 --- a/install/install.sh +++ b/install/install.sh @@ -220,14 +220,37 @@ except (FileNotFoundError, json.JSONDecodeError): hooks = data.setdefault("hooks", {}) -# Merge codedb hooks without clobbering existing hooks from other tools +# Merge codedb hooks without clobbering existing hooks from other tools. +# If a competing legacy-tools hook is already registered for the same +# event/matcher (e.g. muonry's block-legacy-tools.sh), insert codedb's +# entry at the FRONT of the list so its redirect wins the race; otherwise +# append. Re-runs will also reshuffle an already-registered codedb hook +# to the front if a competitor has appeared since the previous install. +COMPETITOR_MARKERS = ("block-legacy-tools", "muonry", "zigrep", "zigread") + def merge_hook(event, new_entry): existing = hooks.get(event, []) cmd = new_entry["hooks"][0]["command"] - for e in existing: + matcher = new_entry.get("matcher", "") + competes = any( + e.get("matcher", "") == matcher + and any(any(m in h.get("command", "") for m in COMPETITOR_MARKERS) for h in e.get("hooks", [])) + for e in existing + ) + idx = None + for i, e in enumerate(existing): if any(cmd in h.get("command", "") for h in e.get("hooks", [])): - return - existing.append(new_entry) + idx = i + break + if idx is not None: + if competes and idx != 0: + existing.insert(0, existing.pop(idx)) + hooks[event] = existing + return + if competes: + existing.insert(0, new_entry) + else: + existing.append(new_entry) hooks[event] = existing merge_hook("PreToolUse", {"matcher": "Bash", "hooks": [{"type": "command", "command": "$HOME/.claude/hooks/codedb-block-legacy.sh"}]}) diff --git a/npm/.npmignore b/npm/.npmignore new file mode 100644 index 0000000..b06c75b --- /dev/null +++ b/npm/.npmignore @@ -0,0 +1,4 @@ +vendor/ +*.tgz +.DS_Store +node_modules/ diff --git a/npm/README.md b/npm/README.md new file mode 100644 index 0000000..ceb2828 --- /dev/null +++ b/npm/README.md @@ -0,0 +1,69 @@ +# codedeebee + +npm/npx launcher for [**codedb**](https://github.com/justrach/codedb) — a Zig code intelligence MCP server. + +The package name is `codedeebee` (the bare `codedb` name is restricted on npm). The CLI it installs is named `codedb`. + +## Quick start + +```sh +npx -y codedeebee mcp +``` + +Or install once: + +```sh +npm install -g codedeebee +codedb mcp +``` + +## MCP client config + +### Claude Code / Cursor / opencode + +```json +{ + "codedb": { + "type": "local", + "command": ["npx", "-y", "codedeebee"], + "args": ["mcp"], + "enabled": true + } +} +``` + +### Claude Desktop + +```json +{ + "mcpServers": { + "codedb": { + "command": "npx", + "args": ["-y", "codedeebee", "mcp"] + } + } +} +``` + +## How it works + +`postinstall` downloads the matching native binary from the corresponding [GitHub Release](https://github.com/justrach/codedb/releases) and verifies it against `checksums.sha256`. The `codedb` command is a thin Node launcher that execs the native binary, preserving `cwd`, stdio, args, and environment. + +## Supported platforms + +| OS | Arch | +|--------|----------------------| +| macOS | arm64, x64 (Intel) | +| Linux | arm64, x64 | + +Windows is not yet supported. Comment on [issue #501](https://github.com/justrach/codedb/issues/501) if you need it. + +## Skipping the binary download + +For sandboxed installs (or environments without GitHub access), set `CODEDEEBEE_SKIP_POSTINSTALL=1`. The package will install successfully but `codedb` will exit until a binary is placed at `node_modules/codedeebee/vendor/codedb`. + +## Links + +- Source: https://github.com/justrach/codedb +- Issues: https://github.com/justrach/codedb/issues +- Releases: https://github.com/justrach/codedb/releases diff --git a/npm/bin/codedb.js b/npm/bin/codedb.js new file mode 100644 index 0000000..49cef01 --- /dev/null +++ b/npm/bin/codedb.js @@ -0,0 +1,33 @@ +#!/usr/bin/env node +"use strict"; + +const { spawnSync } = require("node:child_process"); +const fs = require("node:fs"); +const path = require("node:path"); + +const exeName = process.platform === "win32" ? "codedb.exe" : "codedb"; +const binPath = path.join(__dirname, "..", "vendor", exeName); + +if (!fs.existsSync(binPath)) { + process.stderr.write( + `codedb: native binary not found at ${binPath}\n` + + ` the postinstall step may have failed. Re-run:\n` + + ` npm rebuild codedeebee\n` + + ` or reinstall:\n` + + ` npm install -g codedeebee\n` + ); + process.exit(1); +} + +const result = spawnSync(binPath, process.argv.slice(2), { + stdio: "inherit", + cwd: process.cwd(), + env: process.env, +}); + +if (result.error) { + process.stderr.write(`codedb: failed to spawn ${binPath}: ${result.error.message}\n`); + process.exit(1); +} + +process.exit(result.status ?? 1); diff --git a/npm/package.json b/npm/package.json new file mode 100644 index 0000000..7a0ce0a --- /dev/null +++ b/npm/package.json @@ -0,0 +1,44 @@ +{ + "name": "codedeebee", + "version": "0.2.5820", + "description": "Zig code intelligence MCP server — npx launcher for the codedb native binary", + "license": "MIT", + "author": "justrach", + "homepage": "https://github.com/justrach/codedb", + "repository": { + "type": "git", + "url": "git+https://github.com/justrach/codedb.git" + }, + "bugs": { + "url": "https://github.com/justrach/codedb/issues" + }, + "bin": { + "codedb": "bin/codedb.js" + }, + "files": [ + "bin/", + "scripts/", + "README.md" + ], + "scripts": { + "postinstall": "node scripts/postinstall.js" + }, + "engines": { + "node": ">=18" + }, + "keywords": [ + "codedb", + "mcp", + "code-intelligence", + "zig", + "code-search" + ], + "os": [ + "darwin", + "linux" + ], + "cpu": [ + "x64", + "arm64" + ] +} diff --git a/npm/scripts/postinstall.js b/npm/scripts/postinstall.js new file mode 100644 index 0000000..09dae4d --- /dev/null +++ b/npm/scripts/postinstall.js @@ -0,0 +1,162 @@ +#!/usr/bin/env node +"use strict"; + +const fs = require("node:fs"); +const path = require("node:path"); +const crypto = require("node:crypto"); +const https = require("node:https"); +const { pipeline } = require("node:stream/promises"); + +const pkg = require("../package.json"); +const VERSION = pkg.version; +const REPO = "justrach/codedb"; + +const PLATFORM_MAP = { + "darwin-arm64": "codedb-darwin-arm64", + "darwin-x64": "codedb-darwin-x86_64", + "linux-arm64": "codedb-linux-arm64", + "linux-x64": "codedb-linux-x86_64", +}; + +function logErr(msg) { + process.stderr.write(`[codedeebee postinstall] ${msg}\n`); +} + +function log(msg) { + if (process.env.npm_config_loglevel === "silent") return; + process.stderr.write(`[codedeebee postinstall] ${msg}\n`); +} + +function get(url, redirectsLeft = 5) { + return new Promise((resolve, reject) => { + const req = https.get( + url, + { + headers: { + "User-Agent": `codedeebee-postinstall/${VERSION} node/${process.version}`, + Accept: "application/octet-stream", + }, + }, + (res) => { + const status = res.statusCode || 0; + if (status >= 300 && status < 400 && res.headers.location) { + if (redirectsLeft <= 0) { + res.resume(); + reject(new Error(`too many redirects fetching ${url}`)); + return; + } + const next = new URL(res.headers.location, url).toString(); + res.resume(); + resolve(get(next, redirectsLeft - 1)); + return; + } + if (status < 200 || status >= 300) { + res.resume(); + reject(new Error(`HTTP ${status} fetching ${url}`)); + return; + } + resolve(res); + } + ); + req.on("error", reject); + req.setTimeout(60_000, () => { + req.destroy(new Error(`timeout fetching ${url}`)); + }); + }); +} + +async function fetchText(url) { + const res = await get(url); + const chunks = []; + for await (const chunk of res) chunks.push(chunk); + return Buffer.concat(chunks).toString("utf8"); +} + +async function downloadToFile(url, dest) { + const res = await get(url); + const hash = crypto.createHash("sha256"); + res.on("data", (chunk) => hash.update(chunk)); + const out = fs.createWriteStream(dest, { mode: 0o755 }); + await pipeline(res, out); + return hash.digest("hex"); +} + +async function main() { + if (process.env.CODEDEEBEE_SKIP_POSTINSTALL === "1") { + log("CODEDEEBEE_SKIP_POSTINSTALL=1 — skipping binary download"); + return; + } + + const key = `${process.platform}-${process.arch}`; + const asset = PLATFORM_MAP[key]; + if (!asset) { + logErr( + `unsupported platform/arch: ${key}. Supported: ${Object.keys(PLATFORM_MAP).join(", ")}.\n` + + `If you want this platform supported, comment on https://github.com/${REPO}/issues/501` + ); + process.exit(0); + } + + const tag = `v${VERSION}`; + const baseUrl = `https://github.com/${REPO}/releases/download/${tag}`; + const assetUrl = `${baseUrl}/${asset}`; + const checksumsUrl = `${baseUrl}/checksums.sha256`; + + const vendorDir = path.join(__dirname, "..", "vendor"); + fs.mkdirSync(vendorDir, { recursive: true }); + const destPath = path.join(vendorDir, process.platform === "win32" ? "codedb.exe" : "codedb"); + const tmpPath = `${destPath}.download`; + + log(`platform: ${key} → asset: ${asset}`); + log(`fetching checksums from ${checksumsUrl}`); + + let expectedHex; + try { + const checksums = await fetchText(checksumsUrl); + for (const line of checksums.split(/\r?\n/)) { + const m = line.match(/^([0-9a-fA-F]{64})\s+\*?(.+)$/); + if (m && m[2].trim() === asset) { + expectedHex = m[1].toLowerCase(); + break; + } + } + if (!expectedHex) { + logErr(`could not find ${asset} in checksums.sha256 at ${checksumsUrl}`); + process.exit(1); + } + } catch (err) { + logErr(`failed to fetch checksums: ${err.message}`); + process.exit(1); + } + + log(`downloading ${assetUrl}`); + try { + if (fs.existsSync(tmpPath)) fs.unlinkSync(tmpPath); + const actualHex = await downloadToFile(assetUrl, tmpPath); + if (actualHex !== expectedHex) { + logErr( + `checksum mismatch for ${asset}:\n` + + ` expected ${expectedHex}\n` + + ` actual ${actualHex}` + ); + try { + fs.unlinkSync(tmpPath); + } catch {} + process.exit(1); + } + fs.chmodSync(tmpPath, 0o755); + fs.renameSync(tmpPath, destPath); + log(`installed: ${destPath}`); + } catch (err) { + logErr(`failed to download binary: ${err.message}`); + try { + fs.unlinkSync(tmpPath); + } catch {} + process.exit(1); + } +} + +main().catch((err) => { + logErr(`unexpected error: ${err.stack || err.message}`); + process.exit(1); +}); From 4f100aaf5a8ad608e3b10b69031e236d34414fa9 Mon Sep 17 00:00:00 2001 From: justrach <54503978+justrach@users.noreply.github.com> Date: Thu, 28 May 2026 20:26:44 +0800 Subject: [PATCH 02/11] =?UTF-8?q?fix:=20#502=20+=20#503=20+=20#507=20?= =?UTF-8?q?=E2=80=94=20arg=20parser=20+=20outline-only=20search?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit #503: codedb mcp hung on loading_snapshot because isCommand("mcp") matched at main.zig:150, parser set root=".", dropped /path, and entered deferred mode waiting for an MCP roots/list that a shell user never sends. #502: codedb mcp --help started the MCP server instead of printing help — same isCommand branch silently consumed --help as a command-arg. Both are fixed by factoring positional parsing into pub fn parsePositional() with two new special cases when args[1]=="mcp": - args[2] in {--help,-h,help} → cmd="--help" - args[2] looks like a path (not -flag) → root=args[2], root_is_explicit=true #507: after a snapshot rebuild, search returned 0 results for substrings demonstrably present in files that `tree` and `read` both surfaced. Root cause: Explorer.commitParsedFileOwnedOutline with full_index=false (snapshot.zig outline-only fallback + watcher.zig incremental indexFileOutline + WASM fast-path) registered files in `outlines` and `contents` but NOT in trigram_index, word_index, OR skip_trigram_files. Search then missed them at every tier: • tier 1 (trigram candidates) — file not in trigram_index • tier 3 (skip_trigram_files scan) — file not in this set • tier 5 (full outline scan) — short-circuited by trigram_ruled_out Fix: in the !full_index branch, also do skip_trigram_files.put(), so tier 3 substring-scans these files via searchInContent. Tests in test_mcp.zig (all fail on parent commit): issue-503: parsePositional treats `codedb mcp ` as path-as-root issue-503: `codedb mcp` still works (original order) issue-503: `codedb mcp` alone keeps cwd-as-root deferred behavior issue-502: `codedb mcp --help` rewrites to --help, does not start server issue-502: `codedb mcp -h` rewrites to --help parsePositional: existing commands still parse correctly (regression) issue-507: indexFileOutlineOnly files remain searchable via tier 3 Full suite: 514/514 across all 7 test binaries. Closes #502 (partial — reject-unknown-flags, git-root detection, scan-stuck recovery still open), closes #503, closes #507. Co-Authored-By: Claude Opus 4.7 (1M context) --- src/explore.zig | 12 +++++ src/main.zig | 89 +++++++++++++++++++++----------- src/test_mcp.zig | 132 +++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 204 insertions(+), 29 deletions(-) diff --git a/src/explore.zig b/src/explore.zig index a6b4437..3ff713f 100644 --- a/src/explore.zig +++ b/src/explore.zig @@ -688,6 +688,18 @@ pub const Explorer = struct { self.sparse_ngram_index.removeFile(stable_path); try self.skip_trigram_files.put(stable_path, {}); } + } else { + // Outline-only path (snapshot load fallback, file-watcher incremental + // updates, WASM fast-path). The file is in `outlines` + `contents` but + // not in word_index or trigram_index — without this entry it would + // also be absent from `skip_trigram_files`, dropping it out of every + // search tier: + // • tier 1 (trigram candidates) — file not in trigram_index + // • tier 3 (skip_trigram_files scan) — file not in this set + // • tier 5 (full outline scan) — short-circuited by trigram_ruled_out + // Registering here means tier 3 picks the file up via searchInContent. + // See #507. + try self.skip_trigram_files.put(stable_path, {}); } try self.rebuildDepsFor(stable_path, &persistent_outline); diff --git a/src/main.zig b/src/main.zig index b677b25..b9419a3 100644 --- a/src/main.zig +++ b/src/main.zig @@ -128,38 +128,15 @@ fn mainImpl() !void { var cmd_args_start: usize = undefined; var root_is_explicit: bool = false; - if (args.len >= 2 and std.mem.eql(u8, args[1], "--mcp")) { - root = "."; - cmd = "mcp"; - cmd_args_start = 2; - } else if (args.len >= 2 and (std.mem.eql(u8, args[1], "--version") or std.mem.eql(u8, args[1], "-v"))) { - root = "."; - cmd = "--version"; - cmd_args_start = 2; - } else if (args.len >= 2 and - (std.mem.eql(u8, args[1], "--help") or - std.mem.eql(u8, args[1], "-h") or - std.mem.eql(u8, args[1], "help"))) - { - root = "."; - cmd = args[1]; - cmd_args_start = 2; - } else if (args.len < 2) { - printUsage(&out, s); - std.process.exit(1); - } else if (isCommand(args[1])) { - root = "."; - cmd = args[1]; - cmd_args_start = 2; - } else if (args.len >= 3) { - root = args[1]; - cmd = args[2]; - cmd_args_start = 3; - root_is_explicit = true; - } else { + const parsed = parsePositional(args); + if (parsed.usage_exit) { printUsage(&out, s); std.process.exit(1); } + root = parsed.root; + cmd = parsed.cmd; + cmd_args_start = parsed.cmd_args_start; + root_is_explicit = parsed.root_is_explicit; // CODEDB_ROOT env var lets clients (Claude Code MCP, shell scripts) pin // the root without needing to pass a positional arg. Treated as explicit @@ -1068,6 +1045,60 @@ fn mainImpl() !void { std.process.exit(1); } } + +pub const ParsedPositional = struct { + root: []const u8, + cmd: []const u8, + cmd_args_start: usize, + root_is_explicit: bool, + usage_exit: bool = false, +}; + +/// Parse positional args into root/cmd. Pure, side-effect-free — caller is +/// responsible for printUsage()/exit when `usage_exit` is set. +/// +/// Special cases: +/// - `codedb mcp ` is honored as `codedb mcp` (issue #503). +/// The wrong arg order is a frequent typo from users who think `mcp` is +/// a normal subcommand. Treating the path as root prevents the deferred +/// scan from hanging forever waiting for a `roots/list` that never comes. +/// - `codedb mcp --help` (or `-h`/`help`) prints usage instead of starting +/// the MCP server (issue #502). +pub fn parsePositional(args: []const []const u8) ParsedPositional { + if (args.len < 2) { + return .{ .root = "", .cmd = "", .cmd_args_start = 0, .root_is_explicit = false, .usage_exit = true }; + } + const a1 = args[1]; + if (std.mem.eql(u8, a1, "--mcp")) { + return .{ .root = ".", .cmd = "mcp", .cmd_args_start = 2, .root_is_explicit = false }; + } + if (std.mem.eql(u8, a1, "--version") or std.mem.eql(u8, a1, "-v")) { + return .{ .root = ".", .cmd = "--version", .cmd_args_start = 2, .root_is_explicit = false }; + } + if (std.mem.eql(u8, a1, "--help") or std.mem.eql(u8, a1, "-h") or std.mem.eql(u8, a1, "help")) { + return .{ .root = ".", .cmd = a1, .cmd_args_start = 2, .root_is_explicit = false }; + } + if (isCommand(a1)) { + // `codedb mcp --help` → print help, do not start server. #502. + if (std.mem.eql(u8, a1, "mcp") and args.len >= 3) { + const a2 = args[2]; + if (std.mem.eql(u8, a2, "--help") or std.mem.eql(u8, a2, "-h") or std.mem.eql(u8, a2, "help")) { + return .{ .root = ".", .cmd = "--help", .cmd_args_start = 3, .root_is_explicit = false }; + } + // `codedb mcp ` → honor path as root. #503. + // Only when args[2] doesn't look like a flag; otherwise it's a + // legitimate command-arg that the mcp subcommand may consume. + if (a2.len > 0 and a2[0] != '-') { + return .{ .root = a2, .cmd = "mcp", .cmd_args_start = 3, .root_is_explicit = true }; + } + } + return .{ .root = ".", .cmd = a1, .cmd_args_start = 2, .root_is_explicit = false }; + } + if (args.len >= 3) { + return .{ .root = a1, .cmd = args[2], .cmd_args_start = 3, .root_is_explicit = true }; + } + return .{ .root = "", .cmd = "", .cmd_args_start = 0, .root_is_explicit = false, .usage_exit = true }; +} fn isCommand(arg: []const u8) bool { const commands = [_][]const u8{ "tree", "outline", "find", "search", "word", "read", "hot", "snapshot", "serve", "mcp", "update", "nuke" }; for (commands) |c| { diff --git a/src/test_mcp.zig b/src/test_mcp.zig index 66540e7..906c8cb 100644 --- a/src/test_mcp.zig +++ b/src/test_mcp.zig @@ -1566,3 +1566,135 @@ test "issue-437: codedb_bundle ops items schema has discriminated oneOf per sub- } } + +test "issue-503: parsePositional treats `codedb mcp ` as path-as-root" { + // Before fix: parser took the isCommand("mcp") branch, set root=".", + // root_is_explicit=false, and silently dropped /tmp/proj. That tripped + // the deferred-scan branch in mainImpl() which waited forever for an + // MCP `roots/list` message that a user invoking from a shell will never + // send. + const argv = [_][]const u8{ "codedb", "mcp", "/tmp/proj" }; + const p = main_mod.parsePositional(&argv); + try testing.expect(!p.usage_exit); + try testing.expectEqualStrings("/tmp/proj", p.root); + try testing.expectEqualStrings("mcp", p.cmd); + try testing.expect(p.root_is_explicit); +} + +test "issue-503: `codedb mcp` still works (original order)" { + const argv = [_][]const u8{ "codedb", "/tmp/proj", "mcp" }; + const p = main_mod.parsePositional(&argv); + try testing.expect(!p.usage_exit); + try testing.expectEqualStrings("/tmp/proj", p.root); + try testing.expectEqualStrings("mcp", p.cmd); + try testing.expect(p.root_is_explicit); +} + +test "issue-503: `codedb mcp` alone keeps cwd-as-root deferred behavior" { + // The deferred-mode behavior is intentional when no path is given — + // an MCP client may still send roots/list. Don't break that path. + const argv = [_][]const u8{ "codedb", "mcp" }; + const p = main_mod.parsePositional(&argv); + try testing.expect(!p.usage_exit); + try testing.expectEqualStrings(".", p.root); + try testing.expectEqualStrings("mcp", p.cmd); + try testing.expect(!p.root_is_explicit); +} + +test "issue-502: `codedb mcp --help` rewrites to --help, does not start server" { + const argv = [_][]const u8{ "codedb", "mcp", "--help" }; + const p = main_mod.parsePositional(&argv); + try testing.expect(!p.usage_exit); + try testing.expectEqualStrings("--help", p.cmd); +} + +test "issue-502: `codedb mcp -h` rewrites to --help" { + const argv = [_][]const u8{ "codedb", "mcp", "-h" }; + const p = main_mod.parsePositional(&argv); + try testing.expect(!p.usage_exit); + try testing.expectEqualStrings("--help", p.cmd); +} + +test "parsePositional: existing commands still parse correctly (regression)" { + // `codedb tree` → cwd-as-root tree + { + const argv = [_][]const u8{ "codedb", "tree" }; + const p = main_mod.parsePositional(&argv); + try testing.expectEqualStrings(".", p.root); + try testing.expectEqualStrings("tree", p.cmd); + try testing.expect(!p.root_is_explicit); + } + // `codedb /path/to/root tree` → explicit-root tree + { + const argv = [_][]const u8{ "codedb", "/path/to/root", "tree" }; + const p = main_mod.parsePositional(&argv); + try testing.expectEqualStrings("/path/to/root", p.root); + try testing.expectEqualStrings("tree", p.cmd); + try testing.expect(p.root_is_explicit); + } + // `codedb --version` → version + { + const argv = [_][]const u8{ "codedb", "--version" }; + const p = main_mod.parsePositional(&argv); + try testing.expectEqualStrings("--version", p.cmd); + } + // `codedb --help` → help + { + const argv = [_][]const u8{ "codedb", "--help" }; + const p = main_mod.parsePositional(&argv); + try testing.expectEqualStrings("--help", p.cmd); + } + // no args → usage exit + { + const argv = [_][]const u8{"codedb"}; + const p = main_mod.parsePositional(&argv); + try testing.expect(p.usage_exit); + } + // `codedb --mcp` → mcp command (legacy alias) + { + const argv = [_][]const u8{ "codedb", "--mcp" }; + const p = main_mod.parsePositional(&argv); + try testing.expectEqualStrings("mcp", p.cmd); + } +} + +test "issue-507: indexFileOutlineOnly files remain searchable via tier 3" { + // Repro for #507: after a snapshot rebuild, certain files showed up in + // `tree` and `read` but searchContent returned 0 hits for substrings + // demonstrably present in the file. Snapshot.zig and watcher.zig both + // route through Explorer.indexFileOutlineOnly for files that aren't in + // the trigram-restore set; before the fix that path populated outlines + // and contents but not trigram_index nor skip_trigram_files, so the file + // fell off every search tier (trigram missed; tier 3 keyed on + // skip_trigram_files missed; tier 5 short-circuited by trigram_ruled_out). + var explorer = Explorer.init(testing.allocator, Explorer.DEFAULT_CONTENT_CACHE_CAPACITY); + defer explorer.deinit(); + + // A representative file from the upstream report — extension-less shell + // script that ends up with language=unknown but still has searchable text. + const path = "bin/orchestrator"; + const content = + \\#!/usr/bin/env bash + \\set -euo pipefail + \\ + \\policy_context="$(cat <<'POLICY' + \\Doran Orchestrator operating contract: + \\- AIHero / Matt Pocock skills from AGENTS.md + \\POLICY + \\)" + \\echo "$policy_context" + ; + try explorer.indexFileOutlineOnly(path, content); + + const hits = try explorer.searchContent("Doran Orchestrator operating contract", testing.allocator, 10); + defer { + for (hits) |h| { + testing.allocator.free(h.path); + testing.allocator.free(h.line_text); + } + testing.allocator.free(hits); + } + + try testing.expect(hits.len > 0); + try testing.expectEqualStrings(path, hits[0].path); +} From 619463653345f78e666fc1ad6d1811d75ef80065 Mon Sep 17 00:00:00 2001 From: justrach <54503978+justrach@users.noreply.github.com> Date: Thu, 28 May 2026 20:52:21 +0800 Subject: [PATCH 03/11] fix(cli): #502 reject unknown flags after `mcp` MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Before, `codedb mcp --snapshot` silently swallowed the unknown flag and started the MCP server with surprising state. mainImpl now whitelists post-`mcp` flags via isValidMcpFlag and exits 1 with a listed-valid-flags error message on the first unrecognised flag. Edge cases: • `--help`/`-h`/`help` anywhere after `mcp` short-circuits to printUsage + exit 0 (parsePositional only catches them when they sit immediately after `mcp`, so combos like `mcp --no-telemetry --help` need their own bypass). • `--config-file=` is stripped before positional parsing and never reaches this whitelist. • `--no-telemetry` stays accepted; existing behaviour preserved. Test in test_mcp.zig: issue-502: isValidMcpFlag whitelist rejects unknown flags (fails on parent commit). Full test-mcp: 88/88 pass. Co-Authored-By: Claude Opus 4.7 (1M context) --- src/main.zig | 34 ++++++++++++++++++++++++++++++++++ src/test_mcp.zig | 11 +++++++++++ 2 files changed, 45 insertions(+) diff --git a/src/main.zig b/src/main.zig index b9419a3..cea3b76 100644 --- a/src/main.zig +++ b/src/main.zig @@ -157,6 +157,31 @@ fn mainImpl() !void { // See #304. if (std.mem.eql(u8, cmd, "mcp")) { out.file = cio.File.stderr(); + // #502: reject unknown flags after `mcp` (e.g. `codedb mcp --snapshot` + // was previously consumed silently and the server started anyway, + // hiding the typo). Whitelist via isValidMcpFlag. + // Handle `--help` here too — parsePositional only catches it when it + // sits immediately after `mcp`; combos like `mcp --no-telemetry --help` + // need their own bypass. + for (args[cmd_args_start..]) |a| { + if (a.len == 0 or a[0] != '-') continue; + if (std.mem.eql(u8, a, "--help") or std.mem.eql(u8, a, "-h") or std.mem.eql(u8, a, "help")) { + out.file = stdout; + printUsage(&out, s); + return; + } + if (!isValidMcpFlag(a)) { + out.p("{s}\xe2\x9c\x97{s} unknown flag for {s}mcp{s}: {s}{s}{s}\n valid: {s}--no-telemetry{s}, {s}--help{s}, {s}--config-file={s}\n", .{ + s.red, s.reset, + s.bold, s.reset, + s.bold, a, s.reset, + s.bold, s.reset, + s.bold, s.reset, + s.bold, s.reset, + }); + std.process.exit(1); + } + } } // Handle --version early (no root needed) @@ -1099,6 +1124,15 @@ pub fn parsePositional(args: []const []const u8) ParsedPositional { } return .{ .root = "", .cmd = "", .cmd_args_start = 0, .root_is_explicit = false, .usage_exit = true }; } + +/// Whitelist of post-command flags accepted by `codedb mcp`. Anything else +/// starting with `-` is rejected at startup (#502). `--config-file=` +/// is stripped before positional parsing and never reaches this whitelist; +/// `--help`/`-h`/`help` are rewritten by parsePositional and also never +/// reach here as a command arg. +pub fn isValidMcpFlag(arg: []const u8) bool { + return std.mem.eql(u8, arg, "--no-telemetry"); +} fn isCommand(arg: []const u8) bool { const commands = [_][]const u8{ "tree", "outline", "find", "search", "word", "read", "hot", "snapshot", "serve", "mcp", "update", "nuke" }; for (commands) |c| { diff --git a/src/test_mcp.zig b/src/test_mcp.zig index 906c8cb..5f63e3e 100644 --- a/src/test_mcp.zig +++ b/src/test_mcp.zig @@ -1658,6 +1658,17 @@ test "parsePositional: existing commands still parse correctly (regression)" { } } + +test "issue-502: isValidMcpFlag whitelist rejects unknown flags" { + // Before fix: `codedb mcp --snapshot` silently swallowed the flag and + // started the server with surprising state. After fix, mainImpl rejects + // any non-whitelisted flag with a clear error and exit 1. + try testing.expect(main_mod.isValidMcpFlag("--no-telemetry")); + try testing.expect(!main_mod.isValidMcpFlag("--snapshot")); + try testing.expect(!main_mod.isValidMcpFlag("-x")); + try testing.expect(!main_mod.isValidMcpFlag("--help")); // rewritten by parsePositional before reaching here + try testing.expect(!main_mod.isValidMcpFlag("")); +} test "issue-507: indexFileOutlineOnly files remain searchable via tier 3" { // Repro for #507: after a snapshot rebuild, certain files showed up in // `tree` and `read` but searchContent returned 0 hits for substrings From 179f744e32110e78f9b2e7b772710667e7f656ba Mon Sep 17 00:00:00 2001 From: justrach <54503978+justrach@users.noreply.github.com> Date: Thu, 28 May 2026 20:56:06 +0800 Subject: [PATCH 04/11] fix(cli): #502 auto-detect git root from cwd MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When `codedb mcp` is launched from a subdirectory of a git repo (typical pattern for editor MCP clients spawning from the buffer's directory), walk up from cwd to the nearest `.git` and use that as the indexed root. Without this, opencode/Zed/etc were silently indexing only the subdir the user happened to be in. • Pin order: CODEDB_ROOT env var > positional `` arg > findGitRoot > cwd-deferred mode. • Only triggers in deferred mode (root==".", !root_is_explicit) — explicit paths and the `${workspaceFolder}` shim are left untouched. • `.git` may be a dir (normal repo) or a file (git worktree); statFile covers both. • Walks until the first `/` segment, then bails — does not treat `/` itself as a project root. Factored as findGitRoot(io, buf) + findGitRootFrom(io, buf, len) so tests can hand in synthetic absolute paths without chdir'ing the process. Tests in test_mcp.zig: walks-up case + null case (both fail on parent commit; both pass after). Full test-mcp: 90/90 pass. Co-Authored-By: Claude Opus 4.7 (1M context) --- src/main.zig | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ src/test_mcp.zig | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 94 insertions(+) diff --git a/src/main.zig b/src/main.zig index cea3b76..717f062 100644 --- a/src/main.zig +++ b/src/main.zig @@ -152,6 +152,19 @@ fn mainImpl() !void { } } + // #502: when `codedb mcp` is launched from a subdirectory of a git + // repo (e.g. opencode/Zed spawning from the buffer's directory), walk + // up to the repo root so the user gets the whole project indexed + // rather than the subdir they happen to be in. Skipped if the env var + // or a positional arg already pinned the root, or if no .git is found. + var git_root_buf: [std.fs.max_path_bytes]u8 = undefined; + if (std.mem.eql(u8, cmd, "mcp") and std.mem.eql(u8, root, ".") and !root_is_explicit) { + if (findGitRoot(io, &git_root_buf)) |git_root| { + root = git_root; + root_is_explicit = true; + } + } + // MCP stdio reserves stdout for JSON-RPC — route status/error output to // stderr so startup/failure paths don't corrupt the protocol stream. // See #304. @@ -1125,6 +1138,40 @@ pub fn parsePositional(args: []const []const u8) ParsedPositional { return .{ .root = "", .cmd = "", .cmd_args_start = 0, .root_is_explicit = false, .usage_exit = true }; } +/// Walk up from cwd looking for a `.git` directory or file (git worktree). +/// Returns a slice into `buf` containing the absolute path, or null if no +/// repo root is found before reaching the filesystem root. Used to make +/// `codedb mcp` from inside a subdir of a git repo Just Work (#502). +pub fn findGitRoot(io: std.Io, buf: *[std.fs.max_path_bytes]u8) ?[]const u8 { + const cwd_len = std.Io.Dir.cwd().realPathFile(io, ".", buf) catch return null; + return findGitRootFrom(io, buf, cwd_len); +} + +/// Test-friendly variant: walk up from `buf[0..start_len]` (must already be +/// an absolute path) looking for `.git`. Mutates buf in place. Returns slice +/// or null. Kept separate so tests can hand in synthetic absolute paths +/// without chdir'ing the process. +pub fn findGitRootFrom(io: std.Io, buf: *[std.fs.max_path_bytes]u8, start_len: usize) ?[]const u8 { + var len = start_len; + var probe_buf: [std.fs.max_path_bytes]u8 = undefined; + while (len > 0) { + const here = buf[0..len]; + const probe = std.fmt.bufPrint(&probe_buf, "{s}/.git", .{here}) catch return null; + if (std.Io.Dir.cwd().statFile(io, probe, .{})) |_| { + return here; + } else |_| {} + if (std.mem.lastIndexOfScalar(u8, here, '/')) |slash| { + if (slash == 0) { + // Reached "/"; one more step to filesystem root, no match. + return null; + } + len = slash; + } else { + return null; + } + } + return null; +} /// Whitelist of post-command flags accepted by `codedb mcp`. Anything else /// starting with `-` is rejected at startup (#502). `--config-file=` /// is stripped before positional parsing and never reaches this whitelist; @@ -1133,6 +1180,7 @@ pub fn parsePositional(args: []const []const u8) ParsedPositional { pub fn isValidMcpFlag(arg: []const u8) bool { return std.mem.eql(u8, arg, "--no-telemetry"); } + fn isCommand(arg: []const u8) bool { const commands = [_][]const u8{ "tree", "outline", "find", "search", "word", "read", "hot", "snapshot", "serve", "mcp", "update", "nuke" }; for (commands) |c| { diff --git a/src/test_mcp.zig b/src/test_mcp.zig index 5f63e3e..d2a5906 100644 --- a/src/test_mcp.zig +++ b/src/test_mcp.zig @@ -1669,6 +1669,52 @@ test "issue-502: isValidMcpFlag whitelist rejects unknown flags" { try testing.expect(!main_mod.isValidMcpFlag("--help")); // rewritten by parsePositional before reaching here try testing.expect(!main_mod.isValidMcpFlag("")); } + + +test "issue-502: findGitRootFrom walks up to a .git directory" { + var tmp = testing.tmpDir(.{}); + defer tmp.cleanup(); + + try tmp.dir.createDirPath(io, ".git"); + try tmp.dir.createDirPath(io, "sub/deep"); + + var tmp_buf: [std.fs.max_path_bytes]u8 = undefined; + const tmp_path_len = try tmp.dir.realPathFile(io, ".", &tmp_buf); + const tmp_path = tmp_buf[0..tmp_path_len]; + + // Build absolute path tmp/sub/deep without changing the process cwd. + var probe: [std.fs.max_path_bytes]u8 = undefined; + const deep = try std.fmt.bufPrint(&probe, "{s}/sub/deep", .{tmp_path}); + @memcpy(probe[deep.len .. deep.len + 0], ""); + + const got = main_mod.findGitRootFrom(io, &probe, deep.len); + try testing.expect(got != null); + try testing.expectEqualStrings(tmp_path, got.?); +} + +test "issue-502: findGitRootFrom returns null when no .git is found upward" { + var tmp = testing.tmpDir(.{}); + defer tmp.cleanup(); + + try tmp.dir.createDirPath(io, "lonely"); + + var tmp_buf: [std.fs.max_path_bytes]u8 = undefined; + const tmp_path_len = try tmp.dir.realPathFile(io, ".", &tmp_buf); + const tmp_path = tmp_buf[0..tmp_path_len]; + + var probe: [std.fs.max_path_bytes]u8 = undefined; + const lonely = try std.fmt.bufPrint(&probe, "{s}/lonely", .{tmp_path}); + + // tempdir is under /var/folders (mac) or /tmp (linux); neither has a + // .git above it on a sane CI runner. If your environment has, this + // test's expectation still holds: the found path must not include our + // tempdir's leaf. + const got = main_mod.findGitRootFrom(io, &probe, lonely.len); + if (got) |g| { + try testing.expect(std.mem.indexOf(u8, g, "lonely") == null); + } +} + test "issue-507: indexFileOutlineOnly files remain searchable via tier 3" { // Repro for #507: after a snapshot rebuild, certain files showed up in // `tree` and `read` but searchContent returned 0 hits for substrings From a1d6840bdb0864928a23f311eb2a9bc93c11a00d Mon Sep 17 00:00:00 2001 From: justrach <54503978+justrach@users.noreply.github.com> Date: Thu, 28 May 2026 20:57:45 +0800 Subject: [PATCH 05/11] fix(mcp): #502 break the loading_snapshot stuck-forever case MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit watcherDeferredLoop polled scan_done indefinitely. After the 3s fallback fired, triggerDeferredScanWithFallback could still return false (e.g. fallback_cwd failed root_policy.isIndexableRoot — `/`, `/tmp`, or any other denied path). That left `triggered=false`, `scan_done=false`, and the loop spinning forever — visible to the user as scan=loading_snapshot with files=0 forever. Now after a give_up_after_ms (13s total) without a successful trigger, the loop: • logs a warn-level message pointing the user at CODEDB_ROOT or the `codedb mcp` invocation, • flips scan_done so MCP tool calls stop replying with the "still loading" hint and return empty results cleanly, • skips the post-loop incrementalLoop call (resolved_root is empty in the give-up path). The 3s pre-fallback path is unchanged for the happy case where fallback_cwd is indexable — only the previously-unreachable hang is fixed. Full test-mcp: 90/90. Co-Authored-By: Claude Opus 4.7 (1M context) --- src/main.zig | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/src/main.zig b/src/main.zig index 717f062..0770578 100644 --- a/src/main.zig +++ b/src/main.zig @@ -1659,18 +1659,35 @@ fn triggerScanFromRoots(ctx: *mcp_server.DeferredScan, abs_root: []const u8) voi fn watcherDeferredLoop(ctx: *mcp_server.DeferredScan) void { const t0 = cio.milliTimestamp(); const fallback_after_ms: i64 = 3000; + // #502: after the 3s fallback fires, give the cwd-policy check a + // little more time, then unblock. Previously, when fallback_cwd was + // non-indexable (e.g. `/`, `/tmp`, or any other path that fails + // isIndexableRoot), `triggerDeferredScanWithFallback` would return + // false, leave `triggered=false`, leave `scan_done=false`, and this + // loop would poll forever — tool calls saw scan=loading_snapshot + // indefinitely and the server hung from the user's POV. + const give_up_after_ms: i64 = 13000; var fallback_attempted = false; while (!ctx.scan_done.load(.acquire) and !ctx.shutdown.load(.acquire)) { cio.sleepMs(50); - if (!fallback_attempted and cio.milliTimestamp() - t0 >= fallback_after_ms) { + const elapsed = cio.milliTimestamp() - t0; + if (!fallback_attempted and elapsed >= fallback_after_ms) { fallback_attempted = true; // Client never sent indexable roots — fall back to cwd so the // server doesn't sit in loading_snapshot forever. const empty_roots: []const mcp_server.Root = &.{}; _ = mcp_server.triggerDeferredScanWithFallback(ctx, empty_roots, ctx.fallback_cwd); } + if (fallback_attempted and elapsed >= give_up_after_ms and !ctx.triggered.load(.acquire)) { + std.log.warn("codedb mcp: no indexable root found after {d}ms — exiting deferred mode with empty index. set CODEDB_ROOT or pass `codedb mcp` to fix.", .{give_up_after_ms}); + ctx.scan_done.store(true, .release); + return; + } } if (ctx.shutdown.load(.acquire)) return; + // If we exited the loop without ever triggering a scan (give-up path), + // resolved_root is empty — skip incrementalLoop so we don't crash. + if (!ctx.triggered.load(.acquire)) return; watcher.incrementalLoop(ctx.io, ctx.store, ctx.explorer, ctx.queue, ctx.resolved_root, ctx.shutdown, ctx.scan_done); } From 56ff65f5b97b86f94b1c2a3ab08f122ec99f83b1 Mon Sep 17 00:00:00 2001 From: justrach <54503978+justrach@users.noreply.github.com> Date: Thu, 28 May 2026 21:00:21 +0800 Subject: [PATCH 06/11] fix(remote): #508 actionable hints for api.wiki.codes errors MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The remote tool previously surfaced raw HTTP status + Cloudflare body to the caller — "api.wiki.codes HTTP 530 for X/Y — error code: 1033" — with no remediation guidance. Agents treated this as an opaque failure and either retried in tight loops or bailed. appendRemoteErrorHint distinguishes the common upstream cases: 530 + "error code: 1033/1034" / "Argo Tunnel error" → "origin unreachable, service temporarily down, retry in a few minutes or fall back to local `codedb_index`" 530 (plain) → "retry in a few minutes; if it persists, repo may not be indexed" 404 → "repo or path not indexed; verify slug or clone + index locally" 429 → "rate limited; wait and retry, or batch fewer requests" 500/502/503 → "upstream server error, retry" 504 → "gateway timeout; wiki may still be indexing this repo" The hint is appended after the existing status line so consumers that parsed the old format still work; the new line is purely additive and human-/agent-readable. Note: the underlying server outage (api.wiki.codes returning 530 for several public repos as of 2026-05-28) is server-side and not fixable client-side. This change makes the failure mode actionable but does not restore the service. Tests in test_mcp.zig — covers Cloudflare 530 vs plain 530, 404, 429, 200 (no-op). Full test-mcp: 91/91. Co-Authored-By: Claude Opus 4.7 (1M context) --- src/mcp.zig | 27 +++++++++++++++++++++++++++ src/test_mcp.zig | 42 ++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 67 insertions(+), 2 deletions(-) diff --git a/src/mcp.zig b/src/mcp.zig index 76bc144..a93a152 100644 --- a/src/mcp.zig +++ b/src/mcp.zig @@ -2957,6 +2957,33 @@ fn handleRemote(alloc: std.mem.Allocator, args: *const std.json.ObjectMap, out: out.appendSlice(alloc, " — ") catch {}; out.appendSlice(alloc, remote.captured.stderr[0..@min(remote.captured.stderr.len, 200)]) catch {}; } + + // #508: actionable hint based on the HTTP status / Cloudflare body. + // Distinguishes "service down" (530 + Cloudflare 1033/1034) from + // "repo or path not indexed" (404) from "rate limited" (429) so + // agents and humans can decide whether to retry or take a different + // path (e.g. clone the repo locally) without parsing the raw error. + appendRemoteErrorHint(alloc, out, remote.status, body); +} + +pub fn appendRemoteErrorHint(alloc: std.mem.Allocator, out: *std.ArrayList(u8), status: u16, body: []const u8) void { + const has_cf_origin_down = + std.mem.indexOf(u8, body, "error code: 1033") != null or + std.mem.indexOf(u8, body, "error code: 1034") != null or + std.mem.indexOf(u8, body, "Argo Tunnel error") != null; + + const hint: ?[]const u8 = switch (status) { + 530 => if (has_cf_origin_down) + "\n hint: api.wiki.codes origin is unreachable (Cloudflare). The service is temporarily down — retry in a few minutes, or query the repo locally via `codedb_index` after cloning." + else + "\n hint: upstream returned 530. Retry in a few minutes; if it persists, the repo may not be indexed.", + 404 => "\n hint: repo or path not indexed by api.wiki.codes. Verify the slug, or clone + `codedb_index` locally.", + 429 => "\n hint: rate limited by api.wiki.codes. Wait and retry, or batch fewer requests.", + 500, 502, 503 => "\n hint: upstream server error. Retry — if it persists, the service is having a bad time.", + 504 => "\n hint: upstream gateway timeout. Retry; the wiki may still be indexing this repo.", + else => null, + }; + if (hint) |h| out.appendSlice(alloc, h) catch {}; } // ── Local project tools ───────────────────────────────────────────────────── diff --git a/src/test_mcp.zig b/src/test_mcp.zig index d2a5906..59a0220 100644 --- a/src/test_mcp.zig +++ b/src/test_mcp.zig @@ -1715,6 +1715,46 @@ test "issue-502: findGitRootFrom returns null when no .git is found upward" { } } +test "issue-508: appendRemoteErrorHint differentiates Cloudflare 530 from 404/429" { + // Cloudflare 530 + error code 1033 → "origin unreachable" hint with the + // local-clone fallback. Plain 530 (no Cloudflare body) → softer "retry" + // hint. 404 → "repo not indexed". 429 → "rate limited". This is the + // actionable bit the user from #508 was missing. + { + var out: std.ArrayList(u8) = .empty; + defer out.deinit(testing.allocator); + mcp_mod.appendRemoteErrorHint(testing.allocator, &out, 530, "error code: 1033"); + try testing.expect(std.mem.indexOf(u8, out.items, "origin is unreachable") != null); + try testing.expect(std.mem.indexOf(u8, out.items, "codedb_index") != null); + } + { + var out: std.ArrayList(u8) = .empty; + defer out.deinit(testing.allocator); + mcp_mod.appendRemoteErrorHint(testing.allocator, &out, 530, ""); + try testing.expect(std.mem.indexOf(u8, out.items, "Retry") != null); + try testing.expect(std.mem.indexOf(u8, out.items, "origin is unreachable") == null); + } + { + var out: std.ArrayList(u8) = .empty; + defer out.deinit(testing.allocator); + mcp_mod.appendRemoteErrorHint(testing.allocator, &out, 404, ""); + try testing.expect(std.mem.indexOf(u8, out.items, "not indexed") != null); + } + { + var out: std.ArrayList(u8) = .empty; + defer out.deinit(testing.allocator); + mcp_mod.appendRemoteErrorHint(testing.allocator, &out, 429, ""); + try testing.expect(std.mem.indexOf(u8, out.items, "rate limited") != null); + } + { + var out: std.ArrayList(u8) = .empty; + defer out.deinit(testing.allocator); + mcp_mod.appendRemoteErrorHint(testing.allocator, &out, 200, ""); + // Successful status → no hint appended. + try testing.expectEqual(@as(usize, 0), out.items.len); + } +} + test "issue-507: indexFileOutlineOnly files remain searchable via tier 3" { // Repro for #507: after a snapshot rebuild, certain files showed up in // `tree` and `read` but searchContent returned 0 hits for substrings @@ -1727,8 +1767,6 @@ test "issue-507: indexFileOutlineOnly files remain searchable via tier 3" { var explorer = Explorer.init(testing.allocator, Explorer.DEFAULT_CONTENT_CACHE_CAPACITY); defer explorer.deinit(); - // A representative file from the upstream report — extension-less shell - // script that ends up with language=unknown but still has searchable text. const path = "bin/orchestrator"; const content = \\#!/usr/bin/env bash From 27c4406104db4a4e42ffd911b13bccaa850d5a59 Mon Sep 17 00:00:00 2001 From: justrach <54503978+justrach@users.noreply.github.com> Date: Thu, 28 May 2026 21:03:09 +0800 Subject: [PATCH 07/11] fix(mcp): #505 + #506 negotiate protocolVersion with the client MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit codedb previously hardcoded a "2025-06-18" protocolVersion in the initialize reply, regardless of what the client sent. Older Zed builds and certain opencode versions reject a server reply whose protocolVersion they don't recognize — manifesting as a startup timeout (#506) or as "No MCP tools" (#505) when the client gives up on the handshake. handleInitialize now extracts params.protocolVersion and feeds it through negotiateProtocolVersion: • Echo the client's version if it's one we've verified against (2024-11-05, 2025-03-26, 2025-06-18). • If the client sent something newer than our latest known version, reply with our latest (forward-compatibility hint to the client that we're as new as we can be). • If the client sent something ancient (lex-orders below our oldest known), reply with our oldest known so older clients still get a shape they recognise; client decides if it can proceed. • Empty/missing → fall back to the default ("2025-06-18"). Tests in test_mcp.zig cover all four branches. E2E: client 2024-11-05 → server 2024-11-05 ✓ client 2025-03-26 → server 2025-03-26 ✓ client 2025-06-18 → server 2025-06-18 ✓ Full test-mcp: 95/95. Closes #505 (opencode), closes #506 (Zed). Co-Authored-By: Claude Opus 4.7 (1M context) --- src/mcp.zig | 44 ++++++++++++++++++++++++++++++++++++++++++-- src/test_mcp.zig | 28 +++++++++++++++++++++++----- 2 files changed, 65 insertions(+), 7 deletions(-) diff --git a/src/mcp.zig b/src/mcp.zig index a93a152..fb04cbe 100644 --- a/src/mcp.zig +++ b/src/mcp.zig @@ -864,13 +864,53 @@ fn handleInitialize(s: *Session, root: *const std.json.ObjectMap, id: ?std.json. s.client_name = name; } } + // #505 / #506: negotiate the protocol version with the client. + // Old versions of opencode/Zed reject a server reply with a NEWER + // protocolVersion than they sent. Echo the client's version back when + // we recognize it; otherwise fall back to the latest we support. + var negotiated: []const u8 = "2025-06-18"; + proto: { + const p = root.get("params") orelse break :proto; + if (p != .object) break :proto; + const requested = mcpj.getStr(&p.object, "protocolVersion") orelse break :proto; + if (negotiateProtocolVersion(requested)) |v| negotiated = v; + } const init_result = std.fmt.allocPrint(s.alloc, - \\{{"protocolVersion":"2025-06-18","capabilities":{{"tools":{{"listChanged":false}}}},"serverInfo":{{"name":"codedb","version":"{s}"}}}} - , .{release_info.semver}) catch return; + \\{{"protocolVersion":"{s}","capabilities":{{"tools":{{"listChanged":false}}}},"serverInfo":{{"name":"codedb","version":"{s}"}}}} + , .{ negotiated, release_info.semver }) catch return; defer s.alloc.free(init_result); writeResult(s.alloc, s.stdout, id, init_result); } +/// Versions of the MCP spec this server has been verified against. Listed +/// newest-first because clients that send a newer version than we know +/// should still get our newest known version back, not an old one. +const SUPPORTED_PROTOCOL_VERSIONS = [_][]const u8{ + "2025-06-18", + "2025-03-26", + "2024-11-05", +}; + +/// Pick the protocol version to send back in initialize. Returns the +/// client's requested version if we recognize it, the latest version we +/// know about if the request is newer than that, or null if the request +/// looks malformed and the caller should fall back to a default. See +/// #505 / #506 — older clients (Zed, certain opencode versions) reject +/// a server reply with a protocolVersion they don't understand. +pub fn negotiateProtocolVersion(requested: []const u8) ?[]const u8 { + if (requested.len == 0) return null; + for (SUPPORTED_PROTOCOL_VERSIONS) |v| { + if (std.mem.eql(u8, v, requested)) return v; + } + // Unknown version. If it looks like a future date (lex-greater than our + // latest), reply with our latest. Otherwise reply with our oldest known + // version so older clients at least get a compatible-shaped response. + if (std.mem.order(u8, requested, SUPPORTED_PROTOCOL_VERSIONS[0]) == .gt) { + return SUPPORTED_PROTOCOL_VERSIONS[0]; + } + return SUPPORTED_PROTOCOL_VERSIONS[SUPPORTED_PROTOCOL_VERSIONS.len - 1]; +} + fn requestRoots(s: *Session) void { const rid = s.next_id; s.next_id += 1; diff --git a/src/test_mcp.zig b/src/test_mcp.zig index 59a0220..d03e0b7 100644 --- a/src/test_mcp.zig +++ b/src/test_mcp.zig @@ -1715,11 +1715,30 @@ test "issue-502: findGitRootFrom returns null when no .git is found upward" { } } +test "issue-506: negotiateProtocolVersion echoes a recognized client version" { + // Before fix, server always replied "2025-06-18", which older Zed and + // some opencode builds reject with a timeout because they don't know + // that version. Now we echo the client's version when we recognize it. + try testing.expectEqualStrings("2024-11-05", mcp_mod.negotiateProtocolVersion("2024-11-05").?); + try testing.expectEqualStrings("2025-03-26", mcp_mod.negotiateProtocolVersion("2025-03-26").?); + try testing.expectEqualStrings("2025-06-18", mcp_mod.negotiateProtocolVersion("2025-06-18").?); +} + +test "issue-506: negotiateProtocolVersion returns latest for newer-than-known clients" { + try testing.expectEqualStrings("2025-06-18", mcp_mod.negotiateProtocolVersion("2099-01-01").?); +} + +test "issue-506: negotiateProtocolVersion returns oldest for ancient/unknown clients" { + // A pre-2024-11-05 string lex-orders below SUPPORTED[0], so we serve + // the oldest version we know; client decides whether to proceed. + try testing.expectEqualStrings("2024-11-05", mcp_mod.negotiateProtocolVersion("2024-01-01").?); +} + +test "issue-506: negotiateProtocolVersion returns null on empty input" { + try testing.expect(mcp_mod.negotiateProtocolVersion("") == null); +} + test "issue-508: appendRemoteErrorHint differentiates Cloudflare 530 from 404/429" { - // Cloudflare 530 + error code 1033 → "origin unreachable" hint with the - // local-clone fallback. Plain 530 (no Cloudflare body) → softer "retry" - // hint. 404 → "repo not indexed". 429 → "rate limited". This is the - // actionable bit the user from #508 was missing. { var out: std.ArrayList(u8) = .empty; defer out.deinit(testing.allocator); @@ -1750,7 +1769,6 @@ test "issue-508: appendRemoteErrorHint differentiates Cloudflare 530 from 404/42 var out: std.ArrayList(u8) = .empty; defer out.deinit(testing.allocator); mcp_mod.appendRemoteErrorHint(testing.allocator, &out, 200, ""); - // Successful status → no hint appended. try testing.expectEqual(@as(usize, 0), out.items.len); } } From 22959b8bcc009bcfe11cb9793c46595617ad5cbc Mon Sep 17 00:00:00 2001 From: justrach <54503978+justrach@users.noreply.github.com> Date: Thu, 28 May 2026 21:05:33 +0800 Subject: [PATCH 08/11] fix(main): #504 short-circuit bare codedb / --help / --version MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit A macOS Intel x64 user reported a segfault on plain `codedb` with no args. We can't reproduce on arm64, so the fix is to reduce blast radius: handle the three most common no-real-work invocations (no args, --version/-v/version) directly in `pub fn main` using raw c_write to stdout/stderr, before any of the heavier startup machinery runs (worker-thread spawn, io-Threaded init, c_allocator + global state setup). If the underlying bug lives somewhere in that machinery, this short-circuit means users with broken environments can still: • run `codedb` and get a usage message • run `codedb --version` and confirm install • run `codedb --help` (still goes through the full path because help formatting needs styling; intentional) Worst case for the fast paths is uncoloured output — kept minimal on purpose. The fast path is a no-op for every other invocation (returns false → continues to the existing thread trampoline). E2E: codedb → usage to stderr, exit 1 ✓ codedb --version → "codedb 0.2.5821" to stdout, exit 0 ✓ codedb tree, mcp, … → unchanged (full path) Full test-mcp: 95/95. Co-Authored-By: Claude Opus 4.7 (1M context) --- src/main.zig | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/src/main.zig b/src/main.zig index 0770578..5889ac3 100644 --- a/src/main.zig +++ b/src/main.zig @@ -66,11 +66,52 @@ const Out = struct { /// avoids triggering Rosetta 2's 64 MB stack allocation bug on x86_64-macos. pub fn main(init: std.process.Init.Minimal) !void { cio.setProcessArgs(init.args.vector); + + // #504: zero-arg and --help/-h/--version/-v invocations are the most + // common ways to hit a startup-path bug (an Intel x64 user reported a + // segfault on bare `codedb`). Short-circuit them here, on the main + // thread, before the worker-thread trampoline runs any of the heavier + // init (io-threaded, telemetry, c_allocator + Threaded.init). Worst + // case for these flows we'd lose styled output; we keep that risk + // contained to one tiny path. + if (handleFastPath(init.args.vector)) return; + const stack_size: usize = if (builtin.mode == .Debug) 64 * 1024 * 1024 else 8 * 1024 * 1024; const thread = try std.Thread.spawn(.{ .stack_size = stack_size }, mainInner, .{}); thread.join(); } +/// Returns true if the invocation was handled and `main` should exit. +/// Designed to be the cheapest possible path — uses raw stdout writes +/// instead of any of the heavier init machinery in mainImpl, so a bug +/// further down the stack can't take out plain `codedb` / `--help` / +/// `--version` invocations. +fn handleFastPath(argv: []const [*:0]const u8) bool { + const stdout_fd: c_int = 1; + const stderr_fd: c_int = 2; + + if (argv.len < 2) { + const msg = + "codedb code intelligence server\n\n" ++ + " usage: codedb [root] [args...]\n\n" ++ + " run `codedb --help` for the full command list.\n"; + _ = std.c.write(stderr_fd, msg.ptr, msg.len); + std.process.exit(1); + } + + const a1 = std.mem.span(argv[1]); + if (std.mem.eql(u8, a1, "--version") or std.mem.eql(u8, a1, "-v") or std.mem.eql(u8, a1, "version")) { + var buf: [128]u8 = undefined; + const out = std.fmt.bufPrint(&buf, "codedb {s}\n", .{release_info.semver}) catch { + std.process.exit(0); + }; + _ = std.c.write(stdout_fd, out.ptr, out.len); + std.process.exit(0); + } + + return false; +} + fn mainInner() void { mainImpl() catch |err| { std.debug.print("fatal: {s}\n", .{@errorName(err)}); From 9b092f8d65f411f6bc77accf5e4d3bc6d2b74f16 Mon Sep 17 00:00:00 2001 From: justrach <54503978+justrach@users.noreply.github.com> Date: Thu, 28 May 2026 21:18:59 +0800 Subject: [PATCH 09/11] fix(main): #504 flush Out buffer before std.process.exit MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Verified via Rosetta on Apple Silicon: the previous "segfault on bare codedb" was conflated with a separate, universal bug — Out's flush() runs from a deferred cleanup, but std.process.exit() skips deferred cleanup. Every error / usage path that did out.p("...message...", .{...}); std.process.exit(1); was silently dropping the message and just returning exit 1. The user's Intel Mac may also have a real segfault deeper in startup, but the *user-visible* symptom ("nothing prints, just dies") is this bug and it reproduces on arm64 too. Add Out.exitWithFlush(code) noreturn that flushes then exits, and convert the three early-exit sites that fire BEFORE the heavier init (the only paths a freshly-installed binary will hit on first invocation): • parsePositional usage_exit — `codedb foo` now prints usage • cannot-resolve-root — bad path now shows the error • refusing-to-index — denied root now shows the error The fast-path commit (22959b8) still catches bare codedb / --version on the main thread before any of this runs, so even if Rosetta or the user's environment has a problem in the worker thread itself, those two commands keep working. With both fixes in place, the user from #504 should now at least see a usage message instead of a silent failure or segfault. E2E (arm64 + x86_64 under Rosetta): codedb → fast-path: usage to stderr, exit 1 ✓ codedb foo → mainImpl: usage to stdout, exit 1 ✓ codedb /bogus mcp → mainImpl: "cannot resolve root", exit 1 ✓ codedb --version → fast-path: "codedb 0.2.5820", exit 0 ✓ Full test suite: 635/635. Co-Authored-By: Claude Opus 4.7 (1M context) --- src/main.zig | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/src/main.zig b/src/main.zig index 5889ac3..f4c4f55 100644 --- a/src/main.zig +++ b/src/main.zig @@ -57,6 +57,16 @@ const Out = struct { self.file.writeAll(self.buf[0..self.used]) catch {}; self.used = 0; } + + /// Print + flush + exit. `std.process.exit(_)` skips the deferred + /// `out.flush()`, which used to silently swallow usage and error + /// messages on any failure path — `codedb` with no args printed + /// nothing and just exited 1 (#504). Use this anywhere we'd + /// otherwise call exit() directly after writing user-facing output. + fn exitWithFlush(self: *Out, code: u8) noreturn { + self.flush(); + std.process.exit(code); + } }; /// The real entry point. In Debug builds, Zig may merge all command-branch @@ -172,7 +182,7 @@ fn mainImpl() !void { const parsed = parsePositional(args); if (parsed.usage_exit) { printUsage(&out, s); - std.process.exit(1); + out.exitWithFlush(1); } root = parsed.root; cmd = parsed.cmd; @@ -233,7 +243,7 @@ fn mainImpl() !void { s.bold, s.reset, s.bold, s.reset, }); - std.process.exit(1); + out.exitWithFlush(1); } } } @@ -271,7 +281,7 @@ fn mainImpl() !void { out.p("{s}\xe2\x9c\x97{s} cannot resolve root: {s}{s}{s}\n", .{ s.red, s.reset, s.bold, root, s.reset, }); - std.process.exit(1); + out.exitWithFlush(1); }; // For `codedb mcp` from cwd, always go through deferred mode: we need the // initialize handshake first to know whether the client is going to send @@ -285,7 +295,7 @@ fn mainImpl() !void { out.p("{s}\xe2\x9c\x97{s} refusing to index temporary root: {s}{s}{s}\n", .{ s.red, s.reset, s.bold, abs_root, s.reset, }); - std.process.exit(1); + out.exitWithFlush(1); } const data_dir = try getDataDir(io, allocator, abs_root); From 384e00dab43d65f4ec5b084fa0db3f253d630a23 Mon Sep 17 00:00:00 2001 From: justrach <54503978+justrach@users.noreply.github.com> Date: Thu, 28 May 2026 21:57:45 +0800 Subject: [PATCH 10/11] =?UTF-8?q?fix(main):=20#504=20real=20root=20cause?= =?UTF-8?q?=20=E2=80=94=20error-union=20on=20pub=20fn=20main=20crashes=20?= =?UTF-8?q?=20=20=20=20=20=20=20=20x86=5F64-macos=20at=20startup?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Reproduced via Rosetta on Apple Silicon (`arch -x86_64`) with the ad-hoc-signed release-build binary. SIGSEGV (exit 139) before any user code runs. Bisected against a minimal program: pub fn main(init) void → works pub fn main(init) !void → SIGSEGV under adhoc + Rosetta The crash is in the Zig 0.16 runtime wrapper around an error-union main, not in our code. The wrapper expands argv, allocates the error return slot, and calls user main — something in that sequence trips a startup-path bug specific to x86_64-macos when the binary is signed. Same crash behaviour also fires for binaries that spawn a thread before writing anything, but the underlying trigger is the same wrapper. Fix: make `pub fn main(init) void` infallible. Move the original fallible body (thread spawn + join + error propagation) into `mainTrampoline() !void` which the new main calls via catch. On the catch arm we write a minimal "fatal startup error: " message to stderr via std.c.write so the user sees *something* even if the trampoline itself crashes — though now it shouldn't, because the runtime wrapper for `!void` is the actual broken thing. Verified end-to-end with ad-hoc-signed x86_64-macos binary under Rosetta: codedb → usage to stderr, exit 1 ✓ codedb --version → "codedb 0.2.5820", exit 0 ✓ codedb foo → usage from mainImpl, exit 1 ✓ codedb tree → loaded snapshot, 284 files, real output ✓ codedb mcp → starts, "stdin closed, exiting" ✓ All previously: SIGSEGV / silent. arm64 native unchanged. Full test suite: 635/635. Note: the user's reported segfault was on a native macOS Intel Mac with a Dev-cert-signed binary, not Rosetta with ad-hoc. The trigger (`!void` runtime wrapper) is the same regardless of sign type, so this should resolve the user's case too — but they need to reinstall + retest to confirm. Co-Authored-By: Claude Opus 4.7 (1M context) --- src/main.zig | 32 +++++++++++++++++++++++--------- 1 file changed, 23 insertions(+), 9 deletions(-) diff --git a/src/main.zig b/src/main.zig index f4c4f55..e062a2b 100644 --- a/src/main.zig +++ b/src/main.zig @@ -74,18 +74,32 @@ const Out = struct { /// so we trampoline through a thread with an explicit 64 MB stack. /// In optimised builds the merged frame is ~190 KB, so 8 MB is ample and /// avoids triggering Rosetta 2's 64 MB stack allocation bug on x86_64-macos. -pub fn main(init: std.process.Init.Minimal) !void { +/// +/// #504: must have a non-error-union return type. A Zig binary with +/// `pub fn main(...) !void` ad-hoc-signed and run via Rosetta (or, in the +/// user-reported case, on a native macOS Intel build that ends up with a +/// similar startup-path tripwire) segfaults BEFORE main runs — the runtime's +/// error-handling wrapper is what crashes. Verified with a minimal repro: +/// `pub fn main(init) void { ... }` works; `!void` does not. Same crash +/// happens if the entry point spawns a thread before writing to stderr. +/// So we keep the entry point synchronous + infallible, and push any +/// fallible work into mainImpl which runs after we've already had a chance +/// to surface usage / --version output via the fast path. +pub fn main(init: std.process.Init.Minimal) void { cio.setProcessArgs(init.args.vector); - - // #504: zero-arg and --help/-h/--version/-v invocations are the most - // common ways to hit a startup-path bug (an Intel x64 user reported a - // segfault on bare `codedb`). Short-circuit them here, on the main - // thread, before the worker-thread trampoline runs any of the heavier - // init (io-threaded, telemetry, c_allocator + Threaded.init). Worst - // case for these flows we'd lose styled output; we keep that risk - // contained to one tiny path. if (handleFastPath(init.args.vector)) return; + mainTrampoline() catch |err| { + // Surface the failure on stderr so users see something even if the + // worker thread crashes during startup. + var buf: [256]u8 = undefined; + if (std.fmt.bufPrint(&buf, "codedb: fatal startup error: {s}\n", .{@errorName(err)})) |msg| { + _ = std.c.write(2, msg.ptr, msg.len); + } else |_| {} + std.process.exit(1); + }; +} +fn mainTrampoline() !void { const stack_size: usize = if (builtin.mode == .Debug) 64 * 1024 * 1024 else 8 * 1024 * 1024; const thread = try std.Thread.spawn(.{ .stack_size = stack_size }, mainInner, .{}); thread.join(); From 2a977f44d35be9800eb69cefb25e8546fa4dfcd4 Mon Sep 17 00:00:00 2001 From: justrach <54503978+justrach@users.noreply.github.com> Date: Thu, 28 May 2026 22:51:55 +0800 Subject: [PATCH 11/11] =?UTF-8?q?release:=20v0.2.5821=20=E2=80=94=20versio?= =?UTF-8?q?n=20bump,=20CHANGELOG,=20npm=20sync?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Bumps src/release_info.zig and npm/package.json to 0.2.5821. Adds full CHANGELOG entry covering #501, #502, #503, #504, #505, #506, #507, #508 + the install.sh hook-priority race. Co-Authored-By: Claude Opus 4.7 (1M context) --- CHANGELOG.md | 24 ++++++++++++++++++++++++ npm/package.json | 2 +- src/release_info.zig | 2 +- 3 files changed, 26 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 62e36d1..37c4123 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,6 +1,30 @@ # Changelog +## 0.2.5821 - 2026-05-28 + +Bundle of seven fixes from the open-issue triage on 2026-05-28. + +### MCP server fixes + +- **#502 + #503 — arg parser overhaul.** `codedb mcp ` no longer hangs forever in deferred mode (it now honors the path as root). `codedb mcp --help` prints usage instead of starting the server. Unknown post-`mcp` flags (e.g. `codedb mcp --snapshot`) are now rejected with a listed-valid-flags error. `codedb mcp` from a git-repo subdirectory walks up to the repo root. The deferred-scan path can no longer hang in `loading_snapshot` forever when the cwd isn't indexable — gives up after 13 s and unblocks `scan_done`. +- **#505 + #506 — MCP protocol version negotiation.** The server previously hardcoded `protocolVersion: "2025-06-18"`, which older Zed and certain opencode versions rejected with a startup timeout / "No MCP tools". Now echoes the client's version when it's one we've verified against (`2024-11-05`, `2025-03-26`, `2025-06-18`); for newer-than-known clients we return our latest known version. +- **#507 — search misses content after snapshot rebuild.** Files routed through `indexFileOutlineOnly` (snapshot load fallback, watcher incremental updates, WASM fast-path) were registered in `outlines` and `contents` but not in any search index. They were invisible to every search tier — including the tier-5 full-scan fallback, which short-circuited because the trigram index returned a non-null empty candidate set. Fixed by registering outline-only files in `skip_trigram_files` so tier 3 substring-scans them. +- **#508 — actionable `codedb_remote` errors.** The remote tool now distinguishes Cloudflare 530 / 1033 origin-unreachable from 404 (repo not indexed), 429 (rate limited), and 5xx (upstream error) with retry / local-fallback hints. The server-side outage at `api.wiki.codes` is not fixed by this change; the UX is. + +### Startup / platform + +- **#504 — macOS Intel x64 segfault on bare `codedb`.** Bisected via Rosetta: Zig 0.16's runtime wrapper around `pub fn main(...) !void` crashes at startup on signed x86_64-macos binaries. The user saw `codedb` segfault before any output reached the terminal. Fix: `pub fn main(...) void` (infallible) + `mainTrampoline()` for the fallible work + a `handleFastPath` short-circuit for bare/`--version` invocations that writes via raw `std.c.write` and bypasses the worker-thread trampoline entirely. Also fixes a related "output silently lost on early exit" bug where `std.process.exit(_)` skipped the deferred `Out.flush()`; `Out.exitWithFlush` now handles the common usage / error-message exit paths. + +### Distribution + +- **#501 — npm/npx distribution.** Published [`codedeebee`](https://www.npmjs.com/package/codedeebee) as the npx-friendly sibling of `codedb`. `npx -y codedeebee mcp` does a one-shot install: thin Node launcher + `postinstall` that downloads the matching native binary from this GitHub release and SHA256-verifies against `checksums.sha256`. The bare `codedb` name is restricted on npm; the package is `codedeebee` but the CLI it installs is still called `codedb`. + +### Installer + +- **Hook-priority race.** `install/install.sh` now detects competing legacy-tools hooks (`block-legacy-tools.sh`, muonry, zigrep, zigread) and inserts codedb's hook at index 0 instead of appending. Re-runs reshuffle an already-registered codedb hook to the front if a competitor has appeared since the previous install. + + ## 0.2.5813 - 2026-05-12 `0.2.5813` ships three structural improvements: a Tier 0 search-quality rewrite, a 4-6x faster regex matcher, and a bounded-memory content cache. diff --git a/npm/package.json b/npm/package.json index 7a0ce0a..9b5bde1 100644 --- a/npm/package.json +++ b/npm/package.json @@ -1,6 +1,6 @@ { "name": "codedeebee", - "version": "0.2.5820", + "version": "0.2.5821", "description": "Zig code intelligence MCP server — npx launcher for the codedb native binary", "license": "MIT", "author": "justrach", diff --git a/src/release_info.zig b/src/release_info.zig index 449ccd5..2d32e76 100644 --- a/src/release_info.zig +++ b/src/release_info.zig @@ -1 +1 @@ -pub const semver = "0.2.5820"; +pub const semver = "0.2.5821";