Release v0.11.0 · GlitterKill/sdl-mcp

[0.11.0] - 2026-05-07

Performance

Pass-2 read amplification eliminated (A1). Each language resolver
previously did getFileByRepoPath per imported module + getSymbolsByFile
per resolved target file — ~30k point reads per refresh on a 1000-file TS
repo. New pass-level Pass2ImportCache (in src/indexer/pass2/types.ts)
populated once at dispatcher start with two batched reads
(getFilesByRepo + new getExportedSymbolsLiteByFileIds); resolvers do
O(1) map lookups instead. Threaded through resolveImportTargets (now
takes optional cache arg) and the cpp/c/shell helper indexes that wrap
it. Eliminates ~30k point-read round trips per pass-2.
Pass-2 re-parse eliminated for TS files (C1). Pass-1 already
parses every file with tree-sitter; pass-2 was redundantly re-parsing on
the JS main thread. New Pass1ExtractionCache (Map<relPath, Pass1ExtractionEntry>) populated during pass-1 from both engines
(process-file.ts for the TS engine, rust-process-file.ts for the Rust
engine), gated on isTsCallResolutionFile + skipCallResolution. The TS
resolver in edge-builder/pass2.ts checks the cache first; on hit,
skips a full tree-sitter parse + three extract* calls. Other-language
resolvers retain inline re-parse because their resolvers consume the live
tree handle for scope walkers / call-scope indexes. ~30-50% of pass-2
wall on TS-heavy repos.
Pass-2 write coalescing per concurrency batch. New submitEdgeWrite
callback in Pass2ResolverContext lets the dispatcher decide between
immediate flush (sequential path, one withWriteConn per file) and
batched flush (parallel path, one withWriteConn per concurrency batch
with combined delete + insert). All 11 language resolvers refactored:
removed clearOutgoingCallEdges SQL helper, replaced with in-memory
clearLocalCallDedupKeys; replaced 2× withWriteConn per file with 1×
submitEdgeWrite callback. Cuts writeLimiter handshakes from
O(filesPerBatch) to 1.
Pass-2 concurrency raised on high-end tiers (F1). cpu-presets.ts:
extreme tier pass2Concurrency 6 → 12, high tier 3 → 8. Unblocked by
C1 — earlier raises were capped by the JS main-thread re-parse
bottleneck.
Embedding write coalescing on rebuild path. When dropVectorIndex
succeeds (no live HNSW to maintain), refreshSymbolEmbeddings accumulates
ONNX batch results into a 256-item buffer and flushes via a single
withWriteConn instead of one per ONNX batch. Cuts writeLimiter
handshakes ~8× on bulk rebuild. Buffer is force-flushed in finally
before the HNSW rebuild so unflushed vectors can't strand outside the
index.
BatchPersistAccumulator flush threshold raised 200 → 512. Better
fills the underlying UNWIND CHUNK=256 window in batched writes; halves
txn boundaries on pass-1 drain. Memory cost per accumulator ~+200KB.

Added

Codex SDL-first tool enforcement. sdl-mcp init --client codex --enforce-agent-tools now emits a broad PreToolUse hook that is gated
by the SDL-MCP PID file. When the server is running, the hook denies
repo-targeting native source reads/searches/edits, non-SDL MCP file/search
tools, and repo-local build/test/lint shell commands that should run through
SDL-MCP. When the PID file is absent, native tools remain available.
semantic.modelVariant config. Selects the ONNX file variant
(default/int8, fp16, fp32, plus nomic-only uint8/q4/
q4f16/bnb4) per embedding model. Each model declares supported
variants in src/indexer/model-registry.ts; unsupported requests fall
back to that model's defaultVariant with a warning. Lets users trade
speed for accuracy without a code change.
semantic.executionProviders config. Configurable ONNX Runtime
execution provider list. Platform allow-list filtered against the default
onnxruntime-node package: Windows x64 ["cpu", "dml", "webgpu"],
macOS ["cpu", "coreml"], Linux x64 ["cpu", "cuda", "tensorrt"],
Linux arm64 ["cpu"]. Unsupported entries dropped with a warning;
"cpu" auto-appended as final fallback. Enables AMD GPU acceleration
on Windows via DirectML, Apple Silicon ANE/GPU via CoreML, NVIDIA on
Linux via CUDA (system CUDA 12 + cuDNN required).
semantic.embeddingsSequential config. Run multiple embedding
models in series instead of via Promise.all. Default false. Set
true on systems where ORT serializes parallel sessions at the
thread-pool layer (alternation pattern in CLI progress). Each model
then holds the full thread budget end-to-end.
semantic.embeddingBatchSize config. ONNX inference batch width
for symbol embedding refresh. Default 32, max 128. Larger batches
amortise tokenizer + session bind/unbind costs.
MAX_EMBEDDING_CONCURRENCY raised 4 → 8. Schema cap and clamp logic
bumped; users can now request embeddingConcurrency up to 8.
scip.generator.cleanupAfterIngest config. Default true. Deletes
<repoRoot>/index.scip after the post-refresh ingest consumes it so the
generated file doesn't clutter the working tree. Skipped automatically
when args contains --output/-o (custom paths are user-managed).
Per-model embedding progress in CLI. IndexProgress.model?: string
tags embedding events with their source model. CLI renderer keeps a
per-model Map and renders both jina + nomic on a single status line
(e.g. Embeddings: jina [###---] 25% (2100/8522) | nomic [####--] 30% (2500/8522))
instead of letting interleaved events overwrite each other's counts.
Pass-1 drain progress feedback. BatchPersistAccumulator accepts an
optional setProgressCallback invoked after each batch flush.
indexer-pass1.ts wires this to emit finalizing/pass1Drain substage
events with stageCurrent/stageTotal, replacing the static
"Flushing pass 1 writes" label with a live progress bar.
Pass-2 import + extraction cache types. Pass2ImportCache,
Pass1ExtractionCache, Pass1ExtractionEntry, and SubmitEdgeWrite
exported from src/indexer/pass2/types.ts.
getExportedSymbolsLiteByFileIds query. New batched read in
src/db/ladybug-symbols.ts returning Map<fileId, ExportedSymbolLite[]>
(just symbolId, name). Drop-in replacement for per-file
getSymbolsByFile().filter(s => s.exported) in import resolution.

Changed

MCP TypeScript SDK updated to 1.29.0. Keeps the monolithic
@modelcontextprotocol/sdk dependency on the latest v1 release line.
FileSummary embedding refresh stats are stricter. Incremental
refreshes only evaluate the requested file IDs; cache hits count as
skipped, while empty file-level payloads now count as missing.
durationMs in IndexResult now reflects full wall-clock.
Previously captured immediately after the versionSnapshot phase, which
silently excluded the entire post-index session (finalizeIndexing,
embeddings, deferred indexes, audit flush). On full-mode runs with
embeddings this could under-report by 200-400+ seconds. Now matches
timings.totalMs.
MAX_EMBEDDING_CONCURRENCY is 8 (was 4). Schema clamps + tests
updated. embeddingConcurrency accepts 1-8.
MAX_EMBEDDING_BATCH_SIZE and DEFAULT_EMBEDDING_BATCH_SIZE. New
constants (128 and 32 respectively). REFRESH_BATCH_SIZE kept as an
exported alias for DEFAULT_EMBEDDING_BATCH_SIZE so tests / scripts
retain a stable reference.
Model registry restructured around variants. ModelInfo now exposes
defaultVariant + variants: Record<string, ModelVariantInfo> instead
of a flat modelFile + downloadUrls.model. Tokenizer/config URLs
remain shared across variants. New resolveVariant(name, requested?)
helper centralises fallback logic. resolveModelPath,
isModelAvailable, and ensureModelAvailable now accept an optional
variant argument.
Pass-2 dispatcher writes per batch, not per file. Parallel-path
runPass2Resolvers now collects edges from every file in a
concurrency-batch into a BatchWriteAccumulator and issues one combined
withWriteConn(delete-then-insert) after Promise.all settles.

Fixed

SDL-MCP tool friction fixes. sdl.manual now omits disabled memory
actions when memory is off; Zod v4 discriminated-union schemas such as
search.edit now produce useful schema summaries; search.edit preview
suppresses normal include/exclude filter misses and caps skipped-file /
retrieval diagnostics; slice.build accepts wireFormat: "json" as a
standard JSON alias; sdl.context auto mode no longer ships duplicate
_packedPayload; exact identifier symbol.search calls stay on the
lexical fast path unless semantic/PPR context is explicitly requested;
sdl.workflow.truncated only marks actually skipped/truncated responses;
and sdl.file JSON/YAML path reads parse the full supported file while
applying maxBytes to the returned extraction.
Second-pass tool friction fixes. Literal search.edit include filters
now resolve directly without semantic narrowing or unrelated repo walks;
sdl.context treats exact code identifiers as deterministic seed symbols
before broad retrieval, accepts budget.maxEstimatedTokens as a
maxTokens alias, and rejects unsupported budget.maxCards clearly;
$0.results.0.symbolId workflow refs are now supported; search.edit
action summaries merge mode variants as preview|apply; context packed
stats now distinguish candidate savings from returned payload format; and
MCP responses no longer return a duplicate footer text block.
Published package now includes templates/SDL.md. Fixed
sdl-mcp init --client <client> failing from npm/global installs with
ENOENT: templates\SDL.md; release preflight now requires all client init
templates in the packed tarball.
dropVectorIndex regex now matches LadybugDB binder errors. The
/does not exist/i check missed LadybugDB's actual phrasing
("Binder exception: Table X doesn't have an index with name Y."),
causing fresh-DB embedding refreshes to take the slow per-row HNSW
maintenance path AND skip the post-write index rebuild — leaving the DB
without a vector index after every fresh-DB run. Regex broadened to
/does not exist|doesn't have an index with name|no such (vector |fts )?index/i.
The indexes.length > 0 guard on showIndexes verification was also
dropped; the binder error itself is authoritative for "this index
doesn't exist on this table".

Changed

HealthMetrics.engineDispatch semantics: events → files (BREAKING).
Previously incremented per index.refresh event (one tick per run);
now reflects per-file dispatch sourced from IndexStats.pass1Engine.{rustFiles,tsFiles}.
Legacy += 1 per-event behavior preserved as a back-compat fallback when
pass1Engine telemetry is absent. Dashboard ratios + REST snapshot consumers
should expect order-of-magnitude larger counts. Surfaces in
/api/observability/snapshot and SSE stream.
Packed wire format default flipped to "auto" for slice.build,
sdl.symbol.search, and sdl.context. Server now runs the packed gate by
default; clients without a packed decoder must opt back to legacy with
wireFormat: "compact" (slice) or wireFormat: "json" (symbol/context),
or set wire.packed.defaultFormat: "compact" in config. Decoder available
via decodePacked from @sdl-mcp/wire/packed.
Packed gate thresholds lowered: byte threshold 0.15 → 0.10, token
threshold 0.30 → 0.20. More responses now clear the gate; admins can
override via wire.packed.threshold / wire.packed.tokenThreshold or env
SDL_PACKED_THRESHOLD / SDL_PACKED_TOKEN_THRESHOLD.

Added

Packed wire-format coverage extended to sdl.symbol.search and
sdl.context — both tools now run the packed gate end-to-end with tap
events for both packed and fallback decisions. Observability dashboard
PACKED / BYTES SAVED counters increment for ss1 (symbol-search) and
ctx1 (context) encoders, not just sl1. Per-encoder breakdown table
added to the token-efficiency panel.
Per-encoder dashboard breakdown in PackedWireMetrics.byEncoder:
totalDecisions, packedCount, fallbackCount, packedAdoptionPct,
jsonBaselineBytesTotal, packedBytesTotal, bytesSaved, bytesSavedRatio.
Surfaces in /ui/observability token-efficiency panel.
Slice fallback tap publishing — slice.build previously only published
packed-decision tap events; fallbacks were silent. Both branches now record
to tokenAccumulator and publish packedWire tap events.

Removed

gen1 generic packed encoder — never wired into any production tool
path. Removed encodeGeneric / decodeGeneric exports, registry entry,
and tests/unit/packed-generic.test.ts. EncoderId type narrowed to
"sl1" | "ss1" | "ctx1".

Added

Observability dashboard (V1, read-only) — built-in HTTP UI at
/ui/observability plus
/api/observability/{snapshot,timeseries,beam-explain,stream} REST + SSE APIs.
Surfaces cache hit rates, hybrid retrieval breakdowns (FTS / vector / PPR /
RRF), beam-search decision traces, indexing pipeline metrics, write-pool /
drain saturation, packed-wire savings, SCIP ingest health, deterministic
bottleneck classification, and OS-level resource samples. Configurable via
the new observability.* config block (enabled, sampleIntervalMs,
retentionShortMinutes, retentionLongHours, pprMetricsEnabled,
packedStatsEnabled, scipIngestMetrics, beamExplainCapacity,
beamExplainEntriesPerSlice, sseHeartbeatMs). Default sampling interval
2 s; 15-minute short window + 24-hour long window. Bearer-auth gated
identically to other /api/* routes. New deep dive at
docs/feature-deep-dives/observability-dashboard.md.

Commits since v0.10.10

chore(release): v0.11.0 (a4f84c9)
ci fix (e318fe3)
fix: smooth SDL-MCP tool friction (14f49c3)
Fix missing SDL init template in package (c28a521)
fix: harden symbol search and file summary embeddings (e90cb7a)
Repair dependency placeholder quality (591c05b)
Bound FileSummary embedding resource use (d005690)
ci fixes (804d0bc)
ci fixes (ce9568d)
Improve index data quality metadata (1e4ccd1)
Add Codex hook enforcement assets (03e17ff)
fix: land remaining review-sweep changes (ee4df7d)
feat(indexer): SCIP-first pass2, import cache, and pass1-drain ordering fix (a056b08)
test: align two stale tests with current code-mode and indexer behavior (dc21091)
test(stress): cap LadybugDB buffer pool to 1 GB (66abda3)
fix(db/ladybug): work around three LadybugDB binder/runtime bugs (83d55f5)
gitignore (c9f1be7)
perf(indexer): parallelise dir walk, scip ingest, drop parser pool clamp (28a5d8c)
perf(db/ladybug): UNWIND-batched MERGE for 8 write paths (948bef4)
feat(config/cpu-presets): tier-aware pass2 + scip-ingest concurrency (2549de4)
fix(gateway/compact-schema): emit input-side JSON Schema for transform pipes (975ce08)
perf(indexer/embeddings): batched UNWIND writes + HNSW rebuild path (2bb4bc5)
feat(mcp/code-mode): tool ergonomics + delta blast-radius hard cap (c10fd3e)
refactor(policy/code): split gate.ts into pure decision + enforcement seam (177cd94)
stress test fix (685f4aa)
ci fix (a1dc7f6)
perf(indexer)+feat(observability): wire phase scoping, engine dispatch, and dashboard gaps (7351ee8)
feat(audit/observability): post-index write-session, audit buffer drain, dashboard metrics (69f79ab)
ci fix (e8027f1)
feat(observability): wire packed gate to symbol.search + sdl.context, fix dashboard counters (b35710a)
perf(indexer): batch DB queries in import re-resolution phase (fd41695)
fix(observability): eliminate false memory_pressure on idle process (9a3c273)
fix(cli): delegate index without auth token when auth disabled (11e5cf1)
fix(schema): harden packed-stats migrations across DB shapes (20c7661)
refactor(observability): batch etag cache emits + add health unit tests (e412953)
feat(observability): wire follow-up signals for CACHE + HEALTH panels (36f7b04)
feat(observability): wire CACHE, BEAM, HEALTH panels (9546c34)
fix(observability): unclip stack-bar legends; wire token efficiency (7332b8e)
fix(schema): heal UsageSnapshot drift breaking usage.stats on fresh DBs (d8b0114)
chore(deps): bump @ladybugdb/core 0.15.2 -> 0.16.0 (ffc84b4)
feat(observability): add read-only V1 dashboard (4c975b8)
fix(indexer): serialize cluster compute with SCIP auto-ingest (191a4ed)
ci fix (dbc4d83)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.11.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

[0.11.0] - 2026-05-07

Performance

Added

Changed

Fixed

Changed

Added

Removed

Added

Commits since v0.10.10

Uh oh!