You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
IVFPQ Compressed ANN: End-to-end compressed ANN pipeline with build, query, persistence, and quality coverage (pkg/search/ivfpq_*). This enables significantly higher vector scale with lower memory footprint while preserving practical recall through rerank.
IVFPQ Candidate Generation and Routing Enhancements: Hybrid cluster routing and compressed candidate generation improvements reduce bad-cluster routing and improve search quality consistency under mixed lexical/semantic queries.
Async vs Explicit Transaction Clobber Regression Suite: Added deterministic tests for write-behind queue vs explicit transaction ordering (pkg/storage/async_engine_clobber_test.go) to prevent stale async writes from overwriting newer committed data.
Updated single-concurrency HTTP write benchmark baseline in docs/performance/single-request-benchmark.md.
Updated MaxConcurrentStreams comparison baseline in docs/performance/maxconcurrentstreams-comparison.md with newly re-run 100/250 concurrency results.
Fixed
AsyncEngine stale-write clobber window: Prevented queued async updates from overwriting newer explicit transaction commits via rebase/retry conflict handling.
Auth cookies behind reverse proxies: Secure attribute now correctly set for HTTPS sessions terminated at proxies (X-Forwarded-Proto=https), not only direct TLS.
WAL manifest path traversal risk: Sanitized WAL segment paths before file open to block ../ and nested path escapes.
WAL auto-compaction recovery race: Serialized auto-snapshot/truncate against in-flight mutating operations to prevent rare missing-node recovery outcomes under compaction.
macOS config corruption under repeated settings edits: Eliminated corruption-prone regex write path by switching to structured YAML parsing/writing.
Performance
Compressed ANN scalability: IVFPQ compressed search path now supports materially larger embedding corpora with reduced RAM pressure.
IVFPQ scoring throughput: Contiguous code storage + tuned candidate scoring improves cache locality and reduces per-query overhead in compressed paths.
CGO-heavy embedding/generation latency: Go 1.26 runtime and modern llama.cpp backend changes improve baseline efficiency for local model inference stacks.
HTTP write benchmark baselines refreshed: Latest single-thread and high-concurrency figures were re-measured and published to keep performance guidance aligned with current runtime and build stack.
Security
Cookie transport security: Browser auth cookies now reliably use Secure in proxied HTTPS deployments, reducing accidental insecure-cookie exposure.
Filesystem safety in WAL replay: Segment-path sanitization prevents manifest-sourced path traversal to arbitrary files.
Runtime hardening via Go 1.26: Heap base randomization raises exploit difficulty for memory-corruption scenarios involving CGO boundaries.
Stability / Ops
Go 1.26 operational tooling improvements: New goroutine leak profiling support can aid diagnosis of background loop leaks in long-running server processes.
llama.cpp API drift guardrails: Updated integration assumptions for modern llama.cpp APIs and removed brittle logging call usage to avoid signature drift breakage.
Recovery consistency: Snapshot/compaction sequencing now better preserves WAL and engine state coherence under concurrent mutation load.
What This Means For Users
Faster and more predictable local AI features: Local embeddings/reranking and GGUF-backed workflows run with better baseline performance and fewer runtime edge-case failures.
Bigger datasets on the same hardware: Compressed ANN improvements make it practical to index and query larger embedding corpora without proportional memory growth.
Safer production defaults: Stronger cookie security and WAL path validation reduce common deployment and hardening risks.
Less config fragility on macOS: Settings changes are now reliably persisted without YAML corruption.
Better correctness under concurrency: Explicit transactions and async writes now coexist more safely in high-write systems.
More accurate performance guidance: Published benchmark docs now reflect current Go 1.26-era behavior for both low-concurrency and high-concurrency HTTP write paths.