-
Notifications
You must be signed in to change notification settings - Fork 1
01 Release Notes.md

Product Positioning: A pure .NET embedded vector database — zero native dependencies, runs in-process, no standalone database server deployment required
Framework Version: .NET 10
Namespace:Vorcyc.Quiver
Design Philosophy: Similar to EF Core'sDbContextpattern, achieving automatic discovery, index construction, and persistence of the vector database through declarative attribute annotations
Core Features: Code-First declarative entity definition · Multiple ANN indexes (Flat / HNSW / IVF / KDTree) · 9 built-in distance metrics + custom similarity support · Binary primary storage + JSON/XML export/import · Schema Migration (property rename / value transform) · Reader-writer lock concurrency safety · SIMD-accelerated similarity computation · payload memory modes for vectors and large fields Keywords:Embedded Vector DatabasePure .NETANNApproximate Nearest Neighbor SearchSimilarity RetrievalHNSWIVFKDTreeCode-FirstEF Core StyleEmbeddingSemantic SearchFace RecognitionImage-to-Image SearchRAGSIMDSchema MigrationISimilarityCustom Metric
Name Origin: Quiver — a container for arrows (Arrow), and the mathematical essence of a vector is an arrow
| Improvement | Description |
|---|---|
| Bulk HNSW build on import / batch CRUD |
AddRange, UpsertRange, and ImportAsync now write entities and vectors first, then call BuildBulk once per vector field—the same deferred path as binary LoadEntities. Large JSON/XML imports with HNSW no longer issue one serial index.Add per row. |
| JSON vector Base64 encoding |
float[] / Half[] export as Base64 strings (aligned with XML), shrinking export files and speeding deserialization. Legacy JSON numeric-array vectors remain readable on import. |
| Parallel JSON import parsing |
JsonExportProvider.LoadAsync streams via PipeReader, copies each entity payload, and deserializes on a worker pool with bounded in-flight memory. |
UpsertRange batch API |
QuiverSet<T>.UpsertRange upserts a batch under one write lock; ImportAsync uses it when the target set is non-empty. |
| XML import property read fix |
XmlExportProvider no longer skips sibling properties after ReadElementContentAsStringAsync (e.g. Name and other scalar fields). |
| Truncated JSON error message | Incomplete export files (e.g. interrupted ExportAsync) raise a clear InvalidDataException with file size instead of a raw depth error. |
| Note | Description |
|---|---|
| Default JSON export |
WriteIndented now defaults to false for more compact exports. |
| Change | Description |
|---|---|
Removed vector LazyLoad mode |
VectorMemoryMode.LazyLoad and GlobalVectorMemoryMode.LazyLoad have been removed. For heap-backed stores, lazy vector loading could never reduce process memory because the vector store must retain every vector for index traversal and similarity search, making LazyLoad behaviorally identical to InMemory. Use InMemory for lowest latency or MemoryMapped (the genuine low-heap path) for large datasets. Large-field LazyLoad / PagedCache are unaffected — large fields are not needed for search and remain truly lazy. |
| Improvement | Description |
|---|---|
Half[] vectors support MemoryMapped |
fp16 (Half[]) vector fields can now be declared with VectorMemoryMode.MemoryMapped (and the global modes that resolve to it). Declare the property as public partial Half[]? Name { get; set; } in a partial type; the source generator emits a lazy accessor backed by LazyVectorAccessor.MaterializeHalf, and persisted vectors are encoded as Float16. |
| Lazy property getter re-allocated on every access | The source-generator-emitted getter used ?? instead of ??=: __backing ?? Materialize(this, fieldName). This meant the materialized byte[] for large fields was never written back to the backing field, causing a new allocation on every property read in lazy mode. Fix: the generator now emits ??=, caching the result in the backing field after the first materialization. |
File Format Compatibility: Snapshot (
.vdb) files remain fully backward-compatible with v1.x, v2.x, v3.0.x, v3.1.x, v3.2.x, and v3.3.x. However,.walsidecar files are no longer read or written — see the upgrade notes below before migrating.
Before installing 4.0.1, run your 3.2.x application once with the existing data so that any pending changes in .wal sidecar files are flushed into the main .vdb snapshot via the previous WAL compaction path:
// Run once on 3.2.x before upgrading:
await using var db = new MyDb();
await db.LoadAsync(); // replays any pending .wal entries into memory
await db.SaveAsync(); // writes a full snapshot and clears the WALAfter this step, the .wal file is empty/obsolete and it is safe to upgrade to 4.0.1. Upgrading without doing this will cause any unflushed WAL entries to be silently discarded on load.
| Change | Before (≤ 3.2.1) | After (4.0.1) |
|---|---|---|
| WAL (Write-Ahead Log) removed |
QuiverDbOptions.EnableWal / WalCompactionThreshold / WalFlushToDisk enabled incremental persistence via a .wal sidecar file. SaveChangesAsync() appended deltas; LoadAsync() replayed them. |
The entire WAL subsystem is removed. QuiverDbOptions no longer exposes WAL options. WriteAheadLog, WalEntry, and the _changeLog queue inside QuiverSet<T> no longer exist. SaveChangesAsync() is removed — call SaveAsync() for a full atomic snapshot save instead. LoadAsync() loads the snapshot only. |
| Snapshot alias APIs removed |
QuiverDbContext.RewriteAsync() and CompactAsync() were aliases for full-snapshot persistence/compaction. |
Both aliases are removed. Call SaveAsync(path?) directly for a full atomic snapshot and periodic multi-segment compaction. |
| Rationale | WAL doubled memory peak under heavy writes (_changeLog held a strong reference to every queued entity in addition to the live cache, on top of the full vector copies inside index stores). |
Snapshot-only persistence plus AppendAsync() and payload-level memory modes for vectors and large fields give a much flatter memory profile during bulk ingestion. |
// Before 4.0.1
var options = new QuiverDbOptions
{
DatabasePath = "mydata.vdb",
EnableWal = true, // ← remove
WalCompactionThreshold = 10_000, // ← remove
WalFlushToDisk = true // ← remove
};
// Before 4.0.1
await db.SaveChangesAsync(); // ← replace with SaveAsync()
// After 4.0.1
var options = new QuiverDbOptions
{
DatabasePath = "mydata.vdb"
};
await db.SaveAsync();If you previously relied on SaveChangesAsync() being incremental, use AppendAsync() for batch ingest and call SaveAsync() periodically to defragment multi-segment files.
Schema migration during offline format upgrades:
ConfigureMigration<T>()is applied byQuiverDbContext.LoadAsync()when reading a supported runtime format. If you upgrade a v1/v2/v3 file withQuiverMigrator.MigrateAsync, pass the same schema rules through itsmigrationRulesparameter; otherwise renamed fields may be skipped while the old file is decoded.
using Vorcyc.Quiver.Migration;
var rule = MigrationBuilder<Document>.Build(m => m
.RenameProperty("OldTitle", "Title"));
await QuiverMigrator.MigrateAsync(
sourceFile: "old.vdb",
destinationFile: "data.vdb",
typeMap: new Dictionary<string, Type>
{
[typeof(Document).FullName!] = typeof(Document)
},
migrationRules: new Dictionary<string, SchemaMigrationRule>
{
[typeof(Document).FullName!] = rule
});4.0.1 uses the v4 on-disk binary format. v1/v2/v3 files remain readable through the migration path; new writes always produce v4.
[Magic "QDB\x04"][HeaderLen u32][Header bytes]
[Segment 1] [Segment 2] ... [Segment N]
[FooterTopMagic "QDBF"][SegmentCount u32]
for each: [TypeName][Offset u64][Length u64][EntityCount u32][CRC32 u32]
[FooterOffset u64][TrailerMagic "QDBE"]
This unlocks three file-level capabilities without reintroducing WAL:
| API | Behavior | Cost |
|---|---|---|
QuiverDbContext.AppendAsync() |
Appends current in-memory entities as a new segment to an existing v4 file; rewrites only the footer. | O(Δ) bytes. Truly incremental — replaces the use case WAL covered, without the memory doubling. |
QuiverDbContext.SaveAsync() |
Writes a full snapshot and defragments a multi-segment file into one segment. | O(N). Run periodically. |
QuiverDbFile.MergeAsync(sources, dest, options, typeMap?) |
Merges multiple v4 files. MergeConflictPolicy.Append is a pure byte-copy of segments. LastWriterWins / FirstWriterWins deduplicate by [QuiverKey]. |
Append: O(I/O), no decode. LWW/FWW: decode-and-rewrite. |
QuiverDbFile.InspectAsync(path, verifyCrc) |
Returns QuiverFileInfo (version, segments, per-segment CRC validation, per-type entity counts). |
O(file size) when verifying CRC. |
// Incremental bulk ingest — replaces the pre-4.0 SaveChangesAsync workflow.
// Use synchronous `using`; `await using` runs DisposeAsync(), which performs a final full SaveAsync().
using var db = new MyDb("data.vdb");
await db.LoadAsync();
db.Faces.AddRange(batch);
await db.AppendAsync(); // O(batch) write, no full rewrite
// Periodic defrag
await db.SaveAsync();
// Merge several archive files into one, dedup by [QuiverKey], last writer wins
var typeMap = new Dictionary<string, Type>
{
[typeof(FaceFeature).FullName!] = typeof(FaceFeature)
};
await QuiverDbFile.MergeAsync(
sourceFiles: ["a.vdb", "b.vdb", "c.vdb"],
destinationFile: "merged.vdb",
options: new MergeOptions { ConflictPolicy = MergeConflictPolicy.LastWriterWins },
typeMap: typeMap);
// Diagnostics
var info = await QuiverDbFile.InspectAsync("merged.vdb");
Console.WriteLine($"v{info.FormatVersion}, {info.Segments.Count} segments, crcValid={info.CrcValid}");Note: All sections below describing WAL,
SaveChangesAsync,EnableWal,WriteAheadLog,WalEntry, or.walsidecar files refer to the pre-4.0 architecture and are kept only for historical reference. They no longer reflect runtime behavior.
File Format Compatibility: v3.2.1 is fully backward-compatible with all previous data files (v1.x, v2.x, v3.0.0, v3.1.0, v3.2.0).
| Fix | Description |
|---|---|
EntityPageCache thread-safety |
Fixed a data race in LazyPaging mode where concurrent readers (e.g., Parallel.ForEach calling Find / Search simultaneously) could corrupt the internal LRU state (_loadedPages, _lru, _lruNodes). All paths that mutate LRU state (GetOrLoadPage, FlushDirty, CompactMemory, Clear) are now protected by an internal Lock (_pageLock). FullMemory mode is unaffected (zero overhead). |
Builds on the v4 (
QDB\x04) segment + footer format. Existing v4 files are read transparently; new writes extend the footer with schema v2 (per-segmentKind/FieldName/Dim/FirstId).
This update completes the v4 storage redesign by physically separating vectors and large fields from entity metadata, and replacing in-place delete with a tombstone + merge model. The goal is a flat managed-heap profile even with millions of high-dimensional vectors.
The on-disk VectorBlob segment is mapped into the process via MemoryMappedFile; vectors are served from the OS page cache without copying into the managed heap.
new QuiverDbOptions
{
DatabasePath = "data.vdb",
Vectors.MemoryMode = GlobalVectorMemoryMode.MemoryMapped, // InMemory (default) / LazyLoad / MemoryMapped / Auto / PerField
Vectors.MemoryMapThresholdBytes = 256L * 1024 * 1024, // Auto mode only: switch to mmap above this size
};| Mode | Backend | When to pick |
|---|---|---|
InMemory (default) |
HeapVectorStore (Dictionary<int, float[]>) |
Small / write-heavy datasets, no DatabasePath required |
MemoryMapped |
MmapVectorStore (read-only view over the v4 VectorBlob segment) |
Large read-mostly datasets (face recognition, RAG indexes), bounded managed heap |
Auto |
InMemory below Vectors.MemoryMapThresholdBytes, MemoryMapped above |
Mixed workloads |
SaveAsync / AppendAsync automatically dispose mmap views before the file is replaced and re-bind to the new VectorBlob regions afterwards — the lifecycle is fully hidden from the caller.
Vectors are materialized from mmap only when an entity property is actually read. Declare the property as partial; the Vorcyc.Quiver.SourceGenerators analyzer emits a getter that calls LazyVectorAccessor.Materialize(this, "PropertyName").
public partial class AudioEntity
{
[QuiverKey]
public string Id { get; set; } = "";
[QuiverVector(1024, DistanceMetric.Cosine, Nullable = true, MemoryMode = VectorMemoryMode.MemoryMapped)]
public partial float[]? Embedding { get; set; } // backing field + getter are generated
}Search hot paths still read vectors directly from the mmap region (zero allocation). User code that touches entity.Embedding triggers a one-shot copy out of the mapped view.
Lazy vector source generation requires the vector property and every containing type in its nesting chain to be partial, and the property type must be float[] or float[]?. Invalid declarations produce analyzer diagnostics: QVR001 (property is not partial), QVR002 (containing type chain is not fully partial), or QVR003 (invalid property type).
Add the analyzer to consuming projects:
<ProjectReference Include="..\Vorcyc.Quiver.SourceGenerators\Vorcyc.Quiver.SourceGenerators.csproj" OutputItemType="Analyzer" ReferenceOutputAssembly="false" />
Inline byte[] fields (thumbnails, raw audio, packed features…) used to fatten the EntityMeta segment and inflate working-set memory on load. Annotate them with [QuiverLargeField] and they are written to a separate SegmentKind.Blob segment.
public partial class Photo
{
[QuiverKey] public string Id { get; set; } = "";
[QuiverVector(512, DistanceMetric.Cosine, Nullable = true, MemoryMode = VectorMemoryMode.MemoryMapped)] public partial float[]? Embedding { get; set; }
[QuiverLargeField(Nullable = true, MemoryMode = LargeFieldMemoryMode.PagedCache)] public partial byte[]? Thumbnail { get; set; } // ← lives in its own Blob segment
}[QuiverLargeField] may only be applied to byte[] and is mutually exclusive with [QuiverVector].
Deletes followed by AppendAsync() previously had no on-disk representation. The v4 format adds SegmentKind.Tombstone segments listing dead internal-row ids; loaders filter them out before handing entities to the set.
await using var db = new MyDb("data.vdb");
await db.LoadAsync();
db.Faces.RemoveByKey("F0001");
db.Faces.RemoveByKey("F0002");
// Writes ONLY a Tombstone segment — does NOT re-append the live in-memory entities as new segments.
await db.FlushTombstonesAsync();| API | Writes | Use case |
|---|---|---|
AppendAsync() |
New EntityMeta / VectorBlob / Blob segments for all current in-memory entities, plus a Tombstone segment for any pending deletes. |
Bulk ingest of new entities + opportunistic delete flushing. |
FlushTombstonesAsync() |
Only a Tombstone segment. | Load → mutate-in-place → flush deletes without re-writing live rows. |
SaveAsync() |
Single defragmented snapshot. All prior tombstones are physically dropped. | Periodic compaction. |
QuiverDbOptions gains three knobs that drive an inline best-effort merge after every AppendAsync / FlushTombstonesAsync:
| Option | Default | Purpose |
|---|---|---|
EnableBackgroundMerge |
false |
Master switch. |
AutoMergeMaxSegments |
32 |
Trigger a SaveAsync() once the footer contains at least this many segments. |
AutoMergeTombstoneRatio |
0.25 |
Trigger once tombstones / live ≥ ratio. |
Failures inside auto-merge are swallowed — they never propagate out of the user's AppendAsync call.
SegmentInfo exposes the new Kind (Mixed / EntityMeta / VectorBlob / Blob / Tombstone), FieldName, and Dim columns. Entity counts are no longer double-counted when a type spans multiple VectorBlob / Blob / Tombstone segments.
[FooterTopMagic "QDB2"][SegmentCount u32]
for each entry:
[TypeName s][Offset u64][Length u64][EntityCount u32][CRC32 u32]
[Kind u8][FieldName s][Dim i32][FirstId i32]
[FooterOffset u64][TrailerMagic "QDBE"]
"QDBF" (v1) is still read; new files always emit "QDB2".
Builds on the segmented file format and mmap vector storage. Existing raw-float32
VectorBlobsegments remain readable; new writes carry a per-segmentVectorBlobEncodingbyte plus an optional SQ8 scale table. Index topology and public APIs are unchanged.
This update focuses on keeping disk size, managed-heap size, and runtime working set bounded without knowing the source embedding model: field-level quantization and Matryoshka truncation compress vectors on the I/O path, and a heap-byte budget drives automatic Heap → Mmap promotion at runtime.
QuiverVectorAttribute gains a Quantization property. Two encodings are supported today:
| Value | On-disk size | Notes |
|---|---|---|
VectorQuantization.None (default) |
dim × 4B |
Raw float32, identical to previous v4 raw-vector segments |
VectorQuantization.Sq8 |
dim × 1B + 4B scale |
Per-row SQ8 scalar quantization (int8 + single scale). ≈ 1/4 of raw size on disk. Search-side decode goes through Sq8Codec.DecodeRow into a thread-local buffer, zero allocation. |
public partial class FaceFeature
{
[QuiverKey] public string Id { get; set; } = "";
// SQ8 + Matryoshka: a 1024-dim embedding is indexed/searched on its first 512 dims;
// on-disk size ≈ 1024×1B + 4B.
[QuiverVector(1024, DistanceMetric.Cosine,
Nullable = true,
MemoryMode = VectorMemoryMode.MemoryMapped,
Quantization = VectorQuantization.Sq8,
EffectiveDimensions = 512)]
public partial float[]? Embedding { get; set; }
}Encoding is persisted per segment in the v4 VectorBlob header (VectorBlobEncoding enum + version byte). MmapVectorStore and BinaryStorageProvider decode each segment using its own metadata, so the upstream embedding model is allowed to be unknown.
QuiverVectorAttribute.EffectiveDimensions lets you index/search on only the first N dims of a vector without touching the source embedding:
-
Write path — when
EffectiveDimensions < Dimensions,PrepareVectorscopies the first N dims into a fresh array (without mutating the entity's own array) and optionally L2-normalizes before storing/indexing. -
Query path —
Search/SearchKnnapply the same truncation and normalization to the query vector so the query geometry matches the store geometry. -
Index topology — all index implementations (Flat / HNSW / IVF / KDTree) operate on
EffectiveDimensions; distance-computation cost drops linearly.
Designed for Matryoshka-style embeddings (OpenAI text-embedding-3-large, Nomic, …) and for two-stage "low-dim recall + full-dim rerank" pipelines.
QuiverDbOptions gains two runtime memory controls:
| Option | Default | Purpose |
|---|---|---|
Vectors.MaxInMemoryBytes |
0 (disabled) |
Per-QuiverSet upper bound on in-memory vector payload bytes. |
Vectors.AutoPromoteToMemoryMapped |
false |
When the budget is exceeded, automatically promote the set's in-memory vector stores to mmap. |
Flow:
-
QuiverSet'sAdd/AddRange/Upsertwrite paths callNotifyHeapBytes()at the tail of the write lock, reporting the sum ofIVectorStore.HeapByteSizetoQuiverDbContext. -
QuiverDbContext(implementing the internalIPromotionCoordinator) checksVectors.AutoPromoteToMemoryMapped && bytes ≥ Vectors.MaxInMemoryBytes && DatabasePath != nulland uses a CAS gate to single-flight one promotion task per entity type. - The background task runs
SaveAsync()(to guarantee on-disk = in-memory), then promotes each InMemory vector field byQuiverSet.PromoteFieldsToMmap(...), binding fresh mmap views over the newVectorBlobsegments. - The swap goes through a new
VectorStoreSlotindirection — indices keep their stable slot reference, no index rebuild is required, and the search hot path is never interrupted.
new QuiverDbOptions
{
DatabasePath = "audio.vdb",
Vectors.MemoryMode = GlobalVectorMemoryMode.InMemory,
Vectors.MaxInMemoryBytes = 512L * 1024 * 1024,
Vectors.AutoPromoteToMemoryMapped = true,
};Promotion failures (e.g. disk unwritable) are logged via Trace.TraceWarning and the in-flight flag is cleared. They never propagate out of the user's write call.
| Member | Namespace | Notes |
|---|---|---|
VectorQuantization enum |
Vorcyc.Quiver.Quantization |
Field-level quantization strategy. |
VectorBlobEncoding enum |
Vorcyc.Quiver.Storage |
VectorBlob segment encoding version. |
Sq8Codec |
Vorcyc.Quiver.Storage |
SQ8 row encode/decode with thread-local buffers. |
IVectorStore.HeapByteSize |
Vorcyc.Quiver.Indexing |
Managed-heap bytes currently held by the store. |
IVectorStore.EffectiveDim |
Vorcyc.Quiver.Indexing |
Dimension actually used by index/search. |
QuiverDbOptions.Vectors.MaxInMemoryBytes |
same | In-memory vector byte budget. |
QuiverDbOptions.Vectors.AutoPromoteToMemoryMapped |
same | Master switch for auto-promotion. |
QuiverVectorAttribute.Quantization |
Vorcyc.Quiver |
Field quantization strategy. |
QuiverVectorAttribute.EffectiveDimensions |
same | Matryoshka truncation target. |
-
File format — v4
VectorBlobsegments now carry an encoding byte + optional SQ8 scale region, still embedded in theQDB\x04container and footer schema v2. Existing raw-float32 segments are transparently read. -
Public API —
QuiverDbOptions,QuiverVectorAttribute,IVectorStore,QuiverSet<T>only gain additive members; existing call sites compile unchanged. -
Indexes —
VectorStoreSlotis an internal wrapper insideQuiverSet<T>; index implementations are unaware of the swap.
This update removes the need to rebuild large HNSW graphs on every load. SaveAsync() writes an optional SegmentKind.IndexSnapshot segment for indexes that support snapshots; LoadAsync() restores the topology first and only replays ids not covered by the snapshot.
The HNSW snapshot stores the entry point, max level, node levels, per-layer neighbor lists, and the covered NextId, avoiding the O(N log N) graph rebuild normally caused by replaying Add(id) for every vector. For large vector sets, load cost shifts from “rebuild the graph” to “read and deserialize topology”.
Snapshots carry fingerprints for similarity type, HNSW parameters, and effective dimension. If the runtime model, dimension, effective dimension after quantization/truncation, or index parameters do not match, the loader rejects the snapshot and automatically falls back to the previous rebuild path. Old files without IndexSnapshot segments remain fully readable.
When Vectors.MemoryMode = MemoryMapped / Auto, the load pipeline now binds VectorBlob regions to MmapVectorStore before replaying ids not covered by a snapshot. This prevents HNSW from dereferencing mmap vectors too early and throwing KeyNotFoundException: Vector id ... not found in mmap store.
Mmap region matching also accepts both [QuiverEntity("stable-name")] and the legacy Type.FullName alias, so adding a stable entity name no longer causes old v4 vector regions to be skipped silently.
-
File format — adds an optional
SegmentKind.IndexSnapshotsegment. Existing v4 files load unchanged; index types without snapshot support keep rebuilding normally. -
Lazy loading — non-InMemory vector materialization and
[QuiverLargeField]large-object loading are unaffected. The snapshot stores index topology only, not entity or vector copies. - Mmap — snapshot restore and mmap binding remain separate; search hot paths still read vectors directly from mmap.
File Format Compatibility: v3.2.0 is fully backward-compatible with v1.x, v2.x, v3.0.0, and v3.1.0 data files.
| Feature | Description |
|---|---|
CompactMemory() / CompactMemoryAsync() |
Flushes all dirty pages to disk and evicts every loaded page from memory on demand, minimizing the working-set footprint. Exposed on QuiverSet<T> (per-collection) and as CompactAllMemoryAsync() on QuiverDbContext (all collections at once). No-op in FullMemory mode. Vector index structures are unaffected. |
File Format Compatibility: v3.1.0 is fully backward-compatible with v1.x, v2.x, and v3.0.0 data files.
| Change | Before (v3.0.0) | After (v3.1.0) |
|---|---|---|
VectorStorageMode removed |
QuiverDbOptions.VectorStorage = VectorStorageMode.MemoryMapped — optional memory-mapped vector arena via MmapVectorStore
|
Removed entirely. Vectors are always stored on the GC heap (HeapVectorStore). The LazyPaging entity cache already bounds total memory; a separate mmap layer is no longer needed. |
QuiverSet constructor simplified |
Accepted DistanceMetric defaultMetric as a parameter |
The defaultMetric parameter is removed. Each vector field independently declares its metric via [QuiverVector(dim, metric)]. |
If you previously set VectorStorage = VectorStorageMode.MemoryMapped in your QuiverDbOptions, simply remove that line — no other changes are required. Data files remain fully compatible.
// v3.0.0 (remove the VectorStorage line)
var options = new QuiverDbOptions
{
DatabasePath = "mydata.vdb",
// VectorStorage = VectorStorageMode.MemoryMapped, ← remove this
EntityCache = EntityCacheMode.LazyPaging,
MaxCachedPages = 32,
PageSize = 512
};Same lazy-loading page cache features as 3.0.0, now with a simpler and more consistent architecture.
| Feature | Description |
|---|---|
| Lazy-loading page cache |
EntityCache = EntityCacheMode.LazyPaging — entity objects are no longer fully resident in memory. They are split into fixed-size pages (PageSize entities/page), loaded on demand, and evicted via LRU when MaxCachedPages is exceeded. Idle cold pages are serialized to binary .qvpg page files and read back only when accessed. |
| Controllable memory ceiling | Actual entity memory usage is bounded by MaxCachedPages × PageSize × entity size regardless of total dataset size. |
| Vector indexes remain resident | HNSW / IVF / KDTree index structures always stay in memory, so search performance is unaffected by lazy-loading. |
IsLazyLoading property |
QuiverSet<T>.IsLazyLoading exposes the current caching mode for diagnostics. |
| Transparent API |
EntityPageCache<T> presents the same interface as the previous Dictionary<int, TEntity> — zero changes required in calling code. |
| Property | Type | Default | Description |
|---|---|---|---|
EntityCache |
EntityCacheMode |
FullMemory |
Entity caching mode: FullMemory (all entities in memory) / LazyPaging (LRU page cache). Requires DatabasePath for LazyPaging. |
MaxCachedPages |
int |
16 |
Max pages kept in memory per QuiverSet. |
PageSize |
int |
512 |
Max entities per page. |
var options = new QuiverDbOptions
{
DatabasePath = "mydata.vdb",
EntityCache = EntityCacheMode.LazyPaging, // ← enable lazy paging
MaxCachedPages = 32, // at most 32 pages in memory
PageSize = 512 // 512 entities per page
// memory ceiling ≈ 32 × 512 × entity size
};Page files are stored under
{DatabasePath}.pages/{EntityTypeName}/page_XXXXXXXX.qvpg(custom binary format, no external dependencies).Page file binary layout (v1):
[4B uint32] Magic = 0x51565047 ("QVPG" identifier) [1B byte] Version = 0x01 [4B int32] PropCount ← number of property descriptors PropDescriptor × PropCount: [string] PropName ← BinaryWriter length-prefixed UTF-8 [4B int32] EntityCount ← entities in this page Entity × EntityCount: [4B int32] InternalId per-field (descriptor order): [1B bool isNotNull] + value (same type encoding as BinaryStorageProvider)
File Format Compatibility: v3.0.0 is fully backward-compatible with v1.x and v2.x data files.
| Feature | Description |
|---|---|
| Lazy-loading page cache |
EntityCache = EntityCacheMode.LazyPaging — entity objects loaded on demand in fixed-size pages, evicted by LRU when MaxCachedPages is exceeded. |
| Controllable memory ceiling | Actual entity memory usage is bounded by MaxCachedPages × PageSize × entity size regardless of total dataset size. |
| Vector indexes remain resident | HNSW / IVF / KDTree index structures always stay in memory; search performance is unaffected by lazy-loading. |
IsLazyLoading property |
QuiverSet<T>.IsLazyLoading exposes the current caching mode for diagnostics. |
| Memory-mapped vector storage |
VectorStorage = VectorStorageMode.MemoryMapped (introduced in 3.0.0, removed in 3.1.0 — see above). |
File Format Compatibility: v2.0.0 is fully backward-compatible with v1.x data files. All three storage formats (JSON / XML / Binary) and WAL files can be loaded without any migration.
| Change | Before (v1.x) | After (v2.0.0) |
|---|---|---|
| Similarity computation |
SimilarityFunc delegate |
ISimilarity<T> static abstract interface — JIT generates specialized machine code per type, zero virtual dispatch |
| Vector data ownership | Each index stores vectors internally |
IVectorStore abstraction — indexes only manage topology (graph/tree/inverted list), vectors unified by store |
| Feature | Description |
|---|---|
| 6 new distance metrics | Manhattan (L1), Chebyshev (L∞), Pearson correlation, Hamming, Jaccard, Canberra — plus the original 3 (Cosine / Euclidean / DotProduct), totaling 9 built-in metrics |
| Custom similarity |
[QuiverVector(128, CustomSimilarity = typeof(MySimilarity))] — plug in any ISimilarity<float> struct |
| IVectorStore abstraction |
HeapVectorStore (GC heap) — pluggable vector storage backend |
| Improvement | Details |
|---|---|
| SIMD for all metrics | All 9 similarity implementations use internal VectorMath / Vector<float> paths, auto-adapting to SSE4 / AVX2 / AVX-512 register width without extra NuGet dependencies |
| Zero-overhead dispatch |
ISimilarity<T> with static abstract + readonly struct enables JIT to inline TSim.Compute() at call sites — no delegate indirection |
| # | 章节 |
|---|---|
| 01 | 版本说明 |
| 02 | 产品概述 |
| 03 | 架构概述 |
| 04 | 快速开始 |
| 05 | 核心概念 |
| 06 | 距离度量 |
| 07 | 索引类型 |
| 08 | CRUD 操作 |
| 09 | 向量搜索 |
| 10 | 持久化存储 |
| 11 | 迁移系统 |
| 11a | 模式迁移 |
| 12 | 多向量字段支持 |
| 13 | 线程安全与并发 |
| 14 | 生命周期管理 |
| 15 | 配置选项 |
| 16 | 内部实现细节 |
| 17 | 完整示例 |
| 18 | API 参考速查表 |
| 19 | 使用建议 |