19 Usage Recommendations.md

19. Usage Recommendations

19.1 Recommended Combination

For large, read-mostly databases where Top-K search is the main workload, prefer combining:

new QuiverDbOptions
{
	DatabasePath = "data.vdb",
	Vectors =
	{
		MemoryMode = GlobalVectorMemoryMode.Auto,
		MemoryMapThresholdBytes = 256L * 1024 * 1024,
	},
	LargeFields =
	{
		MemoryMode = GlobalLargeFieldMemoryMode.PagedCache,
		MaxCachedPayloads = 128
	}
};

On the entity side:

public partial class Document
{
	[QuiverKey]
	public string Id { get; set; } = "";

	public string Title { get; set; } = "";

	[QuiverVector(768, DistanceMetric.Cosine, Nullable = true, MemoryMode = VectorMemoryMode.MemoryMapped)]
	[QuiverIndex(VectorIndexType.HNSW)]
	public partial float[]? Embedding { get; set; }

	[QuiverLargeField(Nullable = true, MemoryMode = LargeFieldMemoryMode.PagedCache)]
	public partial byte[]? RawContent { get; set; }
}

This separates four kinds of data: entity objects are held by InMemoryEntityStore, vectors are owned by the InMemory / MemoryMapped store, large byte[] payloads live in dedicated Blob segments, and HNSW graph topology is restored from IndexSnapshot.

19.2 When Lazy Loading Is Useful

Mechanism	Best fit	Benefit
`VectorMemoryMode.MemoryMapped` / `Auto`	Large read-mostly vector sets, especially high-dimensional embeddings	Reads vectors through the OS page cache and reduces managed-heap pressure
`[QuiverVector(MemoryMode = ...)]`	High-dimensional vectors; user code rarely reads `entity.Embedding` directly	Allows non-InMemory vector payload access through generated properties
`LargeFieldMemoryMode.LazyLoad` / `PagedCache`	Thumbnails, original files, audio chunks, or other large `byte[]` fields	Keeps large objects out of `EntityMeta` load paths and materializes on demand
`[QuiverLargeField]`	Per-field large `byte[]` memory behavior	Overrides global large-field memory mode per field
`HNSW + IndexSnapshot`	Large HNSW indexes where load-time graph rebuild is expensive	Restores graph topology and only replays uncovered ids

Lazy loading is most useful for hundreds of thousands to millions of entities, high-dimensional embeddings, search-heavy workloads, and cases where only topK matches are materialized.

19.3 When It Is Not Worth It

Scenario	Recommendation
Small datasets, for example a few thousand rows	Use `VectorMemoryMode.InMemory` and `LargeFieldMemoryMode.InMemory` for simplicity
Every operation reads every large field	Lazy/paged large fields may add I/O and cache-management overhead
Search results always require reading every matched vector payload	non-InMemory vector properties provide less benefit
Write-heavy workloads with frequent `Add` / `Upsert`	Prefer `VectorMemoryMode.InMemory`, then periodically call `SaveAsync`
Low-dimensional vectors or small total vector bytes	`MemoryMapped` / `Auto` has limited benefit
Very small `byte[]` fields	`[QuiverLargeField]` is optional

19.4 Scenario Examples

Small database: keep it simple

For a few thousand rows, low-dimensional vectors, or frequent full-entity iteration, the simple in-memory setup is usually enough:

var options = new QuiverDbOptions
{
	DatabasePath = "small.vdb",
	Vectors = { MemoryMode = GlobalVectorMemoryMode.InMemory },
	LargeFields = { MemoryMode = GlobalLargeFieldMemoryMode.InMemory }
};

public class Note
{
	[QuiverKey]
	public string Id { get; set; } = "";

	[QuiverVector(384, DistanceMetric.Cosine)]
	public float[] Embedding { get; set; } = [];
}

Large search database: mmap vectors + lazy large fields + HNSW snapshot

Good for face databases, RAG document stores, image vector stores, and other search-heavy workloads where only topK matches are materialized:

var options = new QuiverDbOptions
{
	DatabasePath = "vectors.vdb",
	Vectors = { MemoryMode = GlobalVectorMemoryMode.MemoryMapped },
	LargeFields = { MemoryMode = GlobalLargeFieldMemoryMode.PagedCache }
};

public partial class FaceFeature
{
	[QuiverKey]
	public string Id { get; set; } = "";

	[QuiverVector(512, DistanceMetric.Cosine, Nullable = true, MemoryMode = VectorMemoryMode.MemoryMapped)]
	[QuiverIndex(VectorIndexType.HNSW, M = 32, EfConstruction = 300, EfSearch = 100)]
	public partial float[]? Embedding { get; set; }
}

The first SaveAsync() writes the HNSW IndexSnapshot; later LoadAsync() restores the graph topology while search still reads vectors from the mmap vector store.

Large fields only: enable only large-field lazy loading

If scalar entities are modest but byte[] payloads are large, lazy-load only the large fields:

var options = new QuiverDbOptions
{
	DatabasePath = "catalog.vdb",
	Vectors = { MemoryMode = GlobalVectorMemoryMode.InMemory },
	LargeFields = { MemoryMode = GlobalLargeFieldMemoryMode.LazyLoad }
};

Large vectors only: use `MemoryMapped` / `Auto`

If entities are lightweight but vector bytes are large, move vectors out of the managed heap:

var options = new QuiverDbOptions
{
	DatabasePath = "embeddings.vdb",
	Vectors =
	{
		MemoryMode = GlobalVectorMemoryMode.Auto,
		MemoryMapThresholdBytes = 128L * 1024 * 1024,
	},
	LargeFields = { MemoryMode = GlobalLargeFieldMemoryMode.InMemory }
};

Large binary fields: use `[QuiverLargeField]`

For thumbnails, original files, audio chunks, or other large byte[] fields, split the payload into dedicated Blob segments:

public partial class Photo
{
	[QuiverKey]
	public string Id { get; set; } = "";

	[QuiverVector(768, DistanceMetric.Cosine, Nullable = true, MemoryMode = VectorMemoryMode.MemoryMapped)]
	public partial float[]? Embedding { get; set; }

	[QuiverLargeField(Nullable = true, MemoryMode = LargeFieldMemoryMode.PagedCache)]
	public partial byte[]? Thumbnail { get; set; }
}

Write-heavy ingestion: use `InMemory` first, then compact periodically

For frequent writes, keep the write path simple and explicitly persist/compact between batches:

using var db = new MyDb(new QuiverDbOptions
{
	DatabasePath = "ingest.vdb",
	Vectors = { MemoryMode = GlobalVectorMemoryMode.InMemory },
	LargeFields = { MemoryMode = GlobalLargeFieldMemoryMode.InMemory }
});

foreach (var batch in batches)
{
	db.Documents.AddRange(batch);
	await db.AppendAsync();
	db.Documents.Clear();
}

await db.SaveAsync();

For AppendAsync() + Clear() batch ingestion, prefer synchronous using; DisposeAsync() only performs a full SaveAsync() when SaveOnDispose = true.

19.5 How the Pieces Relate

Entities are always stored by InMemoryEntityStore; large-scale memory control is focused on vector and large-field payloads.
[QuiverVector(MemoryMode = ...)] controls vector payload memory behavior for that field.
VectorMemoryMode.MemoryMapped / Auto controls where vector data lives and is the key setting for reducing managed-heap pressure.
[QuiverLargeField] moves large byte[] fields into separate blob segments and can enable lazy/paged access.
IndexSnapshot stores HNSW topology only, not entities or vector copies, so it does not break lazy-loading semantics.

In short: large dataset + large vectors/large fields + many searches + few materialized fields is where MemoryMapped/Auto vectors + LazyLoad/PagedCache large fields + HNSW Snapshot is most useful.

English

#	Chapter
01	Release Notes
02	Product Overview
03	Architecture Overview
04	Quick Start
05	Core Concepts
06	Distance Metrics
07	Index Types
08	CRUD Operations
09	Vector Search
10	Persistent Storage
11	Migration System
11a	Schema Migration
12	Multi-Vector Field Support
13	Thread Safety and Concurrency
14	Lifecycle Management
15	Configuration Options
16	Internal Implementation Details
17	Complete Examples
18	API Reference Cheat Sheet
19	Usage Recommendations

简体中文

#	章节
01	版本说明
02	产品概述
03	架构概述
04	快速开始
05	核心概念
06	距离度量
07	索引类型
08	CRUD 操作
09	向量搜索
10	持久化存储
11	迁移系统
11a	模式迁移
12	多向量字段支持
13	线程安全与并发
14	生命周期管理
15	配置选项
16	内部实现细节
17	完整示例
18	API 参考速查表
19	使用建议

Uh oh!

19 Usage Recommendations.md