Current behavior
SemanticCacheIndexSchema.from_params (redisvl/extensions/cache/llm/schema.py:115) hardcodes the vector field's algorithm to "flat":
{
"name": CACHE_VECTOR_FIELD_NAME,
"type": "vector",
"attrs": {
"dims": vector_dims,
"datatype": dtype,
"distance_metric": "cosine",
"algorithm": "flat",
},
},
There is no constructor argument on SemanticCache to override this. FLAT is exact KNN — fine to the low millions of entries, latency-degrading beyond that. For large multi-tenant deployments or per-tenant player-facing caches (10M–100M entries), HNSW is required. Today the only workaround is to bypass SemanticCache entirely and build the index manually.
Proposed API
SemanticCache(
name="cache",
redis_url=...,
vector_index_config={
"algorithm": "hnsw", # or "flat" (default)
"m": 16,
"ef_construction": 200,
"ef_runtime": 10,
},
)
A simpler form taking just algorithm: Literal["flat", "hnsw"] = "flat" plus a default HNSW parameter preset would also be acceptable for a first cut.
Compatibility notes
The algorithm is fixed at index creation; existing FLAT caches cannot be hot-swapped to HNSW. Customers wanting to switch must build a new cache (new name), re-warm, and cut over — same migration shape as a schema change. This is acceptable as long as the option exists at construction time.
Notes
Surfaced while writing a scoped semantic caching architecture spec for customers running large-scale, multi-tenant deployments.
Current behavior
SemanticCacheIndexSchema.from_params(redisvl/extensions/cache/llm/schema.py:115) hardcodes the vector field'salgorithmto"flat":{ "name": CACHE_VECTOR_FIELD_NAME, "type": "vector", "attrs": { "dims": vector_dims, "datatype": dtype, "distance_metric": "cosine", "algorithm": "flat", }, },There is no constructor argument on
SemanticCacheto override this. FLAT is exact KNN — fine to the low millions of entries, latency-degrading beyond that. For large multi-tenant deployments or per-tenant player-facing caches (10M–100M entries), HNSW is required. Today the only workaround is to bypassSemanticCacheentirely and build the index manually.Proposed API
A simpler form taking just
algorithm: Literal["flat", "hnsw"] = "flat"plus a default HNSW parameter preset would also be acceptable for a first cut.Compatibility notes
The algorithm is fixed at index creation; existing FLAT caches cannot be hot-swapped to HNSW. Customers wanting to switch must build a new cache (new
name), re-warm, and cut over — same migration shape as a schema change. This is acceptable as long as the option exists at construction time.Notes
Surfaced while writing a scoped semantic caching architecture spec for customers running large-scale, multi-tenant deployments.