Summary
Scope-targeted cache invalidation (drop everything tagged policy_version=v3, drop everything for tenant=acme, drop a stale corpus version) is a primary use case for SemanticCache. There is no first-class API for it. Today users must:
- Run a
FilterQuery against cache.index to enumerate matching entries
- Paginate manually (top-K is capped per call)
- Collect entry IDs / keys
- Call
cache.drop(ids=...) in chunks, respecting cluster hash-slot boundaries
This is non-trivial code that every customer rewrites, and getting it wrong silently leaves stale entries behind or stalls the server.
Proposed API
# Lower-level: SearchIndex
def drop_documents_matching(
self,
filter_expression: FilterExpression,
batch_size: int = 5_000,
) -> int:
"""Delete every document in this index matching `filter_expression`.
Paginates internally. On Redis Cluster, batches are split per hash slot.
Returns the total number of documents deleted.
"""
# Higher-level: SemanticCache
def drop_by_filter(self, filter_expression: FilterExpression, batch_size: int = 5_000) -> int:
return self._index.drop_documents_matching(filter_expression, batch_size)
Async equivalents on AsyncSearchIndex / SemanticCache.adrop_by_filter.
Why this belongs in RedisVL
Redis itself has no FT.DROP_DOCS_WHERE primitive — paginated query + key delete is the only path. That is exactly the kind of friction RedisVL exists to hide. Today the abstraction stops one step short of where customers actually need it.
Open question
Whether batch_size should be exposed or hidden. A reasonable default (5–10K) covers most cases; an advanced override is useful for latency-sensitive deployments sharing a database with online traffic.
Summary
Scope-targeted cache invalidation (drop everything tagged
policy_version=v3, drop everything fortenant=acme, drop a stale corpus version) is a primary use case forSemanticCache. There is no first-class API for it. Today users must:FilterQueryagainstcache.indexto enumerate matching entriescache.drop(ids=...)in chunks, respecting cluster hash-slot boundariesThis is non-trivial code that every customer rewrites, and getting it wrong silently leaves stale entries behind or stalls the server.
Proposed API
Async equivalents on
AsyncSearchIndex/SemanticCache.adrop_by_filter.Why this belongs in RedisVL
Redis itself has no
FT.DROP_DOCS_WHEREprimitive — paginated query + key delete is the only path. That is exactly the kind of friction RedisVL exists to hide. Today the abstraction stops one step short of where customers actually need it.Open question
Whether
batch_sizeshould be exposed or hidden. A reasonable default (5–10K) covers most cases; an advanced override is useful for latency-sensitive deployments sharing a database with online traffic.