SemanticCache.drop_by_filter (and SearchIndex.drop_documents_matching): filter-driven deletion as a first-class API

## Summary

Scope-targeted cache invalidation (drop everything tagged `policy_version=v3`, drop everything for `tenant=acme`, drop a stale corpus version) is a primary use case for `SemanticCache`. There is no first-class API for it. Today users must:

1. Run a `FilterQuery` against `cache.index` to enumerate matching entries
2. Paginate manually (top-K is capped per call)
3. Collect entry IDs / keys
4. Call `cache.drop(ids=...)` in chunks, respecting cluster hash-slot boundaries

This is non-trivial code that every customer rewrites, and getting it wrong silently leaves stale entries behind or stalls the server.

## Proposed API

```python
# Lower-level: SearchIndex
def drop_documents_matching(
    self,
    filter_expression: FilterExpression,
    batch_size: int = 5_000,
) -> int:
    """Delete every document in this index matching `filter_expression`.

    Paginates internally. On Redis Cluster, batches are split per hash slot.
    Returns the total number of documents deleted.
    """

# Higher-level: SemanticCache
def drop_by_filter(self, filter_expression: FilterExpression, batch_size: int = 5_000) -> int:
    return self._index.drop_documents_matching(filter_expression, batch_size)
```

Async equivalents on `AsyncSearchIndex` / `SemanticCache.adrop_by_filter`.

## Why this belongs in RedisVL

Redis itself has no `FT.DROP_DOCS_WHERE` primitive — paginated query + key delete is the only path. That is exactly the kind of friction RedisVL exists to hide. Today the abstraction stops one step short of where customers actually need it.

## Open question

Whether `batch_size` should be exposed or hidden. A reasonable default (5–10K) covers most cases; an advanced override is useful for latency-sensitive deployments sharing a database with online traffic.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SemanticCache.drop_by_filter (and SearchIndex.drop_documents_matching): filter-driven deletion as a first-class API #599

Summary

Proposed API

Why this belongs in RedisVL

Open question

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SemanticCache.drop_by_filter (and SearchIndex.drop_documents_matching): filter-driven deletion as a first-class API #599

Description

Summary

Proposed API

Why this belongs in RedisVL

Open question

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions