Skip to content

Add filter support for VSIM vector search#1570

Merged
hailangx merged 51 commits intomainfrom
haixu/vector-filter-postprocessing
Mar 16, 2026
Merged

Add filter support for VSIM vector search#1570
hailangx merged 51 commits intomainfrom
haixu/vector-filter-postprocessing

Conversation

@hailangx
Copy link
Copy Markdown
Member

@hailangx hailangx commented Feb 20, 2026

Adds post-filter support for VSIM vector search results by introducing a JSON-attribute filter expression engine and integrating it into VectorManager after similarity search.

Changes:

  • Introduces a filter module to evaluate over attributes
  • Integrates post-filtering into VectorManager for both value-based and element-based similarity search paths.
  • Adds unit tests for the filter engine and RESP integration tests for VSIM ... FILTER ....
  • Added a micro bench for the filter module to bench the memory allocation

Supported syntax documented at Vector Filter Expressions (VSIM ... FILTER)
website/docs/dev/vector-sets.md

Key design constraint: Zero heap allocation in the per-candidate evaluation loop. All buffers are borrowed from the session-local ScratchBufferBuilder (~9 KB), a pinned byte[] that persists for the session's lifetime. After the first VSIM FILTER query, the buffer is already large enough — subsequent calls have zero allocation cost and zero GC pressure.

design doc website/docs/dev/post_filter_design.md

Follow ups:

  • Use SIMD json way to attract fields
  • Improve recall for the low selectivity case: integrate the beta filter search algo and skip copy the non-match results during the graph traverse with benchmark to see the outcome.

   Implement JSON-path-based filter expressions that are evaluated against
   vector element attributes after similarity search. The filter engine
   includes a tokenizer, expression parser, and evaluator supporting
   comparison operators, logical operators (and/or/not), arithmetic,
   string equality, containment (in), and parenthesized grouping.

   Integrate post-filtering into VectorManager for both VSIM code paths,
   rejecting requests that specify a filter without WITHATTRIBS.
@hailangx hailangx marked this pull request as ready for review February 20, 2026 00:54
Copilot AI review requested due to automatic review settings February 20, 2026 00:54
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds post-filter support for VSIM vector search results by introducing a JSON-attribute filter expression engine and integrating it into VectorManager after similarity search.

Changes:

  • Introduces a tokenizer/parser/evaluator for filter expressions over JSON attributes (and/or/not, comparisons, arithmetic, in, grouping).
  • Integrates post-filtering into VectorManager for both value-based and element-based similarity search paths.
  • Adds unit tests for the filter engine and RESP integration tests for VSIM ... FILTER ....

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
libs/server/Resp/Vector/VectorManager.cs Applies post-filtering to VSIM results and evaluates expressions against per-element attributes.
libs/server/Resp/Vector/Filter/VectorFilterTokenizer.cs Tokenizes filter expressions into numbers/strings/identifiers/operators/keywords.
libs/server/Resp/Vector/Filter/VectorFilterParser.cs Parses tokens into an expression AST with operator precedence.
libs/server/Resp/Vector/Filter/VectorFilterExpression.cs Defines AST node types for literals, member access, unary, and binary ops.
libs/server/Resp/Vector/Filter/VectorFilterEvaluator.cs Evaluates the AST against JsonElement attribute data.
test/Garnet.test/VectorFilterTests.cs Unit tests for tokenizer/parser/evaluator behavior.
test/Garnet.test/RespVectorSetTests.cs Adds RESP-level tests verifying VSIM filtering behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 20, 2026

@hailangx I've opened a new pull request, #1571, to work on those changes. Once the pull request is ready, I'll request review from you.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 20, 2026

@hailangx I've opened a new pull request, #1572, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 2 commits February 19, 2026 17:09
* Initial plan

* Avoid per-result allocation in EvaluateFilter by using Utf8JsonReader with ParseValue

Co-authored-by: hailangx <3389245+hailangx@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: hailangx <3389245+hailangx@users.noreply.github.com>
…ly (#1572)

* Initial plan

* Fetch attributes internally for filtering when not returning them

Co-authored-by: hailangx <3389245+hailangx@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: hailangx <3389245+hailangx@users.noreply.github.com>
@harsha-simhadri
Copy link
Copy Markdown

CAn you link the specification of expression syntax you are implementing here

@hailangx
Copy link
Copy Markdown
Member Author

@microsoft-github-policy-service agree company="Microsoft"

@hailangx
Copy link
Copy Markdown
Member Author

CAn you link the specification of expression syntax you are implementing here
added into the vector set document

Copy link
Copy Markdown
Contributor

@kevin-montrose kevin-montrose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some fundamental reworking is needed here, exceptions and allocations need to go - I've left a bunch of guideline comments to get us pointed in the right direction. It's not quite an exhaustive review - there are minor optimizations and style things we can revisit latter when we're closer to mergeable.

@kevin-montrose kevin-montrose dismissed their stale review March 6, 2026 17:47

Resetting to unblock after discussion.

@kevin-montrose
Copy link
Copy Markdown
Contributor

For future reference, a slightly cleaned up version of the plan we discussed offline:


FILTER is structured kind of like a pipeline:

  1. Take in FILTER, extract relevant fields
  2. After "normal" part of search, we have all the attribute data in a contiguous blob (that's what FetchVectorElementAttributes produces)
    • Matching Redis, we over select when we're going to post-filter (using the FILTER-EF if specified, and COUNT * 100 fi not)
  3. We "extract" the whole blob to produce a set of (count fields): [(field offset) (field length)] ... data; we only include fields we found to care about in #1, and malformed attributes produce some special count (presumably -1)
    • We can right-size the buffer here because we know each attribute has AT MOST one instance each of the relevant fields
    • In the case the same field is specified multiple times (and thus we overflow) we can ignore the attribute
    • This part is amenable to SimdJSON-esque parsing, where we track delimiters in bitmaps and process many characters at once rather than going char-by-char like Redis does
  4. We apply the filter built in #1 against the result of #3, producing a bitmap of matching elements
  5. We use that bitmap in RespServerSessionVectors to write the final results out
    • Importantly, we don't compact or otherwise move element or attribute data once it lands in the result buffers

While generally we do not want early-exits in this code (batching and SIMD want very little control flow to be beneficial), it makes sense to check after every COUNT elements have been compared against the filter to see if we're done.


Minor thoughts after a quick glance before vacaction:

  • Probably not worth putting FILTER code in a separate folder / namespace; put it all int eh same level as VectorManager, and hide any FILTER specific types/code in a new VectorManager partial
  • Avoid static constructors, they have gnarly runtime implications
  • Probably want a union-ish thing for tokens rather than a fat structure, and don't want any cases where we allocate strings or arrays - structure so that stuff can be punned spans
  • Similar for extracted fields, we can work in terms of offsets into the input FILTER rather than allocating strings

@harsha-simhadri
Copy link
Copy Markdown

Haiyang, let us plan to paginate diskann when the first round of post processing does not generate enough results, to mitigate low recall for selective predicates.

@hailangx hailangx changed the title Add post-filter support for VSIM vector search results Add filter support for VSIM vector search Mar 14, 2026
@hailangx hailangx merged commit 7281998 into main Mar 16, 2026
68 of 71 checks passed
@hailangx hailangx deleted the haixu/vector-filter-postprocessing branch March 16, 2026 21:20
@hailangx hailangx restored the haixu/vector-filter-postprocessing branch March 16, 2026 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants