Skip to content

feat: migrate multi-vector query and reranker logic to C++#405

Merged
chinaux merged 20 commits into
alibaba:mainfrom
chinaux:feat/refact_py_to_cpp
May 29, 2026
Merged

feat: migrate multi-vector query and reranker logic to C++#405
chinaux merged 20 commits into
alibaba:mainfrom
chinaux:feat/refact_py_to_cpp

Conversation

@chinaux
Copy link
Copy Markdown
Collaborator

@chinaux chinaux commented May 14, 2026

refact:

  • Add Reranker base class with RrfReRanker and WeightedReRanker implementations
  • Add Collection::MultiQuery interface for multi-vector queries with reranking
  • Add MultiVectorQuery struct in doc.h with forward declaration for Reranker
  • Add C API bindings for reranker and MultiQuery (zvec_reranker_, zvec_multi_vector_query_, zvec_collection_multi_query)
  • Add Python binding for reranker classes with py::function bridge for callback

Comment thread src/include/zvec/db/collection.h Outdated
Comment thread src/db/collection.cc
Comment thread src/include/zvec/db/doc.h Outdated
Comment thread src/include/zvec/db/doc.h Outdated
Comment thread src/include/zvec/db/doc.h Outdated
Comment thread src/include/zvec/db/reranker.h Outdated
Comment thread src/db/reranker/reranker.cc
Comment thread src/binding/c/c_api.cc
Comment thread src/include/zvec/c_api.h
Comment thread src/include/zvec/db/query.h Outdated
Comment thread src/include/zvec/db/query.h Outdated
@zhourrr zhourrr force-pushed the feat/refact_py_to_cpp branch from 9f92f1b to 3d46458 Compare May 25, 2026 02:47
@zhourrr zhourrr requested a review from egolearner May 25, 2026 05:44
Comment thread src/include/zvec/db/reranker.h Outdated
Comment thread src/include/zvec/db/reranker.h Outdated
Comment thread src/include/zvec/db/reranker.h Outdated
Comment thread src/include/zvec/db/reranker.h
Comment thread src/db/reranker/reranker.cc Outdated
Comment thread src/include/zvec/db/query.h Outdated
chinaux added 10 commits May 28, 2026 09:54
- Add Reranker base class with RrfReRanker and WeightedReRanker implementations
- Add Collection::MultiQuery interface for multi-vector queries with reranking
- Add MultiVectorQuery struct in doc.h with forward declaration for Reranker
- Add C API bindings for reranker and MultiQuery (zvec_reranker_*, zvec_multi_vector_query_*, zvec_collection_multi_query)
- Add Python binding for reranker classes with py::function bridge for callback
- Validate duplicate field names in multi-vector queries (C++ and Python consistent)
- Remove TODO comment about concurrent execution (SQLEngine is not thread-safe)
- Update collection.h MultiQuery doc comment from concurrently to sequentially
- Add C++ collection tests (6 MultiQuery test cases)
- Add C API tests (reranker functions + multi_vector_query end-to-end)
- Implement Python test cases (11 previously skipped tests now active)
- Simplify Python query_executor validation for unified duplicate field check
- Register _SubVectorQuery in pybind11 with from_vector_query() factory
- Convert _VectorQuery to _SubVectorQuery in MultiVectorQueryExecutor
- Relax RRF/Weighted score assertion tolerance from 1e-10 to 1e-6
- Fix WeightedReRanker test metric to IP (matching HnswIndexParam default)
@chinaux chinaux force-pushed the feat/refact_py_to_cpp branch from 1871946 to b4c0cb3 Compare May 28, 2026 01:55
Comment thread src/db/reranker/reranker.cc Outdated
Comment thread src/include/zvec/c_api.h Outdated
Comment thread src/db/reranker/reranker.cc Outdated
Comment thread src/db/reranker/reranker.cc Outdated
Comment thread src/db/reranker/reranker.cc Outdated
Comment thread src/include/zvec/db/query.h Outdated
Comment thread src/db/collection.cc Outdated
Comment thread src/include/zvec/db/reranker.h Outdated
Comment thread src/include/zvec/db/reranker.h Outdated
Comment thread src/include/zvec/db/query.h
Comment thread src/db/collection.cc
Comment thread src/db/collection.cc Outdated
@chinaux chinaux force-pushed the feat/refact_py_to_cpp branch from c838e3a to b187572 Compare May 28, 2026 08:25
Comment thread examples/c/CMakeLists.txt Outdated
Comment thread src/db/reranker/reranker.cc Outdated
Comment thread src/db/collection.cc Outdated
…directly, and use insert return value to avoid duplicate set lookup
Comment thread src/include/zvec/db/reranker.h Outdated
Comment thread src/binding/python/model/param/python_param.cc
Comment thread src/binding/python/model/python_collection.cc Outdated
Comment thread src/include/zvec/c_api.h Outdated
Comment thread src/include/zvec/c_api.h Outdated
Comment thread src/binding/python/model/python_reranker.cc Outdated
Copy link
Copy Markdown
Collaborator

@egolearner egolearner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chinaux chinaux merged commit f539580 into alibaba:main May 29, 2026
24 of 25 checks passed
egolearner added a commit to egolearner/zvec that referenced this pull request May 29, 2026
Adapt FTS to the query-model refactor from alibaba#428/alibaba#405 on main:
- VectorQuery is replaced by SearchQuery + QueryTarget
  (variant<VectorClause, FtsClause>)
- Fold FTS validation into SearchQuery::validate_and_sanitize
- c_api / python binding read & write FTS via FtsClause and target_.clause_
- Add QueryTarget::get_fts_clause() helper, replacing raw std::get_if calls
- Migrate FTS tests to the new model; drop the vector/fts mutual-exclusion
  case that no longer applies under the variant
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants