Skip to content

refactor(algorithm): extract common search helpers to InnerIndexInterface base class#2139

Merged
LHT129 merged 1 commit into
antgroup:mainfrom
LHT129:2026-06-01-抽象索引公共骨架消除inner_index子类重复
Jun 5, 2026

Hidden character warning

The head ref may contain hidden characters: "2026-06-01-\u62bd\u8c61\u7d22\u5f15\u516c\u5171\u9aa8\u67b6\u6d88\u9664inner_index\u5b50\u7c7b\u91cd\u590d"
Merged

refactor(algorithm): extract common search helpers to InnerIndexInterface base class#2139
LHT129 merged 1 commit into
antgroup:mainfrom
LHT129:2026-06-01-抽象索引公共骨架消除inner_index子类重复

Conversation

@LHT129
Copy link
Copy Markdown
Collaborator

@LHT129 LHT129 commented Jun 4, 2026

What problem does this PR solve?

Resolves #2138

Extracts repetitive search validation, filter composition, result packing, and serialization footer logic from individual index implementations into protected helper methods in InnerIndexInterface.

What is changed and how does it work?

New protected helper methods in InnerIndexInterface:

Method Purpose
validate_search_query() Checks query dimension and num_elements
validate_knn_args() Validates KNN search parameters (query + k)
validate_range_args() Validates range search parameters (query + radius + limited_size)
create_search_filter() Composes DeletedIdsFilter with user filter (InnerIdWrapperFilter or ExtraInfoWrapperFilter)
pack_knn_result() Converts DistHeap to Dataset with label resolution
pack_knn_result_with_extra_info() Same as above, with extra info support
make_empty_result() Creates an empty Dataset result
write_index_footer() Writes Footer with basic_info metadata
read_index_footer() Reads Footer and returns basic_info metadata

Applied to 6 index implementations:

  • BruteForce: SearchWithRequest, RangeSearch, Serialize, Deserialize
  • HGraph: KnnSearch, RangeSearch, SearchWithRequest
  • IVF: KnnSearch, RangeSearch, SearchWithRequest, reorder, create_search_param
  • Pyramid: KnnSearch, RangeSearch, Serialize, Deserialize
  • SINDI: KnnSearch, RangeSearch, Serialize, Deserialize
  • WARP: SearchWithRequest, RangeSearch, Serialize, Deserialize

Impact:

  • +218 / -254 lines net reduction of 36 lines, but more importantly eliminates 6x code duplication
  • No behavioral change: same error messages, same error codes, same result format
  • All existing tests pass without modification

Related issues

Closes #2138

Copilot AI review requested due to automatic review settings June 4, 2026 04:38
@LHT129 LHT129 added the kind/improvement Code improvements (variable/function renaming, refactoring, etc. ) label Jun 4, 2026
@LHT129 LHT129 self-assigned this Jun 4, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Jun 4, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Require kind label

Wonderful, this rule succeeded.
  • label~=^kind/

🟢 Require version label

Wonderful, this rule succeeded.
  • label~=^version/

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors search, validation, and serialization logic across several algorithms (including BruteForce, HGraph, IVF, Pyramid, SINDI, and WARP) by consolidating common helper methods into the base class InnerIndexInterface. This significantly reduces code duplication for tasks like filter creation, result packing, argument validation, and index footer handling. The review feedback highlights several critical issues where the return value of read_index_footer is not checked in pyramid.cpp, sindi.cpp, and warp.cpp, which could lead to crashes if a footer is missing. Additionally, a defensive null check is recommended for this->extra_infos_ in inner_index_interface.cpp to prevent potential null pointer dereferences.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/algorithm/pyramid/pyramid.cpp Outdated
Comment thread src/algorithm/sindi/sindi.cpp Outdated
Comment thread src/algorithm/warp/warp.cpp Outdated
Comment thread src/algorithm/inner_index_interface.cpp Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors common search and serialization boilerplate from multiple InnerIndexInterface subclasses into shared protected helper methods on InnerIndexInterface, reducing duplication across algorithms while aiming to keep behavior consistent.

Changes:

  • Added reusable helpers to InnerIndexInterface for query validation, filter composition, heap→dataset packing, empty results, and footer read/write.
  • Updated multiple index implementations (BruteForce, HGraph, IVF, Pyramid, SINDI, WARP) to call the new helpers instead of duplicating logic.
  • Standardized result packing and footer serialization/deserialization call sites.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/algorithm/inner_index_interface.h Declares new protected helper methods for search/filter/result/serialization.
src/algorithm/inner_index_interface.cpp Implements shared helper logic (filters, packing, footer IO, validation).
src/algorithm/bruteforce/bruteforce.cpp Uses shared filter/result packing helpers and footer helpers; centralizes empty-result handling.
src/algorithm/hgraph/hgraph_search.cpp Replaces local validation/filter/result packing with shared helpers (incl. extra-info packing).
src/algorithm/ivf/ivf.cpp Uses shared filter creation and heap→dataset packing in multiple paths.
src/algorithm/pyramid/pyramid.cpp Uses shared filter creation and footer helpers for serialization/deserialization.
src/algorithm/sindi/sindi.cpp Uses shared filter creation, empty-result helper, and footer helpers.
src/algorithm/warp/warp.cpp Uses shared filter creation, heap→dataset packing, and footer helpers.
Comments suppressed due to low confidence (1)

src/algorithm/sindi/sindi.cpp:508

  • read_index_footer() returns a bool but its return value is ignored. When a footer is absent, jsonify_basic_info[INDEX_PARAM].GetString() will throw a nlohmann::json exception (not a VsagException). Since this path is gated by deserialize_without_footer_, it should explicitly require a footer and fail with a controlled VsagException (or skip the compatibility check when has_footer == false).
    if (not deserialize_without_footer_) {
        JsonType jsonify_basic_info;
        this->read_index_footer(reader, jsonify_basic_info);
        // Check if the index parameter is compatible
        {
            auto param = jsonify_basic_info[INDEX_PARAM].GetString();
            SINDIParameterPtr index_param = std::make_shared<SINDIParameter>();

Comment thread src/algorithm/warp/warp.cpp
Comment thread src/algorithm/pyramid/pyramid.cpp
@LHT129 LHT129 force-pushed the 2026-06-01-抽象索引公共骨架消除inner_index子类重复 branch from 393da39 to 2837e09 Compare June 4, 2026 05:48
@LHT129 LHT129 force-pushed the 2026-06-01-抽象索引公共骨架消除inner_index子类重复 branch from 2837e09 to e2ea28b Compare June 4, 2026 06:46
Copilot AI review requested due to automatic review settings June 4, 2026 06:46
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Comment thread src/algorithm/inner_index_interface.cpp
Comment thread src/algorithm/hgraph/hgraph_search.cpp Outdated
…face base class

Extract repetitive search validation, filter composition, result packing,
and serialization footer logic from individual index implementations into
protected helper methods in InnerIndexInterface:

- validate_knn_args / validate_range_args / validate_search_query:
  common query dimension, k, radius, and num_elements checks
- create_search_filter: composing DeletedIdsFilter with user filter
  (supports both InnerIdWrapperFilter and ExtraInfoWrapperFilter)
- pack_knn_result / pack_knn_result_with_extra_info: DistHeap to Dataset
- make_empty_result: empty dataset creation
- write_index_footer / read_index_footer: Footer serialization helpers

Applied to BruteForce, HGraph, IVF, Pyramid, SINDI, and WARP, eliminating
~254 lines of duplicated boilerplate across 6 index implementations.

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>
@LHT129 LHT129 force-pushed the 2026-06-01-抽象索引公共骨架消除inner_index子类重复 branch from e2ea28b to ef51f83 Compare June 4, 2026 07:16
Copy link
Copy Markdown
Collaborator

@wxyucs wxyucs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@LHT129 LHT129 merged commit 34eb9f8 into antgroup:main Jun 5, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/improvement Code improvements (variable/function renaming, refactoring, etc. ) size/L version/1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extract common search skeleton from InnerIndex subclasses to eliminate duplication

3 participants