Skip to content

Conversation

@meiravgri
Copy link
Collaborator

@meiravgri meiravgri commented Aug 8, 2025

Overview

This PR introduces a comprehensive test suite for FP16 data types in the SVS algorithm.
The new test file test_svs_fp16.cpp is adapted from the existing test_svs.cpp to specifically validate SVS functionality with float16 vectors across single-index, multi-index, and tiered index configuration.

Key Changes

New Test File: tests/unit/test_svs_fp16.cpp

Test Class Structure

The file introduces three specialized test classes:

  1. FP16SVSTest: Core SVS functionality tests for single-index scenarios
  2. FP16SVSMultiTest: Multi-index SVS tests (inherits from FP16SVSTest)
  3. FP16SVSTieredIndexTest: Tiered index SVS tests (inherits from FP16SVSTest)
  • Type Definitions: runs float16 with quantization modes:

    • VecSimSvsQuant_NONE
    • VecSimSvsQuant_8
    • VecSimSvsQuant_8x8_LeanVec
  • Custom Vector Generation: Added GenerateVector() and GenerateAndAddVector() methods specifically for float16 with proper FP32→FP16 conversion

  • Data Conversion: All vector operations use vecsim_types::FP32_to_FP16() and vecsim_types::FP16_to_FP32() for proper type conversion

Test Coverage

Total Test Count: 47 tests (28 core + 10 multi-index + 9 tiered index)

Core SVS Tests (FP16SVSTest - 31 tests)

All essential SVS functionality tests adapted for FP16, including both adapted tests from the original suite and new FP16-specific tests:

Adapted from original test suite (26 tests):

  • Vector operations: svs_vector_add_test, svs_vector_update_test
  • Bulk operations: svs_bulk_vectors_add_delete_test
  • Index management: svs_indexing_same_vector, svs_reindexing_same_vector, svs_reindexing_same_vector_different_id
  • Batch iteration: svs_batch_iterator, svs_batch_iterator_non_unique_scores, svs_batch_iterator_reset, svs_batch_iterator_corner_cases, batchIteratorSwapIndices
  • Index lifecycle: resizeIndex, svs_empty_index, test_delete_vector
  • Search algorithms: svs_vector_search_test_ip, svs_vector_search_test_l2, svs_vector_search_test_cosine
  • Metrics and queries: rangeQuery, rangeQueryCosine
  • Debugging and info: test_svs_info, test_basic_svs_info_iterator, test_dynamic_svs_info_iterator
  • Size estimation: testSizeEstimation
  • Stress testing: sanity_reinsert_1280
  • Distance calculation validation: svs_get_distance
  • Quantization: scalar_quantization_query
  • quant_modes: Comprehensive quantization mode validation

New FP16-specific tests (2 tests):

  • test_override_all: Vector override functionality testing. Same flow logic as logging_runtime_params from test_svs.cpp, minus the logger validations.

Multi-Index Tests (FP16SVSMultiTest - 10 tests)

Specialized tests for multi-index scenarios where multiple vectors can share the same label:

  • vector_add_multiple_test: Adding multiple vectors with same label
  • vector_search_test: Search functionality in multi-index context
  • search_more_than_there_is: Edge case handling for search limits
  • find_better_score: Score optimization in multi-vector scenarios
  • find_better_score_after_pop: Score optimization after vector removal
  • reindexing_same_vector_different_id: Reindexing with different IDs
  • test_svs_info: Multi-index debug information validation
  • test_basic_svs_info_iterator: Multi-index debug iterator testing
  • rangeQuery: Range queries in multi-index context
  • svs_batch_iterator_basic: Batch iteration for multi-index

Tiered Index Tests (FP16SVSTieredIndexTest - 9 tests)

Tests for tiered index architecture with background indexing:

  • CreateIndexInstanceSingle/CreateIndexInstanceMulti: Index creation validation
  • RangeTestSingle/RangeTestMulti: Range query testing for both configurations
  • KNNSearch: K-nearest neighbor search in tiered context
  • deleteVector/deleteVectorMulti: Vector deletion in tiered indexes
  • BatchIteratorSingle/BatchIteratorMulti: Batch iteration for tiered indexes

Tests Excluded (18 tests)

Tests from the original test_svs.cpp that were intentionally not included, either because they are not relevant to for couldn't be reliably implemented with FP16 binary representation:

  • Timeout tests: testTimeoutReturn_topK, testTimeoutReturn_range, testTimeoutReturn_batch_iterator
  • Parameter validation: test_svs_parameter_combinations_and_defaults, test_svs_parameter_consistency_across_metrics
  • Runtime parameter resolution: resolve_ws_search_runtime_params, resolve_bc_search_runtime_params, resolve_use_search_history_runtime_params, resolve_epsilon_runtime_params
  • FP16 incompatible: svs_test_inf_score (infinity values not achievable with FP16)
  • Debug/utility tests: debugInfoIteratorFieldOrder, svs_vector_search_by_id_test
  • Redundant/irrelevant: testInitialSizeEstimation (covered by testSizeEstimation), joinSearchParams (type not relevant), FitMemoryTest (no-op), logging_runtime_params, preferAdHocOptimization (essentially no-op test), svs_search_empty_index (duplicate of svs_empty_index),

Additional Infrastructure Changes

SVS Index Implementation Enhancements

Enhanced Vector Data Access (src/VecSim/algorithms/svs/svs.h)

  • Implemented getStoredVectorDataByLabel(): Added comprehensive implementation for retrieving stored vector data by label (only supported for non-compressed indices, both single and multi-index configurations)

Query Result Debugging (src/VecSim/query_result_definitions.h)

  • Debug Print Operators: Added operator<< for VecSimQueryResult and VecSimQueryReply to enable better test debugging
  • Test-Only Features: Wrapped debug functionality in #ifdef BUILD_TESTS to keep production code clean
  • Enhanced Error Messages: Improved test failure diagnostics with detailed query result information (validateTopKSearchTest() function now prints all results of the assertion fails: ASSERT_TRUE(allUniqueResults(res)) << *res;)

Test Infrastructure Improvements

Enhanced Test Utilities (tests/unit/unit_test_utils.cpp & .h)

  • SVS Parameter Validation: Added validateSVSIndexAttributesInfo() function to validate SVS index parameters against svsInfoStruct values

Extended Test Data Generation (tests/utils/tests_utils.h)

  • FP16 Vector Generation: Added populate_float16_vec() function for generating random FP16 test vectors

@meiravgri meiravgri changed the title add fp16 tests [MOD-10185] add SVS fp16 tests Aug 13, 2025
@codecov
Copy link

codecov bot commented Aug 20, 2025

Codecov Report

❌ Patch coverage is 96.15385% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 96.86%. Comparing base (bab676d) to head (7e570b1).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
src/VecSim/algorithms/svs/svs.h 94.73% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #740      +/-   ##
==========================================
+ Coverage   96.68%   96.86%   +0.17%     
==========================================
  Files         126      126              
  Lines        7714     7739      +25     
==========================================
+ Hits         7458     7496      +38     
+ Misses        256      243      -13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@alonre24 alonre24 requested a review from Copilot September 3, 2025 20:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces comprehensive FP16 testing for the SVS algorithm, adding a new test suite that validates SVS functionality with float16 vectors across single-index, multi-index, and tiered index configurations. The tests are adapted from the existing SVS test suite with proper FP32→FP16 conversion handling.

Key Changes

  • Adds 47 specialized FP16 tests across three test classes covering core SVS functionality, multi-index scenarios, and tiered indexing
  • Implements proper FP16 vector data retrieval in SVS index for testing purposes
  • Enhances test infrastructure with FP16 vector generation utilities and improved debugging capabilities

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/unit/test_svs_fp16.cpp New comprehensive FP16 test suite for SVS algorithm (mentioned in CMakeLists.txt)
tests/utils/tests_utils.h Adds FP16 vector generation utility function
tests/unit/unit_test_utils.h Declares SVS parameter validation function
tests/unit/unit_test_utils.cpp Implements SVS parameter validation and enhances test debugging
tests/unit/test_index_test_utils.cpp Adds tests for new query result print operators
tests/unit/CMakeLists.txt Includes new FP16 test file in build
src/VecSim/query_result_definitions.h Adds debug print operators for query results
src/VecSim/algorithms/svs/svs.h Implements vector data retrieval by label for testing

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Collaborator

@alonre24 alonre24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, It seems like we have a good coverage here without going deeply into the test code itself. Few general questions:

  1. Should we cover the tests listed under "New FP16-specific tests" for Float32 as well?
  2. Any particular reason why "KNNSearch" is the only one that we have only for single and for multi as well?
  3. Regarding the excluded tests - what does it mean in general that tests "couldn't be reliably implemented with FP16 precision"? relaxing the accuracy expectation did not work in that case? And in particular, why not covering timeouts and why do "infinity values not achievable with FP16"? As for the rest of the excluded tests, I understand the reason for the exclusion.

@meiravgri
Copy link
Collaborator Author

@alonre24

  1. Updated the description.
  • quant_modes is actually covered also in test_svs.cpp.
  • test_override_all flow is included and derived from SVSTest::logging_runtime_params, minus the logger testing, which seemed to me irrelevant here.
  1. Did you mean "and NOT for multi as well"? In that case, it's derived from the test_svs_tiered.cpp, which didn't implement it for multi.
  • Regarding precision, leftovers from the previous description version.
  • Since timeout handling is not datatype related, we can avoid the duplication.
  • infinity is only defined for floating-point representation.

@meiravgri meiravgri requested a review from GuyAv46 September 18, 2025 15:53
alonre24
alonre24 previously approved these changes Oct 14, 2025
@meiravgri meiravgri added this pull request to the merge queue Oct 15, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 15, 2025
@meiravgri meiravgri added this pull request to the merge queue Oct 15, 2025
Merged via the queue into main with commit 00bf35d Oct 15, 2025
17 checks passed
@meiravgri meiravgri deleted the meiravg_svs_fp16_tests branch October 15, 2025 09:30
@github-actions
Copy link

Successfully created backport PR for 8.2:

github-actions bot pushed a commit that referenced this pull request Oct 15, 2025
* add fp16 tests

* more tests

* add printing utility to VecSimQueryReply and VecSimQueryResult

more tests

* more tests

add validateSVSIndexAttributesInfo to compare info to params

add populate_float16_vec

* more tests

* imp getStoredVectorDataByLabel for multi and no-compressed

more tests

* fix quant_modes

* multi index tests

* tiered tests

* fix test to fail

* temproraly push test_override_all to float32 tests

* update to crash

* revert svs tests chnages

fix cosine batch iterators tests and all fp16 tests

* test

* add range test

* skip test_get_distance for scalaar quant

* align svs_get_distance name test with test_svs.cpp

* clean ups

* add coverage to print VecSimQueryResult and VecSimQueryReply

* guard validateSVSIndexAttributesInfo with #if HAVE_SVS for the include

* fix

* align names with main branhc

(cherry picked from commit 00bf35d)
github-merge-queue bot pushed a commit that referenced this pull request Oct 15, 2025
* [MOD-10185] add SVS fp16 tests (#740)

* add fp16 tests

* more tests

* add printing utility to VecSimQueryReply and VecSimQueryResult

more tests

* more tests

add validateSVSIndexAttributesInfo to compare info to params

add populate_float16_vec

* more tests

* imp getStoredVectorDataByLabel for multi and no-compressed

more tests

* fix quant_modes

* multi index tests

* tiered tests

* fix test to fail

* temproraly push test_override_all to float32 tests

* update to crash

* revert svs tests chnages

fix cosine batch iterators tests and all fp16 tests

* test

* add range test

* skip test_get_distance for scalaar quant

* align svs_get_distance name test with test_svs.cpp

* clean ups

* add coverage to print VecSimQueryResult and VecSimQueryReply

* guard validateSVSIndexAttributesInfo with #if HAVE_SVS for the include

* fix

* align names with main branhc

(cherry picked from commit 00bf35d)

* remove multi block

* fix

* remove multi tiered tests

* revert to align with master

---------

Co-authored-by: meiravgri <109056284+meiravgri@users.noreply.github.com>
Co-authored-by: meiravgri <meirav.grimberg@redis.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants