[SVS] Add SVS LeanVec compression support #684

rfsaliev · 2025-05-26T13:36:13Z

Describe the changes in the pull request

This PR introduces SVS LeanVec index compression integration to VecSim SVSIndex.

Which issues this PR fixes

No issues

Main objects this PR modified

Added SVS LeanVec compression support:
- New template parameter IsLeanVec for SVSIndex class in svs.h
- New SVSStorageTraits specialization for IsLeanVec=true in svs_extensions.h
- SVSStorageTraits interface extended by compute_distance_by_id() used in SVSIndex::getDistanceFrom_Unsafe()
- Added new LeanVec options to VecSimSvsQuantBits enum in vec_sim_common.h
- Reflect new LeanVec options in svs_factory.h
- Added new test to test_svs.cpp which constructs SVSIndex with different kind of compression/quantization.

Mark if applicable

This PR introduces API changes
This PR introduces serialization changes

codecov · 2025-05-26T14:26:20Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.52%. Comparing base (552b295) to head (493d732).
Report is 6 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #684      +/-   ##
==========================================
+ Coverage   96.24%   96.52%   +0.28%     
==========================================
  Files         112      111       -1     
  Lines        6278     6325      +47     
==========================================
+ Hits         6042     6105      +63     
+ Misses        236      220      -16

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull Request Overview

This PR integrates SVS LeanVec index compression into the VecSim SVSIndex by extending the template parameters and storage traits, updating factory functions, and adding corresponding tests.

Introduces IsLeanVec template parameter and specializations in SVSIndex, SVSStorageTraits, and factory helpers.
Adds new LeanVec quantization modes (4x8_LeanVec, 8x8_LeanVec) to the VecSimSvsQuantBits enum and updates CMake to fetch a matching SVS release.
Expands unit tests to cover size estimation and search behavior for the new LeanVec modes.

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tests/unit/test_svs.cpp	Added tests for LeanVec quantization modes; adjusted size tests
src/VecSim/vec_sim_common.h	Updated `VecSimSvsQuantBits` enum to include LeanVec options
src/VecSim/index_factories/svs_factory.cpp	Extended template signatures and switch cases for LeanVec
src/VecSim/algorithms/svs/svs_utils.h	Added generic `compute_distance_by_id` to primary traits
src/VecSim/algorithms/svs/svs_extensions.h	Included LeanVec header and added specializations for LeanVec
src/VecSim/algorithms/svs/svs.h	Added `IsLeanVec` to `SVSIndex` and delegated distance logic
cmake/svs.cmake	Bumped `SVS_URL` to a newer nightly release

Comments suppressed due to low confidence (3)

src/VecSim/vec_sim_common.h:195

EstimateInitialSize only distinguishes quantBits == 0 vs. nonzero quant and always uses the non-LeanVec template. For LeanVec modes, the base object size differs and should be accounted for by branching on IsLeanVec or inspecting params->quantBits for LeanVec flags.

est += (params->quantBits == 0)
               ? sizeof(SVSIndex<svs::distance::DistanceL2, float, 0, 0, false>)
               : sizeof(SVSIndex<svs::distance::DistanceL2, float, 8, 0, false>);

tests/unit/test_svs.cpp:2190

The test uses the original params.quantBits for size estimation but the code may fallback unsupported modes via isSVSQuantBitsSupported. You should retrieve the fallback mode (std::get<0>(isSVSQuantBitsSupported(params.quantBits))) and use that value when calling EstimateElementSize, to keep the test aligned with the index creation.

double estimation_accuracy = (quant_bits != VecSimSvsQuant_NONE) ? 0.11 : 0.01;

tests/unit/test_svs.cpp:2172

Using std::vector<float[dim]> is not valid C++; array types cannot be direct vector element types. Consider using std::vector<std::array<float, dim>> or std::vector<std::vector<float>> to store fixed-size vectors.

std::vector<float[dim]> v(n);

Copilot · 2025-06-04T13:17:33Z

src/VecSim/vec_sim_common.h

+    VecSimSvsQuant_4x4 = 4 | (4 << 8), // 4-bit quantization with 4-bit residuals
+    VecSimSvsQuant_4x8 = 4 | (8 << 8), // 4-bit quantization with 8-bit residuals
+    VecSimSvsQuant_4x8_LeanVec = 4 | (8 << 8) | (1 << 16), // LeanVec 4x4 quantization
+    VecSimSvsQuant_8x8_LeanVec = 8 | (8 << 8) | (1 << 16), // LeanVec 8x8 quantization


The shift amount for packing residual bits was changed from << 10 to << 8, altering the numeric values of existing enum members. This could break backward compatibility or bit-field parsing. Consider restoring the original shift or documenting the new bit layout explicitly.

Suggested change

VecSimSvsQuant_4x4 = 4 | (4 << 8), // 4-bit quantization with 4-bit residuals

VecSimSvsQuant_4x8 = 4 | (8 << 8), // 4-bit quantization with 8-bit residuals

VecSimSvsQuant_4x8_LeanVec = 4 | (8 << 8) | (1 << 16), // LeanVec 4x4 quantization

VecSimSvsQuant_8x8_LeanVec = 8 | (8 << 8) | (1 << 16), // LeanVec 8x8 quantization

VecSimSvsQuant_4x4 = 4 | (4 << 10), // 4-bit quantization with 4-bit residuals

VecSimSvsQuant_4x8 = 4 | (8 << 10), // 4-bit quantization with 8-bit residuals

VecSimSvsQuant_4x8_LeanVec = 4 | (8 << 10) | (1 << 16), // LeanVec 4x4 quantization

VecSimSvsQuant_8x8_LeanVec = 8 | (8 << 10) | (1 << 16), // LeanVec 8x8 quantization

I tried to encode quantization mode in a kind of bitmask. Had thoughts that such encoding would be useful for further processing. I also tried to keep enough bits in the bitmask to support up to 32-bit quantization.
But current code does not use such masks and limited quantization modes are supported by pre-compiled SVS library.
Instead, in tests I see that LeanVec case is not readable. E.g.:

1452/1478 Test #1454: SVSTest.svs_empty_index<SVSIndexType<(VecSimType)0,float,(VecSimSvsQuantBits)67592>> ...................................... Passed 0.01 sec

IMHO, at this moment, we can set any values for these constants and we can make them more readable. E.g.:

typedef enum { VecSimSvsQuant_NONE = 0, // No quantization. VecSimSvsQuant_4 = 4, // 4-bit quantization VecSimSvsQuant_8 = 8, // 8-bit quantization VecSimSvsQuant_4x4 = 404, // 4-bit quantization with 4-bit residuals VecSimSvsQuant_4x8 = 408, // 4-bit quantization with 8-bit residuals VecSimSvsQuant_4x8_LeanVec = 10408, // LeanVec 4x8 quantization VecSimSvsQuant_8x8_LeanVec = 10808, // LeanVec 8x8 quantization // Further scalar fallback quantization: VecSimSvsQuant_Scalar = 20008 // 8-bit scalar quantization } VecSimSvsQuantBits;

@DvirDukhan , @alonre24 , what do you think about?

src/VecSim/algorithms/svs/svs.h

src/VecSim/algorithms/svs/svs_extensions.h

src/VecSim/algorithms/svs/svs_utils.h

src/VecSim/vec_sim_common.h

dor-forer · 2025-06-09T14:00:03Z

@rfsaliev Can we merge this?

rfsaliev added 5 commits May 26, 2025 15:01

Add LeanVec support

43123b8

Add basic test for all quantization modes.

1158988

Fix handling of cmake USE_SVS=OFF and HAVE_SVS=false in Tiered SVS index

c07ff36

Update SVS shared library with LeanVec memcheck fixes.

a752d3d

Code review s1e1

1c91b6f

rfsaliev requested review from alonre24 and meiravgri May 26, 2025 13:36

[SVS] Improve code coverage in tests

070d9e2

DvirDukhan requested a review from Copilot June 4, 2025 13:14

Copilot AI reviewed Jun 4, 2025

View reviewed changes

alonre24 requested review from dor-forer and removed request for meiravgri June 5, 2025 09:30

Small fixes

94e29cb

dor-forer reviewed Jun 5, 2025

View reviewed changes

Code review e2s1

493d732

dor-forer approved these changes Jun 8, 2025

View reviewed changes

rfsaliev added this pull request to the merge queue Jun 9, 2025

Merged via the queue into main with commit b0d4bf6 Jun 9, 2025
17 checks passed

rfsaliev deleted the rfsaliev/svs-leanvec branch June 9, 2025 16:55

dor-forer mentioned this pull request Jun 10, 2025

[SVS] SVS API and functionality update #676

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SVS] Add SVS LeanVec compression support #684

[SVS] Add SVS LeanVec compression support #684

Uh oh!

rfsaliev commented May 26, 2025 •

edited

Loading

Uh oh!

codecov bot commented May 26, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jun 4, 2025

Uh oh!

rfsaliev Jun 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dor-forer commented Jun 9, 2025

Uh oh!

Uh oh!

Uh oh!

[SVS] Add SVS LeanVec compression support #684

[SVS] Add SVS LeanVec compression support #684

Uh oh!

Conversation

rfsaliev commented May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

rfsaliev Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dor-forer commented Jun 9, 2025

Uh oh!

Uh oh!

Uh oh!

rfsaliev commented May 26, 2025 •

edited

Loading

codecov bot commented May 26, 2025 •

edited

Loading

rfsaliev Jun 5, 2025 •

edited

Loading