Support FLOAT64 as vector data type MOD-3982 #3129

alonre24 · 2022-10-02T10:29:18Z

Integrating the support of FLOAT64 vectors into RediSearch and add flow tests. The enablement required two small modifications (due to preparation work done in advanced):

In ft.create - allow specifying FLOAT64 as the value for the TYPE argument of a vector field in a schema - for example: FT.CREATE idx SCHEMA v VECTOR HNSW 6 TYPE FLOAT64 DIM 4096 DISTANCE_METRIC L2
In integration with json - choose the appropriate callback that converts the vector element in a json array into the right data type, based on the vector index meta data (that is, allow converting elements to float64 in addition to float32)

Rest of this PR handles testing, including refactoring of vecsim_test.py making it more generic, thus allowing test to run over more than one hard-coded data type.

…on tests

CLAassistant · 2022-10-02T10:29:28Z

All committers have signed the CLA.

tests/pytests/test_vecsim.py

lgtm-com · 2022-10-02T10:45:29Z

This pull request introduces 1 alert and fixes 24 when merging 9f74b3b into 2144521 - view on LGTM.com

new alerts:

1 for Multiplication result converted to larger type

fixed alerts:

24 for Function declared in block

lgtm-com · 2022-10-02T11:07:26Z

This pull request introduces 1 alert and fixes 24 when merging 4393d59 into 2144521 - view on LGTM.com

new alerts:

1 for Multiplication result converted to larger type

fixed alerts:

24 for Function declared in block

…ch/RediSearch into alon_enable_vecsim_fp64

lgtm-com · 2022-10-02T12:39:36Z

This pull request introduces 1 alert and fixes 24 when merging 9d5b398 into 2144521 - view on LGTM.com

new alerts:

1 for Multiplication result converted to larger type

fixed alerts:

24 for Function declared in block

lgtm-com · 2022-10-02T14:29:21Z

This pull request introduces 1 alert and fixes 24 when merging 863507d into 7991f4a - view on LGTM.com

new alerts:

1 for Multiplication result converted to larger type

fixed alerts:

24 for Function declared in block

codecov · 2022-10-02T16:41:52Z

Codecov Report

Base: 82.71% // Head: 82.74% // Increases project coverage by +0.03% 🎉

Coverage data is based on head (20a0b6e) compared to base (208daf1).
Patch coverage: 76.47% of modified lines in pull request are covered.

❗ Current head 20a0b6e differs from pull request most recent head 4ea2114. Consider uploading reports for the commit 4ea2114 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3129      +/-   ##
==========================================
+ Coverage   82.71%   82.74%   +0.03%     
==========================================
  Files         180      180              
  Lines       29654    29660       +6     
==========================================
+ Hits        24528    24542      +14     
+ Misses       5126     5118       -8

Impacted Files	Coverage Δ
src/aggregate/aggregate_exec.c	`96.72% <60.00%> (-0.56%)`	⬇️
src/debug_commads.c	`88.13% <60.00%> (+0.52%)`	⬆️
src/index.c	`83.66% <100.00%> (+0.86%)`	⬆️
src/json.c	`88.88% <100.00%> (+1.34%)`	⬆️
src/spec.c	`87.85% <100.00%> (+0.01%)`	⬆️
src/fork_gc.c	`56.14% <0.00%> (-0.81%)`	⬇️
src/vector_index.c	`86.46% <0.00%> (+0.43%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

DvirDukhan · 2022-10-02T11:01:27Z

docs/docs/reference/Vectors.md

@@ -127,7 +127,7 @@ FT.CREATE my_index2
 SCHEMA vector_field VECTOR 
 HNSW 
 14 
-TYPE FLOAT32 
+TYPE FLOAT64 


Why not add another example instead of changing it?

The FLAT index example is with FLOAT32... Didn't want to overload with examples...

tests/pytests/test_json.py

tests/pytests/test_vecsim.py

DvirDukhan · 2022-10-03T10:45:40Z

tests/pytests/test_vecsim.py

+        # For FLOAT64, this block size exceeds 10% of system memory, but not for FLOAT32
+        block_size = system_memory // (dim*float64_byte_size) // 9
+        if data_type == 'FLOAT32':
+            env.expect('FT.CREATE', currIdx, 'SCHEMA', 'v', 'VECTOR', 'FLAT', '10', 'TYPE', data_type,
+                       'DIM', dim, 'DISTANCE_METRIC', 'L2', 'INITIAL_CAP', 0, 'BLOCK_SIZE', block_size).ok()
+        else:
+            env.expect('FT.CREATE', currIdx, 'SCHEMA', 'v', 'VECTOR', 'FLAT', '10', 'TYPE', data_type,
+                   'DIM', dim, 'DISTANCE_METRIC', 'L2', 'INITIAL_CAP', 0, 'BLOCK_SIZE', block_size).error().contains(
+            f'Vector index block size {block_size} exceeded server limit')
+


This is a bit confusing that one type is passing and the other is not

Right, but that's the exact purpose of the test... to test the difference in memory estimation between the two types

tests/pytests/test_vecsim.py

lgtm-com · 2022-10-03T12:00:34Z

This pull request introduces 1 alert and fixes 24 when merging 20a0b6e into 7991f4a - view on LGTM.com

new alerts:

1 for Multiplication result converted to larger type

fixed alerts:

24 for Function declared in block

alonre24 added 2 commits October 1, 2022 14:39

Enable float64 + refactor tests and add tests to float64

6b5486a

Update the rest of the tests + enable float64 with json and update js…

9f74b3b

…on tests

alonre24 requested a review from DvirDukhan October 2, 2022 10:29

sonatype-lift bot reviewed Oct 2, 2022

View reviewed changes

tests/pytests/test_vecsim.py Show resolved Hide resolved

sonatype-lift bot reviewed Oct 2, 2022

View reviewed changes

tests/pytests/test_vecsim.py Show resolved Hide resolved

alonre24 added 3 commits October 2, 2022 13:53

Update docs + vecsim ref

f8cb6fb

Merge branch 'master' into alon_enable_vecsim_fp64

2562b9d

Merge branch 'master' into alon_enable_vecsim_fp64

4393d59

alonre24 added 2 commits October 2, 2022 15:24

add scipy to requirements, revert NEVER_DECODE removal

29654fd

Merge branch 'alon_enable_vecsim_fp64' of https://github.com/RediSear…

9d5b398

…ch/RediSearch into alon_enable_vecsim_fp64

alonre24 added 2 commits October 2, 2022 17:14

small fix for test in cluster

9c87c8c

Merge branch 'master' into alon_enable_vecsim_fp64

863507d

DvirDukhan reviewed Oct 3, 2022

View reviewed changes

Addressing PR comments

20a0b6e

alonre24 added 2 commits October 18, 2022 15:40

merge with updated master + resolve conflicts in test_vecsim

42a5f42

Merge with updated master (vecsim version bumped to 0.5.0)

4ea2114

alonre24 marked this pull request as ready for review October 19, 2022 13:13

alonre24 requested a review from DvirDukhan October 19, 2022 13:13

DvirDukhan approved these changes Oct 19, 2022

View reviewed changes

DvirDukhan changed the title ~~Support FLOAT64 as vector data type~~ Support FLOAT64 as vector data type MOD-3982 Oct 19, 2022

oshadmi merged commit 8f34860 into master Oct 19, 2022

oshadmi deleted the alon_enable_vecsim_fp64 branch October 19, 2022 16:26

snewcomer mentioned this pull request Apr 26, 2024

[Feature Request] int8 or float16 support for vector fields #4609

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support FLOAT64 as vector data type MOD-3982 #3129

Support FLOAT64 as vector data type MOD-3982 #3129

alonre24 commented Oct 2, 2022 •

edited

Loading

CLAassistant commented Oct 2, 2022 •

edited

Loading

lgtm-com bot commented Oct 2, 2022

lgtm-com bot commented Oct 2, 2022

lgtm-com bot commented Oct 2, 2022

lgtm-com bot commented Oct 2, 2022

codecov bot commented Oct 2, 2022 •

edited

Loading

DvirDukhan Oct 2, 2022

alonre24 Oct 3, 2022

DvirDukhan Oct 3, 2022

alonre24 Oct 3, 2022

lgtm-com bot commented Oct 3, 2022

Support FLOAT64 as vector data type MOD-3982 #3129

Support FLOAT64 as vector data type MOD-3982 #3129

Conversation

alonre24 commented Oct 2, 2022 • edited Loading

CLAassistant commented Oct 2, 2022 • edited Loading

lgtm-com bot commented Oct 2, 2022

lgtm-com bot commented Oct 2, 2022

lgtm-com bot commented Oct 2, 2022

lgtm-com bot commented Oct 2, 2022

codecov bot commented Oct 2, 2022 • edited Loading

Codecov Report

DvirDukhan Oct 2, 2022

Choose a reason for hiding this comment

alonre24 Oct 3, 2022

Choose a reason for hiding this comment

DvirDukhan Oct 3, 2022

Choose a reason for hiding this comment

alonre24 Oct 3, 2022

Choose a reason for hiding this comment

lgtm-com bot commented Oct 3, 2022

alonre24 commented Oct 2, 2022 •

edited

Loading

CLAassistant commented Oct 2, 2022 •

edited

Loading

codecov bot commented Oct 2, 2022 •

edited

Loading