-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: forbid creating index if num_sub_vectors doesn't divide dim #2234
Conversation
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2234 +/- ##
==========================================
- Coverage 81.11% 81.08% -0.04%
==========================================
Files 185 186 +1
Lines 53411 53575 +164
Branches 53411 53575 +164
==========================================
+ Hits 43324 43439 +115
- Misses 7605 7669 +64
+ Partials 2482 2467 -15
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
if data.num_columns() % m != 0 { | ||
return Err(Error::invalid_input( | ||
format!( | ||
"num_sub_vectors must divide vector dimension {}, but got {}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible, I might refer the user in a section to the docs that explains this in more detail. I'd imagine that some users just run the index command expecting it to work, and if it doesn't they probably need more than this message to debug it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, we don't yet have a docs page that describes this well. We should add that, but this seems like a definite improvement over the current panics.
rust/lance/src/index/vector/ivf.rs
Outdated
assert!(dataset | ||
.create_index(&["vector"], IndexType::Vector, None, ¶ms, false) | ||
.await | ||
.is_err()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to also assert the error message:
assert!(dataset | |
.create_index(&["vector"], IndexType::Vector, None, ¶ms, false) | |
.await | |
.is_err()); | |
let res = dataset | |
.create_index(&["vector"], IndexType::Vector, None, ¶ms, false) | |
.await; | |
assert!( | |
matches!(res, Err(Error::InvalidInput { source, .. } | |
if source.to_string().contains("num_sub_vectors must divide vector dimension"))), | |
"{:?}", | |
&res | |
); |
see details: lancedb/lancedb#1222