New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Brought back index regeneration #2648
Conversation
Codecov ReportAll modified lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2648 +/- ##
==========================================
- Coverage 84.30% 81.79% -2.52%
==========================================
Files 229 229
Lines 25503 25499 -4
==========================================
- Hits 21501 20857 -644
- Misses 4002 4642 +640
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
self.delete_vdb_index(vdb_index["id"]) | ||
# Recreate it back. | ||
self.create_vdb_index( | ||
vdb_index["id"], | ||
vdb_index["distance"], | ||
additional_params=vdb_index.get("additional_params", None), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when either of these commands fail? do we end up with corrupt indexes or do we just delete index? If possible I think we should rolling back to previous index state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adolkhan I don't think we should roll back, because the previous index won't have all the samples, so even though it could technically, it's not a correct index. Btw this stuff is being removed with incremental updates, so this code will likely exist for a short time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future shouldn't we undo the whole operation. Say if we tried to delete some samples, index part errored out, shouldn't we add back all deleted elements and use old index while throwing exception that this operation was unsuccessful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise the dataset state will be in weir situation. Some samples got removed while index was wiped out completely or if error happened during deletion, then we either end up with corrupted index or old index. Isn't it the case? @sounakr correct me if I am wrong. But since you said that this code is short lived, then I think it is fine to merge these changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, though I don't like the idea of writing rollbacks near the user-facing API. We should write all of this at the low level, i.e. the API for adding samples and updating the index, and all the rollback logic, should be implemented at a low level behind a single user-facing API call.
That's how ds.extend({tensor_1.... tensor_2....})
works. In that API, and if some samples fail to append, no samples append at all, presumably using something similar to rollbacks, but most importantly, the person who calls the ds.extend
API doesn't implement or worry about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@istranic / @adolkhan , When there is an error or corruption we can do two things.
- Retry index creation from very beginning. i.e. not incremental but from scratch.
- Even if the retry fails then mark the index as non_usable. This non_usable flag will prohibit us from using the index temporairy during search options.
- The index has to be manually dropped and recreated later to clear out the flag and make it reusable again.
Let me know you thoughts. Will implement this logic in incremental.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dataset and index should be kept separate. Dataset is a superset and index can be recreated anytime if Dataset is present. So in case of corruption we have Indra generic consine similarity and L2 implementation which can take over even if hnsw index is absent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sounakr My concern is not about the index corruption itself. My concern is about our code structure, and the fact that it's architected in a way that's very bad for a developer or a user.
self.delete_vdb_index(vdb_index["id"]) | ||
# Recreate it back. | ||
self.create_vdb_index( | ||
vdb_index["id"], | ||
vdb_index["distance"], | ||
additional_params=vdb_index.get("additional_params", None), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future shouldn't we undo the whole operation. Say if we tried to delete some samples, index part errored out, shouldn't we add back all deleted elements and use old index while throwing exception that this operation was unsuccessful?
self.delete_vdb_index(vdb_index["id"]) | ||
# Recreate it back. | ||
self.create_vdb_index( | ||
vdb_index["id"], | ||
vdb_index["distance"], | ||
additional_params=vdb_index.get("additional_params", None), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise the dataset state will be in weir situation. Some samples got removed while index was wiped out completely or if error happened during deletion, then we either end up with corrupted index or old index. Isn't it the case? @sounakr correct me if I am wrong. But since you said that this code is short lived, then I think it is fine to merge these changes
…add_regen_back
Kudos, SonarCloud Quality Gate passed! |
🚀 🚀 Pull Request
Impact
Description
-- Added index regeneration back. It was removed previously by mistake
-- Minor refactors to how parameters are passed
-- Increased M and efConstruction to provide more accuracy to index
Things to be aware of
Things to worry about
Additional Context