Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental indexing #41

Merged
merged 31 commits into from
Jan 11, 2024
Merged

Incremental indexing #41

merged 31 commits into from
Jan 11, 2024

Conversation

irevoire
Copy link
Member

@irevoire irevoire commented Dec 6, 2023

Fix #21

  • add_item keeps track of the items that have been modified in a roaring bitmap in the metadata
  • n_item as a roaring bitmap in the metadata
  • About preprocess:
    • We can probably compute the max-norm on the fly while adding the items
    • If the max-norm doesn't change (from a doc addition OR deletion), then we don't need to run the second part of the pre-processing on all the nodes but only on the new nodes
  • To insert the new documents, we split the IDs we're supposed to insert on every split node between the left and right
  • If we deleted too many items in a descendants node, we should try to look at the split node above us to merge both branch into a single descendants node
  • Once we're able to insert the items in a descendant/item leaf, we need to check its size and create a new split node if needed
  • Do we need to re-compute the normal of all the impacted nodes?
  • If the number of items increases, we may need to create new trees
  • Fix append_item by moving the item Node mode at the end of the enum

@irevoire irevoire marked this pull request as ready for review January 10, 2024 17:17
@irevoire irevoire added indexing Everything related to indexing enhancement New feature or request performance relevancy labels Jan 10, 2024
@irevoire irevoire merged commit 2337e1e into main Jan 11, 2024
7 checks passed
@irevoire irevoire deleted the incremental-indexing branch January 11, 2024 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request indexing Everything related to indexing performance relevancy
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement incremental updates
2 participants