Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the docid_word_positions database #3819

Merged
merged 1 commit into from Jun 7, 2023

Conversation

loiclec
Copy link
Contributor

@loiclec loiclec commented Jun 7, 2023

Remove the docid_word_positions database, which was only used during deletion operations. In the process, also fixes #3816

That would happen when a word was deleted from all exact attributes
but not all regular attributes.
@loiclec loiclec added performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption indexing labels Jun 7, 2023
@loiclec loiclec requested a review from Kerollmops June 7, 2023 09:31
@Kerollmops Kerollmops added this to the v1.3.0 milestone Jun 7, 2023
Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @loiclec for that!
Do you by how much this reduces the size of an index?
bors merge

@loiclec
Copy link
Contributor Author

loiclec commented Jun 7, 2023

I ran some quick tests now. On a clean indexing of movies.json and smol_wiki_articles_3_4.csv, we eliminate about 15% of the index size.

@meili-bors
Copy link
Contributor

meili-bors bot commented Jun 7, 2023

Build succeeded:

@meili-bors meili-bors bot merged commit 9dcf1da into main Jun 7, 2023
10 checks passed
@meili-bors meili-bors bot deleted the remove-docid-word-positions branch June 7, 2023 10:43
@meili-bot meili-bot added the v1.3.0 PRs/issues solved in v1.3.0 released on 2023-07-31 label Aug 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
indexing performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption v1.3.0 PRs/issues solved in v1.3.0 released on 2023-07-31
Projects
None yet
3 participants