Fix sampling of lucene index size #6963
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Since when we iterate over all the terms in the lucene index, lucene
will return also the term that has been deleted and not in use anymore
untill the next index compaction. This is not a huge problem per se
for computing index selectivity, but we were using this skewed numbers
also for the index size. This commit will fix the index size to be
computed correctly by asking lucene the number of documents present in
the index.