Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we use a SparseFixedBitSet when deletes are sparse? #13084

Open
jpountz opened this issue Feb 6, 2024 · 0 comments
Open

Should we use a SparseFixedBitSet when deletes are sparse? #13084

jpountz opened this issue Feb 6, 2024 · 0 comments

Comments

@jpountz
Copy link
Contributor

jpountz commented Feb 6, 2024

Description

@uschindler asked this question in https://lists.apache.org/thread/6o3hn3x8syfm8lj93kk5rrxb0kx701gp.

In this discussion, we were looking for introducing the ability to iterate deleted docs, in order to compute (cheaply!) some facets across the entire doc ID space, to then fix counts by iterating deleted docs and decrementing counts in buckets where they belong. Using a SparseFixedBitSet in the sparse case would help have a good iterator all the time, rather that requiring O(maxDoc) all the time because this is what FixedBitSet requires to iterate all clear bits.

If having sequential access on deletes wasn't a requirement, a set-based approach would work too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant