Skip to content

Add prefetching for terms dict in doc values #14773

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

easyice
Copy link
Contributor

@easyice easyice commented Jun 12, 2025

This follows a similar approach as doc values and only prefetches the first page of data. Perhaps these were missed at the time?

Copy link

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

@jpountz
Copy link
Contributor

jpountz commented Jun 18, 2025

Doing this is a bit less obvious in my mind since terms dictionaris are allowed to have a random access pattern, when doc-value iterators are required to be consumed in order?

Somewhat separately, I've been wondering if we need to APIs, one for saying "I'm going to use this field" and pass the full range of data, and another one to say "I'm going to use this specific range of bytes", which would be what IndexInput#prefetch does today.

@easyice
Copy link
Contributor Author

easyice commented Jun 18, 2025

Thanks for the explanation, You're right, I'll close this PR. I feel an API to prefetch full range of data could be useful, e.g. for files like .tip, which are relatively small but have a significant impact on read performance?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants