-
Notifications
You must be signed in to change notification settings - Fork 982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make FSTPostingsFormat load FSTs off-heap #12552
Conversation
this.dict = new FST<>(in, in, new FSTTermOutputs(fieldInfo)); | ||
OffHeapFSTStore offHeapFSTStore = new OffHeapFSTStore(); | ||
this.dict = new FST<>(in, in, new FSTTermOutputs(fieldInfo), offHeapFSTStore); | ||
in.skipBytes(offHeapFSTStore.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious - why do we need to skip to the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The OffHeapFSTStore doesn't advance the input (but the on-heap one does), we need to seek it manually since the input contains multiple FSTs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was worried at first that we are not cloning this IndexInput
anywhere and that this would cause concurrency bugs when two queries pull a TermsEnum
here, but we are OK because OffHeapFSTStore
does this cloning when it pulls a random access slice from the IndexInput
.
@Tony-X have you tried passing all Lucene unit tests using this Codec? I think you can add |
@mikemccand hey Mike, I did not make a new Codec for this. IIRC, Oh wait, I see that the test target support Since the successful output doesn't provide randomization info, I also ran with an non-existent postings format |
Thanks for the CHANGES entry - I'll push shortly |
Thanks @msokolov ! |
I think this was mistakingly not backported to 9.x? (I only caught this because I was seeing merge conflicts trying to backport #12803 and saw this. I'll backport shortly -- I think this is low risk for the pending 9.9.0 release: this has baked in main for a couple months, and this is a very rarely used experimental postings format. |
* Make FSTPostingsFormat load FSTs off-heap
Description
FSTs supports to load offheap for a while. As we were trying to use
FSTPostingsFormat
for some fields we realized heap usage bumped.Upon further investigation we realized the FSTPostingsFormat does not load FSTs offheap. This PR addresses that.