New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check the real memory circuit breaker when building global ordinals #102462
Conversation
Pinging @elastic/es-analytics-geo (Team:Analytics) |
Hi @iverase, I've created a changelog YAML for you. |
server/src/main/java/org/elasticsearch/index/fielddata/ordinals/GlobalOrdinalsBuilder.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this, this is the kind of approach I had in mind. From a quick look, checking the circuit breaker is not so cheap, but doing it infrequently, every 8192 terms, should help amortize this cost. I wonder if we should do it even more infrequently, e.g. every 64k terms, since only adding 8k terms should not allocate lots of heap
server/src/main/java/org/elasticsearch/index/fielddata/ordinals/GlobalOrdinalsBuilder.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/fielddata/ordinals/GlobalOrdinalsBuilder.java
Outdated
Show resolved
Hide resolved
I upddated it so we do the check every 65536 calls. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…lastic#102462) This commit improves the way we build global ordinals by checking the parent circuit breaker every 65536 calls to TermsEnum#next.
Currently global ordinals are built without any protection against out of memory errors, even though they can use quite a bit of heap. This PR improves the way we build global ordinals by checking the parent circuit breaker every
819265536 calls to TermsEnum#next.