Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update eager-global-ordinals.asciidoc #40620

Closed
wants to merge 1 commit into from
Closed

Update eager-global-ordinals.asciidoc #40620

wants to merge 1 commit into from

Conversation

@Leaf-Lin
Copy link

Leaf-Lin commented Mar 28, 2019

Adding clarification on global ordinals.
Global ordinals are loaded as-needed by default as opposed to up-front. setting eager_global_ordinals: true will load them up-front and keep them up-to-date, as opposed to lazy loading when needed.
eager_global_ordinals is false by default, just trying to clarify that bit.

  • Have you signed the contributor license agreement?
  • Have you followed the contributor guidelines?
  • If submitting code, have you built your formula locally prior to submission with gradle check?
  • If submitting code, is your pull request against master? Unless there is a good reason otherwise, we prefer pull requests against master and will backport as needed.
  • If submitting code, have you checked that your submission is for an OS that we support?
  • If you are submitting this code for a class then read our policy for that.
Adding clarification on global ordinals.
Global ordinals are loaded as-needed by default as opposed to up-front. setting `eager_global_ordinals: true` will load them up-front and keep them up-to-date, as opposed to lazy loading when needed.
eager_global_ordinals is false by default, just trying to clarify that bit.
@Leaf-Lin

This comment has been minimized.

Copy link
Author

Leaf-Lin commented Mar 28, 2019

I think the doc can be further improved, but I'm not sure how. Here's some external blogs gave a good explanation:

Enable eager global ordinals to improve the performance of high-cardinality terms aggregations
The performance of terms aggregations on high-cardinality fields (fields with thousands or millions of possible unique values) may become slow and unpredictable in-part because by default global ordinals are lazily built on the first aggregation that occurs since the previous refresh as described here.

Building global ordinals eagerly should improve such aggregations and make response times more consistent, as the related data structure will be created when segments are refreshed, as opposed to the first query after each refresh.

Note that if global ordinals are eagerly built, this will impact write performance because new global ordinals will be created on every refresh. To minimize the additional workload caused by frequently building global ordinals due to frequent refreshes, increase the refresh interval.

@elasticmachine

This comment has been minimized.

Copy link
Collaborator

elasticmachine commented Mar 29, 2019

@elasticmachine

This comment has been minimized.

Copy link
Collaborator

elasticmachine commented Mar 29, 2019

Copy link
Contributor

mayya-sharipova left a comment

@Leaf-Lin Thanks for your contributions.

I actually think that the original wording that we had in the documentation is more clear and specific than the one you suggested. You suggested as "as-needed" but it is not clear when they are needed. If you want, we can incorporate both your suggestions and our original statement to something like this:

"global ordinals are loaded as needed for a search or aggregation request..."

Another note - please keep the text wrapped around 80 columns.

Also, we can NOT just copy text from external blogs. If you have some original content to enhance documentation, please go ahead. You contribution is very much welcome.

@Leaf-Lin

This comment has been minimized.

Copy link
Author

Leaf-Lin commented Apr 1, 2019

Hi,

I'm not suggesting to copy/paste content from other blogs, but I found the other blog seems to explain more clearly about what the setting is about.
Also, here's another comment from our internal team:
https://discuss.elastic.co/t/global-ordinals-performance-and-size-on-heap/158320/2

The doc is maybe a bit optimistic and could mention workarounds if loading is slow or if the searcher is refreshed often (each refresh of the main searcher invalidates the loaded global ordinals).

I think the doc can be improved to incorporate more clarity, but I'm not in the position to do so.

@mayya-sharipova

This comment has been minimized.

Copy link
Contributor

mayya-sharipova commented Apr 1, 2019

@Leaf-Lin You are right, the doc can be improved. And we have an issue for it.
Would like to continue to work on this small PR and address the requested changes?

@Leaf-Lin

This comment has been minimized.

Copy link
Author

Leaf-Lin commented Apr 3, 2019

Ah, I didn't realised this has been raised. Maybe we can close this one, let's just focus on that issue.
Some of the info posted here might be useful though.

@Leaf-Lin Leaf-Lin closed this Apr 3, 2019
@Leaf-Lin Leaf-Lin deleted the Leaf-Lin-patch-1 branch Dec 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.