Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TopChildren Queries Waste Memory #8160

Closed
kire321 opened this Issue Oct 20, 2014 · 5 comments

Comments

Projects
None yet
2 participants
@kire321
Copy link
Contributor

kire321 commented Oct 20, 2014

The TopChildrenQuery allocates an array to hold the results of the query. Currently, the size of this array is based on the number of documents in the index and can be many megabytes in size. Instead, it should be based on the number of parent documents that were asked for.

Pull request coming soon.
Details: Look for readerParentDocs = new IntObjectOpenHashMap<>(indexReader.maxDoc()); in src/main/java/org/elasticsearch/index/search/child/TopChildrenQuery.java

@clintongormley

This comment has been minimized.

Copy link
Member

clintongormley commented Oct 20, 2014

@kire321 Out of interest, why do you use top children instead of has_children?

@kire321

This comment has been minimized.

Copy link
Contributor Author

kire321 commented Oct 20, 2014

For our most heavily used cluster, has_children has a latency of 30 seconds or so.

@clintongormley

This comment has been minimized.

Copy link
Member

clintongormley commented Oct 20, 2014

@kire321 what version of Elasticsearch are you using, and are you using eager global ordinals on the _parent field? (see http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/fielddata-formats.html#_fielddata_loading)

Also, how many parents and how many children? And in a typical query, how many children match, and how many parents does that equate to?

@kire321

This comment has been minimized.

Copy link
Contributor Author

kire321 commented Oct 20, 2014

version 1.3.1
We've never tinkered with field data settings. If I understand the documentation correctly, the default is lazy loading, so no.
38544074 parents, 46344271 children. I'll get back to you about the number of matches in a typical query. Stuff is compiling... :)

@kire321

This comment has been minimized.

Copy link
Contributor Author

kire321 commented Oct 20, 2014

A typical query has 2125 parents and 2114 children.

martijnvg added a commit that referenced this issue Oct 20, 2014

@martijnvg martijnvg closed this in 6dac6ec Oct 20, 2014

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.