New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
has_child and inner_hits for grandchild hit doesn't work #11118
Comments
I've created a simpler test case for this, and it seems a little clearer that this doesn't currently work. Add mappings:
Populate:
Query:
Result:
Not only is the grandchild hits empty, but they're also not nested within the child hits, so aren't going to give me what I want anyway. I'm not sure what the intended/expected behaviour would be here, but guess I need to try something else for now. |
Above with inner hits at top-level: Query:
Result:
Here the grandchild inner hit is correctly found, and nested |
[I'm using 1.5.2] |
I managed to track down why this wasn't working. When parsing 'has child' queries, inner hits were always being added to the I've put together a very quick and dirty fix for this (https://github.com/lukens/elasticsearch/commit/fb22d622e7b24074b5f4fe3e26cffd4c38cff75b) which has got it working for my case, but I don't think is suitable for inclusion in a release. Issues I see with my with my fix:
I considered submitting as a pull request, but felt it was far too rough and ready. |
@lukens If you want nested inner hits then you need to use the top level inner hits, the inner_hits on a query doesn't support nesting. The fact that your grandparents inner hits is empty is clearly a bug, thanks for bringing this up! |
Hi, it's grandchild, rather than grandparent, that isn't working. Though you also seem to be suggesting it shouldn't be. Either way, the change I committed shows that it can work, my change just isn't a very nice way to make it work. |
Ah, or are you saying it should work, but just shouldn't be nested? I'm not really sure what the point of it would be if it wasn't nested, though that may just be because it doesn't fit my use case, and I can't think of a use case where it would be useful. I think it would be good if nesting did work, as that would still allow either use case, really. The problem with top level inner_hits is that I have to apply the query once in the has_child query, and then again in the inner_hits query, which makes everything slower than it would otherwise need to be. |
@lukens yes, I meant grandchild. The reason it is a bug is, because the inner_hits in your response shouldn't be empty. The top level inner hits and inner hits defined on a query internally to ES is the same thing and either way of defining inner hits will yield the same performance in terms of query time. The nested inner hits support in the query dsl was left out to reduce complexity and most of the times there is just a single level relationship. Obviously that means for your use case that you need to use top level inner hits. Maybe the inner hits support in the query dsl should support multi level relationships too, but I think the parsing logic shouldn't be get super complex because this. I need to think more about this. Like you said if it the grandchild isn't nested its hits in the response, then it isn't very helpful. The only overhead of top level inner hits is that queries are defined twice, so the request body gets larger. If you're concerned with that, you can consider using search templates, so that you don't you reduce the amount of data send to your cluster. |
Isn't there also an overhead in running the query twice with the top level inner_hits, or does Elasticsearch do something clever so that it only gets run once? (and, if so, does it need to be specified again in the inner hits?) I've not yet come across search templates, are these compatible with highly dynamic searches? I'd like try and spend a bit more time on getting nesting working in a less hacky manner, but am under enormous pressure just to get a project completed at the moment. For the case when grandchildren aren't nested in the query, I guess the simple solution is to also not nest them in the response. I don't think this would overcomplicate the parsing too much. I have a fairly nice way to handle all this in my mind, just not the time to implement it at the moment. |
OK, switching to top level hits doesn't seem to have affected performance, so I can work with that for now. It had seemed much slower before, but once I'd actually got inner_hits on the query working, that ended up just as slow, until I tweaked some other things. The "fix" I've committed above is probably only really of any use as a reference if you want to go with nesting, as it doesn't solve the problem when not nested. I expect there's something in the inner workings of inner_hits that currently relies on them being nested. |
inner_hits runs as part of the fetch phase and always executes an additional search to fetch the inner hits. The search being executed is cheap. It only runs an a single shard and just runs a query that fetches top child docs that matches with
Yes, the dynamic part of the search request can be templated.
Yes, the inner hits features relies on the fact that grandchild and child are nested. When using the top-level inner hits notation this works out, but when using inner_hits as part of the query dsl this doesn't work out, because grandchild inner hit definition isn't nested under the child inner hit definition. In order to fix this properly the query dsl parsing logic should just support nested inner hits. I think the format doesn't need to change in order to support this. Just because the fact the two |
Allso seeing this issue, in my case multi-level nested documents (as in #13064). Would be great to get a solution to this. |
Will this issue be fixed at 2.x? Really looking forward to see the nested inner hits in query dsl. |
When will this be fixed, its a blocker for us !! |
+1 |
1 similar comment
+1 |
@martijnvg just pinging you about this as a reminder |
…g nested `has_child`, `has_parent` or `nested` queries in the query dsl. Closes elastic#11118
+1 |
I added information on how to define _source fields for inner_hits. In addition, the current information on the retrieval of inner_hits for multiple nested documents is incorrect (see also elastic#11118).
+1 |
…y dsl. Fix a limitation that prevent from hierarchical inner hits be defined in query dsl. Removed the nested_path, parent_child_type and query options from inner hits dsl. These options are only set by ES upon parsing the has_child, has_parent and nested queries are using their respective query builders. These options are still used internally, when these options are set a new private copy is created based on the provided InnerHitBuilder and configuring either nested_path or parent_child_type and the inner query of the query builder being used. Closes elastic#11118
Hi Everyone, |
It looks like this issue is still unsolved, at least in elasticsearch 7.1 |
I'm trying to get
inner_hits
to work on an a two level deephas_child
query, but the grandchild hits just seem to return an empty array. If I specify the inner hits at the top level I get the results I want, but its rather too slow. Using inner hits on the has child query seems much faster, but doesn't return the grandchild hits.Below is an example query:
The
document-ref
inner hits just always returns an empty array. Should this work (and, if so, any ideas why it isn't?), or is it beyond the means of what inner hits can currently do?The text was updated successfully, but these errors were encountered: