Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for nested query syntax within query string query DSL #11322

Open
tuespetre opened this issue May 24, 2015 · 34 comments
Open

Support for nested query syntax within query string query DSL #11322

tuespetre opened this issue May 24, 2015 · 34 comments

Comments

@tuespetre
Copy link

@tuespetre tuespetre commented May 24, 2015

I understand that issue #9611 was closed regarding this:

Nested fields need to be queries with nested queries/filters, because multiple documents can match and you need to be able to specify how these multiple scores should be reduced to a single score.
— -- @clintongormley

Proposal

I propose that, when a field name within a query string query is parsed, and it does not match a field mapping, an attempt should be made to match the field name to a nested object mapper. If the attempt is successful, the query text for that field name should then be parsed as a query string query using the same settings as the root level query string query. The resulting query from that parsing will in turn be used to create a ToParentBlockJoinQuery (a nested query) that uses the same default scoring mode that would be applied when manually submitting a nested query ("avg".)

The syntax

The acceptable syntax for a nested query within a query string query is similar to this:

nestedPath:"<query string query>"

This means that any constructs you would use in a query string query are valid:

children:"children.first:peggy"
children:"children.first:\"peggy\""
children:"children.first:(peggy ruby)"
children:"children.first:peggy AND children.last:sue"
children:"children.first:pegyg~ +children.last:su?"

Note that the nested query MUST be surrounded with quotes. I wanted it to be parentheses instead but unfortunately the Lucene QueryParser class does not recognize the field names the way I wanted it to (children:(children.first:peggy) would come out as a TermQuery on children.first, the children field name would be discarded.)

Other considerations

  • Support for specifying scoring modes within the query string query settings based on nested object paths is a possibility.
  • Support for inner hits may also be a possibility, in a similar fashion to scoring modes.

Support for nested queries in query strings at all would be an enhancement, but these options could provide additional enhancements. Example of how they may look:

{
    "query_string" : {
        "query" : "children:\"children.first:peggy\"",
        "nested": [
            {
                "path": "children",
                "score_mode": "max",
                "inner_hits": {
                    <inner_hits_options>
                }
            }
        ]
    }
}

Pull request

For the basic functionality, I have already made the necessary modifications (three changed files, one changed test file to add a test with several assertions) on the 'master' branch of my local clone of the repository. I would like to submit a pull request; please advise as to how you would like that to be done (if I need to rebase onto another branch, etc.)

@tuespetre
Copy link
Author

@tuespetre tuespetre commented Jul 24, 2015

I've since worked around this in other ways (simple regex to parse out nested field expressions on my end and submit them properly to ES); It was fun to mess around with this but I fully support axing it now. It would just be more complexity to maintain; perhaps the query string documentation could hint at some kind of better solution for developers that may look for this functionality.

Loading

@clintongormley
Copy link
Contributor

@clintongormley clintongormley commented Jul 27, 2015

thanks @tuespetre

Loading

@radenui
Copy link

@radenui radenui commented Jan 22, 2016

Hi @tuespetre ,

I'm very interested in the workaround you used. Did you manage to make it work with kibana ?
Thanks !

Loading

@tuespetre
Copy link
Author

@tuespetre tuespetre commented Jan 22, 2016

@radenui

I wrote the following drop-in helper class (written in C#, but should be easily portable to other languages): https://gist.github.com/tuespetre/f6951bb665c79abbb7c8

You basically use the class to create new URIs by performing some function against the existing query string (remove this filter, replace that filter, add this filter, etc.) When you specifically need to allow users to perform a 'proper' nested query, you can just use the helper to extract the filters on the nested properties out and build up a separate query string, which you would then submit as a nested query string query in your request to Elasticsearch.

I'm using it to offer both 'customer service representative friendly' interfaces (where the query string built up by the 'friendly' controls is stored in a hidden input) and 'technical user friendly' interfaces (where the query string is spit out into a visible text box that you can also type in, a-la GitHub Issues.)

Loading

@rmm5t
Copy link

@rmm5t rmm5t commented May 10, 2016

I actually quite like this proposal. Is it something that would be considered by the elasticsearch team or is this something that's not likely to ever be a feature? I'd love if the query string syntax allowed for nested query combinations.

I wanted it to be parentheses instead

Agreed. I think this syntax would be much better served by parentheses instead of quotations.

Loading

@jsangari-ssat
Copy link

@jsangari-ssat jsangari-ssat commented May 24, 2016

Hi,

Is the syntax recommended here for the query_string supported in ES, I am using Version 2.2 and am having hard time getting it to work

Loading

@alexgarel
Copy link

@alexgarel alexgarel commented Sep 19, 2016

Hello I also think this should be supported. query_string remains a nice helper, and being able to use nested objects whit it would be great.

Loading

@tuespetre
Copy link
Author

@tuespetre tuespetre commented Sep 19, 2016

@alexgarel and all everyone:

I think it would be more beneficial to keep something this niche and complex out of the core elasticsearch, and offer your own query DSL 'layer' that can be translated into a 'proper' ES query on the backend. By brushing up on regular expressions (or even parsing!) a little bit you can put together some pretty cool UX affordances specific to your application.

Loading

@rmm5t
Copy link

@rmm5t rmm5t commented Sep 19, 2016

...keep something this niche and complex out of the core elasticsearch...

...By brushing up on regular expressions (or even parsing!) a little bit...

So, is it niche and complex or is it as simple as adding a few items to the elasticsearch grammar?

Personally, I agree that you can add a custom syntax on top (with regexes or otherwise), but I also would like this discussion to remain open, because I think having a conversation about making the query string syntax more robust isn't necessarily a bad thing. Having every elasticsearch application implement yet another hack on top of the query string syntax to accomplish this isn't necessarily a great use of global man-hours.

I'm mostly interested to better understand if the elasticsearch team is interested in a Pull Request for this feature. So far, we don't have an answer to that question.

Loading

@alexgarel
Copy link

@alexgarel alexgarel commented Sep 19, 2016

@tuespetre
Ok I understand, it's the way we have chosen but not fully implemented yet. If someone needs it, we have a (GPL) lucene query parser in python

Loading

@tuespetre
Copy link
Author

@tuespetre tuespetre commented Sep 19, 2016

@rmm5t I had submitted a PR (#11339) but as @clintongormley points out it's just a fragile thing to have in the core application, and as I found out when working the PR initially, it can't really be done with a pleasant syntax -- it comes out feeling very verbose and awkward, especially being unable to hijack the parenthesis for it. With a small handful of regular expressions I was able to implement a much nicer syntax specific to the particular needs of our application without feeling like I had to 'settle' for something subpar.

Loading

@rmm5t
Copy link

@rmm5t rmm5t commented Sep 19, 2016

I had submitted a PR (#11339) but as @clintongormley points out it's just a fragile thing to have in the core application, and as I found out when working the PR initially, it can't really be done with a pleasant syntax

@tuespetre Interesting point. That PR was tagged for discussion (which, respectfully, never really happened amongst the elasticsearch team, aside from @clintongormley willingness to comment and chime in). Then, it was closed, solely because you closed this particular issue after building a workaround -- not because a discussion really happened.

I agree with your first assessment that the double-quoted syntax isn't ideal. I understand there are problems with the clearer parentheses syntax, but I suspect those can probably be overcome.

If the core query string syntax and implementation are "fragile," maybe that's something that should be addressed and potentially refactored as well. To be clear, I'm not trying to make light of this; I'm sure a refactor would be a tricky endeavor.

Proposal

Overall, I'd really just like to see an ability to narrow a query string search to one particular embedded object. I'd like to see a syntax that looked like this:

children:(gender:male AND age:>=18 AND age:<=25)

Otherwise, there's no way to use the query string syntax and (in this particular US-centric example) find parents who have children who should be signed up for the US Selective Service System.

Loading

@traut
Copy link

@traut traut commented Jul 20, 2017

can we resurrect this issue please?

Loading

@clintongormley
Copy link
Contributor

@clintongormley clintongormley commented Jul 20, 2017

Yeah, I think we need to think more about whether to expose this. Opening for more discussion

Loading

@czjxy881
Copy link
Contributor

@czjxy881 czjxy881 commented Oct 9, 2017

+1

Loading

@tuespetre
Copy link
Author

@tuespetre tuespetre commented Oct 9, 2017

@rmm5t good points, your 'wish syntax' looks nice!

Loading

@buchanae
Copy link

@buchanae buchanae commented Dec 14, 2017

If I could comment on my experience as a user:

It took me an hour or so to figure out that this didn't exist. I'd like to build a dashboard with a search bar, where the syntax is defined by Elasticsearch/Lucene's query string syntax. Having this would make that project substantially easier.

As an engineer: this seems like a great candidate for something that could grow, mature, and harden outside of the core. If a service/library can be built using Lucene's parser and submit JSON-style nested Elasticsearch queries on the backend, we could figure out the details with a non-core prototype.

children:(gender:male AND age:>=18 AND age:<=25)

I like that.

My initial idea, inspired by jq: children[].gender:male. About 2 seconds of thought went into that, so potentially full of holes :)

Loading

@alexgarel
Copy link

@alexgarel alexgarel commented Dec 14, 2017

@buchanae sorry for I repeat myself but you can see our (GPL) lucene query parser in python it's yet far from perfect but may help.

Loading

@albogdano
Copy link

@albogdano albogdano commented Jan 12, 2018

I'd also like to +1 this and share my experience. I'm a long-time Elasticsearch user and I've recently hit the "field mapping explosion" limitation. Our system allows users to define their own objects with any number of custom fields, which leads to a mapping explosion. Currently, from what I read in the forum, the only way to solve this is to use nested key/value objects inside an array field:

nested: [{k: FIELD1, v: TERM1}, ...]

This lead me to this issue. I'm trying to seamlessly combine normal queries and queries to nested objects in a single query string query. I think this feature would make it easier for people to solve the problem of "too many custom fields".

EDIT: I've implemented this as a Lucene query string syntax extension, by detecting and rewriting queries which contain special nested fields. Link to code

Loading

albogdano added a commit to Erudika/para-search-elasticsearch that referenced this issue Jan 18, 2018
@cbuescher
Copy link
Member

@cbuescher cbuescher commented Mar 13, 2018

/cc @elastic/es-search-aggs

Loading

@cont-korzh
Copy link

@cont-korzh cont-korzh commented Apr 10, 2018

+1

Loading

@jonasbergqvist
Copy link

@jonasbergqvist jonasbergqvist commented Jun 28, 2018

+1

Loading

@jimczi jimczi removed the discuss label Sep 7, 2018
@jimczi jimczi self-assigned this Sep 7, 2018
@imranansarij2ee
Copy link

@imranansarij2ee imranansarij2ee commented Jan 23, 2019

+1

Loading

3 similar comments
@waswrongassembled
Copy link

@waswrongassembled waswrongassembled commented Feb 15, 2019

+1

Loading

@ChristopherSnay
Copy link

@ChristopherSnay ChristopherSnay commented Apr 25, 2019

+1

Loading

@theoJA
Copy link

@theoJA theoJA commented May 13, 2019

+1

Loading

@prashantalhat
Copy link

@prashantalhat prashantalhat commented Mar 13, 2020

+1

Loading

@jimczi jimczi removed their assignment Mar 13, 2020
@kietdinh
Copy link

@kietdinh kietdinh commented Mar 16, 2020

+1

Loading

1 similar comment
@ffery
Copy link

@ffery ffery commented Apr 15, 2020

+1

Loading

@chethan-uc
Copy link

@chethan-uc chethan-uc commented Oct 7, 2020

+1

Loading

2 similar comments
@4ndygu
Copy link

@4ndygu 4ndygu commented Jul 21, 2021

+1

Loading

@adjivas
Copy link

@adjivas adjivas commented Oct 8, 2021

+1

Loading

@SimarFromCowbell
Copy link

@SimarFromCowbell SimarFromCowbell commented Oct 29, 2021

+1 would like this added

Loading

@tumbledwyer
Copy link

@tumbledwyer tumbledwyer commented Nov 9, 2021

+1
I have the issue, where I'm trying to do a nested query from logstash using the elasticsearch filter, which only supports query string, not the regular DSL.
I can accomplish this in KQL like this:
myNestedObject:{ nestedProperty: "The value I'm looking for" }

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet