Skip to content
This repository

Query DSL: Indices query type #1416

Closed
poblish opened this Issue October 21, 2011 · 12 comments

4 participants

Andrew Regan Shay Banon Folke Lemaitre Medcl
Andrew Regan

The indices query type allows to execute a query only for shards that belong to the listed indices, otherwise resulting in a "match_all" behavior. For example:

`indices` : {
    "indices" : ["index1", "index_prefix_*"]
    "query" : {
        "term" : ...
    }
}

--- original request

Currently, searches only allow one set of indices to be specified across the entire query, but it would be helpful to allow BoolQueryBuilder-style subqueries to be able to specify their own indices as well, something Compass used to support, and for ES to deal with the aggregation. In other words, add setIndices(String... indices) to QueryBuilder, rather than just having it on SearchRequestBuilder. This would allow UNION-style queries to be built up, of the form:

"return all hits where property=X in (indexA,indexB) OR property=Y in (indexA,indexC)"

Shay Banon
Owner

Setting the indices to execute on each one is problematic, but we can introduce a new query type, called indices that will wrap another query and be executed when it matches the indices provided. I will update the issue to reflect that.

Shay Banon kimchy closed this in b2b608f October 22, 2011
Andrew Regan

Hi Shay,

Thanks for your work on this. I assumed it was working, but now I'm not so sure. My generated query is looking like:

"query": {
  "bool": {
    "should": {
      "custom_boost_factor": {
        "query": {
        "match_all": {}
    },
    "indices": [
        "simpleflag",
        "favouriteflag"
        ]
       }
    }
  }
}

... it doesn't seem that the inner query is actually being wrapped by an outer indices query. I'm using the right QueryBuilders.indicesQuery(...) method, though. Is this the expected output?

Also, I imagined that the result of running the above across _all indices would be to only return all records for the specificed indices only, but as I understand it I'm going to get all from those because of my match_all, plus all from every other index too, because match_all is used for everything not in "indices: []".

I think that having something like "match_none" would make more sense for indices not specified in the subquery's list. That would allow me to run global-style queries across _all indices, but give the subqueries a real chance to filter down by index, without pulling in unwanted documents from the indices they didn't specify.

WDYT?

Shay Banon
Owner

Heya, its a bug in the indices query builder, that uses the custom_boost_factor by mistake. You can build it yourself for now, I will push a fix. Here is the issue: #1485.

Andrew Regan

Thanks, I'm trying a newly-built snapshot, but I'm not seeing any difference: the query certainly changes, but no difference in the results for any query I try.

curl -XGET '10.10.10.101:9200/_all/default/_search' -d '{
    "query": {
        "indices": {
            "query": {
                "match_all": {}
            },
            "indices": [
                "simpleflag", "favouriteflag", "articleversion", "articlerating","followrelationship", "groupevent", "assertionflag"
            ]
        }
    }
}'

So, with the above, I'm expecting the indices wrapper query to give me back a small number of results (every doc in those few indices), but I'm actually getting every single doc in all indices returned.

(I know I could rewrite the above to do without indices{}, but this is just the simplest possible case.)

Shay Banon
Owner

The query you posted in this form does not make sense. The indices query will use the query you provided internally when its executed on one of the listed indices, and match_all when it does not match one of those indices, which, if I remember, is what you were after.

Andrew Regan

This takes us back to the first comment I left here yesterday. It looks like what I really need is not in fact match_all for indices not in the indices[] section, but "match none". In the example above I only care about the results within the indices{} query - I don't want anything else whatsoever. What Compass did was perfect for me :-)

Shay Banon
Owner

When do you really need match_none? I mean, if its match_none, in your example, why not just query those indices (instead of using _all) and thats it?

Andrew Regan

The above was the simplest possible case. In reality my app will be receiving custom, configurable queries like the following, and turning them (programatically) into suitably nested queries to pass to ES:

  • get me all documents where { either John or Jane Smith} is mentioned in an article, performed an action, was tagged, etc.
  • get me a list/count of all tags, flags, etc.

There are some cases where setting the overall indices helps, but in general I have to search against _all, relying on the individual bool() subqueries to control which (of the 50 or so) indices are used. Compass achieved this beautifully. One way of looking at this is that I want to use my subqueries to build up from nothing. What indices{} currently does is filter down from everything.

Shay Banon
Owner

I still don't understand why you would need one that does not match, maybe in should clauses in a boolean query... . Opened #1492.

Andrew Regan

Great, this is now working perfectly - many thanks!

Folke Lemaitre
folke commented March 16, 2012

Any chance "indices" could be added as a filter as well?

Medcl
medcl commented March 17, 2012

++1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.