Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sorting with nested_filter does not work with inner nested document #9305

Closed
pickypg opened this issue Jan 15, 2015 · 3 comments · Fixed by #9692

Comments

@pickypg
Copy link
Member

commented Jan 15, 2015

Using an inner nested filter within a nested_filter for sorting does not work (tested in 1.4.2) and it silently fails to handle the inner nested filter.

The corresponding test code is commented out at https://github.com/elasticsearch/elasticsearch/tree/v1.4.2/src/test/java/org/elasticsearch/nested/SimpleNestedTests.java#L1171

Example follows:

PUT /test
{"mappings":{"type":{"properties":{"officelocation":{"type":"string"},"users":{"type":"nested","properties":{"first":{"type":"string"},"last":{"type":"string"},"workstation":{"type":"nested","properties":{"stationid":{"type":"string"},"phoneid":{"type":"string"}}}}}}}}}

PUT /test/type/1
{"officelocation":"glendale","users":[{"first":"fname1","last":"lname1","workstation":[{"stationid":"s1","phoneid":"p1"},{"stationid":"s2","phoneid":"p2"}]},{"first":"fname2","last":"lname2","workstation":[{"stationid":"s3","phoneid":"p3"},{"stationid":"s4","phoneid":"p4"}]},{"first":"fname3","last":"lname3","workstation":[{"stationid":"s5","phoneid":"p5"},{"stationid":"s6","phoneid":"p6"}]}]}

PUT /test/type/2
{"officelocation":"glendale","users":[{"first":"fname4","last":"lname4","workstation":[{"stationid":"s1","phoneid":"p1"},{"stationid":"s2","phoneid":"p2"}]},{"first":"fname5","last":"lname5","workstation":[{"stationid":"s3","phoneid":"p3"},{"stationid":"s4","phoneid":"p4"}]},{"first":"fname1","last":"lname1","workstation":[{"stationid":"s5 ss","phoneid":"p5"},{"stationid":"s6","phoneid":"p6"}]}]}

GET /test/_search
{
  "fields": "_id",
  "sort": [
    {
      "users.first": {
        "order": "asc"
      }
    },
    {
      "users.first": {
        "order": "asc",
        "nested_path": "users",
        "nested_filter": {
          "nested": {
            "path": "users.workstation",
            "filter": {
              "term": {
                "users.workstation.stationid": "s5"
              }
            }
          }
        }
      }
    }
  ]
}

This returns

{
   "took": 9,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 2,
      "max_score": null,
      "hits": [
         {
            "_index": "test",
            "_type": "myType",
            "_id": "1",
            "_score": null,
            "sort": [
               "fname1",
               null // <- should be "fname3"
            ]
         },
         {
            "_index": "test",
            "_type": "myType",
            "_id": "2",
            "_score": null,
            "sort": [
               "fname1",
               null // <- should be "fname1"
            ]
         }
      ]
   }
}

Here is the nested filter as a query:

GET /test/_search
{"query":{"filtered":{"filter":{"nested":{"path":"users","filter":{"nested":{"path":"users.workstation","filter":{"term":{"users.workstation.stationid":"s5"}}}}}}}}}
@navneet83

This comment has been minimized.

Copy link
Member

commented Jan 16, 2015

A possible workaround for this problem would be to index inner "workstation" object both as nested fields and as flattened object field.This can be achieved by setting "include_in_parent" to true.
Here is the updated schema (note the "include_in_parent":"true" under workstation :
PUT /test

{"mappings":{"type":{"properties":{"officelocation":{"type":"string"},"users":{"type":"nested","properties":{"first":{"type":"string"},"last":{"type":"string"},"workstation":{"type":"nested", "include_in_parent":"true" ,"properties":{"stationid":{"type":"string"},"phoneid":{"type":"string"}}}}}}}}}

Here is the updated query:
POST test/_search

{
  "fields":"_id",
  "sort":[
    {
      "users.first":{
        "order":"asc"
      }
    },
    {
      "users.first":{
        "order":"asc",
        "nested_path":"users",
        "nested_filter":{
          "term":{
            "users.workstation.stationid":"s5"
          }
        }
      }
    }
  ]
}

This returns:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : null,
    "hits" : [ {
      "_index" : "test123",
      "_type" : "type",
      "_id" : "2",
      "_score" : null,
      "sort" : [ "fname1", "fname1" ]
    }, {
      "_index" : "test123",
      "_type" : "type",
      "_id" : "1",
      "_score" : null,
      "sort" : [ "fname1", "fname3" ]
    } ]
  }
}
@martijnvg

This comment has been minimized.

Copy link
Member

commented Jan 19, 2015

The reason why the sorting goes wrong here is that the nested query is dependant about the nested context it is placed in. If a nested query has no other nested query above it, it assumes that it is should link back to the main/root document. If a nested query is placed under another nested query then it assumes it should link back to nested level that belong to the path the parent nested query has been set to. Nested sorting doesn't set this nested context and therefor the nested query doesn't link back to the users level, but to the root level instead.

We need to make sure that nested sorting sets the nested level properly, so that other nested queries know about this. The tricky bit is here that due to how the sorting elements gets parsed (in a streaming manner), the nested filter may be parsed before the path field has been parsed, so in order to do this properly the filter should be parsed after the path field has been parsed.

@angelatsai

This comment has been minimized.

Copy link

commented Feb 10, 2015

I encountered the same problem, too.

The include_in_parent workaround can solve this problem only when you don't need the relations defined in workstation nested object.

For example, if you need to filter by workstation {"stationid":"s5","phoneid":"p6"}, user {"first":"fname3","last":"lname3","workstation":[{"stationid":"s5","phoneid":"p5"},{"stationid":"s6","phoneid":"p6"}]} in the first document would match, even there's no {"stationid":"s5","phoneid":"p6"} in his workstations.

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Feb 17, 2015
…t nested level during search request parsing.

The nested scope is set by any nested feature, so that sub nested queries and filters know about their context and these sub nested queries and filters can construct the right parent filter.
Removed the LateBindingParentFilter workaround in the nested query parser in favour of the nested scope maintained in the query parse context.
Due to this change nested queries and filters can now also be included in nested sorting and inner hits, because those features also now use the nested scope.

This change doesn't fix the usage of nested filters in nested and reverse_nested aggregations. The `nested` filter shouldn't be used inside these aggregations and instead the `nested` and `reverse_nested` aggs should be used to query on the right level. In a different change `nested` inside a `nested` and `reverse_nested` aggregation should result in a parse error.

Closes elastic#9305
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Feb 17, 2015
…t nested level during search request parsing.

The nested scope is set by any nested feature, so that sub nested queries and filters know about their context and these sub nested queries and filters can construct the right parent filter.
Removed the LateBindingParentFilter workaround in the nested query parser in favour of the nested scope maintained in the query parse context.
Due to this change nested queries and filters can now also be included in nested sorting and inner hits, because those features also now use the nested scope.

This change doesn't fix the usage of nested filters in nested and reverse_nested aggregations. The `nested` filter shouldn't be used inside these aggregations and instead the `nested` and `reverse_nested` aggs should be used to query on the right level. In a different change `nested` inside a `nested` and `reverse_nested` aggregation should result in a parse error.

Closes elastic#9305
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.