Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect reverse nested agg counts when using multi level nested filters #9263

Closed
spotta opened this issue Jan 12, 2015 · 2 comments

Comments

@spotta
Copy link

commented Jan 12, 2015

There are couple of issues, one is regression of previously fixed issue #6994. This is working fine in 1.3.4 and 1.4.0, failing from 1.4.1.

Also there is another issue, reverse nested agg counts are incorrect when using more than one level of nested filter when fetching the aggs, this is happening even in 1.3.4.

Below details to repro the issue. Not sure if these should be separate issues, entering as one as they seem to be related.

DELETE /_all


POST /test
{
  "mappings": {
    "foo": {
      "dynamic": "strict",
      "properties": {
        "id": {
          "type": "long"
        },
        "baz": {
          "type": "nested",
          "properties": {
            "baz_cde": {
              "type": "string"
            }
          }
        },
        "bar": {
          "type": "nested",
          "properties": {
            "bar_typ": {
              "type": "string"
            },
            "color": {
              "type": "nested",
              "properties": {
                "name": {
                  "type": "string"
                }
              }
            }
          }
        }
      }
    }
  }
}


PUT /test/foo/1
{
    "id": 1,
    "bar": [
        {
            "bar_typ": "bar1",
            "color": [
                {
                    "name": "red"
                },
                {
                    "name": "green"
                },
                {
                    "name": "yellow"
                }
            ]
        },
        {
            "bar_typ": "bar1",
            "color": [
                {
                    "name": "red"
                },
                {
                    "name": "blue"
                },
                {

                    "name": "white"
                }
            ]
        },
        {
            "bar_typ": "bar1",
            "color": [
                {
                    "name": "black"
                },
                {
                    "name": "blue"
                }
            ]
        },
        {
            "bar_typ": "bar2",
            "color": [
                {
                    "name": "orange"
                }
            ]
        },
        {
            "bar_typ": "bar2",
            "color": [
                {
                    "name": "pink"
                }
            ]
        }
    ],
    "baz": [
        {
            "baz_cde": "abc"
        },
        {
            "baz_cde": "klm"
        },
        {
            "baz_cde": "xyz"
        }
    ]
}

Query to find bar counts, grouping by baz_cde, and applying a bar filter

POST /test/_search
{
  "size": 0,
  "aggs": {
    "nested_0": {
      "nested": {
        "path": "baz"
      },
      "aggs": {
        "group_by_baz": {
          "terms": {
            "field": "baz.baz_cde"
          },
          "aggs": {
            "to_root": {
              "reverse_nested": {},
              "aggs": {
                "nested_1": {
                  "nested": {
                    "path": "bar"
                  },
                  "aggs": {
                    "filter_by_bar": {
                      "filter": {
                        "term": {
                          "bar.bar_typ": "bar1"
                        }
                      },
                      "aggs": {
                        "bar_count": {
                          "value_count": {
                            "field": "bar_typ"
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

This query should return bar_count as 3 for all baz_cde fields, and is working as expected in 1.3.4 and 1.4.0, but failing from 1.4.1 onwards. Tested in 1.4.2 also.

Query to find bar counts, grouping by baz_cde, and applying a bar filter as well as bar.color filter

POST /test/_search
{
  "size": 0,
  "aggs": {
    "nested_0": {
      "nested": {
        "path": "baz"
      },
      "aggs": {
        "group_by_baz": {
          "terms": {
            "field": "baz.baz_cde"
          },
          "aggs": {
            "to_root": {
              "reverse_nested": {},
              "aggs": {
                "nested_1": {
                  "nested": {
                    "path": "bar"
                  },
                  "aggs": {
                    "filter_by_bar": {
                      "filter": {
                        "term": {
                          "bar.bar_typ": "bar1"
                        }
                      },
                      "aggs": {
                        "nested_2": {
                          "nested": {
                            "path": "bar.color"
                          },
                          "aggs": {
                            "filter_bar_color": {
                              "filter": {
                                "term": {
                                  "bar.color.name": "red"
                                }
                              },
                              "aggs": {
                                "reverse_to_bar": {
                                  "reverse_nested": {
                                    "path": "bar"
                                  },
                                  "aggs": {
                                    "bar_count": {
                                      "value_count": {
                                        "field": "bar_typ"
                                      }
                                    }
                                  }
                                }
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

This query should return bar_count as 2 for all baz_cde fields, but only one of them "xyz" is getting the correct count 2 but for the others value 1 is returned. This is happening in older versions too.

@martijnvg martijnvg self-assigned this Jan 12, 2015

@martijnvg

This comment has been minimized.

Copy link
Member

commented Jan 15, 2015

There is a very good chance that this bug is caused by the same underlying cause that causes #9280

@martijnvg

This comment has been minimized.

Copy link
Member

commented Jan 18, 2015

This issue is not related to #9280, but to #9317 instead, at least the first part of this bug report (which is failing from 1.4.1). The underlying problem is that the nested aggregator relies on the fact that a parent docid / bucket ord combination is only processed once, but that isn't the case. Depending on where the nested aggregator is placed the same parent docid is processed multiple times and this results in incorrect results. (for example a terms agg emits multiple buckets for the same docid and a nested agg is a child agg of this terms agg)

A similar problem also exists in the reverse_nested agg, which causes the second search request in this issue to yield incorrect results.

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Jan 26, 2015
martijnvg added a commit that referenced this issue Jan 26, 2015
Aggs: fix handling of the same child doc id being processed multiple …
…times in the `reverse_nested` aggregation.

Closes #9263
Closes #9345

@martijnvg martijnvg closed this in a645994 Jan 26, 2015

martijnvg added a commit that referenced this issue Jan 26, 2015
Aggs: fix handling of the same child doc id being processed multiple …
…times in the `reverse_nested` aggregation.

Closes #9263
Closes #9345
martijnvg added a commit that referenced this issue Jan 26, 2015
Aggs: fix handling of the same child doc id being processed multiple …
…times in the `reverse_nested` aggregation.

Closes #9263
Closes #9345
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.