Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting incorrect value count using reverse nested aggregation when using more than 1 nested level #7505

Closed
spotta opened this Issue Aug 28, 2014 · 4 comments

Comments

Projects
None yet
3 participants
@spotta
Copy link

spotta commented Aug 28, 2014

Using the following hierarchical data structure

  • author
    • book
      • review

I am trying to find number of books by genre given book.publisher and book.review.rating, but getting incorrect value count aggregate result.

This is working correctly if I use only 2 levels (book and review), but when I add author level also
then it is failing.

Mapping, data and query used below:



curl -XDELETE localhost:9200/authors

curl -XPUT  localhost:9200/authors

curl -XPUT  localhost:9200/authors/author/_mapping
'{
    "author": {
      "properties": {
        "author_id": {
          "type": "long"
        },
        "name": {
          "type": "string"
        },
        "book": {
          "type": "nested",
          "properties": {
            "book_id": {
              "type": "long"
            },
            "name": {
              "type": "string"
            },
            "genre": {
              "type": "string"
            },
            "publisher": {
              "type": "string"
            },
            "review": {
              "type": "nested",
              "properties": {
                "rating": {
                  "type": "string"
                },
                "posted_by": {
                  "type": "string"
                }
              }
            }
          }
        }
      }
    }
  }'
  
 curl -XPUT localhost:9200/authors/author/0
 '{
  "author_id": "1",
  "name": "a1",
  "book": [
    {
      "book_id": "11",
      "name": "a1-b1",
      "genre": "g1",
      "publisher": "p1",
      "review": [
        {
          "rating": "1s",
          "posted_by": "a"
        },
        {
          "rating": "2s",
          "posted_by": "b"
        },
        {
          "rating": "1s",
          "posted_by": "a"
        }
      ]
    },
    {
      "book_id": "12",
      "name": "a1-b2",
      "genre": "g1",
      "publisher": "p1",
      "review": [
        {
          "rating": "1s",
          "posted_by": "a"
        },
        {
          "rating": "2s",
          "posted_by": "b"
        },
        {
          "rating": "1s",
          "posted_by": "a"
        }
      ]
    }
  ]
}'


The book count (book_count) from the following query should be 2 but instead it is 1. 
The output at filter by rating is correct, but the value count isn't.


curl -XPOST localhost:9200/authors/_search
'{
  "size": 0,
  "aggs": {
    "nested_book": {
      "nested": {
        "path": "book"
      },
      "aggregations": {
        "group_by_genre": {
          "terms": {
            "field": "genre"
          },
          "aggregations": {
            "filter_by_publisher": {
              "filter": {
                "bool": {
                  "must": {
                    "term": {
                      "book.publisher": "p1"
                    }
                  }
                }
              },
              "aggregations": {
                "nested_review": {
                  "nested": {
                    "path": "book.review"
                  },
                  "aggregations": {
                    "filter_by_rating": {
                      "filter": {
                        "bool": {
                          "must": {
                            "term": {
                              "book.review.rating": "1s"
                            }
                          }
                        }
                      },
                      "aggregations": {
                        "reverse_to_book": {
                          "reverse_nested": {
                            "path": "book"
                          },
                          "aggregations": {
                            "book_count": {
                              "value_count": {
                                "field": "book_id"
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}'


@martijnvg martijnvg self-assigned this Aug 28, 2014

@martijnvg

This comment has been minimized.

Copy link
Member

martijnvg commented Aug 28, 2014

@spotta Can you share the ES version that you're using?

@spotta

This comment has been minimized.

Copy link
Author

spotta commented Aug 28, 2014

I tried using 1.3.1 & 1.3.2, same results

martijnvg added a commit that referenced this issue Aug 29, 2014

Aggregations: The nested aggregator should iterate over the child doc…
… ids in ascending order.

The reverse_nested aggregator requires that the emitted doc ids are always in ascending order, which is already enforced on the scorer level,
but this also needs to be enforced on the nested aggrgetor level otherwise incorrect counts are a result.

Closes #7505
Closes #7514

martijnvg added a commit that referenced this issue Aug 29, 2014

Aggregations: The nested aggregator should iterate over the child doc…
… ids in ascending order.

The reverse_nested aggregator requires that the emitted doc ids are always in ascending order, which is already enforced on the scorer level,
but this also needs to be enforced on the nested aggrgetor level otherwise incorrect counts are a result.

Closes #7505
Closes #7514

@martijnvg martijnvg closed this in 2ba4e35 Aug 29, 2014

@martijnvg

This comment has been minimized.

Copy link
Member

martijnvg commented Aug 29, 2014

Thanks for reporting this bug @spotta. The next release will include a fix for this bug.

@spotta

This comment has been minimized.

Copy link
Author

spotta commented Aug 30, 2014

awesome! thanks for the quick turnaround.

@clintongormley clintongormley added the >bug label Sep 8, 2014

martijnvg added a commit that referenced this issue Sep 8, 2014

Aggregations: The nested aggregator should iterate over the child doc…
… ids in ascending order.

The reverse_nested aggregator requires that the emitted doc ids are always in ascending order, which is already enforced on the scorer level,
but this also needs to be enforced on the nested aggrgetor level otherwise incorrect counts are a result.

Closes #7505
Closes #7514

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015

Aggregations: The nested aggregator should iterate over the child doc…
… ids in ascending order.

The reverse_nested aggregator requires that the emitted doc ids are always in ascending order, which is already enforced on the scorer level,
but this also needs to be enforced on the nested aggrgetor level otherwise incorrect counts are a result.

Closes elastic#7505
Closes elastic#7514
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.