terms facet gives wrong count with n_shards > 1

I'm working with nested documents and have noticed that my faceted search interface is giving the wrong counts when I have more than one shard. To be more specific, I'm working with RDF triples (entity > attribute > value) and I'm nesting the attributes (called predicates in my example):

```
{
  "_id" : "512a2c022f0b4e3daa341e6c8bcf6c2f",
  "url": "http://dbpedia.org/resource/Alan_Shepard",
  "predicates": [
    {
      "type": "type",
      "string_value": ["thing", "person", "astronaut"]
    }, {
      "type": "label",
      "string_value": ["Alan Shepard"]
    }, {
      "type": "time in space",
      "float_value": [216.950]
    },
    ... lots more
  ]
}
```

I've created a shell script (https://gist.github.com/1196986) that recreates the problem with a fresh index. The created data set has these totals:
- thing (30)
- creative work (20)
- video game (10)
- tv show (10)
- people (10)

With only **one shard** the following query gives the correct counts no matter what the size parameter is set to:

```
{
  "size": 0,
  "query": {
    "match_all": {}
  },
  "facets": {
    "type_counts": {
      "terms": {
        "field": "string_value",
        "size": 5
      },
      "nested": "predicates",
      "facet_filter": {
        "term": {
          "type": "type"
        }
      }
    }
  }
}
```

However, with **more than one shard** the size parameter affects the accuracy of the counts. If it is equal to or greater than the number of terms returned by the facet query (5 in this case) then it works fine. However, the terms at the bottom of the list start to display low counts as you reduce the size parameter:

With "size" : 4
- thing (30)
- creative work (20)
- video game (10)
- **tv show (9)**

With "size" : 3
- thing (30)
- **creative work (15)**
- **video game (9)**

With "size" : 2
- thing (30)
- **creative work (15)**

So it looks like the sub-totals from some of the shards aren't being included for some reason. BTW I'm on ubuntu and the problem seems to affect all versions of ES I've tried (17.0, 17.1 and 17.6). Any ideas...?

P.S. absolutely loving ES - it's made my life a lot easier :)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

terms facet gives wrong count with n_shards > 1 #1305

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

terms facet gives wrong count with n_shards > 1 #1305

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions