Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support of the count api when using field collapsing #27556

Closed
Koc opened this issue Nov 28, 2017 · 3 comments
Closed

Add support of the count api when using field collapsing #27556

Koc opened this issue Nov 28, 2017 · 3 comments

Comments

@Koc
Copy link

Koc commented Nov 28, 2017

{
    "query": {
        "bool": {
            "filter": [{
                "terms": {
                    "countries_ids": [165]
                }
            }]
        }
    },
    "collapse": {
        "field": "company_id"
    }
}

For the now impossible execute this kinds of queries by count api https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-count.html . We can do search but cann't do only count. Also hits.total field in response displays total count of documents before they was collapsed. It would be nice add something like hits.total_collapsed for getting count of total collapsed documents.

@jimczi
Copy link
Contributor

jimczi commented Nov 28, 2017

Also hits.total field in response displays total count of documents before they was collapsed. It would be nice add something like hits.total_collapsed for getting count of total collapsed documents.

Retrieving the total number of group that match a query in a distributed system is not trivial and can be very costly.
The field collapsing is applied on the top documents, not all documents, this is why it only returns the number of documents that match and not the number of group. It does not need this costly computation to provide accurate results on the top hits.
If you really want to compute this number you can use the cardinality aggregation which was designed to handle this case more efficiently:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html#_counts_are_approximate

@jimczi jimczi closed this as completed Nov 28, 2017
@Koc
Copy link
Author

Koc commented Nov 28, 2017

Thank you for such detail explanation.

Yes, as workaround for now I am using cardinality aggregation. But it requires a lot of changes from application side. Is it possible use this aggrregation under the hood when end user queries count of collapsed field? According to account that result will approximate.

@jimczi
Copy link
Contributor

jimczi commented Nov 28, 2017

It is costly to compute even with the cardinality aggregation so I think it's import to keep the computation explicit. Also this number is not really necessary for field collapsing, you don't need it to paginate so I am reluctant to add a slow mode in field collapsing. Sorry but as I said if you really want to compute this number you should use aggregations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants