Slowdown on date range queries in ES 2.1 #15994

jrots · 2016-01-14T16:53:29Z

On a dataset of +/- 30M (unoptimized / so lot's of segments)
The following query executes faster a lot faster :

curl -XGET "http://localhost:9200/users/person/_search" -d '
    {
        "from": 0,
        "size": 20,
        "_source": false,
        "query": {
            "constant_score": {
                "filter": {
                    "bool": {
                        "must": [
                            {
                                "term": {
                                    "gender": "female"
                                }
                            },
                            {
                                "terms": {
                                    "location.geohash": [
                                        "u143",
                                        "u149",
                                        "u14c",
                                        "u146",
                                        "u14d",
                                        "u14f",
                                        "u147",
                                        "u14e",
                                        "u14g"
                                    ]
                                }
                            },
                            {
                                "term": {
                                    "accountstatus": "active"
                                }
                            },
                            {
                                "terms": {
                                    "country": [
                                        "be"
                                    ]
                                }
                            }
                        ]
                    }
                }
            }
        },
        "post_filter": { 
            "range": {
                "birthdate": {
                    "gte": "1981-01-13",
                    "lte": "1997-01-13"
                }
            }
        }
    }   
    '

< 110ms, consecutive runs,
{"took":109,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":2454,"max_score":1.0

vs. this is one is always : 300ms to 400ms +

curl -XGET "http://localhost:9200/users/person/_search" -d '
  {
        "from": 0,
        "size": 20,
        "_source": false,
        "query": {
            "constant_score": {
                "filter": {
                    "bool": {
                        "must": [
                            {
                                "term": {
                                    "gender": "female"
                                }
                            },
                            {
                                "terms": {
                                    "location.geohash": [
                                        "u143",
                                        "u149",
                                        "u14c",
                                        "u146",
                                        "u14d",
                                        "u14f",
                                        "u147",
                                        "u14e",
                                        "u14g"
                                    ]
                                }
                            },
                            {
                                "term": {
                                    "accountstatus": "active"
                                }
                            },
                            {
                                "terms": {
                                    "country": [
                                        "be"
                                    ]
                                }
                            },
                            {
                                "range": {
                                    "birthdate": {
                                        "gte": "1981-01-13",
                                        "lte": "1997-01-13"
                                    }
                                }
                            }
                        ]
                    }
                }
            }
        }
    }
    '

{"took":304,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":2446,"max_score"

If this expected behaviour, please close the issue -
But I think that ES should analyse the query and know when to do post filtering instead of generating a lot of lucene terms for date fields that slow down queries drastically.

clintongormley · 2016-01-14T17:00:36Z

What version of ES?
How many times did you run it?
How many segments do you have?
What hardware are you running this on? (all your search times seem very slow for such a small index)

jrots · 2016-01-14T17:15:44Z

ES: 2.1.1 - Lucene 5.3.1, -Xms28g -Xmx28g mlockall = true
6 times consecutively for each query
Intel(R) Xeon(R) CPU X5650 @ 2.67GHz, 2 physical cpu's, 12 cores w hyperthreading, 24 processors in total
64GB ram, 1TB SSD
segments : https://gist.github.com/jrots/d7020441bf33dca4b67a , 4 shards with 30 segments on each shard

jimczi · 2016-01-14T18:05:06Z

And what is the average time when you remove the range filter/post_filter ? The post_filter vs filter are very similar in terms of implementation, the post_filter has to scan the inverted lists also.

clintongormley · 2016-01-15T11:23:43Z

@jrots It seems there may be a misunderstanding of what the post_filter does in Elasticsearch. The post filter is simply a filter that is applied to the query results AFTER aggregations have been calculated - it has nothing to do with executing the filter via the inverted index vs fielddata.

That said, something seems odd about how the query is executed - if the range query is the most costly, the bool query should figure that out and change the execution order to apply the terms queries first. We'll investigate.

Two questions:

Is the birthdate field mapped as a string or as a date?
Why do the number of results differ for the two queries? Are you actively indexing?

jimczi · 2016-01-15T16:05:04Z

Why do the number of results differ for the two queries? Are you actively indexing?

@clintongormley I guess that the number of results differs because the post-filtered documents are not taken into account in the total hits. I opened #16021 because I don't know if it's expected or not.

clintongormley · 2016-01-15T16:44:59Z

@jimferenczi no - these counts should be the same whether post_filter is used or not. post_filter just affects which documents the aggs see.

jrots · 2016-01-15T17:03:34Z

I was still actively indexing at that moment - now that's done and I've also ran: force merge max_segments=1 to be sure that we're measuring the right things here.

with range on birthdate :

{"took":872,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":18892}}
{"took":875,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":18892}}
{"took":109,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":18892}}

If I change the range of the birthdate with one day f.e. it's again consecutive 800ms+

leaving out the birthdate completely :

{"took":37,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":39062}}
{"took":32,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":39062}}

etc..
even changing some of the params in the query,

running with range on birthdate in post_filter,initial + consecutive

{"took":401,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":18892}}
{"took":54,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":18892}}
{"took":53,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":18892}}

birthdate is stored as :

          "birthdate" : {
            "type" : "date",
            "format" : "YYYY-MM-dd"
          },

Wonder if I don't need time precision on my dates (like a birthdate f.e.),
if I'm not better of storing these dates as a fixed int number => floor(( (epochtime) - (timestamp birthdate) )/60/60/24)
This way it would generate less terms in the lucene index, the precision_step would also be 8 in that case, delivering faster responses.

clintongormley · 2016-01-15T17:14:58Z

I was still actively indexing at that moment

OK that explains the varying counts.

Wonder if I don't need time precision on my dates (like a birthdate f.e.), if I'm not better of storing these dates as a fixed int number => floor(( (epochtime) (timestamp birthdate) )/60/60/24)

Before going there... query execution order has had a big rewrite, and it sounds like it is choosing a suboptimal order. @jimferenczi is going to dive into it to see if he can find something that can be improved.

thanks for all the info

jimczi · 2016-01-15T20:36:06Z

edited version of the comment which was wrong and misleading.
@jrots I'll try to reproduce.

jimczi · 2016-01-18T15:00:21Z

@jrots I've tried to reproduce with a 40M documents index, 5 shards, each document contain one date field with a random date chosen between "1950-01-01" to "2000-12-31" and two long fields with different cardinalities (one big and one small).
I compared the post_filter+boolean query vs one big boolean query and the results are always the same, there is no gain with the post_filter. I checked with different type of range queries (big range like 1950 => 2000 and small ones like your example).
When you test the performance you can have some variations depending on the activity of your machine in terms of cpu and ram. Are you sure that you stopped the indexing and the queries when you checked the performance of each query ? Can you try to re-run your tests with more than 3 trials per query ?

clintongormley · 2016-01-18T18:11:05Z

@jrots could you provide a more complete recreation? or would it be possible to upload your index somewhere so that we could try it out?

jrots · 2016-01-19T19:45:43Z

Ok I'll have a look to have a testable index that doesn't contain too much private info

jrots · 2016-02-01T10:07:38Z

Hi guys,
sorry for the late reply.I retried the post_filter and filter on date ranges and it indeed have the same results.. (both quite slow to be honest). Must've been the indexing I was doing at that time that caused the fluctuation in search response times.

I think if you have a lot of documents and are doing a wide numeric range there can be a big slowdown. .

I fixed my problem by rewriting the following range query :

                    "range": {
                                "birthdate": {
                                    "gte": "1967-01-28",
                                    "lte": "1996-01-28"
                                }
                            }

to :

"filter": {
                "bool": {
                    "should": [
                        {
                            "range": {
                                "birthdate": {
                                    "gte": "1967-01-28",
                                    "lte": "1967-12-31"
                                }
                            }
                        },
                        {
                            "range": {
                                "birthdate": {
                                    "gte": "1996-01-01",
                                    "lte": "1996-01-28"
                                }
                            }
                        },
                        {
                            "terms": {
                                "country_birthdate_string": [
                                    "be1-1968",
                                    "be1-1969",
                                    "be0-1970",
                                    "be0-1980",
                                    "be1-1990",
                                    "be1-1991",
                                    "be1-1992",
                                    "be1-1993",
                                    "be1-1994",
                                    "be1-1995"
                                ]
                            }
                        }

0_<year rounded 0> => spans 10 years
1 => spans 1 year

I added a transform to the index mapping that generates these keywords automatically for me:

    "mappings" : {
      "person" : {
        "transform" : {
          "inline" : "ctx._source[\"country_birthdate_string\"] = [ctx._source[\"country\"] + \"0_\"+(int)Math.floor(Integer.parseInt(ctx._source[\"birthdate\"].substring(0,4))/10)*10, ctx._source[\"country\"] + \"1_\"+ctx._source[\"birthdate\"].substring(0,4)];",
          "lang" : "groovy"
        },
        "_all" : {
          "enabled" : false
        },
        "properties" : {
          "birthdate" : {
            "type" : "date",
            "format" : "YYYY-MM-dd"
          },
          "country" : {
            "type" : "string",
            "index" : "not_analyzed"
          },
          "country_birthdate_string" : {
            "type" : "string",
            "index" : "not_analyzed"
          },

I have a speed up times 10 doing it like this for my use case.
Maybe worth adding something like this higher up in lucene internal/ revisit the numeric tree lookup for large ranges. .
Regards
J

clintongormley · 2016-03-01T12:34:33Z

The new point field encoding coming in 5.0 should improve this situation. Closing

ChristopheBoucaut · 2017-04-06T10:19:06Z

Hi @jrots.

Did you test your use case with ES 5.* ? Can you confirm the improve ?

I have the same issue with ES 5.3 and I don't found a "native solution" in ES.

Regards.

tvernum · 2017-06-28T06:03:36Z

@hwb1992 Please ask this question on our discuss forum where we can provide better support.

clintongormley added feedback_needed :Search/Search Search-related issues that do not fall into other categories labels Jan 14, 2016

clintongormley added >bug and removed feedback_needed labels Jan 15, 2016

clintongormley assigned jimczi Jan 15, 2016

jimczi mentioned this issue Jan 15, 2016

The total number of results in the query response is the number of documents that matched the query minus the number of documents that were post-filtered. #16021

Closed

clintongormley closed this as completed Mar 1, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slowdown on date range queries in ES 2.1 #15994

Slowdown on date range queries in ES 2.1 #15994

jrots commented Jan 14, 2016

clintongormley commented Jan 14, 2016

jrots commented Jan 14, 2016

jimczi commented Jan 14, 2016

clintongormley commented Jan 15, 2016

jimczi commented Jan 15, 2016

clintongormley commented Jan 15, 2016

jrots commented Jan 15, 2016

clintongormley commented Jan 15, 2016

jimczi commented Jan 15, 2016

jimczi commented Jan 18, 2016

clintongormley commented Jan 18, 2016

jrots commented Jan 19, 2016

jrots commented Feb 1, 2016

clintongormley commented Mar 1, 2016

ChristopheBoucaut commented Apr 6, 2017 •

edited

tvernum commented Jun 28, 2017

Slowdown on date range queries in ES 2.1 #15994

Slowdown on date range queries in ES 2.1 #15994

Comments

jrots commented Jan 14, 2016

clintongormley commented Jan 14, 2016

jrots commented Jan 14, 2016

jimczi commented Jan 14, 2016

clintongormley commented Jan 15, 2016

jimczi commented Jan 15, 2016

clintongormley commented Jan 15, 2016

jrots commented Jan 15, 2016

clintongormley commented Jan 15, 2016

jimczi commented Jan 15, 2016

jimczi commented Jan 18, 2016

clintongormley commented Jan 18, 2016

jrots commented Jan 19, 2016

jrots commented Feb 1, 2016

clintongormley commented Mar 1, 2016

ChristopheBoucaut commented Apr 6, 2017 • edited

tvernum commented Jun 28, 2017

ChristopheBoucaut commented Apr 6, 2017 •

edited