Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slowdown on date range queries in ES 2.1 #15994

Closed
jrots opened this issue Jan 14, 2016 · 16 comments
Closed

Slowdown on date range queries in ES 2.1 #15994

jrots opened this issue Jan 14, 2016 · 16 comments
Assignees
Labels
>bug :Search/Search Search-related issues that do not fall into other categories

Comments

@jrots
Copy link

jrots commented Jan 14, 2016

On a dataset of +/- 30M (unoptimized / so lot's of segments)
The following query executes faster a lot faster :

curl -XGET "http://localhost:9200/users/person/_search" -d '
    {
        "from": 0,
        "size": 20,
        "_source": false,
        "query": {
            "constant_score": {
                "filter": {
                    "bool": {
                        "must": [
                            {
                                "term": {
                                    "gender": "female"
                                }
                            },
                            {
                                "terms": {
                                    "location.geohash": [
                                        "u143",
                                        "u149",
                                        "u14c",
                                        "u146",
                                        "u14d",
                                        "u14f",
                                        "u147",
                                        "u14e",
                                        "u14g"
                                    ]
                                }
                            },
                            {
                                "term": {
                                    "accountstatus": "active"
                                }
                            },
                            {
                                "terms": {
                                    "country": [
                                        "be"
                                    ]
                                }
                            }
                        ]
                    }
                }
            }
        },
        "post_filter": { 
            "range": {
                "birthdate": {
                    "gte": "1981-01-13",
                    "lte": "1997-01-13"
                }
            }
        }
    }   
    '

< 110ms, consecutive runs,
{"took":109,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":2454,"max_score":1.0

vs. this is one is always : 300ms to 400ms +

curl -XGET "http://localhost:9200/users/person/_search" -d '
  {
        "from": 0,
        "size": 20,
        "_source": false,
        "query": {
            "constant_score": {
                "filter": {
                    "bool": {
                        "must": [
                            {
                                "term": {
                                    "gender": "female"
                                }
                            },
                            {
                                "terms": {
                                    "location.geohash": [
                                        "u143",
                                        "u149",
                                        "u14c",
                                        "u146",
                                        "u14d",
                                        "u14f",
                                        "u147",
                                        "u14e",
                                        "u14g"
                                    ]
                                }
                            },
                            {
                                "term": {
                                    "accountstatus": "active"
                                }
                            },
                            {
                                "terms": {
                                    "country": [
                                        "be"
                                    ]
                                }
                            },
                            {
                                "range": {
                                    "birthdate": {
                                        "gte": "1981-01-13",
                                        "lte": "1997-01-13"
                                    }
                                }
                            }
                        ]
                    }
                }
            }
        }
    }
    '

{"took":304,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":2446,"max_score"

If this expected behaviour, please close the issue -
But I think that ES should analyse the query and know when to do post filtering instead of generating a lot of lucene terms for date fields that slow down queries drastically.

@clintongormley
Copy link

What version of ES?
How many times did you run it?
How many segments do you have?
What hardware are you running this on? (all your search times seem very slow for such a small index)

@clintongormley clintongormley added feedback_needed :Search/Search Search-related issues that do not fall into other categories labels Jan 14, 2016
@jrots
Copy link
Author

jrots commented Jan 14, 2016

  • ES: 2.1.1 - Lucene 5.3.1, -Xms28g -Xmx28g mlockall = true
  • 6 times consecutively for each query
  • Intel(R) Xeon(R) CPU X5650 @ 2.67GHz, 2 physical cpu's, 12 cores w hyperthreading, 24 processors in total
  • 64GB ram, 1TB SSD
  • segments : https://gist.github.com/jrots/d7020441bf33dca4b67a , 4 shards with 30 segments on each shard

@jimczi
Copy link
Contributor

jimczi commented Jan 14, 2016

And what is the average time when you remove the range filter/post_filter ? The post_filter vs filter are very similar in terms of implementation, the post_filter has to scan the inverted lists also.

@clintongormley
Copy link

@jrots It seems there may be a misunderstanding of what the post_filter does in Elasticsearch. The post filter is simply a filter that is applied to the query results AFTER aggregations have been calculated - it has nothing to do with executing the filter via the inverted index vs fielddata.

That said, something seems odd about how the query is executed - if the range query is the most costly, the bool query should figure that out and change the execution order to apply the terms queries first. We'll investigate.

Two questions:

  • Is the birthdate field mapped as a string or as a date?
  • Why do the number of results differ for the two queries? Are you actively indexing?

@jimczi
Copy link
Contributor

jimczi commented Jan 15, 2016

Why do the number of results differ for the two queries? Are you actively indexing?

@clintongormley I guess that the number of results differs because the post-filtered documents are not taken into account in the total hits. I opened #16021 because I don't know if it's expected or not.

@clintongormley
Copy link

@jimferenczi no - these counts should be the same whether post_filter is used or not. post_filter just affects which documents the aggs see.

@jrots
Copy link
Author

jrots commented Jan 15, 2016

I was still actively indexing at that moment - now that's done and I've also ran: force merge max_segments=1 to be sure that we're measuring the right things here.

with range on birthdate :

{"took":872,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":18892}}
{"took":875,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":18892}}
{"took":109,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":18892}}

If I change the range of the birthdate with one day f.e. it's again consecutive 800ms+

leaving out the birthdate completely :

{"took":37,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":39062}}
{"took":32,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":39062}}

etc..
even changing some of the params in the query,

running with range on birthdate in post_filter,initial + consecutive

{"took":401,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":18892}}
{"took":54,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":18892}}
{"took":53,"timed_out":false,"_shards":{"total":4,"successful":4,"failed":0},"hits":{"total":18892}}

birthdate is stored as :

          "birthdate" : {
            "type" : "date",
            "format" : "YYYY-MM-dd"
          },

Wonder if I don't need time precision on my dates (like a birthdate f.e.),
if I'm not better of storing these dates as a fixed int number => floor(( (epochtime) - (timestamp birthdate) )/60/60/24)
This way it would generate less terms in the lucene index, the precision_step would also be 8 in that case, delivering faster responses.

@clintongormley
Copy link

I was still actively indexing at that moment

OK that explains the varying counts.

Wonder if I don't need time precision on my dates (like a birthdate f.e.), if I'm not better of storing these dates as a fixed int number => floor(( (epochtime) (timestamp birthdate) )/60/60/24)

Before going there... query execution order has had a big rewrite, and it sounds like it is choosing a suboptimal order. @jimferenczi is going to dive into it to see if he can find something that can be improved.

thanks for all the info

@jimczi
Copy link
Contributor

jimczi commented Jan 15, 2016

edited version of the comment which was wrong and misleading.
@jrots I'll try to reproduce.

@jimczi
Copy link
Contributor

jimczi commented Jan 18, 2016

@jrots I've tried to reproduce with a 40M documents index, 5 shards, each document contain one date field with a random date chosen between "1950-01-01" to "2000-12-31" and two long fields with different cardinalities (one big and one small).
I compared the post_filter+boolean query vs one big boolean query and the results are always the same, there is no gain with the post_filter. I checked with different type of range queries (big range like 1950 => 2000 and small ones like your example).
When you test the performance you can have some variations depending on the activity of your machine in terms of cpu and ram. Are you sure that you stopped the indexing and the queries when you checked the performance of each query ? Can you try to re-run your tests with more than 3 trials per query ?

@clintongormley
Copy link

@jrots could you provide a more complete recreation? or would it be possible to upload your index somewhere so that we could try it out?

@jrots
Copy link
Author

jrots commented Jan 19, 2016

Ok I'll have a look to have a testable index that doesn't contain too much private info

@jrots
Copy link
Author

jrots commented Feb 1, 2016

Hi guys,
sorry for the late reply.I retried the post_filter and filter on date ranges and it indeed have the same results.. (both quite slow to be honest). Must've been the indexing I was doing at that time that caused the fluctuation in search response times.

I think if you have a lot of documents and are doing a wide numeric range there can be a big slowdown. .

I fixed my problem by rewriting the following range query :

                    "range": {
                                "birthdate": {
                                    "gte": "1967-01-28",
                                    "lte": "1996-01-28"
                                }
                            }

to :

"filter": {
                "bool": {
                    "should": [
                        {
                            "range": {
                                "birthdate": {
                                    "gte": "1967-01-28",
                                    "lte": "1967-12-31"
                                }
                            }
                        },
                        {
                            "range": {
                                "birthdate": {
                                    "gte": "1996-01-01",
                                    "lte": "1996-01-28"
                                }
                            }
                        },
                        {
                            "terms": {
                                "country_birthdate_string": [
                                    "be1-1968",
                                    "be1-1969",
                                    "be0-1970",
                                    "be0-1980",
                                    "be1-1990",
                                    "be1-1991",
                                    "be1-1992",
                                    "be1-1993",
                                    "be1-1994",
                                    "be1-1995"
                                ]
                            }
                        }

0_<year rounded 0> => spans 10 years
1 => spans 1 year

I added a transform to the index mapping that generates these keywords automatically for me:

    "mappings" : {
      "person" : {
        "transform" : {
          "inline" : "ctx._source[\"country_birthdate_string\"] = [ctx._source[\"country\"] + \"0_\"+(int)Math.floor(Integer.parseInt(ctx._source[\"birthdate\"].substring(0,4))/10)*10, ctx._source[\"country\"] + \"1_\"+ctx._source[\"birthdate\"].substring(0,4)];",
          "lang" : "groovy"
        },
        "_all" : {
          "enabled" : false
        },
        "properties" : {
          "birthdate" : {
            "type" : "date",
            "format" : "YYYY-MM-dd"
          },
          "country" : {
            "type" : "string",
            "index" : "not_analyzed"
          },
          "country_birthdate_string" : {
            "type" : "string",
            "index" : "not_analyzed"
          },

I have a speed up times 10 doing it like this for my use case.
Maybe worth adding something like this higher up in lucene internal/ revisit the numeric tree lookup for large ranges. .
Regards
J

@clintongormley
Copy link

The new point field encoding coming in 5.0 should improve this situation. Closing

@ChristopheBoucaut
Copy link

ChristopheBoucaut commented Apr 6, 2017

Hi @jrots.

Did you test your use case with ES 5.* ? Can you confirm the improve ?

I have the same issue with ES 5.3 and I don't found a "native solution" in ES.

Regards.

@tvernum
Copy link
Contributor

tvernum commented Jun 28, 2017

@hwb1992 Please ask this question on our discuss forum where we can provide better support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Search Search-related issues that do not fall into other categories
Projects
None yet
Development

No branches or pull requests

5 participants