Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ES 0.90 Multiple Numeric Range filters within boolean.should return incorrect results #2826

Closed
roytmana opened this issue Mar 28, 2013 · 7 comments

Comments

@roytmana
Copy link

I have a build of 0.9 from source (3-4 days ago) and it does not calculate numeric filters correctly when combined in bool should
The number of hits in the following query should be 35 (16 from the 1st range and 19 from the 2nd). Works correctly in 0.20.5 but in 0.9 it returns wrong number of hits which is also different depending on which range filter is listed first in the bool filter (the numbers are 32 and 28 depending on which range is first)

when removing one of the ranges from bool.should the calculations are correct. The calculations are also correct when using the same ranges in Range facet

{
  "query": {
    "match_all": {}
  },
  "from": 0,
  "filter": {
    "bool": {
      "should": [
        {
          "numeric_range": {
            "money.totals.obligationTotal": {
              "from": 20000000,
              "to": 50000000,
              "include_upper": false
            }
          }
        },
        {
          "numeric_range": {
            "money.totals.obligationTotal": {
              "from": 50000000,
              "to": 100000000,
              "include_upper": false
            }
          }
        }
      ]
    }
  }
}
@martijnvg
Copy link
Member

@roytmana Thanks for reporting this issue. This is a bug that manifests itself under certain circumstances (specific types of filters inside the bool filter that yield certain optimisations during filter execution). I will fix this asap.

@martijnvg
Copy link
Member

@roytmana The bug is fixed by the following commit:
a89dde8

Also in the above case it makes more sense to use the range filter instead of the numeric_range filter. The range filter should be faster in your case and is cached automatically.

@roytmana
Copy link
Author

@martijnvg thanks a lot. Docs say numeric filter is more memory intensive but faster. My use case is

  1. facet on entire range of obligationTotal values using range facet with about dozen of ranges from 0 to infinity
  2. user selects one or more facet ranges and I apply filter based on it.

Do you think range or numeric range is faster here since all values were probably already enumerated for faceting
and does regular range filter recognize and handle numeric fields properly (I do not want lexicographical ranges but numerical)

Thanks,
alex

@martijnvg
Copy link
Member

@roytmana range filter does also work for number based fields. Since you're faceting on these fields it maybe makes sense to use numeric_range filter, just make sure you execute it in combination with other filters (like term filter).

@roytmana
Copy link
Author

user can very well select just one of my range facet values and I will then filter on this range alone. do you think it may be an issue?

@martijnvg
Copy link
Member

I expect the range filter to execute a bit faster in the case it is the only filter driving the search request. I don't expect using the numeric_range filter to be a problem.

@roytmana
Copy link
Author

many thanks!

On Fri, Mar 29, 2013 at 1:58 PM, Martijn van Groningen <
notifications@github.com> wrote:

I expect the range filter to execute a bit faster in the case it is the
only filter driving the search request. I don't expect using the
numeric_range filter to be a problem.


Reply to this email directly or view it on GitHubhttps://github.com//issues/2826#issuecomment-15652554
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants