Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query DSL: Cache range filter on date field by default #7114

Closed
clintongormley opened this issue Jul 31, 2014 · 6 comments · Fixed by #7122
Closed

Query DSL: Cache range filter on date field by default #7114

clintongormley opened this issue Jul 31, 2014 · 6 comments · Fixed by #7122

Comments

@clintongormley
Copy link

A range filter on a date field with a numeric from/to value is not cached by default:

DELETE /test 

PUT /test/t/1
{
  "date": "2014-01-01"
}

GET /_validate/query?explain
{
  "query": {
    "filtered": {
      "filter": {
        "range": {
          "date": {
            "from": 0
          }
        }
      }
    }
  }
}

Returns:

"explanation": "ConstantScore(no_cache(date:[0 TO *]))"
@dadoonet
Copy link
Member

Indeed, in DateFieldMapper:

        if (lowerTerm != null) {
            if (lowerTerm instanceof Number) {
                lowerVal = ((Number) lowerTerm).longValue();
            } else {
                String value = convertToString(lowerTerm);
                cache = explicitCaching || !hasNowExpressionWithNoRounding(value);
                lowerVal = parseToMilliseconds(value, context, false);
            }
        }
        if (upperTerm != null) {
            if (upperTerm instanceof Number) {
                upperVal = ((Number) upperTerm).longValue();
            } else {
                String value = convertToString(upperTerm);
                cache = explicitCaching || !hasNowExpressionWithNoRounding(value);
                upperVal = parseToMilliseconds(value, context, includeUpper);
            }
        }

@dadoonet
Copy link
Member

dadoonet commented Aug 6, 2014

While looking at this issue, I found that if you use a now as the from value and a constant value as to, the range is cached although I think it should not:

DELETE /test 

PUT /test/t/1
{
  "date": "2014-01-01"
}

GET /test/_validate/query?explain
{
  "query": {
    "filtered": {
      "filter": {
        "range": {
          "date": {
            "from": "now",
            "to":"now/d+d"
          }
        }
      }
    }
  }
}

gives:

"explanations": [
      {
         "index": "test",
         "valid": true,
         "explanation": "ConstantScore(cache(date:[1407341732658 TO 1407456000000]))"
      }
   ]

@clintongormley Do you agree that this should be fixed as well?

@clintongormley
Copy link
Author

@dadoonet yes i do

@dadoonet
Copy link
Member

dadoonet commented Aug 7, 2014

Here are the results of my findings so far (test agains master branch so before applying PR #7122):

I created a range filter like this:

{
    "constant_score" : {
        "filter" : {
            "range" : {
                "born" : {
                    "gte": <GTE>,
                    "lte": <LTE>
                },
                "_cache" : <CACHE IF NOT NULL>
            }
        }
    }
}

Failures

The most critical ones performance wise are:

[WARN ] gte [null], lte [1577836800], _cache [null] should be cached
[WARN ] gte [1325376000], lte [null], _cache [null] should be cached
[WARN ] gte [1325376000], lte [1577836800], _cache [null] should be cached
[WARN ] gte [now], lte [2012-01-01], _cache [null] should not be cached

Those are minor issues performance wise.

[WARN ] gte [now/d], lte [null], _cache [false] should not be cached
[WARN ] gte [null], lte [now/d], _cache [false] should not be cached
[WARN ] gte [now/d], lte [now/d], _cache [false] should not be cached
[WARN ] gte [2012-01-01], lte [null], _cache [false] should not be cached
[WARN ] gte [null], lte [2012-01-01], _cache [false] should not be cached
[WARN ] gte [2012-01-01], lte [2012-01-01], _cache [false] should not be cached
[WARN ] gte [now], lte [2012-01-01], _cache [false] should not be cached
[WARN ] gte [2012-01-01], lte [now/d], _cache [false] should not be cached
[WARN ] gte [now/d], lte [2012-01-01], _cache [false] should not be cached
[WARN ] gte [1325376000], lte [now/d], _cache [false] should not be cached
[WARN ] gte [now/d], lte [1577836800], _cache [false] should not be cached

Extreme use case: I don't know if we should cache or not this one. It produces a range filter: born:[* TO *]. Should this be cached? @clintongormley WDYT?

[WARN ] gte [null], lte [null], _cache [null] should be cached

Passing tests

For information, here are the successful tests:

[INFO ] gte [null], lte [null], _cache [true] is cached
[INFO ] gte [null], lte [null], _cache [false] is not cached
[INFO ] gte [now], lte [null], _cache [null] is not cached
[INFO ] gte [null], lte [now], _cache [null] is not cached
[INFO ] gte [now], lte [now], _cache [null] is not cached
[INFO ] gte [now/d], lte [null], _cache [null] is cached
[INFO ] gte [null], lte [now/d], _cache [null] is cached
[INFO ] gte [now/d], lte [now/d], _cache [null] is cached
[INFO ] gte [2012-01-01], lte [null], _cache [null] is cached
[INFO ] gte [null], lte [2012-01-01], _cache [null] is cached
[INFO ] gte [2012-01-01], lte [2012-01-01], _cache [null] is cached
[INFO ] gte [2012-01-01], lte [now], _cache [null] is not cached
[INFO ] gte [2012-01-01], lte [now/d], _cache [null] is cached
[INFO ] gte [now/d], lte [2012-01-01], _cache [null] is cached
[INFO ] gte [now], lte [1577836800], _cache [null] is not cached
[INFO ] gte [1325376000], lte [now], _cache [null] is not cached
[INFO ] gte [1325376000], lte [now/d], _cache [null] is cached
[INFO ] gte [now/d], lte [1577836800], _cache [null] is cached
[INFO ] gte [now], lte [null], _cache [true] is cached
[INFO ] gte [null], lte [now], _cache [true] is cached
[INFO ] gte [now], lte [now], _cache [true] is cached
[INFO ] gte [now/d], lte [null], _cache [true] is cached
[INFO ] gte [null], lte [now/d], _cache [true] is cached
[INFO ] gte [now/d], lte [now/d], _cache [true] is cached
[INFO ] gte [2012-01-01], lte [null], _cache [true] is cached
[INFO ] gte [null], lte [2012-01-01], _cache [true] is cached
[INFO ] gte [2012-01-01], lte [2012-01-01], _cache [true] is cached
[INFO ] gte [now], lte [2012-01-01], _cache [true] is cached
[INFO ] gte [2012-01-01], lte [now], _cache [true] is cached
[INFO ] gte [2012-01-01], lte [now/d], _cache [true] is cached
[INFO ] gte [now/d], lte [2012-01-01], _cache [true] is cached
[INFO ] gte [null], lte [1577836800], _cache [true] is cached
[INFO ] gte [1325376000], lte [null], _cache [true] is cached
[INFO ] gte [1325376000], lte [1577836800], _cache [true] is cached
[INFO ] gte [now], lte [1577836800], _cache [true] is cached
[INFO ] gte [1325376000], lte [now], _cache [true] is cached
[INFO ] gte [1325376000], lte [now/d], _cache [true] is cached
[INFO ] gte [now/d], lte [1577836800], _cache [true] is cached
[INFO ] gte [now], lte [null], _cache [false] is not cached
[INFO ] gte [null], lte [now], _cache [false] is not cached
[INFO ] gte [now], lte [now], _cache [false] is not cached
[INFO ] gte [2012-01-01], lte [now], _cache [false] is not cached
[INFO ] gte [null], lte [1577836800], _cache [false] is not cached
[INFO ] gte [1325376000], lte [null], _cache [false] is not cached
[INFO ] gte [1325376000], lte [1577836800], _cache [false] is not cached
[INFO ] gte [now], lte [1577836800], _cache [false] is not cached
[INFO ] gte [1325376000], lte [now], _cache [false] is not cached

@clintongormley
Copy link
Author

Extreme use case: I don't know if we should cache or not this one. It produces a range filter: born:[* TO *]. Should this be cached? @clintongormley WDYT?

This should be rewritten as a match_all filter instead.

@dadoonet
Copy link
Member

dadoonet commented Aug 8, 2014

After talking with @martijnvg and @clintongormley, I changed the test scenario.
Basically, even if user explicitly choose to cache a filter which contains now, we don't cache it.

Here are new list of failing tests:

[WARN ][org.elasticsearch.index.query] gte [now], lte [2012-01-01], _cache [null] should not be cached
[WARN ][org.elasticsearch.index.query] gte [null], lte [1577836800], _cache [null] should be cached
[WARN ][org.elasticsearch.index.query] gte [1325376000], lte [null], _cache [null] should be cached
[WARN ][org.elasticsearch.index.query] gte [1325376000], lte [1577836800], _cache [null] should be cached
[WARN ][org.elasticsearch.index.query] gte [now], lte [null], _cache [true] should not be cached
[WARN ][org.elasticsearch.index.query] gte [null], lte [now], _cache [true] should not be cached
[WARN ][org.elasticsearch.index.query] gte [now], lte [now], _cache [true] should not be cached
[WARN ][org.elasticsearch.index.query] gte [now], lte [2012-01-01], _cache [true] should not be cached
[WARN ][org.elasticsearch.index.query] gte [2012-01-01], lte [now], _cache [true] should not be cached
[WARN ][org.elasticsearch.index.query] gte [now], lte [1577836800], _cache [true] should not be cached
[WARN ][org.elasticsearch.index.query] gte [1325376000], lte [now], _cache [true] should not be cached

About:

Extreme use case: I don't know if we should cache or not this one. It produces a range filter: born:[* TO *].

#7204 is opened for that.

dadoonet added a commit to dadoonet/elasticsearch that referenced this issue Aug 12, 2014
A range filter on a date field with a numeric `from`/`to` value is **not** cached by default:

    DELETE /test

    PUT /test/t/1
    {
      "date": "2014-01-01"
    }

    GET /_validate/query?explain
    {
      "query": {
        "filtered": {
          "filter": {
            "range": {
              "date": {
                "from": 0
              }
            }
          }
        }
      }
    }

Returns:

    "explanation": "ConstantScore(no_cache(date:[0 TO *]))"

This patch fixes as well not caching `from`/`to` when using `now` value not rounded.
Previously, a query like:

    GET /_validate/query?explain
    {
      "query": {
        "filtered": {
          "filter": {
            "range": {
              "date": {
                "from": "now"
                "to": "now/d+1"
              }
            }
          }
        }
      }
    }

was cached.

Also, this patch does not cache anymore `now` even if the user asked for caching it.
As it won't be cached at all by definition.

Added as well tests for all possible combinations.

Closes elastic#7114.
dadoonet added a commit that referenced this issue Aug 12, 2014
A range filter on a date field with a numeric `from`/`to` value is **not** cached by default:

    DELETE /test

    PUT /test/t/1
    {
      "date": "2014-01-01"
    }

    GET /_validate/query?explain
    {
      "query": {
        "filtered": {
          "filter": {
            "range": {
              "date": {
                "from": 0
              }
            }
          }
        }
      }
    }

Returns:

    "explanation": "ConstantScore(no_cache(date:[0 TO *]))"

This patch fixes as well not caching `from`/`to` when using `now` value not rounded.
Previously, a query like:

    GET /_validate/query?explain
    {
      "query": {
        "filtered": {
          "filter": {
            "range": {
              "date": {
                "from": "now"
                "to": "now/d+1"
              }
            }
          }
        }
      }
    }

was cached.

Also, this patch does not cache anymore `now` even if the user asked for caching it.
As it won't be cached at all by definition.

Added as well tests for all possible combinations.

Closes #7114.

(cherry picked from commit 9e68687)
dadoonet added a commit that referenced this issue Aug 12, 2014
A range filter on a date field with a numeric `from`/`to` value is **not** cached by default:

    DELETE /test

    PUT /test/t/1
    {
      "date": "2014-01-01"
    }

    GET /_validate/query?explain
    {
      "query": {
        "filtered": {
          "filter": {
            "range": {
              "date": {
                "from": 0
              }
            }
          }
        }
      }
    }

Returns:

    "explanation": "ConstantScore(no_cache(date:[0 TO *]))"

This patch fixes as well not caching `from`/`to` when using `now` value not rounded.
Previously, a query like:

    GET /_validate/query?explain
    {
      "query": {
        "filtered": {
          "filter": {
            "range": {
              "date": {
                "from": "now"
                "to": "now/d+1"
              }
            }
          }
        }
      }
    }

was cached.

Also, this patch does not cache anymore `now` even if the user asked for caching it.
As it won't be cached at all by definition.

Added as well tests for all possible combinations.

Closes #7114.

(cherry picked from commit 9e68687)
dadoonet added a commit that referenced this issue Aug 12, 2014
A range filter on a date field with a numeric `from`/`to` value is **not** cached by default:

    DELETE /test

    PUT /test/t/1
    {
      "date": "2014-01-01"
    }

    GET /_validate/query?explain
    {
      "query": {
        "filtered": {
          "filter": {
            "range": {
              "date": {
                "from": 0
              }
            }
          }
        }
      }
    }

Returns:

    "explanation": "ConstantScore(no_cache(date:[0 TO *]))"

This patch fixes as well not caching `from`/`to` when using `now` value not rounded.
Previously, a query like:

    GET /_validate/query?explain
    {
      "query": {
        "filtered": {
          "filter": {
            "range": {
              "date": {
                "from": "now"
                "to": "now/d+1"
              }
            }
          }
        }
      }
    }

was cached.

Also, this patch does not cache anymore `now` even if the user asked for caching it.
As it won't be cached at all by definition.

Added as well tests for all possible combinations.

Closes #7114.

(cherry picked from commit 9e68687)
dadoonet added a commit that referenced this issue Sep 8, 2014
A range filter on a date field with a numeric `from`/`to` value is **not** cached by default:

    DELETE /test

    PUT /test/t/1
    {
      "date": "2014-01-01"
    }

    GET /_validate/query?explain
    {
      "query": {
        "filtered": {
          "filter": {
            "range": {
              "date": {
                "from": 0
              }
            }
          }
        }
      }
    }

Returns:

    "explanation": "ConstantScore(no_cache(date:[0 TO *]))"

This patch fixes as well not caching `from`/`to` when using `now` value not rounded.
Previously, a query like:

    GET /_validate/query?explain
    {
      "query": {
        "filtered": {
          "filter": {
            "range": {
              "date": {
                "from": "now"
                "to": "now/d+1"
              }
            }
          }
        }
      }
    }

was cached.

Also, this patch does not cache anymore `now` even if the user asked for caching it.
As it won't be cached at all by definition.

Added as well tests for all possible combinations.

Closes #7114.
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
A range filter on a date field with a numeric `from`/`to` value is **not** cached by default:

    DELETE /test

    PUT /test/t/1
    {
      "date": "2014-01-01"
    }

    GET /_validate/query?explain
    {
      "query": {
        "filtered": {
          "filter": {
            "range": {
              "date": {
                "from": 0
              }
            }
          }
        }
      }
    }

Returns:

    "explanation": "ConstantScore(no_cache(date:[0 TO *]))"

This patch fixes as well not caching `from`/`to` when using `now` value not rounded.
Previously, a query like:

    GET /_validate/query?explain
    {
      "query": {
        "filtered": {
          "filter": {
            "range": {
              "date": {
                "from": "now"
                "to": "now/d+1"
              }
            }
          }
        }
      }
    }

was cached.

Also, this patch does not cache anymore `now` even if the user asked for caching it.
As it won't be cached at all by definition.

Added as well tests for all possible combinations.

Closes elastic#7114.

(cherry picked from commit 9e68687)
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
A range filter on a date field with a numeric `from`/`to` value is **not** cached by default:

    DELETE /test

    PUT /test/t/1
    {
      "date": "2014-01-01"
    }

    GET /_validate/query?explain
    {
      "query": {
        "filtered": {
          "filter": {
            "range": {
              "date": {
                "from": 0
              }
            }
          }
        }
      }
    }

Returns:

    "explanation": "ConstantScore(no_cache(date:[0 TO *]))"

This patch fixes as well not caching `from`/`to` when using `now` value not rounded.
Previously, a query like:

    GET /_validate/query?explain
    {
      "query": {
        "filtered": {
          "filter": {
            "range": {
              "date": {
                "from": "now"
                "to": "now/d+1"
              }
            }
          }
        }
      }
    }

was cached.

Also, this patch does not cache anymore `now` even if the user asked for caching it.
As it won't be cached at all by definition.

Added as well tests for all possible combinations.

Closes elastic#7114.

(cherry picked from commit 9e68687)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants