You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Druid Broker Result Level Cache Not Working Properly When Different Query Intervals Cover the Same Set of Segments.
Affected Version
0.13.0-incubating
Description
We have a segment with Segment Granularity of DAY and Query Granularity of HOUR.
When we send the same queries with only a different query interval, the result level cache will return the wrong result.
Example: QueryA and QueryB has same query body.
QueryA Interval: 2019-03-15T14:00:00/ 2019-03-16T14:00:00
QueryB Interval: 2019-03-15T18:00:00/ 2019-03-16T18:00:00
Response:
QueryA gets the correct result.
QueryB gets the result of QueryA.
Some Investigations:
Result level cache will use the query without interval as key. When the key match, the E-Tag is used to check if the cached result should be returned. However, E-Tag is only calculated with the Identifiers of involved segments.
The text was updated successfully, but these errors were encountered:
I looked into this a bit and believe the ETag calculation needs to be re-thought. It has the bug you mentioned - the one you mentioned, where it is calculated based on segment identifiers (which include the entire segment interval) and does not include a sub-interval within a segment. This causes the behavior you are seeing, and could be fixed by mixing in the SegmentDescriptor interval to the ETag.
But a potential larger bug is that it is leveraging QueryToolChest.computeCacheKey, which is designed for segment-level caching and, importantly, only includes query parameters that might affect the segment-level results. This is done to promote higher cache hit rates. For example, it doesn't include things like:
grandTotal or postAggregations for timeseries
any non-sorting postAggregations for topN
having, subtotalsSpec, postAggregations, or limitSpec for groupBy (although the lack of limitSpec is a bug, I think, since groupBy supports limit push down in some cases now)
I think this means we need a new method in the QueryToolChest for computing result-level cache keys.
the lack of limitSpec is a bug, I think, since groupBy supports limit push down in some cases now
Ah, maybe it's not. I don't remember if limit push down goes all the way to the per-segment results, or if it only affects the merge buffer. If it's the latter, it doesn't need to be in the per-segment cache key.
Druid Broker Result Level Cache Not Working Properly When Different Query Intervals Cover the Same Set of Segments.
Affected Version
0.13.0-incubating
Description
We have a segment with Segment Granularity of DAY and Query Granularity of HOUR.
When we send the same queries with only a different query interval, the result level cache will return the wrong result.
Example: QueryA and QueryB has same query body.
QueryA Interval: 2019-03-15T14:00:00/ 2019-03-16T14:00:00
QueryB Interval: 2019-03-15T18:00:00/ 2019-03-16T18:00:00
Involved Segments:
QueryA: segment2019-03-15_2019-03-16
QueryB: segment2019-03-15_2019-03-16
Response:
QueryA gets the correct result.
QueryB gets the result of QueryA.
Some Investigations:
Result level cache will use the query without interval as key. When the key match, the E-Tag is used to check if the cached result should be returned. However, E-Tag is only calculated with the Identifiers of involved segments.
The text was updated successfully, but these errors were encountered: