Cache: Add maxEntrySize config, make groupBy cacheable by default. #5108

gianm · 2017-11-20T23:46:44Z

The idea is a maxEntrySize makes it more feasible to cache query types that
can potentially generate large result sets, like groupBy and select,
without fear of writing too much to the cache per query.

Includes a refactor of cache population code in CachingQueryRunner and
CachingClusteredClient, such that they now use the same CachePopulator
interface with two implementations: one for foreground and one for
background.

The main reason for splitting the foreground / background impls is
that the foreground impl can have a more efficient implementation of
maxEntrySize. It can stop retaining subvalues for the cache early.

Also includes:

New cache populator metrics: put/ok, put/errors, put/oversized.
Removed groupBy from default uncacheable list, since the main reason it was uncacheable was that its cached values could be too large.

The idea is this makes it more feasible to cache query types that can potentially generate large result sets, like groupBy and select, without fear of writing too much to the cache per query. Includes a refactor of cache population code in CachingQueryRunner and CachingClusteredClient, such that they now use the same CachePopulator interface with two implementations: one for foreground and one for background. The main reason for splitting the foreground / background impls is that the foreground impl can have a more effective implementation of maxEntrySize. It can stop retaining subvalues for the cache early.

gianm · 2018-06-13T04:25:53Z

Removed WIP tag -- I have added tests and resolved conflicts.

clintropolis

overall lgtm 👍

clintropolis · 2018-06-25T21:34:14Z

server/src/main/java/io/druid/client/cache/CacheConfig.java

@@ -52,10 +52,13 @@
  private int cacheBulkMergeLimit = Integer.MAX_VALUE;

  @JsonProperty
-  private int resultLevelCacheLimit = Integer.MAX_VALUE;
+  private int maxEntrySize = 1_000_000;


What is a typical cache entry size? (how did you pick this number)

In our cluster the average size is about 1KB. But I am not sure how representative this is. I chose 1MB because I figured it was a large number that will block egregiously large cache entries.

jihoonson · 2018-08-02T20:41:34Z

@gianm please fix the licenses.

[ERROR] /home/travis/build/apache/incubator-druid/server/src/main/java/io/druid/client/cache/BackgroundCachePopulator.java:2: Line does not match expected header line of ' * Licensed to the Apache Software Foundation (ASF) under one'. [Header]
[ERROR] /home/travis/build/apache/incubator-druid/server/src/main/java/io/druid/client/cache/CachePopulatorStats.java:2: Line does not match expected header line of ' * Licensed to the Apache Software Foundation (ASF) under one'. [Header]
[ERROR] /home/travis/build/apache/incubator-druid/server/src/main/java/io/druid/client/cache/ForegroundCachePopulator.java:2: Line does not match expected header line of ' * Licensed to the Apache Software Foundation (ASF) under one'. [Header]
[ERROR] /home/travis/build/apache/incubator-druid/server/src/test/java/io/druid/client/cache/CachePopulatorTest.java:2: Line does not match expected header line of ' * Licensed to the Apache Software Foundation (ASF) under one'. [Header]

gianm · 2018-08-02T20:44:24Z

Oops, missed those. I pushed with new license headers.

jihoonson

LGTM after Travis.

gianm added Area - Cache WIP labels Nov 20, 2017

gianm added 9 commits November 20, 2017 20:31

Merge branch 'master' into cache-maxentrysize

b4a9b1b

Merge branch 'master' into cache-maxentrysize

ec4c170

Add CachePopulatorStats.

c63826d

Fix whitespace.

58cd731

Fix docs.

a94e2c1

Fix various tests.

21e1fa1

Add tests.

83a4f85

Fix tests.

f7904bd

Better tests

142c5e9

gianm removed the WIP label Jun 13, 2018

Remove conflict markers.

6921362

gianm changed the title ~~Cache: Add maxEntrySize config.~~ Cache: Add maxEntrySize config, make groupBy cacheable by default. Jun 13, 2018

Merge branch 'master' into cache-maxentrysize

22b507a

clintropolis reviewed Jun 25, 2018

View reviewed changes

clintropolis approved these changes Jun 27, 2018

View reviewed changes

gianm added 2 commits August 1, 2018 08:41

Merge branch 'master' into cache-maxentrysize

89bc813

Merge branch 'master' into cache-maxentrysize

d2a726c

Fix licenses.

2dc4561

jihoonson approved these changes Aug 2, 2018

View reviewed changes

gianm merged commit 3525d40 into apache:master Aug 7, 2018

gianm deleted the cache-maxentrysize branch August 7, 2018 17:23

drcrallen mentioned this pull request Aug 8, 2018

Move Caching Cluster Client to java streams and allow parallel intermediate merges #5913

Closed

1 task

gianm mentioned this pull request Sep 18, 2018

If response is too large for memcache, no response is returned at all. #4156

Closed

dclim added this to the 0.13.0 milestone Oct 8, 2018

jihoonson mentioned this pull request Jun 29, 2020

Add groupBy limitSpec to queryCache key #10093

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache: Add maxEntrySize config, make groupBy cacheable by default. #5108

Cache: Add maxEntrySize config, make groupBy cacheable by default. #5108

gianm commented Nov 20, 2017 •

edited

gianm commented Jun 13, 2018

clintropolis left a comment

clintropolis Jun 25, 2018

gianm Jun 25, 2018

jihoonson commented Aug 2, 2018

gianm commented Aug 2, 2018

jihoonson left a comment

Cache: Add maxEntrySize config, make groupBy cacheable by default. #5108

Cache: Add maxEntrySize config, make groupBy cacheable by default. #5108

Conversation

gianm commented Nov 20, 2017 • edited

gianm commented Jun 13, 2018

clintropolis left a comment

Choose a reason for hiding this comment

clintropolis Jun 25, 2018

Choose a reason for hiding this comment

gianm Jun 25, 2018

Choose a reason for hiding this comment

jihoonson commented Aug 2, 2018

gianm commented Aug 2, 2018

jihoonson left a comment

Choose a reason for hiding this comment

gianm commented Nov 20, 2017 •

edited