Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LRU cache guarantee to keep size under limit #3510

Conversation

@xiaoyao1991
Copy link
Member

xiaoyao1991 commented Sep 26, 2016

@xiaoyao1991 xiaoyao1991 force-pushed the xiaoyao1991:lru_cache_guarantee_to_keep_size_under_limit branch from 6c117fb to c0f8b08 Sep 26, 2016
);
numBytes.addAndGet(key.remaining() + value.length);

if (numBytes.get() > sizeInBytes) {

This comment has been minimized.

Copy link
@guobingkun

guobingkun Sep 26, 2016

Contributor

Remove "if" condition since it's covered by "while".

Copy link
Contributor

guobingkun left a comment

LGTM

@binlijin
Copy link
Contributor

binlijin commented Sep 26, 2016

LGTM

@ben-manes
Copy link

ben-manes commented Sep 26, 2016

Alternatively, ConcurrentLinkedHashMap is a concurrent, LRU cache that supports weighted entries. That might be a good stop-gap to offer much of the Caffeine-based extension's performance, except for the improved eviction policy. The cache was fairly successful, e.g. adopted by Cassandra and ported into Guava (though at a large perf loss). It is also Java 6 compatible.

@xiaoyao1991
Copy link
Member Author

xiaoyao1991 commented Sep 26, 2016

@ben-manes Thanks for the suggestion. Is it easy to implement the eviction policy in ConcurrentLinkedHashMap?

@ben-manes
Copy link

ben-manes commented Sep 27, 2016

CLHM is an LRU policy on top of ConcurrentHashMap. Its intentionally bare bones because it took me a long time to stumble on the right design ideas. The challenge with LRU is that every read is a write to global, shared state. The solution is to use a database style commit log to record and replay events. We then tried to retrofit the ideas onto Guava and I rewrote it in Caffeine.

So if you're asking if its easy to use here for an LRU cache, then yes.

ConcurrentMap<K, V> cache = new ConcurrentLinkedHashMap.Builder<K, V>()
    .maximumWeightedCapacity(1000)
    .weigher(...)
    .build();

If you're asking if its easy to implement the ideas from scratch, then its moderate. But doing it with all the features and tuning in Caffeine was hard. However that also replaced LRU with a modern policy, which is one of the reasons why Druid 0.9.2 adopted it as an extension. Maybe when Druid is JDK8-based they'll make it the default.

@drcrallen
Copy link
Contributor

drcrallen commented Sep 27, 2016

IMHO we shouldn't try to reinvent the wheel here. The local cache in druid has tons of problems that make it unsuitable once you get to any appreciable scale, which is why I took @ben-manes 's advice and checked out Caffeine, which is working great for us.

The simple local cache does, however, provide a great way to spin up a mild druid cluster with minimal overhead.

@drcrallen
Copy link
Contributor

drcrallen commented Sep 27, 2016

I'm not saying it shouldn't be fixed, but I think focus should be on getting the default local cache more functional and usable, rather than trying to write the-next-big-cache.

@ben-manes
Copy link

ben-manes commented Sep 27, 2016

@drcrallen oh I don't think anyone is suggesting that :)

It was unclear whether @xiaoyao1991's question was about usage or implementation. LHM is extended to become a cache, whereas CLHM simply is one (as any other form of a concurrent LHM makes little sense). But I suspect he meant the former rather than the internal details.

Anyway, I think @xiaoyao1991's changes are good and wanted to suggest CLHM as an option if Druid needs to remain JDK7-based.

@drcrallen
Copy link
Contributor

drcrallen commented Sep 27, 2016

Ok, so this foregoes the LinkedHashMap's remove eldest entry hook in order to ensure the cache size never exceeds the configured value in the put call itself. Or more precisely, it will evict until either there is enough room or the cache is empty.

Thread safety is retained through io.druid.client.cache.MapCache#clearLock , but the contract for LinkedHashMap already mandates external synchronization. So that's ok.

Looks good from here.

@drcrallen drcrallen added the Bug label Sep 27, 2016
@drcrallen drcrallen added this to the 0.9.3 milestone Sep 27, 2016
@drcrallen
Copy link
Contributor

drcrallen commented Sep 27, 2016

👍

it's not letting me start and approve a review :-P

@drcrallen
Copy link
Contributor

drcrallen commented Sep 27, 2016

Thanks for the contrib @xiaoyao1991 , would you mind filling out the Druid CLA? http://druid.io/community/cla.html

@fjy
Copy link
Contributor

fjy commented Sep 27, 2016

@drcrallen he should be covered already

@xiaoyao1991
Copy link
Member Author

xiaoyao1991 commented Sep 28, 2016

@drcrallen yes, the idea is just like what you said. Thread safety is guaranteed through the synchronized map.

I've just filled out the CLA.

Need me to squash the commits?

@drcrallen
Copy link
Contributor

drcrallen commented Sep 28, 2016

nope! github will squash them

@drcrallen drcrallen merged commit 91e6ab4 into apache:master Sep 28, 2016
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
fundead pushed a commit to fundead/druid that referenced this pull request Dec 7, 2016
* LRU cache guarantee to keep size under limit

* address comments

* fix failed tests in jdk7
dgolitsyn added a commit to metamx/druid that referenced this pull request Feb 14, 2017
* LRU cache guarantee to keep size under limit

* address comments

* fix failed tests in jdk7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

6 participants
You can’t perform that action at this time.