Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eviction listener #59

Closed
ksperling opened this issue Mar 31, 2017 · 15 comments
Closed

Eviction listener #59

ksperling opened this issue Mar 31, 2017 · 15 comments
Labels
Milestone

Comments

@ksperling
Copy link

ksperling commented Mar 31, 2017

Similar to #38, it would be really useful to have an eviction listener.

In my use-case i'm using an in-memory Cache<MyKey, File> to manage an on-disk cache of files, so whenever an entry is expired / removed / evicted from the cache I need to delete the corresponding file.

I'm currently using caffeine but would like to try cache2k to see if it supports our access patterns better.

@cruftex
Copy link
Member

cruftex commented Apr 1, 2017

Thanks for the use case. At the current state an eviction listener isn't to complicated to provide, but I am hesitating, once it is there, I have to support it forever.

I am a little in doubt whether "this is it". Maybe its an XY-Problem?

Thinking deeper about your use case, I get the impression that you actually like to use the eviction algorithm and not the cache. So one question is what data is actually in memory and what features is the "whole cache" adding? Maybe a reusable library for the eviction is a better alternative?

If its an interactive application, there is also the restart problem. With a restart the whole history is lost. I don't know how critical this is for your application. Supporting the use case the next logical request would be to somehow preserve/persist the cache state over a restart. What happens to the disk files on shutdown?

Thoughts?

@ksperling
Copy link
Author

ksperling commented Apr 1, 2017

Yeah I have thought of that, however it is not just the eviction that I am using, it's also the read-through capability that guarantees that multiple concurrent requests for the same entry result only in a single attempt to load it. If there was an "eviction" library that supported all this, I imagine it would still have to manage a certain amount of per-item meta-data storage internally, in addition to whatever storage of the actual content I am doing myself, so I'm not sure it would look that different in practice from just using a cache to store a 'handle' to the actual data.

As far as restarting the application goes, the files remain on disk and get re-added to the cache on startup, so any frequency / recency meta-data is lost. In practice that doesn't seem to be a huge issue so far. In our use case data actually has an intrinsic timestamp and recent data is much more likely to be accessed than older data, so if it was possible to expose the intrinsic age of each item to the cache in some way (rather than the cache using time since load/write to define recency) this would be very powerful for our use case and make the loss of access frequency meta-data even less of an issue.

In terms of eviction, another use case might be having a cache of Bitmap objects on Android (or anything else that references non-JVM memory) that require a recycle() or similar deallocation call. Intuitively that feels like less of an 'abuse' of an in-memory cache, but conceptually there's really very little difference between a JNI pointer to native memory and a file name / handle. For any of these cases it would be nice to have a generic removal listener that can be notified for replacement / removal / expiry / eviction in a uniform way.

@cruftex
Copy link
Member

cruftex commented Apr 3, 2017

Thanks for the detailed thoughts on this. Looks reasonable.

so if it was possible to expose the intrinsic age of each item to the cache

cache2k honors the least recently used aspect. So when you insert the oldest ones first, that will do.

I plan to release 1.0 now with the current feature set. Then I like to start an 1.1 dev version. This allows testing new features, while being able to change and improve the API or even drop features again that didn't prove. I cannot promise, but probably something testable will pop up the next months. Would be nice when you then take a look at it and test it.

@cruftex cruftex added this to the v1.1.x milestone Apr 3, 2017
@cruftex cruftex added the feature label Apr 3, 2017
@srikrishnacj
Copy link

srikrishnacj commented Nov 30, 2017

I too need this feature badly. My scenario is use cache like swap memory for OS.

@cruftex
Copy link
Member

cruftex commented Nov 30, 2017

@srikrishnacj
Can you describe more about your use case? What should happen when something is read or written to the cache? What should happen on eviction?

@srikrishnacj
Copy link

srikrishnacj commented Nov 30, 2017

I cannot explain my business usecase but I will try to explain as close as possible. I Have to huge number of records from file. Each record creates a new object or updates the existing objects one or more created by previous records. My Computer memory is so small to process all the objects in memory.If use a disk based storage like mongo or couch, its takes very huge amount of time says in hours, since huge amount of lookup and update calls. So I want to use cache primary storage and hard disk as swap space and do all the operation on cache. All the creation and updates are done on the cache. If cache is full, I will save the evicted or expired items to disk. If cache does not have object, then I will load from by disk via CacheLoader. Once the process is completed, I will iterate all the objects in cache and flush it to disk. I have my own implementation for this scenario with HashMap. But it is not advanced like it does not track least used elements or timeout expire. Once the memory is full, it flushes everything to disk and caches the data one by one from disk if required via get(). another draw back is getting memory size via system call is painful. So I am trying to my custom cache with Standard JCache Implementation. So basically I want CacheLoader, EvictionListener, ExpireListener and eager expire and eviction policy. While searching for standard libs i came up with cache2k and caffeine. caffeine seems to fulfill all my needs. caffeine seems to full fill all my needs and as bones it seamlessly evicts least used objects during garbage collection with soft values configuration enabled . cache2k is more interesting since it supports JCache out of the box and performance bonues. only lag is eviction listener.

@ivanviragine
Copy link

ivanviragine commented Nov 30, 2017

My use case is: I have a lot of drivers that takes a long time to load and takes up a lot of memory. So, everytime I need a new driver, I get it from cache or load it and put in the cache.
As this drivers takes a lot of memory, I keep it in the cache only for some minutes, to save loading time if I may need it again. When the time comes, it should be evicted, but, for a proper eviction, I should run the close method on the cached driver, so it shuts down gracefully.
I've ended up putting a very large capacity, to avoid eviction and put a cache expiry listener as this:

.addListener(new CacheEntryExpiredListener<String, CachedDriver>() {
@OverRide
public void onEntryExpired(final Cache<String, CachedDriver> cache, final CacheEntry<String, CachedDriver> entry) {
entry.getValue().close();
}
})

This works as expected, only drawback is that I can't use a small or specific capacity.

@cruftex
Copy link
Member

cruftex commented Dec 4, 2017

Thanks for the more detailed descriptions!

@srikrishnacj
You describe something like storage tiers. Using an eviction listener can be tricky, since there might be race conditions. For your use case to work it is important that the eviction listener is synchronous and blocks if a new request wants to load the entry again.

I also can think alternative support for your use case. E.g. a write behind like feature, that uses the cache writer for writing changes but only when entries are evicted or there is spare storage bandwidth. However, this will take more time to implement.

@ivanviragine
That is the "free additional resources" use case, same as @ksperling described. Yes, seams useful.

@srikrishnacj
Copy link

srikrishnacj commented Feb 10, 2018

@cruftex Sorry for long delayed response.

there is race condition but i will guard it with read write re entrant lock. Exactly as you said i will write entries on evicted. like cache as a swap storage. That's why i want eviction listener on cache.

@cruftex
Copy link
Member

cruftex commented Sep 6, 2018

Just checked in code for the eviction listener.

There are three different events now, in case an entry is removed, CacheEntryRemovedListener (by command), CacheEntryEvictedListener (by eviction), CacheEntryExpiredListener. In case this is used for resource cleanup, it would be nicer to have just one event for all three cases, since there is no need to differentiate. Other caches send an eviction event and a cause. In cache2k I'd like to reserve the wording "eviction" for eviction by space limitations only.

@gsaxena888
Copy link

gsaxena888 commented May 13, 2019

@cruftex Awesome...do you which release version this new CacheEntryEvictedListener will appear in? (I checked the latest 1.2.1 and it does not appear to be there...)

@cruftex
Copy link
Member

cruftex commented Aug 26, 2019

I'll release what I have now with 1.3.1.Alpha and after a couple more tests Version 1.4 in about two weeks.

@cruftex cruftex closed this as completed Aug 26, 2019
@cr-orilibhaber
Copy link

cr-orilibhaber commented Apr 20, 2020

Hello,
Any ETA time for version 1.4? I find the new CacheEntryEvictedListener feature highly valuable for my business flow but I don't want to introduce an Alpha version (1.3.1.Alpha) into my code base.

So I would like to ask how much of an "Alpha version" is in fact version 1.3.1?
Is it absolutely not production grade? or is it just lacking some unit coverage but was sanity tested for E2E?

If release of 1.4 is too far away I would be forced to switch to Caffeine just for the sake of having this functionality available.
That would be a shame, because I find Cache2K's usage/API/documentation and performance superior in all other aspects.

@cruftex
Copy link
Member

cruftex commented Apr 22, 2020

@cr-orilibhaber

That would be a shame, because I find Cache2K's usage/API/documentation and performance
superior in all other aspects.

Thanks for the praise. That makes me very happy! I am a bit hold up with other stuff at the moment but feedback like this motivates me to dedicate more time on cache2k again.

I just opened another issue, it would be a great help and highly appreciated if you can give a bit more detailed feedback: #139

So I would like to ask how much of an "Alpha version" is in fact version 1.3.1?

There is a comprehensive test suite. Each alpha version can be expected to run without issues most of the time, when no new features are used. New features in Alpha versions may be buggy and change their interface in the next version.

Is it absolutely not production grade? or is it just lacking some unit coverage but was sanity tested for E2E?

I started to work on an async loader, which added a lot of complexity, especially with the listener interaction. The code that calls the listeners has a lot more thorough testing and got a lot of cleanup. Please give the week to check and release another alpha or beta new version for you. Using the latest code base, I can support you better, if you run into an issue.

Using a synchronous listener in 1.3.1 is probably okay.

Any ETA time for version 1.4?

That will depend on your feedback and happiness and how much time I can mobilize. 2-4 weeks seem doable.

@cr-orilibhaber
Copy link

cr-orilibhaber commented Apr 27, 2020

Well, I took 1.3.1 for a test drive and I admit it looks mostly promising, the thing is, I noticed that the eviction Listener invocation was flaky, it either is able to work flawlessly, or not at all.
I'll expand on this:
I use a messaging service to deliver events to a SpringBoot based service.
I use Cache2K as a read-through cache (LRU) to load results from a DB, it is instantiated lazily, manually (not through Spring's DI) as this is a multi-tenant service and the key set variations are unknown.
I test the LRU capability of the cache by using a maxSize limit of 1 to trigger eviction constantly
(I use a key set > 3)
When the message queue is empty, and the whole service is given enough time to start (along with Cache2K instance) the eviction Listener is able to intercept evicted events once traffic start coming in.
But when I start the service and the queue has already pending messages that start coming in as soon as the consumer thread is ready, it seems like Cache2K instance didn't had "enough time" to start-up correctly and as a result the eviction Listener doesn't get invoked at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants