-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cache maintains size greater than the defined maximumSize
for a long time (1h+)
#420
Comments
The cache does use a clean-up process that is schedules on the ForkJoinPool. It uses read and write buffers to decouple the hash table from the eviction policy, which allows us to make the policy work concurrently rather than locking all operations. The write buffer is sized at 32768, but should be aggressively drained on any read/write if not empty. The scheduling of this work is guarded by a flag indicating whether it was submitted to the ForkJoinPool. There was a bug in some JDKs where ForkJoinPool would drop tasks rather than running them. The cache is less trusting of the configured Executor now, so it will fallback to making block and writers perform this work if the write buffer is full. However even on a synthetic stress test it is difficult to fill the write buffer and cause this. From the trace it looks like FJP is blocked trying to signal the free worker thread after submitting into its work queue. We submit after obtaining a tryLock that guards the eviction policy, so unfortunately our recovery measures would also block since they need this lock. It appears to be a JVM bug where it is unable to unpark the thread and blocks the submitter. There isn't any way for us to recover or be resilient to this type of error, as it looks like a data race problem within their logic. The only workarounds would be to specify a different executor, e.g. same-thread, or upgrading to a later JVM (ideally latest patch release of your major version). |
Another possibility is if you are on an old Linux kernel version. There was a nasty bug that they quietly fixed which fundamentally broke all locks (futexes) under load. So you might be affected by a kernel bug, or a lack of workaround for a processor bug (e.g. Intel's TSX was very buggy and used to optimize locks). |
Thanks for your prompt responses.
So far I have run into this issue on Oracle's 1.8.0_241 (and some older version too). I might check a couple of older versions, but I suppose not before I find some small reproduction case.
There is only a read load (
Actually, I just selected one specific stack trace that shows that some cleanup is scheduled. To me, this means that:
Oh yeah, I remember that infamous futex issue :) I verified our kernel version against known synchronization issues. No findings here.
On my side, I can probably detect the stall and manually call cleanup. Similarly, it would be possible in caffeine to execute cleanup directly if e.g. the most recent scheduled cleanup has not completed after some nanos. But these are lame workarounds of an issue that shouldn't happen... I'll try to create a simple test that reproduces the issue to possibly learn more... Naive highly concurrent |
Great, the issue I knew of was #77 (JDK-8078490) which was fixed by then.
In that stacktrace the cache held its eviction lock and was stuck trying to schedule the task. All cleanups have to be performed under that lock, so if that's an error condition a manual cleanUp would block as well and be broken.
Yes, the caller just needs to schedule the work and move on. The read buffer is lossy so when full it just drops that event. The write buffer can't be, so it blocks.
Yes, since it thinks the FJP task was scheduled then it won't bother scheduling another one. The improvements in later versions won't schedule in that case, but will have writers assist by doing the work themselves if the write buffer is full. This means that the FJP may still be broken by dropping out task, but each time the write buffer is full the cache corrects itself by having the writer do the full cleanup. So it is broken but periodically fixed. The easiest workaround would be to simply set |
If manual
Good ideas, thanks! My removal listener is cheap enough to make testing both approaches worthwhile. At this point, I'm mostly concerned about correctness. |
great, please let me know the results with an alternative executor. I think the latest release is correct, but would very much appreciate your feedback from these experiments. |
I managed to reproduce the issue in a simple example. It mimics the behavior of the original code. My expectation is that if The problem reproduces with the default and https://gist.github.com/jakubzytka/349c6738a71f249743171a78864a795d Reproduced with:
under
and also
Example output (cache with
|
reproduces also with 2.8.3:
|
Thanks for the test case. Now I understand what the confusion is which you correctly identified originally. I had thought you meant it was broken and never catching up, blocking future writes, as that had problems in the past due to FJP bugs. But now I see what you said all along, that there is an incorrect assumption that the maximum size is a strict threshold and that the cache will not exceed. Solr's BlockCache used to be implemented on ConcurrentLinkedHashMap, which like Caffeine did not offer a strict guarantee. It sounds like you found a long-standing bug in their assumptions and a reasonable misunderstanding, An option would be to switch to Guava's cache that does provide a strict bound and often evicts prematurely to do so. A strict bound is only possible if the writers are serialized with the eviction policy, e.g. atomic under an exclusive lock. Guava does this by splitting the hash table into multiple segments, each with their own LRU policy. This approach limits throughput since writes to independent keys block each other and makes it more difficult to implement more advanced operations like In Caffeine and CLHM, we instead decouple the hash table and the eviction policy under two different locks. This is a pub-sub style approach where our write buffer captures the event and replays it on the policy. This allows the buffer to absorb a burst of writes and schedule using a The commit you referenced does help mask this race when the FJP.commonPool is used. It won't trigger for a custom executor, like Most of the time a small excess is okay, e.g. like GCs have garbage that is cleared out when activity occurs again. The main case where this caused trouble was for expiration, when application code expected a prompt notification (e.g. to close an idle session) but no cache activity meant no cleanup occured. We offer a If there is no activity then you could schedule a task that calls Sorry for my misunderstandings early on. Does that answer your questions? |
I'm sorry I still think we may not be on the same page :) There is a hard limit on cache size in solr's code, but it is enforced differently (I used IIUC that expectation is OK as long as Thanks for the hint about a |
We can certainly add a hint when absent to check and drain if needed. It simply never came up before as a problem :) We would change the current logic, public @Nullable V getIfPresent(Object key, boolean recordStats) {
Node<K, V> node = data.get(nodeFactory.newLookupKey(key));
if (node == null) {
if (recordStats) {
statsCounter().recordMisses(1);
}
return null;
}
... to something like, public @Nullable V getIfPresent(Object key, boolean recordStats) {
Node<K, V> node = data.get(nodeFactory.newLookupKey(key));
if (node == null) {
if (recordStats) {
statsCounter().recordMisses(1);
}
if (drainStatus() == REQUIRED) {
scheduleDrainBuffers();
}
return null;
}
... and write a unit test to assert the behavior. If that would solve your problem then I can look into it tonight. |
yeah, I can perfectly see why :) I'd appreciate it if this use case could be supported by caffeine. There is one more tricky thing... I do have a stack trace where draining buffers was scheduled during the perceived "stuck cache"... I'll recheck the metrics, maybe I missed something (like - an accidental cache hit) |
This won't hurt performance and we similarly trigger it if the retrieved entry is expired. This will of course be best effort, so a possibly stale read due to memory barrier piggybacking on an initial call or other races. You can build locally to test with that snippet if desired, or else I'll have a snapshot for you tonight to verify with prior to a new release. |
Please try the snapshot, e.g. against your test case, to see if that satisfies your needs. If so then I'll cut a release. Otherwise reopen if there's more for us to dig into. |
I confirmed 2.8.4-SNAPSHOT works both in the synthetic example given above and in my real-life scenario. Thank you very much for your outstanding help! |
Released 2.8.4! |
I'm observing a weird behavior of caffeine cache v2.4.0. This issue presumably goes away when caffeine is upgraded to v2.8.1.
I'm seeking a confirmation whether this is an old, fixed issue (possibly by e8ff6d3) or if that's a new issue and I was just lucky to not see it after the upgrade.
I'd greatly appreciate any comments.
The issue:
I have a cache with a
removalListener
and amaximumSize
, say 32767.I'm observing that the cache size (measured by the number of
put
minus the number of removalListener calls) sometimes gets stuck at 32768. There is a constantgetIfPresent
load, but noput
happens when the cache size stays at 32768.Sometimes this condition gets magically resolved but it usually takes hours.
The moment of resolution is often correlated with a reduction in the number of cache accesses (as if some cleanup task was being starved? On the other hand I do see parked
ForkJoinPool
threads)This is a problem because a certain component (specifically, Solr's
BlockCache
) relies on cache size getting down tomaximumSize
in a reasonably short time. Is such an expectation reasonable?Side note: I ensured immutability of cache key (
BlockCacheKey
in case of Solr) so no hanky-panky withhashCode
/equals
is involved.I gathered stack traces during the period when the cache is "stuck". They show that some sort of cleanup is triggered:
I also can see that there are unutilized
ForkJoinPool
threads afterward, e.g.:The text was updated successfully, but these errors were encountered: