loadAll being called once for each element of getAll #1697

pschmeltzer · 2016-12-09T14:42:32Z

When using Ehcache3 (3.1.1) with on heap store, calls to getAll with a key set > 1 on cold cache result in loadAll being called for each key separately. The iterable being passed to loadAll in CacheLoaderWriter always appears to be a size of 1 regardless to the amount of keys in getAll.

Here is my cache configuration:
cache = cacheManager.createCache(name, CacheConfigurationBuilder.newCacheConfigurationBuilder(key, value, ResourcePoolsBuilder.heap(entries)) .withLoaderWriter(loader) .build() );

See comments on stackoverflow

The text was updated successfully, but these errors were encountered:

spierre99 · 2017-02-14T16:22:19Z

Exactly same problem for me. Can you advise us about a possible date fixing ?
Very difficult for us to optimize database loading by avoiding multiple SQL requests

henri-tremblay · 2017-02-20T17:53:48Z

I was able to easily reproduce with the following code. We will now look for a fix

@Test
public void test() {
    CacheLoaderWriter<Integer, String> loaderWriter = new CacheLoaderWriter<Integer, String>() {
        @Override
        public String load(Integer integer) throws Exception {
            return integer.toString();
        }

        @Override
        public Map<Integer, String> loadAll(Iterable<? extends Integer> iterable) {
            System.out.println("loadAll called");
            return StreamSupport.stream(iterable.spliterator(), false)
                .peek(i -> System.out.println("Id: " + i))
                .map(e -> new AbstractMap.SimpleImmutableEntry<>(e, e.toString()))
                .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
        }

        @Override
        public void write(Integer integer, String s) {
        }

        @Override
        public void writeAll(Iterable<? extends Map.Entry<? extends Integer, ? extends String>> iterable) {
        }

        @Override
        public void delete(Integer integer) throws Exception {
        }

        @Override
        public void deleteAll(Iterable<? extends Integer> iterable) {
        }
    };
    CacheManagerBuilder cacheManagerBuilder = newCacheManagerBuilder();
    try(CacheManager cacheManager = cacheManagerBuilder.build(true)) {
        Cache<Integer, String> cache = cacheManager.createCache("test",
            CacheConfigurationBuilder.newCacheConfigurationBuilder(Integer.class, String.class,
                ResourcePoolsBuilder.heap(10)).withLoaderWriter(loaderWriter).build());


        cache.getAll(IntStream.range(1, 10).mapToObj(i -> Integer.valueOf(i)).collect(Collectors.toSet()));
    }
}

spierre99 · 2017-03-16T14:04:25Z

Hi Henry,

Do you have any clue about how to solve this challenge ?

henri-tremblay · 2017-03-16T14:15:52Z

I've started to look at it. However, it is on hold right now. The issue is a bit complicated due to the way Stores are working. I need to reimplement the getAll entirely. Do you need a workaround?

…

On 16 March 2017 at 10:04, spierre99 ***@***.***> wrote: Hi Henry, Do you have any clue about how to solve this challenge ? — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#1697 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABKRM0VntD0OP4FMH45JucrH0l4EvVlQks5rmUFpgaJpZM4LJANn> .

henryptung · 2017-05-18T21:12:50Z

@henri-tremblay Any update on this? Would be interested in a workaround too, if available.

henri-tremblay · 2017-05-19T14:41:15Z

Still complicated and was put on ice for now.

The best workaround I can see is to do the loadAll externally. You load everything you need and the putAll. And then do the getAll whenever you need. I'm assuming here that you want to pre-load some data.

alexsnaps · 2017-08-21T14:01:07Z

Actually this isn't a bug, but rather is a feature… by design. Possibly a non-desired feature though... In the meantime, as I share some responsibility in this design, let me try to explain why the behavior is as such.

What the CacheLoader APIs aims at solving

Besides the problems around consistency a cache-through pattern addresses, the loader API enables readers to not have to synchronize around populating the cache. The loader will make sure only a single thread will populate the cache for a given key, while other threads accessing the same key will just block until the value is installed.

The idea behind the design

The CacheLoaderWriter.loadAll() takes the side of optimizing the cache, not the underlying SoR. This means that the set of keys passed to the .loadAll method will match a given locked segment of the underlying Cache's Store. In the case of the on-heap store, there are no such "segments", keys are locked on a per-key basis, not a segment.

That's why, in the example above, you only get to have an Iterable with a single entry. I think this has been discussed at length in one the early dev meetings (not quite sure which one though).

Simplicity

The current design enables entries being loaded to be made available as quickly as possible. It could be something else makes this impossible currently (I haven't read the code in a long while now), but this enables two threads each populating one of two keys being loaded.

t1: cache.getAll(a, b);
t2: cache.get(b);

The first thread (t1) populates a, but b is already loaded when it accesses it, as t2 populated the key b already.
If t1 would have locked both a & b prior to loading them, t2 couldn't have parallelized the load to b while a was being loaded.

Ordering

Things can get somewhat more tricky to deal with as well, imagine the case below:

t1: cache.getAll(a, b);
t2: cache.getAll(b, a);

These two threads can now deadlock each other. Obviously ordering the lock acquisitions would be the way to solve this, but there is no contraint on the type of a & b to be ? extends Comparable neither.

Now this can all be mitigated with other locking strategies, like installing some lock object in the mapping, so that t1 could see that a is already being loaded. Then the cache could either have each thread load a single key, or transfer the responsibility of loading all keys on a single thread (and hence CacheLoaderWriter.loadAll(), so to provide more options for the loader to optimize data access for the underlying system of record.

So... what?

All this to say that the route chosen was to optimize for latency of the cache. I hope that the examples above show how that approach makes access to the cache somewhat fairer to all... While other options exists, they come at a higher price in terms of runtime because of the additional coordination requirements.

Now maybe there are better ways of going about this though... I haven't given this any further thoughts for years to be honest. Nonetheless that's a vague attempt at explaining why things are the way they are; what challenges need to be accounted for when trying to "address this issue"; and, most importantly, what tradeoffs they may bring...

henri-tremblay · 2017-08-22T04:06:12Z

Highly interesting explanation. I'll take that in consideration. The problem is really the loader writer. Doing tons of queries to the database instead of one to load a cache isn't at all efficient. So we will have to find a way.

alexsnaps · 2017-08-23T00:08:41Z

Also, note that `loadAll` never was intended to warm the cache up... If you'd want to do that (effectively), there'd be other means. We used to have a utility for that (including one that'd snapshot the current hot keyset at a given interval/shutdown and use that information to reload the cache up on restart). That _could_ use the same CacheLoaderWriter API. Anyways, a matter of tradeoffs... I can understand that in case your SoR enables "batch loads", you may want to leverage that. One idea, thinking out loud here, is to support both approaches based on config: loader/writer optimized for either cache or SoR. In all cases, best of luck :)

dsylenko · 2019-10-23T18:48:27Z

Any news on this issue?

henri-tremblay self-assigned this Feb 15, 2017

henri-tremblay removed their assignment Sep 18, 2017

fossf added the enhancement label Feb 14, 2019

chrisdennis added the pending review label Sep 21, 2022

chrisdennis removed the pending review label Nov 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

loadAll being called once for each element of getAll #1697

loadAll being called once for each element of getAll #1697

pschmeltzer commented Dec 9, 2016 •

edited

spierre99 commented Feb 14, 2017

henri-tremblay commented Feb 20, 2017 •

edited

spierre99 commented Mar 16, 2017

henri-tremblay commented Mar 16, 2017 via email

henryptung commented May 18, 2017

henri-tremblay commented May 19, 2017

alexsnaps commented Aug 21, 2017 •

edited

henri-tremblay commented Aug 22, 2017

alexsnaps commented Aug 23, 2017 via email •

edited

dsylenko commented Oct 23, 2019

loadAll being called once for each element of getAll #1697

loadAll being called once for each element of getAll #1697

Comments

pschmeltzer commented Dec 9, 2016 • edited

spierre99 commented Feb 14, 2017

henri-tremblay commented Feb 20, 2017 • edited

spierre99 commented Mar 16, 2017

henri-tremblay commented Mar 16, 2017 via email

henryptung commented May 18, 2017

henri-tremblay commented May 19, 2017

alexsnaps commented Aug 21, 2017 • edited

What the CacheLoader APIs aims at solving

The idea behind the design

Simplicity

Ordering

So... what?

henri-tremblay commented Aug 22, 2017

alexsnaps commented Aug 23, 2017 via email • edited

dsylenko commented Oct 23, 2019

pschmeltzer commented Dec 9, 2016 •

edited

henri-tremblay commented Feb 20, 2017 •

edited

alexsnaps commented Aug 21, 2017 •

edited

alexsnaps commented Aug 23, 2017 via email •

edited