Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter subgroups before paginating #27513

Merged
merged 4 commits into from
Mar 15, 2024
Merged

Conversation

pkeuter
Copy link
Contributor

@pkeuter pkeuter commented Mar 4, 2024

Closes #27512

Signed-off-by: Peter Keuter <github@peterkeuter.nl>
Signed-off-by: Peter Keuter <github@peterkeuter.nl>
Copy link
Contributor

@sguilhen sguilhen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THanks @pkeuter , the fix makes sense to me. Handling the pagination before filtering may result in less entries being returned than expected, so it makes sense to filter before paginating.

I've added a suggestion to use the no param variation of getSubGroupsStream along with StreamsUtil.paginatedStream that handles the pagination (including some basic validation for the values passed as first and max.

Signed-off-by: Peter Keuter <github@peterkeuter.nl>
Signed-off-by: Peter Keuter <github@peterkeuter.nl>
Copy link
Contributor

@sguilhen sguilhen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me! @pedroigor @ahus1 would any of you have time to check this one and possibly merge it? Thanks!

Copy link
Contributor

@ahus1 ahus1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving based on @sguilhen's review. Thank you for this PR!

@ahus1 ahus1 merged commit e26a261 into keycloak:main Mar 15, 2024
65 checks passed
@pkeuter
Copy link
Contributor Author

pkeuter commented Mar 15, 2024

Thanks for merging @ahus1! Is there any chance we could get this in a patch version of 24 instead of in the next major release? That would be great.

@ahus1
Copy link
Contributor

ahus1 commented Mar 15, 2024

@pkeuter - sure, that's possible. Please create a PR against release/24.0 which cherry-picks the commit from main. Once you created the PR and the checks are green, I can have a look to merge it.

@pkeuter
Copy link
Contributor Author

pkeuter commented Mar 15, 2024

Here you go @ahus1. Thanks for accepting this for 24.0.x!

#27942

@mhajas
Copy link
Contributor

mhajas commented Mar 15, 2024

@sguilhen @ahus1 This PR can have performance implications on the getSubGroups endpoint. The pagination was added as part of this PR #22700 which was doing some performance optimizations. With this change, the optimization is not used (we always load all children groups from the database when this endpoint is called). This can have a big impact, for example, when LDAP groups are stored as subgroups of some parent group.

Also, this endpoint is used from the admin console. The group page for a group with many children groups may be taking a lot of time to load even though the pagination is used.

@alice-wondered Will this have a performance impact in your use case?

cc: @pedroigor

@pkeuter
Copy link
Contributor Author

pkeuter commented Mar 15, 2024

@mhajas I understand your concern, but I would think that having a correct response is more important than having a fast response? Even in the case of thousands of child groups, IMO the performance impact will still be minimal since it will still return a small portion and the rest of the processing is done on the server.

@alice-wondered
Copy link
Contributor

I can foresee there being some interesting performance implications for our use case as we use a structured group approach to implement some form of multi-tenant organizations at the moment. As a result we have an extremely large number of groups nested within a top-level group. Loading all of them into memory will definitely have an impact on our services, especially since any group operation other than a read will also require rebuilding the entire cache.

It looks like the organization work has pivoted towards using a first class entity rather than relying on groups, so our use case would eventually be resolved by switching our implementation to that approach. However we track upstream for our releases so the impact would be immediate.

Here are some of my thoughts about this problem in general....

At the time of this work and while looking into the multi-tenant organization effort I was actually thinking about this particular scenario a bit. I'm of the opinion that the only scalable way to implement this behavior would be to change out the dynamic "just-in-time" evaluation of the permission entities into something that can be done upfront and then used as a filter for db requests.

If the permissions system were able to return a list of group ids that match the permission (I believe that spiceDB supports behavior like this using graph optimizations on their lookup API for an example) then it would be easy enough to just get pages of group ids at once and then fetch all of them from the database. That's an easy enough operation to cache and we get the added benefit of the entire search being on the indexed column.

I could also see java 21 virtual threads makings shorter work of this problem. Spinning up N virtual threads to "divide and conquer" the work and then ultimately forming the results into pages. The main performance concern here would be choosing N such that you don't keep high memory requirements as with a basic Collection approach you wouldn't be able to free memory until the Collection in question goes out of scope and you'd still be preserving the results that pass the filter as well.

@sguilhen
Copy link
Contributor

sguilhen commented Mar 15, 2024

@mhajas yeah, I noticed that now that I looked into the JPA GroupAdapter implementation, thanks for bringing this to my attention. If the change is a big step back in terms of performance, then let's revert it and discuss the issue this PR aims to fix and how we could actually fix this without undoing @alice-wondered 's work. As it stands now , the endpoint is broken - it returns the wrong number of entries (or even none) when the user doesn't have permission to retrieve all groups.

@pkeuter
Copy link
Contributor Author

pkeuter commented Mar 15, 2024

I think I now understand what the problem you are speaking of is about, since the adapter can get the groups dynamically via LDAP. That could indeed have a big performance impact. could, because I think the performance impact for the current situation will still be minimal.

This is because the current implementation presumably did not yield any issues, because all either canViewGlobal or auth.groups().canView(g) is true. Therefore, nothing will change because when filtering, and every item in the filter returns true anyway, The limit in paginatedStream will still kick in after 10 items. Making no difference at all.

Unless the user does not have rights to see all groups and the first item the user can see starts at position 11 or beyond, in that case it will still need to get the first 10 items from the adapter. But I think this is the expected behaviour because the endpoint would otherwise be completely useless.

So to make a long story short:

With this change, the optimization is not used (we always load all children groups from the database when this endpoint is called). This can have a big impact, for example, when LDAP groups are stored as subgroups of some parent group.

Unless I am completely missing something here: I don't think this is true, since the stream is lazy and will yield results after hitting the limit, therefore making this not an issue. But please correct me if I am wrong.

@sguilhen
Copy link
Contributor

This can have a big impact, for example, when LDAP groups are stored as subgroups of some parent group.

honest question: how exactly can that happen? I don't see the search going into LDAP anywhere during the invocation of the endpoint, the LDAP groups were already imported when the group mapper synched the groups from LDAP. This search is going only into Keycloak's DB as far as I can see. I'm prob missing something but I just don't see LDAP involved in this search at all.

@pkeuter
Copy link
Contributor Author

pkeuter commented Mar 15, 2024

Unless I am completely missing something here: I don't think this is true, since the stream is lazy and will yield results after hitting the limit, therefore making this not an issue. But please correct me if I am wrong.

The problem, as I understand it, is that getSubGroupsStream() loads all subgroups from the DB, and then the result is filtered based on the permission, and then it is limited. The variant that takes the first and max params propagates those to the query that is performed on the DB, so the number of entities retrieved and stored in memory is significatly lower. The only problem with this approach is that any filter that you can't bake into the JPA query has to be applied afterwards in the stream, resulting in possibly less results than intended.

But this has not changed? I've switched from this implementation to calling getSubGroupsStream() directly. What would that change performance wise? It is still a lazy stream.

@sguilhen
Copy link
Contributor

It has changed: this method is overriden in the JPA GroupAdapter: https://github.com/pkeuter/keycloak/blob/fe16e959451be7cf904958a2ec4c1d36b7437a8a/model/jpa/src/main/java/org/keycloak/models/jpa/GroupAdapter.java#L130

I'm not sure why exactly there's such a big performance difference between the two approaches as we are retrieving the entities via getResultStream(), which should use the JDBC scrolling capabilities so we don't load everything. But again, I haven't dug into this too deep to know why this is a problem in terms of performance, perhaps @alice-wondered can share more details.

@pkeuter
Copy link
Contributor Author

pkeuter commented Mar 15, 2024

Ah I see, thanks for the clarification, and sorry for the potential confusion I have brought in 😅 I agree, the difference should theoretically still be minimal. We'll await some more details.

@alice-wondered
Copy link
Contributor

alice-wondered commented Mar 15, 2024

we are retrieving the entities via getResultStream(), which should use the JDBC scrolling capabilities so we don't load everything

I actually didn't know this was going on under the hood, which is what the cause for my concern originates from. Researching some it seems to be contentious to use it in this way *EDIT: use it haphazardly since databases are really good at filtering and moving it to a java stream bypasses that and is also a bit non-obvious about what's going on.

That being said, in this scenario where we can't really do anything about the filtering that we have to do on every single entity then I'm not entirely sure there's a better optimization to be made.

I still personally think the ideal would be more robust permissions evaluation that would let us not get records that we're going to throw away anyway but that's definitely out of the scope of this issue

@mhajas
Copy link
Contributor

mhajas commented Mar 18, 2024

Unless I am completely missing something here: I don't think this is true, since the stream is lazy and will yield results after hitting the limit, therefore making this not an issue. But please correct me if I am wrong.

The missing part is that the paginatedStream, in this case, does not affect how many objects are loaded from the database. We always load all sub groups (line 227), collect the result to memory, and then call the limit method on the stream from the cached collection. See cache implementation:

public Stream<GroupModel> getSubGroupsStream() {
if (isUpdated()) return updated.getSubGroupsStream();
Set<GroupModel> subGroups = new HashSet<>();
for (String id : cached.getSubGroups(modelSupplier)) {
GroupModel subGroup = keycloakSession.groups().getGroupById(realm, id);
if (subGroup == null) {
// chance that role was removed, so just delegate to persistence and get user invalidated
getDelegateForUpdate();
return updated.getSubGroupsStream();
}
subGroups.add(subGroup);
}
return subGroups.stream().sorted(GroupModel.COMPARE_BY_NAME);
}

I still personally think the ideal would be more robust permissions evaluation that would let us not get records that we're going to throw away anyway but that's definitely out of the scope of this issue

This could be a solution from the long-term perspective, however, I don't think it is a simple solution. Such big changes need to be designed and then planned accordingly to find a unified solution for all other places. This issue is more immediate and we should find a fix that can be added quickly.

I would vote for reverting this PR and finding another solution to this issue as in my opinion, the performance drop is more serious than returning fewer groups. But I would be happy to hear other suggestions.

@pkeuter
Copy link
Contributor Author

pkeuter commented Mar 18, 2024

@mhajas Thanks for the clarification. Because this function is still only used in the admin-ui (group picker modal and grouptree), I would personally still think that it's better to have a "mostly fast always correct"-result than an "always fast sometimes incorrect"-response. But that is, of course, not my decision.

Just chiming in with my limited knowledge of Keycloak in it's entirety; maybe looping through the subgroups is not necessary because the update-actions already call the getDelegateForUpdate-function and this seems to be a sort of sanity-check, and something like this will also work? In that case the actual stream can be used.

 @Override
    public Stream<GroupModel> getSubGroupsStream() {
        if (isUpdated()) return updated.getSubGroupsStream();
        return modelSupplier.get().getSubGroupsStream();
    }

Or hopefully someone with a better knowledge of the application can figure out something better like you mentioned because I really think reverting back to a broken situation doesn't seem like the best option here.

@mhajas
Copy link
Contributor

mhajas commented Mar 18, 2024

Can you elaborate more on "mostly fast" part? I don't think I am following.

Also, what do you mean by the update-actions? When you load subgroups it does not need to involve any update.

Are you suggesting removing caching from that method completely?

@pkeuter
Copy link
Contributor Author

pkeuter commented Mar 18, 2024

Can you elaborate more on "mostly fast" part? I don't think I am following.

Well I think most groups won't have thousands of subgroups, therefore this function will only be slow when the amount of subgroups is huge.

Also, what do you mean by the update-actions? When you load subgroups it does not need to involve any update.

Sorry for not explaining too well. I was thinking these functions already call getDelegateForUpdate, therefore I thought maybe this call would not be necessary:


Are you suggesting removing caching from that method completely?

That was not what I intended with my suggested change, but looking more closely I see what you mean. The call to cached.getSubgroups will call the database lazily and on subsequent calls the results of that call will be used.

I am not sure what the best option would be. With the cache removed, not the entire list of subgroups will have to be fetched so this could be a viable option. But again: I don't know all the details that this change would impact so maybe it's good to have someone with more detailed knowledge look at this. But going back to a broken situation also doesn't look like the right solution to me.

@sguilhen
Copy link
Contributor

@mhajas question: if in the endpoint he calls getSubgroupsStream(-1, -1), the infinispan adapter won't collect all results, instead the call will go all the way to the JPA adapter, right? Then filtering and paginating the results after that shouldn't cause the whole collection of groups to be stored in memory. If I got it right, only when calling getSubgroupsStream() it the causes the stream to be collected into a collection of subgroup ids. Just trying to understand how the cache is working in each case.

@alice-wondered
Copy link
Contributor

@mhajas

This could be a solution from the long-term perspective, however, I don't think it is a simple solution. Such big changes need to be designed and then planned accordingly to find a unified solution for all other places. This issue is more immediate and we should find a fix that can be added quickly.

Completely agree. It's definitely not a trivial change

@mhajas
Copy link
Contributor

mhajas commented Mar 18, 2024

@mhajas question: if in the endpoint he calls getSubgroupsStream(-1, -1), the infinispan adapter won't collect all results, instead the call will go all the way to the JPA adapter, right? Then filtering and paginating the results after that shouldn't cause the whole collection of groups to be stored in memory. If I got it right, only when calling getSubgroupsStream() it the causes the stream to be collected into a collection of subgroup ids. Just trying to understand how the cache is working in each case.

Sounds like something that could work. We should probably do some performance comparisons. Maybe some manual test with 10k groups would do the thing.

@sschu
Copy link
Contributor

sschu commented Mar 19, 2024

I am wondering if this wouldn't be a good opportunity to test common table expressions to get the group structure recursively from the DB: https://in.relation.to/2023/02/20/hibernate-orm-62-ctes/

@sguilhen
Copy link
Contributor

@mhajas @ahus1 @alice-wondered As we're inching towards Keycloak 25, I wanted to get back to this discussion because if this needs to get reverted we need to to do that sooner than later.

I've tested some things as Michal suggested, created a group, 10k subgroups and took some time measurements when calling the getSubGroups method to fetch 10 groups in the GroupResource endpoint. So here's what I found out:

1- previous version, implemented by @alice-wondered , is the most efficient because it only fetches the required number of entities from the DB, then processes those entities (cache them, etc) and returns them. Jumping around the list - for example, requesting 10 sugroups starting from the 5000th subgroup - yields the same results in terms of response time. Again this is expected because the DB is always queried for a small batch of sugroups.

Here's some time measurments in this scenario. Numbers in parenthesis reflect the first and max params.

Time taken to fetch 10 subgroups 43 ms    (0, 10)
Time taken to fetch 10 subgroups 34 ms   (1000, 10)
Time taken to fetch 10 subgroups 47 ms   (5000, 10)
Time taken to fetch 10 subgroups 47 ms   (9000, 10)

Numbers are similar when fetching sequential pages, although a little better than "randomly" jumping around the list:

Time taken to fetch 10 subgroups starting at 0: 51 ms
Time taken to fetch 10 subgroups starting at 10:  35 ms
Time taken to fetch 10 subgroups starting at 20:  33 ms
Time taken to fetch 10 subgroups starting at 30:  34 ms
Time taken to fetch 10 subgroups starting at 40:  32 ms
Time taken to fetch 10 subgroups starting at 50:  33 ms
Time taken to fetch 10 subgroups starting at 60:  27 ms
Time taken to fetch 10 subgroups starting at 70:  29 ms
Time taken to fetch 10 subgroups starting at 80:  28 ms
Time taken to fetch 10 subgroups starting at 90:  28 ms
Time taken to fetch 10 subgroups starting at 100: 30 ms

2- The implementation in this PR, calling getSubGroupsStream() with no params, ends up loading every subgroup in the infinispan adapter. Times taken when fetching 10 groups starting in "random" positions follows:

Time taken to fetch 10 subgroups 1657 ms    (0, 10)
Time taken to fetch 10 subgroups 655 ms   (1000, 10)
Time taken to fetch 10 subgroups 54 ms   (5000, 10)
Time taken to fetch 10 subgroups 49 ms   (9000, 10)

The first request takes the longest as it fetches and caches every subgroup. Subsequent requests end up using the cache, but as I understand it this initial request fetching everything is what prompted the changes made by @alice-wondered , so it is something we don't want to do.

Fetching sequential pages showed very similar results: first request takes more time, subsequent ones take around 50 ms:

Time taken to fetch 10 subgroups starting at 0: 1318 ms
Time taken to fetch 10 subgroups starting at 10: 576 ms
Time taken to fetch 10 subgroups starting at 20: 48 ms
Time taken to fetch 10 subgroups starting at 30: 65 ms
Time taken to fetch 10 subgroups starting at 40: 68 ms
Time taken to fetch 10 subgroups starting at 50: 47 ms
Time taken to fetch 10 subgroups starting at 60: 49 ms
Time taken to fetch 10 subgroups starting at 70: 54 ms
Time taken to fetch 10 subgroups starting at 80: 58 ms
Time taken to fetch 10 subgroups starting at 90: 51 ms
Time taken to fetch 10 subgroups starting at 100: 45 ms

3- If we change things in this PR to call getSubGroupsStream(-1, -1) instead of the no-arg version, it does prevent infinispan from loading all entries. BUT the JPA adapter adds a sort(comparator) to the stream that makes no sense as the DB already has an ORDER BY query, and this comparator ends up loading all subgroups just to be able to sort them. In the end, this not only loads everything like the no-arg version, but it has a worse performance. See for example the numbers for random fetches:

Time taken to fetch 10 subgroups 1479 ms (0, 10)
Time taken to fetch 10 subgroups 636 ms  (1000, 10)
Time taken to fetch 10 subgroups 249 ms  (5000, 10)
Time taken to fetch 10 subgroups 330 ms  (9000, 10)

Fetching sequential pages is not much better either, and worse than the no-arg version:

Time taken to fetch 10 subgroups starting at 0: 1488 ms
Time taken to fetch 10 subgroups starting at 10: 409 ms
Time taken to fetch 10 subgroups starting at 20: 175 ms
Time taken to fetch 10 subgroups starting at 30: 335 ms
Time taken to fetch 10 subgroups starting at 40: 96 ms
Time taken to fetch 10 subgroups starting at 50: 147 ms
Time taken to fetch 10 subgroups starting at 60: 169 ms
Time taken to fetch 10 subgroups starting at 70: 128 ms
Time taken to fetch 10 subgroups starting at 80: 239 ms
Time taken to fetch 10 subgroups starting at 90: 114 ms
Time taken to fetch 10 subgroups starting at 100: 162 ms

4- Still using the idea of calling getSubGroupsStream(-1, -1) but also removing the sort(comparator) from the JPA adapter, the numbers are a lot better as it doesn´t load the whole thing just to be able to sort the subgroups. Here's the time measurements for the random fetches:

Time taken to fetch 10 subgroups 73   (0, 10)
Time taken to fetch 10 subgroups 134  (1000, 10)
Time taken to fetch 10 subgroups 281  (5000, 10)
Time taken to fetch 10 subgroups 333  (9000, 10)

Notice how the first request was a lot faster. The other ones don't look a lot better, but that's because when we jump, say, to position 5000, it has to process and cache everything until it reaches that point. Compared to Alice's version, where this is done when querying the DB, it is worse, for sure. Jumping back, say, to positon 2000, yields a much better results because of the cached groups (around 40 ms).

However, when fetching sequential pages, it has shown good numbers, comparable to what we see from Alice's version:

Time taken to fetch 10 subgroups starting at 0 96
Time taken to fetch 10 subgroups starting at 10 55
Time taken to fetch 10 subgroups starting at 20 51
Time taken to fetch 10 subgroups starting at 30 55
Time taken to fetch 10 subgroups starting at 40 50
Time taken to fetch 10 subgroups starting at 50 50
Time taken to fetch 10 subgroups starting at 60 48
Time taken to fetch 10 subgroups starting at 70 50
Time taken to fetch 10 subgroups starting at 80 56
Time taken to fetch 10 subgroups starting at 90 56
Time taken to fetch 10 subgroups starting at 100 62

@sguilhen
Copy link
Contributor

What we have to decide is what to do with this information. In all scenarios the user has permission to read the returned groups.

Alice's version is the most efficient for sure, but it makes the endpoint return the wrong number of results if the user doesn't have permissions for all the fetched groups. In some cases, it may even return zero subgroups if the first 10 fetched by the DB are all groups the user doesn't have permission to view. It is a regression as in previous versions the endpoint returned the expected number of entries.

The last idea I've explored (number 4 above) yielded some interesting results. It prevents infinispan from caching every subgroup in the first request, and still has decent response time when fetching pages sequentially. It also properly filters the groups and returns the expected number of entries. Notice that with the sort call removed from the stream, it will load from the DB only the number of entities needed to fulfill the search criteria: it reads them one by one until 10 groups the user has permission to see are processed.

Of course, if the user has permissions to see only a handful of groups out of the 10k, it might end up loading a good part of the DB and the response time will probably not be so good, but it is what it is. Unless we find a way to resolve the permissions in a different way, I don't see how we can avoid this situation.

So I would really like to know what you would prefer to do here. The way I see we can

1- revert this commit, but the pagination will continue to be broken (not only here, but for top level groups as well - we have an issue raised from the community about that too) when user doesn't have permissions for every subgroup returned. We then must find a different way to tackle this, possibly reworking how the permissions are evaluated;

2- go with option number 4 above, which seems to perform decently in the default case (where user has permissions to read all subgroups), and returns the proper number of results from the method. Drawback is that performabce will probably drop if the user has permissions to see only a small subset of the subgroups compared to Alice's version.

WDYT?

@alice-wondered
Copy link
Contributor

alice-wondered commented Apr 18, 2024

but it makes the endpoint return the wrong number of results if the user doesn't have permissions for all the fetched groups.

I agree that this is not a good solution when the output is visibly incorrect. My solution would work best with a more upfront way of getting a batch of ids in order that are allowed and then only fetching those, but as mentioned above that would require more work and is out of immediate scope. When we get into very fine grained access with 10k+ entries I think this is the best solution long term.

go with option number 4 above

This seems like the best immediate trade-off to me!

Thanks for doing all of the bench-marking, the results are really interesting to read through

@sguilhen
Copy link
Contributor

Thanks for the feedback, @alice-wondered ! I also personally find option number 4 a good solution in the short term. It might be good to revisit the whole permission resolution at some point, because several endpoints end up filtering results from the DB based on the permissions and its really not an ideal thing to do when fetching results from the DB in a stream because you basically have to give up on using the first and max parameters in the queries to be able to return the expected number of results, which hurts the performance. Perhaps there's a way to , like you suggested, resolve the permissions before going to the DB.

@mhajas
Copy link
Contributor

mhajas commented Apr 22, 2024

@sguilhen Just to understand correctly, is the option 4 compared to option 3 faster because the sorting is not done or because getResultStream does not load all the results from the database?

I am wondering about: has to process and cache everything until it reaches that point part. Maybe I missed something but where is getSubGroupsStream(-1, -1) processed and cached? It seems we are not caching that call

public Stream<GroupModel> getSubGroupsStream(Integer firstResult, Integer maxResults) {
if (isUpdated()) return updated.getSubGroupsStream(firstResult, maxResults);
return modelSupplier.get().getSubGroupsStream(firstResult, maxResults);
}

@sguilhen
Copy link
Contributor

hey @mhajas! Sorting not done is the key difference, because when it is there it basically requires the whole stream to be available for the sort function, so it ends up loading everything. When it is not there, getResultStream loads things on demand until the stream requirements are met (number of entries you want to return is reached in the endpoint, for example).

So, if you need to "jump" in the stream to fetch 10 sugroups starting in position 5000, it will process every subgroup from the DB, discarding the first 5000 results, and then it gets the 10 it wants and returns them. This is why I said that it is not so great when you jump around the stream too much.

We are indeed not caching that call, but what I meant by my comment is that we still create a cached group for every subgroup that is read from the DB. And once the cached groups are there, subsequent requests run faster - probably creating and caching them individually is still a costly operation.

For example, suppose you do a search starting on position 1000, fetching the next 10. So you end up processing the first 1010 entities loaded from the DB, creating cached group instances for each one of them. This has a cost, the response time will be a bit higher if you didn't previously cache all those subgroups.

Now, if you do a search for 10 groups starting on position 500, it runs a lot faster - it still has to process 510 entities from the DB, but they have all been cached individually in the previous call and the overall response time is significantly lower.

@mhajas
Copy link
Contributor

mhajas commented Apr 22, 2024

We are indeed not caching that call, but what I meant by my comment is that we still create a cached group for every subgroup that is read from the DB.

I am not sure about that statement. Where do we create the cached group adapter?

@sguilhen
Copy link
Contributor

sguilhen commented Apr 23, 2024

This

ends up in the infinispan RealmAdapter, and that one delegates to RealmCacheSession.getGroupById which creates the adapters

public GroupModel getGroupById(RealmModel realm, String id) {
CachedGroup cached = cache.get(id, CachedGroup.class);
if (cached != null && !cached.getRealm().equals(realm.getId())) {
cached = null;
}
if (cached == null) {
Long loaded = cache.getCurrentRevision(id);
GroupModel model = getGroupDelegate().getGroupById(realm, id);
if (model == null) return null;
if (invalidations.contains(id)) return model;
cached = new CachedGroup(loaded, realm, model);
cache.addRevisioned(cached, startupRevision);
} else if (invalidations.contains(id)) {
return getGroupDelegate().getGroupById(realm, id);
} else if (managedGroups.containsKey(id)) {
return managedGroups.get(id);
}
GroupAdapter adapter = new GroupAdapter(cached, this, session, realm);
managedGroups.put(id, adapter);
return adapter;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Getting subgroups does pagination before filtering
6 participants