Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Group scalability upgrades #22700

Merged
merged 27 commits into from
Oct 26, 2023

Conversation

alice-wondered
Copy link
Contributor

Resolves #22372

@eatikrh
Copy link
Contributor

eatikrh commented Aug 25, 2023

We have also been working on the UI changes that go along with these changes. Having the group providers work separated from the UI work procedurally makes sense. But without the UI changes Groups UI will break. Would you agree?
Loading the Groups tree in its entirety is due to the current UI expects it to be loaded that way. This has been the source of the performance problems we have come across with the Groups pages.
Expanding the group providers will be necessary, but as long as we don't change the UI, the performance problem will remain because UI will still ask for the whole group tree to be loaded and we would have to keep the functionality that loads the whole group tree in t

}
subGroups.add(subGroup);
}
return subGroups.stream().sorted(GroupModel.COMPARE_BY_NAME);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't better to leave sorting to the caller?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm totally fine with that! I was trying to maintain parity with the existing method

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean ordering at the storage layer? Let's also see what the @keycloak/store-maintainers think about it too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just mean that other methods in this class sort the stream at the end so I just followed that same pattern

// TODO: this batch size could potentially be stored somewhere (config?) as a variable
// calculate out batches of subgroups to delete from the parent group to avoid grinding the server to a halt at large scale
// especially helpful for freeing up some amount of database time for other requests
long batches = (long) Math.ceil(session.groups().getSubGroupsCount(realm, group.getId()) / 1000.0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't necessary to also flush at the end of the batch statement? Otherwise, they are just being added to the EntityManager without any real benefit during the commit phase.

If we really want running batches I would agree we need to resolve the size dynamically as it depends a lot on how you are using the server. Also, do we really want to be smart and calculate the size based on the subgroup count or just set a default integer and allow changing it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't pick up on that feature of the entity manager but it totally makes sense, will resolve that.

As far as the calculations go, I used the count as an indicator of where to stop but we could also just check if there are any groups left after each batch, it would result in more calls to the DB I think, though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see what the store team thinks about it. Perhaps there is a mechanism/pattern there already for batching.

What I was saying is to just include a provider configuration option with a default batch size and allow users to change it accordingly. We have this approach in other places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of calculating the number of batches beforehand - with "read committed" isolation, the items listed can change in a concurrent setup, so I'd prefer to loop until the result is empty.

I wonder if you've added em.flush() and the batching here to avoid performance problems. Calling em.flush() will write the changes to the current database transaction, still the entities will remain the Hibernate persistence context. Usually the large persistence context causes problems in performance, and only calling em.flush() doesn't solve this. Whenever one selects data from the database, a em.flush() is called anyway, so there is no need to call it here again IMHO.

There is currently no good solution to deletion and batching available. I could imagine an async deletion of groups, that would iteratively create a group and the subgroups in a background task, but that would require more thoughts and would be out-of-scope for this.

edewit
edewit previously approved these changes Aug 28, 2023
@cypress

This comment was marked as outdated.

ghost

This comment was marked as outdated.

alice-wondered and others added 3 commits October 24, 2023 15:47
…ocs to reflect changes in next version of keycloak, move deprecated usages of RealmModel methods to the corresponding KeycloakSession method
Signed-off-by: Michal Hajas <mhajas@redhat.com>
Signed-off-by: Michal Hajas <mhajas@redhat.com>
@mhajas
Copy link
Contributor

mhajas commented Oct 24, 2023

@Redhat-Alice there were some test failures. I added some fixes. I hope it will be ok now. I need to drop now I will be able to have a look tomorrow if anything breaks. Sorry for the delays.

@alice-wondered
Copy link
Contributor Author

No worries! I'll keep an eye out on GHA for the next hour or so and try to fix anything that comes up in the meantime!

@jonkoops
Copy link
Contributor

Looks like everything is green now 🎉

Copy link
Contributor

@jonkoops jonkoops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to the front-end still look good to me.

Copy link
Contributor

@mhajas mhajas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Redhat-Alice! This is a great performance improvement when working with groups!

@mhajas
Copy link
Contributor

mhajas commented Oct 25, 2023

FYI @sguilhen is away this and the majority of the next week. We don't need to wait for his review until he is back. Let's wait for @mposolda and we should be good to go.

@mposolda
Copy link
Contributor

@mhajas @Redhat-Alice @jonkoops Thanks everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expand Group providers to allow for paginated lookup of subgroups
10 participants