Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve validatePublicUsername() performance #3091

Merged
merged 1 commit into from Sep 27, 2023

Conversation

JonathanTreffler
Copy link
Contributor

@JonathanTreffler JonathanTreffler commented Sep 27, 2023

On Instances with many groups the validatePublicUsername endpoint ends with a timeout, because the code checking if the anonymous users chosen userName matches any group on the server is quite inefficient. It does not utilize the search parameter of the Group::search() function and therefore loads all groups from all group backends from the database into RAM, creates a php entity for every one of them and calls two functions on every one of those entities.

I created a performance profile of a simple call of the /check/username endpoint on an instance with ~12k groups (which is certainly a lot, but I assure you every one of those has an important role on our instance and other apps have no problems working with this amount of groups):

In total this request took 71.6 seconds, causing the connection to the client to drop during the request with a timeout (the php process still runs to completion).

In the performance profile you can see, that most time (73%) is spent in the nextcloud core just fetching the data from the database and putting it into php entities.

image

In total this one request causes 54 THOUSAND requests to QueryBuilder->execute(), which AFAIK is below the cache layer and that means every one of those is a sql query to the database. I hope we can all agree that kind of amplification from one request is not great.

But the current implementation makes sense looking at the timeline: The first usage of this empty search parameter I could track down through various refactors is the initial implementation of Public Pages back in 2019 in #664 .
This was before the default database group backend started searching by display names in 2020 (introduced in nextcloud/server#21358), so it makes sense a custom search was implemented on this high level. But now we can use the MUCH faster filtering on the database query level 🥳 .

It is still true, that every group backend decides for itself, how it handles the search parameter and theoretically a group backend could decide only to search by gid, but I think it is reasonable to assume, that any group backend, that uses DisplayNames (ours with the 12k groups does not) follows the core implementation and also searches display names.

On our instance this small 9 characters of code patch brought the request time of the /check/username endpoint to 714 ms down from 71.6 seconds, a 10200% performance improvement 🥳 .
image

… groups

Signed-off-by: Jonathan Treffler <jonathan.treffler@rwth-aachen.de>
@dartcafe
Copy link
Collaborator

First of all, thanks for your excellent analysis.

12k groups. Wow. Never thought that this is a valid use case. Your suggestion will speed up it for sure, but it will eliminate the thought of avoiding spoofing by using exiting displaynames.

But I have to rethink about @svenseeberg's comment. Maybe it would be better to remove the check against group names, because groups can't be a participants. The check as it is will probably lose its value.

To be honest: I have to dig in my memory why we introduced it back then.

@dartcafe
Copy link
Collaborator

Ah, wait. Reading helps. I did not get the fact, that the search also includes the search for the group's displayname. In this case, adding the serach term to the group search will improve it indeed.

Nevertheless, we have to rethink the check.

@dartcafe
Copy link
Collaborator

dartcafe commented Sep 27, 2023

Tracking this down a little bit further: Since the IGroup:search() does an exact case insensitive search, it would be enough to simply check (OCA\Polls\Model\Group\Group:exists(query)), if there is a return from the search.

This way the object creation could be skipped also.

same for usernames in Nextcloud and among the shares.

Copy link
Collaborator

@dartcafe dartcafe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this should work...

@dartcafe dartcafe added this to the 5.4 milestone Sep 27, 2023
@dartcafe
Copy link
Collaborator

Please merge by yourself, unless you plan further improvements.

@dartcafe dartcafe linked an issue Sep 27, 2023 that may be closed by this pull request
18 tasks
@JonathanTreffler
Copy link
Contributor Author

JonathanTreffler commented Sep 27, 2023

Tracking this down a little bit further: Since the IGroup:search() does an exact case insensitive search

Not exactly, IGroup:search() on the default database group backend does a SQL iLike query with '%' . $this->dbConn->escapeLikeParameter($search) . '%' (see https://github.com/nextcloud/server/blob/a636aef585a6e3e6cc628be6952cbec526148d70/lib/private/Group/Database.php#L291).
Meaning the returned groups only have to contain the search string anywhere in the gid or display Name (case in-sensitive).

That is why I left in the for loop checking the returned entries, because both exact match and partial matches will get returned from the search.

@JonathanTreffler
Copy link
Contributor Author

Please merge by yourself, unless you plan further improvements.

If we detect other issues in the roll out of the app onto our huge instance we will probably be back for other optimizations :) , but for now I think this can be merged.

@JonathanTreffler JonathanTreffler merged commit 4f32195 into master Sep 27, 2023
15 checks passed
@delete-merged-branch delete-merged-branch bot deleted the publicUsername-performance-fix branch September 27, 2023 21:35
@dartcafe
Copy link
Collaborator

Sure. All contributions are welcome here. Feeling less lonly. 😆

@svenseeberg
Copy link

Thank you all :)

Copy link

Hello there,
Thank you so much for taking the time and effort to create a pull request to our Nextcloud project.

We hope that the review process is going smooth and is helpful for you. We want to ensure your pull request is reviewed to your satisfaction. If you have a moment, our community management team would very much appreciate your feedback on your experience with this PR review process.

Your feedback is valuable to us as we continuously strive to improve our community developer experience. Please take a moment to complete our short survey by clicking on the following link: https://cloud.nextcloud.com/apps/forms/s/i9Ago4EQRZ7TWxjfmeEpPkf6

Thank you for contributing to Nextcloud and we hope to hear from you soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Public Shared Polls queries for users in a loop?
3 participants