Performance regression with permission graph on big instances #30474
Labels
Administration/Permissions
Collection or Data permissions
Misc/API
.Performance
Priority:P2
Average run of the mill bug
.Team/AdminWebapp
Admin and Webapp team
Type:Bug
Product defects
Milestone
Describe the bug
Product doc
The permission graph is incrementing exponentially the more schemas/tables, groups and permissions we have, which causes a performance regression. I made a test creating 500 DB connections (sample db with only 1 schema and 4 tables), 500 users on 500 groups and I get the following performance distribution
number of users created per minute:
number of permissions changed by minute
This could be due to several factors:
SELECT * FROM "permissions_revision" ORDER BY "id" DESC
and then we get the first item of the result of that query. That forces the DB to do a sequential scan of the table so this worsens as the table gets bigger. If we changed that query toSELECT "id" FROM "permissions_revision" ORDER BY "id" DESC LIMIT 1
we could get massive performance improvements as we would only pull 1 field instead of massive jsons saved as texts and also we would use the indexE.g.:
EXPLAIN ANALYZE SELECT * FROM "permissions_revision" ORDER BY "id" DESC
vs.
EXPLAIN ANALYZE SELECT "id" FROM "permissions_revision" ORDER BY "id" DESC LIMIT 1
To Reproduce
Expected behavior
Performance should be consistent across the scale
Logs
NA
Information about your Metabase installation
Severity
P2
Additional context
Recreated this as we have customers that have massive 6MB payloads on the graph endpoint, so my test is really lightweight compared to those, but this is a great start
EDIT: did some more tests to see which endpoint is the culprit and yes, it's the permission graph. Response time per endpoint
The text was updated successfully, but these errors were encountered: