-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Edge performance issues #2881
Comments
One suggestion would be to not call ListStorageSpaces for every lookup. Just verify that the provider ID or mountpoint matches, return that and assume that it’ll handle the actual storage request. I tried to implement something related to this but since generating the space path requires attributes of the space a reference belongs to, I couldn’t. The gateway storage provider constructs a relative reference using this path and further uses that in the request. IMO this should be delegated to the storage provider itself. |
I always wanted to keep all path handling in the registry: it should know the full path to every space root. That would have been a very simple design, but it only works for a spaces and id first approach. For path based requests the space registry might only be responsible for the prefix that leads to the storage provider. But this is misleading as a storage provider can provide both: The storage drivers however would have to ba able to deal with path based requests. Something we deliberately got rid of. Maybe the storageprovider can deal with the space lookup for a path. It could make a ListStorageSpaces call with a path filter to get the space and then call the actual method. |
We have to differentiate to types of requests as they follow different lookup semantics:
For 1. we added #2792 which should use the providerids that are provided in ocis since: owncloud/ocis#3619. ocdav does a LookUpStorageSpaceByID call which contains an ID filter. The gateway storageprovider splits the spaceid into a providerid and a nodeid. The space registries ListProvider then makes a findProvidersForResource call which currently then makes a ListStorageSpaces call on the storageprovider with the matching providerid or a provider without an id set. I think we mixed responsibilities. An the reva storage registry may not need to do an actual ListStorageSpaces call on the provider as the reva storage registry is only responsible for routing. To route a PROPFIND or MKCOL request the ocdav service does not need to know the etag or mtime of the storage space. The gateway certainly does not need it for routing. I may have failed to clarify that the graph service, which lists the spaces available to a user really is the only service that needs to be aware of etags and mtimes... 🤔 For 2. the routing happens based on path prefixes. The reva storage registry can match requests based on path prefix and The gateway should IMO truncate the path segments it can consume before forwarding the request to the storage provider ( The storage provider should then use the storage driver interface to look up the space for the given path. In the running example from above Now, while I would prefer making the storage driver interface always requires relative references and the storage provider always using a yet to specify ResolveSpaceAlias or ListStorageSpaces() call with a filter on the alias, I see the tradeoff when using path only navigation: all the path to space navigation may be unnecessary to reach a resource. But paths are fragile and the web ui as well as all clients will always use relative requests in the future. I think both types of requests can be made efficient without an additional call to a storage space. Only the ListStorageSpaces call needs to make actual calls on the providers ... which it currently does anyway... 🤔 I'll check if I can kill the ListStorageSpaces call in the spaces registry. It should be possible because we know the providerid. |
@butonic can you link the PR which reduces the stat requests here? |
When calling the
GET /ocs/v1.php/apps/files_sharing/api/v1/shares?include_tags=false&state=all&shared_with_me=true
endpoint, each received share is statted 23 times (twice via theStat
call, the rest of the times via theListStorageSpaces
call), sinceListStorageSpaces
is called every time any type of operation needs to be performed.Enabling the storageprovider cache in the gateway reduces this to 12 stats for subsequent calls. For an actual heterogenous, distributed system, this makes the platform unusable.
The performance is further worsened by two calls to list all the received shares for EVERY received share in the shares storage provider.
Note that these numbers have been obtained after the change to route requests by mount IDs was merged.
cc @labkode @butonic @micbar @dragotin
The text was updated successfully, but these errors were encountered: