[WIP] Speed up assets and accounts pages #988

victorgarcia98 · 2024-02-26T12:57:33Z

Description

Running against the simulation DB with 196 Assets and 59 users, the existing code takes about 134s to load the asset page. With the changes introduced in this PR, the asset pages loading time is brought down to 53s, which is still pretty high.

Further Improvements

Time profiling to identify slow routines.

Related Items

Closes #964

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on code under GPL or other license that is incompatible with FlexMeasures

…is relies on the fact that the index of the internal API already take care of auth. Running agains the simulation database, it takes 52s to load the asset page on average. Signed-off-by: Victor Garcia Reolid <victor@seita.nl>

flexmeasures/api/v3_0/users.py

Signed-off-by: Victor Garcia Reolid <victor@seita.nl>

…oint' into feature/crud/get-from-index-endpoint

…-index-endpoint

Flix6x · 2024-03-11T11:32:09Z

The loading time for the asset page is still unacceptable. How we process our internal API responses needs some work, I believe. For example, for assets, we are going through considerable trouble to recreate Asset objects, from what our API gives us, without querying our database and keeping them out of our db session. But then we also assign to it an owner (via a db query), child assets (via recursively processing our internal API response), a parent (via a db query) and sensors (via a db query). This is all costing considerable time when loading the assets page.

I see a couple of options:

It would be a lot quicker to just query the asset table and have the objects live in the session.
Bypass the internal API altogether.
Stop loading owners, children and parents unless the UI needs them (the assets page doesn't use them), and have the response of the internal API include a sensor count so we don't need to query the sensor table.

Before you start work, know that I did a tech spike on these options, so I can quickly push some code once we decided.

@nhoening your input is desired.

…-index-endpoint

nhoening · 2024-03-12T13:30:40Z

All I want to maintain is that if possible we call the internal API, as then we know we use our one layer where authorization is defined.

One or two internal API calls per UI page load should suffice, of course. We can definitely change anything after that.
We should study if that approach is feasible for what the asset and account page needs to do.

…-index-endpoint

Signed-off-by: F.N. Claessen <felix@seita.nl>

…l using internal API Signed-off-by: F.N. Claessen <felix@seita.nl>

Signed-off-by: F.N. Claessen <felix@seita.nl>

Flix6x · 2024-03-18T15:11:59Z

I implemented option 1, such that we still call the internal API for the auth check, but then reload the Assets from the db instead of separately querying its parents, owner, children and sensors. This further reduces the loading times on the assets page (I saw loading times of 17 seconds now).

Flix6x

I noticed that there are fewer assets in the /assets listing compared to before this PR, so there must still be something wrong. Also, the asset icons (which are based on the asset type) are gone.

nhoening · 2024-03-18T16:22:57Z

Should we run a profiler to check where the 17 seconds are spent?

Signed-off-by: Victor Garcia Reolid <victor@seita.nl>

…x-endpoint

Signed-off-by: Victor Garcia Reolid <victor@seita.nl>

victorgarcia98 · 2024-03-27T22:33:21Z

I noticed that there are fewer assets in the /assets listing compared to before this PR, so there must still be something wrong. Also, the asset icons (which are based on the asset type) are gone.

I tried it out locally and found the same. It turns out that we were only considering the assets of the current_user and the PUBLIC assets were missing, as well. Added these two and updated the tests.

flexmeasures/ui/crud/assets.py

Flix6x · 2024-03-28T17:48:59Z

flexmeasures/api/v3_0/assets.py

    @as_json
-    def index(self, account: Account):
+    def index(self, account: Account | None):
        """List all assets owned by a certain account.


Owned or accessible? I think by now this index endpoint morphed into the latter. In which case, should we include the public assets? Unless an account is passed, of course.

The docstring would need a rewrite, too.

Technically, it's also a breaking change in the API. We could maintain backwards compatibility if we add a new optional field that signals the caller wants to get all accessible accounts instead of the assets their own account owns. Or a new endpoint, but I don't favour that option, because it's still an asset index endpoint.

True, good point. I've included the parameter all_accessible (default=False).
This parameter signals if we want to get all the assets that the requesting user has read permissions or, by default, the ones that it owns.

What might nicely round this up would be to revise check_access to accept an as_account argument that defaults to the current user, in combination with putting back the permissions decorator.

So then we'd have the following four cases:

No account is passed, and all_accessible=false: caller wants to list their own assets

No account is passed, and all_accessible=true: caller wants to list all assets that they can access

An account is passed, and all_accessible=false: caller wants to list some other account's assets

an account is passed, and all_accessible=true: caller wants to list all assets that some other account can access. Caller requires read access on the passed account (checked using the decorator) and the passed account requires read access on the assets (to be checked using the revised check_access I suggested above).

One open point would be whether we should implicitly assume that, if the caller has access to account, it also has access to all accounts that account has access to. That is, we assume consultant permissions propagate.

Alternatively, we could call check_access on the assets for both the caller and the account (in case they are not the same). That is, we do not assume consultant permissions propagate.

Propagation has my preference. Whichever the case, we also might want to mention the choice of policy explicitly in the docs.

An unlikely case but possible is the user to have consultancy role but no read role. Thus, I think we can't really bring permission_required_for_context back. Perhaps, if len(accounts) == 0, we should raise Forbidden or Unauthorized.

Regarding the changes to the check_access, to my (superficial) understanding, they are already covered by the authorization policy:

No account is passed, and all_accessible=false: caller wants to list their own assets

Given that account is not passed, the marshmallow field will default to AccountIdField.load_current and check_access will make sure that the current user has read permissions for the account.

No account is passed, and all_accessible=true: caller wants to list all assets that they can access

In this scenario, the user could or couldn't have read access to his/her own account. Yet, it could have a consultant role and have access over the accounts that have the consultant_account_id set to the user's account.

An account is passed, and all_accessible=false: caller wants to list some other account's assets

check_access make sure that the requesting user has read permissions to its own account.

An account is passed, and all_accessible=true: caller wants to list all assets that some other account can access. Caller requires read access on the passed account (checked using the decorator) and the passed account requires read access on the assets (to be checked using the revised check_access I suggested above).

Here probably we should raise if the current user has no read permissions for its own account even if it has consultant role.

One open point would be whether we should implicitly assume that, if the caller has access to account, it also has access to all accounts that account has access to. That is, we assume consultant permissions propagate.

This is an interesting point, indeed. I think we should open a new issue for this and handle that on the Account __acl__ method.

Given that this touches auth policies I want to make sure we get @nhoening 's opinion, too.

Interesting solution!

With the new structure, I guess we could deprecate the /public endpoint (as we can return public endpoints here), but if one * only* wants public assets, that would require loading a potentially long list and parsing the result. Or we add another parameter only_public.

Consultancy is not propagating across multiple levels of accounts, and this is not the time or place to go there. We don't need it. Let's leave authorization (and check_access where it is right now)

"An unlikely case but possible is the user to have consultancy role but no read role." I don't understand this. If you have that role, and your account is consultant of a client account, you can read client account data.

I would also not do Case 4 ("An account is passed, and all_accessible=true: caller wants to list all assets that some other account can access."). Or is there a real use case? Why should we go to this trouble? We could say that both are not possible together and raise 422.

Docstring can still be better: "List all assets that the user can access, on a given account (default: own) or all that the user has permission to read (excluding public)".

"An unlikely case but possible is the user to have consultancy role but no read role." I don't understand this. If you have that role, and your account is consultant of a client account, you can read client account data.

This is the case that a user has consultant roles but no doesn't have the read. This means that it can read their client's account but not his/her.

I would also not do Case 4 ("An account is passed, and all_accessible=true: caller wants to list all assets that some other account can access."). Or is there a real use case? Why should we go to this trouble? We could say that both are not possible together and raise 422.

I agree, currently we don't need to check what a user belonging to a different account can list. Also, If we were to implement this, it gets more complex because the user roles + account roles + consultancy relationship determine if a user can read an asset.

…count Signed-off-by: Victor Garcia Reolid <victor@seita.nl>

…oint' into feature/crud/get-from-index-endpoint

Signed-off-by: Victor Garcia Reolid <victor@seita.nl>

victorgarcia98 · 2024-04-02T15:48:16Z

Reminder: this PR requires an API changelog entry.

Signed-off-by: Victor Garcia Reolid <victor@seita.nl>

…oint' into feature/crud/get-from-index-endpoint

nhoening

Two small refactoring ideas.

I tested it and I observed some speedup, so this is not bad.

With many many assets (like when we do simulations) the real solution is to not ask the API for all assets at once, but use server-side pagination). We do it on the client side now.

nhoening · 2024-05-22T17:21:24Z

flexmeasures/ui/crud/assets.py

+
+    asset_ids_filter = [GenericAsset.id.in_(ad["id"] for ad in assets)]
+
+    return db.session.scalars(select(GenericAsset).where(*asset_ids_filter)).all()


What fields are we displaying which are not in the initial API response?

I'd wager it's the account details. Maybe a good idea to mention why we go to the database again (to gather all data not just direct asset fields)

nhoening · 2024-05-23T09:43:13Z

flexmeasures/api/v3_0/users.py

@@ -89,7 +91,25 @@ def index(self, account: Account, include_inactive: bool = False):
        :status 403: INVALID_SENDER
        :status 422: UNPROCESSABLE_ENTITY
        """
-        users = get_users(account_name=account.name, only_active=not include_inactive)
+
+        if account is not None:


Couldn't the code in here which deals with getting the accounts be similar to the code currently in the API endpoint for accounts?

It seems that one has two features we could also use here (take the current user's account if None passed, and maybe we want to raise if the user doesn't have access to the account they sent in.

Maybe worth a small utility function.

victorgarcia98 added 2 commits February 26, 2024 13:41

Merge branch 'main' into feature/crud/get-from-index-endpoint

70c5ff8

victorgarcia98 added the API label Feb 26, 2024

Flix6x mentioned this pull request Feb 26, 2024

Slow loading /users and /assets listings #964

Open

Flix6x reviewed Feb 26, 2024

View reviewed changes

flexmeasures/api/v3_0/users.py Show resolved Hide resolved

victorgarcia98 added 3 commits February 28, 2024 19:15

check permissions

7f43acf

Signed-off-by: Victor Garcia Reolid <victor@seita.nl>

Merge branch 'main' into feature/crud/get-from-index-endpoint

ecd4f91

Merge remote-tracking branch 'origin/feature/crud/get-from-index-endp…

2acfb7f

…oint' into feature/crud/get-from-index-endpoint

victorgarcia98 requested a review from Flix6x February 29, 2024 10:10

Merge remote-tracking branch 'origin/main' into feature/crud/get-from…

31a5803

…-index-endpoint

Merge remote-tracking branch 'origin/main' into feature/crud/get-from…

3d3bece

…-index-endpoint

Flix6x added 5 commits March 18, 2024 14:20

Merge remote-tracking branch 'origin/main' into feature/crud/get-from…

4ab2eea

…-index-endpoint

fix: harmless, but redundant plus

fa54b6c

Signed-off-by: F.N. Claessen <felix@seita.nl>

fix: no dumb children

6c84bd3

Signed-off-by: F.N. Claessen <felix@seita.nl>

feature: create Asset objects faster by selecting from db, while stil…

88086a6

…l using internal API Signed-off-by: F.N. Claessen <felix@seita.nl>

style: black

021bd79

Signed-off-by: F.N. Claessen <felix@seita.nl>

Flix6x requested changes Mar 18, 2024

View reviewed changes

Flix6x assigned victorgarcia98 Mar 25, 2024

victorgarcia98 added 3 commits March 27, 2024 10:14

Merge branch 'main' into feature/crud/get-from-index-endpoint

67a0d60

Signed-off-by: Victor Garcia Reolid <victor@seita.nl>

Merge remote-tracking branch 'origin' into feature/crud/get-from-inde…

38f0f11

…x-endpoint

add public assets to the call and fix tests

096921a

Signed-off-by: Victor Garcia Reolid <victor@seita.nl>

victorgarcia98 marked this pull request as ready for review March 27, 2024 22:26

victorgarcia98 requested a review from Flix6x March 27, 2024 22:26

add future

02ac27e

Signed-off-by: Victor Garcia Reolid <victor@seita.nl>

Merge branch 'main' into feature/crud/get-from-index-endpoint

fc00b72

Flix6x requested changes Mar 28, 2024

View reviewed changes

victorgarcia98 added 4 commits April 2, 2024 17:40

add all_accessible to get all the accessible assets by a certain ac…

f1f5ac0

…count Signed-off-by: Victor Garcia Reolid <victor@seita.nl>

Merge remote-tracking branch 'origin/feature/crud/get-from-index-endp…

7fcb4a5

…oint' into feature/crud/get-from-index-endpoint

Improve docstring.

eec4ad4

Signed-off-by: Victor Garcia Reolid <victor@seita.nl>

Merge branch 'main' into feature/crud/get-from-index-endpoint

a99e5cb

victorgarcia98 requested a review from Flix6x April 2, 2024 15:46

victorgarcia98 added 3 commits April 3, 2024 21:54

raise if the account is provided and all_accessible=True

e98dc72

Signed-off-by: Victor Garcia Reolid <victor@seita.nl>

Merge remote-tracking branch 'origin/feature/crud/get-from-index-endp…

5b9a2b3

…oint' into feature/crud/get-from-index-endpoint

Merge branch 'main' into feature/crud/get-from-index-endpoint

012983d

victorgarcia98 requested a review from nhoening May 17, 2024 16:42

Merge branch 'main' into feature/crud/get-from-index-endpoint

edeb490

nhoening requested changes May 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Speed up assets and accounts pages #988

[WIP] Speed up assets and accounts pages #988

victorgarcia98 commented Feb 26, 2024

Flix6x commented Mar 11, 2024

nhoening commented Mar 12, 2024

Flix6x commented Mar 18, 2024

Flix6x left a comment

nhoening commented Mar 18, 2024

victorgarcia98 commented Mar 27, 2024

Flix6x Mar 28, 2024

victorgarcia98 Apr 2, 2024

Flix6x Apr 2, 2024

victorgarcia98 Apr 3, 2024

Flix6x Apr 4, 2024

nhoening Apr 8, 2024 •

edited by Flix6x

Loading

victorgarcia98 Apr 18, 2024

victorgarcia98 commented Apr 2, 2024

nhoening left a comment •

edited by Flix6x

Loading

nhoening May 22, 2024

nhoening May 23, 2024 •

edited by Flix6x

Loading


		asset_ids_filter = [GenericAsset.id.in_(ad["id"] for ad in assets)]

		return db.session.scalars(select(GenericAsset).where(*asset_ids_filter)).all()

[WIP] Speed up assets and accounts pages #988

Are you sure you want to change the base?

[WIP] Speed up assets and accounts pages #988

Conversation

victorgarcia98 commented Feb 26, 2024

Description

Further Improvements

Related Items

Flix6x commented Mar 11, 2024

nhoening commented Mar 12, 2024

Flix6x commented Mar 18, 2024

Flix6x left a comment

Choose a reason for hiding this comment

nhoening commented Mar 18, 2024

victorgarcia98 commented Mar 27, 2024

Flix6x Mar 28, 2024

Choose a reason for hiding this comment

victorgarcia98 Apr 2, 2024

Choose a reason for hiding this comment

Flix6x Apr 2, 2024

Choose a reason for hiding this comment

victorgarcia98 Apr 3, 2024

Choose a reason for hiding this comment

Flix6x Apr 4, 2024

Choose a reason for hiding this comment

nhoening Apr 8, 2024 • edited by Flix6x Loading

Choose a reason for hiding this comment

victorgarcia98 Apr 18, 2024

Choose a reason for hiding this comment

victorgarcia98 commented Apr 2, 2024

nhoening left a comment • edited by Flix6x Loading

Choose a reason for hiding this comment

nhoening May 22, 2024

Choose a reason for hiding this comment

nhoening May 23, 2024 • edited by Flix6x Loading

Choose a reason for hiding this comment

nhoening Apr 8, 2024 •

edited by Flix6x

Loading

nhoening left a comment •

edited by Flix6x

Loading

nhoening May 23, 2024 •

edited by Flix6x

Loading