Fix performance issues in Model Serving Global #2334

lucferbux · 2024-01-12T12:15:32Z

Description

Closes https://issues.redhat.com/browse/RHOAIENG-549

How Has This Been Tested?

As a cluster admin user

Open the network tab on Chrome
Go to Model Serving Global
Select "All projects"
Check that it's listing InferenceServices and Servingruntimes are being listed as cluster wide fetches

As a regular admin user

Open the network tab on Chrome
Go to Model Serving Global
Select "All projects"
Check that it first requests all projects and then it does several calls to fecth all InferenceServices and ServingRuntimes in those projects

General

Go to Model Serving Global
Select "All projects"
Check that you can cancel the request

Test Impact

Added e2e tests

Request review criteria:

Self checklist (all need to be checked):

The developer has manually tested the changes and verified that the changes work
Commits have been squashed into descriptive, self-contained units of work (e.g. 'WIP' and 'Implements feedback' style messages have been removed)
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has added tests or explained why testing cannot be added (unit tests & storybook for related changes)

If you have UI changes:

Included any necessary screenshots or gifs if it was a UI change.
Included tags to the UX team if it was a UI/UX change (find relevant UX in the SMEs section).

After the PR is posted & before it merges:

The developer has tested their solution on a cluster by using the image produced by the PR to main

lucferbux · 2024-01-12T12:16:16Z

@vconzola let me know if this is what you had in mind to cancel the "All projects" request

vconzola · 2024-01-12T14:24:51Z

@lucferbux The video goes by really quickly. Is the gist that you're showing a spinner until the first pieces of data come back then you're loading the table with skeletons in cells where the data is still being fetched?

lucferbux · 2024-01-17T17:38:40Z

@lucferbux The video goes by really quickly. Is the gist that you're showing a spinner until the first pieces of data come back then you're loading the table with skeletons in cells where the data is still being fetched?

Yes, the flow will be:

You go to "All projects"
You'll have the loading skeleton with the "Cancel" button
If you press it, you'll be redirected to the last namespace/preferred namespace
You'll get that namespace loaded

mturley · 2024-01-17T22:41:11Z

@vconzola here's that loading state with the cancel button (used throttling in the network dev tools to make it load slowly)

mturley

LGTM and works as described. I did notice with throttled dev tools that when you press the cancel button to navigate away it doesn't actually cancel the pending network requests (as far as I can tell) and they continue in the background. Maybe that's fine? I imagine it would be a pain to figure out how to actually send a cancel signal to them (I haven't looked at the fetch implementation).

That's probably out of scope here, so this looks good ✅

vconzola · 2024-01-18T15:05:31Z

@mturley Thanks, but not sure what that screenshot is showing me. I guess I don't understand the gist of the issue or what was done to fix it. Is the problem that the user was stuck on the loading page with no way to cancel out if fetching the data from all projects takes too long? Or is the problem that we're waiting for all data from all projects to be returned before exiting the loading page? If it's the latter we should close the loading page without canceling the fetch requests as soon as the first data is returned from a project and just show skeletons in the table cells while we're waiting for the rest of the data. Is something like that possible?

mturley · 2024-01-18T15:24:41Z

@vconzola I'm not sure exactly, that's a question for @lucferbux . I just figured I'd share a screenshot of a state that's hard to see in that video. From what I can tell here there's nothing that affects the loading skeleton state within the displayed table itself, this change just makes the fetches themselves more efficient and adds the cancel button to the existing spinner in case it still takes too long.

lucferbux · 2024-01-19T11:41:06Z

@mturley Thanks, but not sure what that screenshot is showing me. I guess I don't understand the gist of the issue or what was done to fix it. Is the problem that the user was stuck on the loading page with no way to cancel out if fetching the data from all projects takes too long? Or is the problem that we're waiting for all data from all projects to be returned before exiting the loading page? If it's the latter we should close the loading page without canceling the fetch requests as soon as the first data is returned from a project and just show skeletons in the table cells while we're waiting for the rest of the data. Is something like that possible?

@vconzola I'm sorry to confuse you, don't worry about the implementation, the UI will do the following:

Show you the cancel button in "All projects"
If you press cancel it will navigate to the next project

If that's ok we can lgtm this pr

lucferbux · 2024-01-19T11:42:26Z

@vconzola I'm not sure exactly, that's a question for @lucferbux . I just figured I'd share a screenshot of a state that's hard to see in that video. From what I can tell here there's nothing that affects the loading skeleton state within the displayed table itself, this change just makes the fetches themselves more efficient and adds the cancel button to the existing spinner in case it still takes too long.

Yes, you are right, I took a look at that the other day, but that's something we might wanna fix in a follow up issue, it's not related to this enhancement (more or less) and it's more of an structural issue, we are not implementing well the useFetchData so we don't abort the requests on change, I've just tested it and it seems to be working great in pipelines, but not here.

lucferbux · 2024-01-19T12:01:21Z

I've created https://issues.redhat.com/browse/RHOAIENG-2002 as a follow up, we need to fix this asap.

vconzola · 2024-01-19T14:19:10Z

If you press cancel it will navigate to the next project

Not sure what you mean by 'next project". My expectation would be (1) if the "all projects" page is loading from another page link (if that's even possible) or from the main nav link, then Cancel will navigate to the first project in the dropdown list; (2) if the "all projects" page is loading from being selected in the dropdown, then Cancel will navigate to the previous project view that was displayed. But if that's not the way it works, and you need to merge today, we can get it in a follow-up PR.

So consider this my LGTM for this PR.

andrewballantyne · 2024-01-19T14:23:23Z

@lucferbux we may want to use history back -- it's a common practice in OpenShift Console to do a history back when canceling out of a form -- but it has drawbacks -- like if someone navigates from a bookmark or copy paste of a URL -- it can be awkward to kick them outside of our app or to the empty state (or just flat out doesn't work).

But I think we can improve on this with another PR.

lucferbux · 2024-01-19T15:22:02Z

@lucferbux we may want to use history back -- it's a common practice in OpenShift Console to do a history back when canceling out of a form -- but it has drawbacks -- like if someone navigates from a bookmark or copy paste of a URL -- it can be awkward to kick them outside of our app or to the empty state (or just flat out doesn't work).

But I think we can improve on this with another PR.

Ok, great, let me log another issue for that, I think that's the preferred behaviour, but we can improve it in the next version.

lucferbux · 2024-01-19T15:22:16Z

/approve

openshift-ci · 2024-01-19T15:22:22Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lucferbux, mturley

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [lucferbux]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci bot requested review from alexcreasy and dpanshug January 12, 2024 12:15

lucferbux requested review from mturley, vconzola, DaoDaoNoCode and christianvogt and removed request for alexcreasy and dpanshug January 12, 2024 12:15

Fix performance issues in Model Serving Global

2522c0f

lucferbux force-pushed the rhoaieng-549 branch from a411af5 to 2522c0f Compare January 12, 2024 12:20

mturley approved these changes Jan 17, 2024

View reviewed changes

openshift-ci bot assigned mturley Jan 17, 2024

openshift-ci bot added the lgtm label Jan 17, 2024

openshift-ci bot added the approved label Jan 19, 2024

openshift-merge-bot bot merged commit 2dd7f08 into opendatahub-io:main Jan 19, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix performance issues in Model Serving Global #2334

Fix performance issues in Model Serving Global #2334

lucferbux commented Jan 12, 2024

lucferbux commented Jan 12, 2024

vconzola commented Jan 12, 2024

lucferbux commented Jan 17, 2024

mturley commented Jan 17, 2024

mturley left a comment

vconzola commented Jan 18, 2024

mturley commented Jan 18, 2024

lucferbux commented Jan 19, 2024

lucferbux commented Jan 19, 2024

lucferbux commented Jan 19, 2024

vconzola commented Jan 19, 2024 •

edited

andrewballantyne commented Jan 19, 2024

lucferbux commented Jan 19, 2024

lucferbux commented Jan 19, 2024

openshift-ci bot commented Jan 19, 2024

Fix performance issues in Model Serving Global #2334

Fix performance issues in Model Serving Global #2334

Conversation

lucferbux commented Jan 12, 2024

Description

How Has This Been Tested?

As a cluster admin user

As a regular admin user

General

Test Impact

Request review criteria:

lucferbux commented Jan 12, 2024

vconzola commented Jan 12, 2024

lucferbux commented Jan 17, 2024

mturley commented Jan 17, 2024

mturley left a comment

Choose a reason for hiding this comment

vconzola commented Jan 18, 2024

mturley commented Jan 18, 2024

lucferbux commented Jan 19, 2024

lucferbux commented Jan 19, 2024

lucferbux commented Jan 19, 2024

vconzola commented Jan 19, 2024 • edited

andrewballantyne commented Jan 19, 2024

lucferbux commented Jan 19, 2024

lucferbux commented Jan 19, 2024

openshift-ci bot commented Jan 19, 2024

vconzola commented Jan 19, 2024 •

edited