Update ag-grid and implement getRowId to improve runs table performance #5725

adamreeve · 2022-04-20T01:12:21Z

What changes are proposed in this pull request?

Upgrades to ag-grid 27.2.0 and makes the following changes to improve the performance of the runs table when loading more rows:

Move the "Load more" button out of the grid and just render it below the grid in a separate element
Implement getRowId using the run uuid so that previously rendered rows aren't re-rendered (see https://www.ag-grid.com/react-data-grid/row-ids/)
- In order to correctly re-render cells when data has changed, this required making sure each column definition had a field specified that corresponded to an actual field present in the row data. Eg. previously the models column referred to the models field that didn't actually exist, so models were always considered equal and weren't re-rendered when they changed. Similarly, the date column actually used many different fields that affected how the cell should be rendered, so just comparing startTime values wasn't sufficient to decide whether the cell should be re-rendered.

Fixes #5653 (see that issue for some performance numbers)

How is this patch tested?

Running existing unit tests, manual testing of the UI to check that behaviour seems correct.

Does this PR change the documentation?

No. You can skip the rest of this section.
Yes. Make sure the changed pages / sections render correctly by following the steps below.

Check the status of the ci/circleci: build_doc check. If it's successful, proceed to the
next step, otherwise fix it.
Click Details on the right to open the job page of CircleCI.
Click the Artifacts tab.
Click docs/build/html/index.html.
Find the changed pages / sections and make sure they render correctly.

Release Notes

Is this a user-facing change?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release notes for MLflow users.

Improve the performance of the runs table when loading a large number of runs.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

Interface

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

Language

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

Integrations

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
rn/feature - A new user-facing feature worth mentioning in the release notes
rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
rn/documentation - A user-facing documentation change worth mentioning in the release notes

(I'm not sure about this classification, feel free to change it)

Signed-off-by: Adam Reeve <adreeve@gmail.com>

dbczumar · 2022-04-28T22:04:17Z

@harupy @sunishsheth2009 Can you take a look?

sunishsheth2009

I have some questions on how do we improve performance here?

I thought setting the getRowId to runId is enough. What are all the other changes needed for?

sunishsheth2009 · 2022-04-28T23:45:52Z

mlflow/server/js/src/experiment-tracking/components/ExperimentRunsTableMultiColumnView2.js

@@ -179,10 +176,26 @@ export class ExperimentRunsTableMultiColumnView2 extends React.Component {
        },
        {
          headerName: ATTRIBUTE_COLUMN_LABELS.DATE,
-          field: 'startTime',
+          field: 'runDateInfo',


I am curious on this, Are we getting the same data from a new field now?

Sort of. I've added a corresponding new runDateInfo property in the data returned from getRowData, which is an object aggregating all of the data needed to render the date cell rather than just the start time. And DateCellRenderer now accesses these values through its value prop instead of the data prop that contains data for the whole row. This is needed so that we can correctly implement the equals property to handle deciding whether the cell needs to be re-rendered.

sunishsheth2009 · 2022-04-28T23:46:29Z

mlflow/server/js/src/experiment-tracking/components/ExperimentRunsTableMultiColumnView2.js

+          equals: (dateInfo1, dateInfo2) => {
+            return (
+              dateInfo1.referenceTime === dateInfo2.referenceTime &&
+              dateInfo1.startTime === dateInfo2.startTime &&
+              dateInfo1.experimentId === dateInfo2.experimentId &&
+              dateInfo1.runUuid === dateInfo2.runUuid &&
+              dateInfo1.runStatus === dateInfo2.runStatus &&
+              dateInfo1.isParent === dateInfo2.isParent &&
+              dateInfo1.hasExpander === dateInfo2.hasExpander &&
+              dateInfo1.expanderOpen === dateInfo2.expanderOpen &&
+              _.isEqual(dateInfo1.childrenIds, dateInfo2.childrenIds)
+            );


Why do we need to check the equals here. Where is this function evoked from and how do we use it?

It isn't used directly within mlflow code, but is used by ag-grid to decide whether a cell needs to be re-rendered. After the data is updated, ag-grid can decide not to re-render a cell if the row id is the same and the cell values are considered to be equal. If we didn't implement this then a lot of cells would unnecessarily be re-rendered because by default reference equality is used.

This wasn't needed previously when row ids were assigned by ag-grid as a new set of ids is assigned after data is updated.

Since we're using _.equals here already, what's stopping us from doing _.isEqual(dateInfo1, dateInfo2)?

I was concerned there might be more performance overhead of using _.isEqual but I haven't tested that to verify. Using _.isEqual(dateInfo1, dateInfo2) would definitely simplify things so I'll switch to that if I don't see a big performance difference.

Yeah using _.isEqual for the top-level comparisons is slower but it's pretty insignificant compared to the time everything else takes. Eg. when loading 100 more rows with 1000 already loaded I get 25 ms in the value comparisons with the current code and then 100 ms when changing all the comparisons to just use _.isEqual, but the whole loading operation takes multiple seconds so I'll make this change.

adamreeve · 2022-05-02T01:47:26Z

I thought setting the getRowId to runId is enough. What are all the other changes needed for?

Implementing getRowId is the main change that improves performance, and most of the other changes are needed for the grid to work correctly with application assigned row ids and re-render cells when needed.

Eg. previously for the "Start Time" column, the value of the cell was set to the startTime field, but the DateCellRenderer actually used a bunch of other properties from the row data to render this cell. This meant that when data was updated, the "Start Time" cells were never re-rendered even if things like expanderOpen had changed, because ag-grid only tested for equality of startTime to decide whether to re-render this cell. I've also had to add a referenceDate field here so that cells are re-rendered when time passes due to start time being rendered as something like "x seconds ago".

Similarly for the "Models" column, this was configured to use a non-existent models field, which would always evaluate to undefined so values for this column were always considered equal and not re-rendered, so I've had to add a new models field to fix the rendering behaviour for this column.

There are similar changes to other columns so that every column is mapped to a value field that can be tested for equality to decide whether cells need to be re-rendered when data is updated. This wasn't a problem previously when using grid assigned row ids as new ids were assigned whenever data was updated and every row was completely re-rendered.

The two changes that aren't directly related to implementing getRowId are the change to the runInfosByUuid reducer, which is a minor performance improvement I added after seeing this take a fairly long time when profiling, and moving the LoadMore button outside of the grid.

Moving the load more button out gave a performance improvement on its own, as I guess ag-grid then didn't have to test each row to see if it needed to use the FullWidthCellRenderer and could simplify its rendering logic. I'm not sure if this would still show as big a performance improvement when done after the getRowId change. When I tried just implementing getRowId without moving the load more button and using ag-grid 27.1.0, I was getting errors within ag-grid. Possibly these have been fixed in 27.2.0 but I'd argue it's still better to keep the load more button outside of the grid as rendering it within a row just seems to add unnecessary complication without any obvious benefit.

Signed-off-by: Adam Reeve <adreeve@gmail.com>

sunishsheth2009

Thank for the detailed explanation. It makes sense. :)
Also thank you for making these changes and the upgrade. Appreciate your help

@xanderwebs can you take a look at it as well?

xanderwebs

Thanks for the contribution, code generally looks good, the only thing from my end is that it would be good to keep the function signature of Utils.renderSource the same (even if the params are unused here).

xanderwebs · 2022-05-04T22:44:20Z

mlflow/server/js/src/common/utils/Utils.js

@@ -103,8 +103,8 @@ class Utils {
    return dateFormat(d, format);
  }

-  static timeSinceStr(date) {
-    const seconds = Math.max(0, Math.floor((new Date() - date) / 1000));
+  static timeSinceStr(date, referenceDate) {


Since this changes the function signature, do you mind adding a default here to referenceDate such that it will default to new Date() if someone calls it the old way?

xanderwebs · 2022-05-04T22:45:43Z

mlflow/server/js/src/common/utils/Utils.js

   */
-  static renderSource(tags, queryParams, runUuid) {


For edge purposes, it would be better to keep the signature of the function the same.

xanderwebs · 2022-05-04T22:49:55Z

mlflow/server/js/src/experiment-tracking/components/RunView.js

@@ -276,7 +276,7 @@ export class RunViewImpl extends Component {
          >
            <div style={{ display: 'flex', alignItems: 'center' }}>
              {Utils.renderSourceTypeIcon(tags)}
-              {Utils.renderSource(tags, queryParams, runUuid)}


Again, would be good to keep this for edge purposes.

xanderwebs · 2022-05-04T23:12:38Z

mlflow/server/js/src/experiment-tracking/components/ExperimentRunsTableMultiColumnView2.js

+          equals: (dateInfo1, dateInfo2) => {
+            return (
+              dateInfo1.referenceTime === dateInfo2.referenceTime &&
+              dateInfo1.startTime === dateInfo2.startTime &&
+              dateInfo1.experimentId === dateInfo2.experimentId &&
+              dateInfo1.runUuid === dateInfo2.runUuid &&
+              dateInfo1.runStatus === dateInfo2.runStatus &&
+              dateInfo1.isParent === dateInfo2.isParent &&
+              dateInfo1.hasExpander === dateInfo2.hasExpander &&
+              dateInfo1.expanderOpen === dateInfo2.expanderOpen &&
+              _.isEqual(dateInfo1.childrenIds, dateInfo2.childrenIds)
+            );


Since we're using _.equals here already, what's stopping us from doing _.isEqual(dateInfo1, dateInfo2)?

xanderwebs · 2022-05-04T23:24:50Z

mlflow/server/js/src/experiment-tracking/components/ExperimentRunsTableMultiColumnView2.js

+  const { experimentId } = props.data;
+  const { name, basename } = props.value;


What's the difference between data and value here? ==> The question behind this question is really, why are we reading experimentId in DateCellRenderer off of value there, but off of data here?

data contains the data for the whole row, whereas value is the value for this cell (selected from the row data using the field configured for the column). The value here doesn't include the experimentId because I'm using the experiment name values directly from the map returned by Utils.getExperimentNameMap in getRowData.

I could probably create new value objects that also include the experimentId which might be tidier, but that seems unnecessary as the experiment id for a row is never going to change (if it could change it would be important to include it so that the equality comparison was correct).

I've added a comment in the function to explain this too.

Signed-off-by: Adam Reeve <adreeve@gmail.com>

…erer Signed-off-by: Adam Reeve <adreeve@gmail.com>

Signed-off-by: Adam Reeve <adreeve@gmail.com>

adamreeve · 2022-05-06T10:00:20Z

Thanks for the feedback @xanderwebs, I think I've addressed all of your comments now.

xanderwebs

LGTM!

adamreeve added 9 commits April 20, 2022 11:51

Upgrade to ag-grid-community version 27.2.0

47c9a2b

Signed-off-by: Adam Reeve <adreeve@gmail.com>

Move LoadMoreBar out of data grid

6d308e3

Signed-off-by: Adam Reeve <adreeve@gmail.com>

Implement getRowId for AgGridReact

4f7fc74

Signed-off-by: Adam Reeve <adreeve@gmail.com>

Improve performance of runInfosByUuid reducer

7ade758

Signed-off-by: Adam Reeve <adreeve@gmail.com>

Implement equals for run models column

b6b052e

Signed-off-by: Adam Reeve <adreeve@gmail.com>

Implement equals for run start time columns

eb5bb0f

Signed-off-by: Adam Reeve <adreeve@gmail.com>

Use experimentName field for ExperimentNameRenderer

3a913a5

Signed-off-by: Adam Reeve <adreeve@gmail.com>

Add version field for version cell renderer

873c791

Signed-off-by: Adam Reeve <adreeve@gmail.com>

Use tags as source column field and remove unused parameter

9a33197

Signed-off-by: Adam Reeve <adreeve@gmail.com>

github-actions bot added area/tracking Tracking service, tracking client APIs, autologging area/uiux Front-end, user experience, plotting, JavaScript, JavaScript dev server rn/bug-fix Mention under Bug Fixes in Changelogs. labels Apr 20, 2022

dbczumar requested review from harupy and sunishsheth2009 April 21, 2022 23:27

Bump ag-grid to 27.2.1 to pick up fix for incorrect peer dependency

b20190d

Signed-off-by: Adam Reeve <adreeve@gmail.com>

sunishsheth2009 reviewed Apr 28, 2022

View reviewed changes

Merge branch 'master' into grid-update

914618b

Signed-off-by: Adam Reeve <adreeve@gmail.com>

sunishsheth2009 approved these changes May 4, 2022

View reviewed changes

xanderwebs requested changes May 4, 2022

View reviewed changes

adamreeve added 4 commits May 6, 2022 11:33

Add default value for referenceDate parameter

c0b995e

Signed-off-by: Adam Reeve <adreeve@gmail.com>

Revert change to Utils.renderSource signature

59189e8

Signed-off-by: Adam Reeve <adreeve@gmail.com>

Add comment explaining data vs value difference in ExperimentNameRend…

439bd96

…erer Signed-off-by: Adam Reeve <adreeve@gmail.com>

Use more _.isEqual

57f3cd2

Signed-off-by: Adam Reeve <adreeve@gmail.com>

xanderwebs approved these changes May 6, 2022

View reviewed changes

dbczumar merged commit 0a88eab into mlflow:master May 6, 2022

adamreeve deleted the grid-update branch May 8, 2022 22:03

harupy mentioned this pull request Jun 1, 2022

[BUG] Wrong link to model in Runs table of the Experiments view #6011

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update ag-grid and implement getRowId to improve runs table performance #5725

Update ag-grid and implement getRowId to improve runs table performance #5725

adamreeve commented Apr 20, 2022 •

edited

Loading

dbczumar commented Apr 28, 2022

sunishsheth2009 left a comment

sunishsheth2009 Apr 28, 2022

adamreeve May 2, 2022

sunishsheth2009 Apr 28, 2022

adamreeve May 2, 2022

xanderwebs May 4, 2022

adamreeve May 5, 2022

adamreeve May 6, 2022

adamreeve commented May 2, 2022

sunishsheth2009 left a comment

xanderwebs left a comment

xanderwebs May 4, 2022

xanderwebs May 4, 2022

xanderwebs May 4, 2022

xanderwebs May 4, 2022

xanderwebs May 4, 2022

adamreeve May 5, 2022

adamreeve May 6, 2022

adamreeve commented May 6, 2022

xanderwebs left a comment

		const { experimentId } = props.data;
		const { name, basename } = props.value;

Update ag-grid and implement getRowId to improve runs table performance #5725

Update ag-grid and implement getRowId to improve runs table performance #5725

Conversation

adamreeve commented Apr 20, 2022 • edited Loading

What changes are proposed in this pull request?

How is this patch tested?

Does this PR change the documentation?

Release Notes

Is this a user-facing change?

What component(s), interfaces, languages, and integrations does this PR affect?

How should the PR be classified in the release notes? Choose one:

dbczumar commented Apr 28, 2022

sunishsheth2009 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adamreeve commented May 2, 2022

sunishsheth2009 left a comment

Choose a reason for hiding this comment

xanderwebs left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adamreeve commented May 6, 2022

xanderwebs left a comment

Choose a reason for hiding this comment

adamreeve commented Apr 20, 2022 •

edited

Loading