PERF-5358 Improve CompoundIndexes.yml workload #1214

dstorch · 2024-05-08T17:49:46Z

Jira Ticket: PERF-5358

Whats Changed

There are three related changes:

Change the data so that there are 300,000 with approximately 100,000 associated with each of three tenants. Then change how the "selectivity" value is calculated so that the query returns ~101 results. I did confirm by looking at the mongod logs that beforehand the query was returning 0 results and now it returns 87 (which is much closer to the expected 101).
Change the predicate to have a conjunct for each indexed field. This makes the test much more similar to "Simple.yml", and it's not clear to me why it wasn't written this way in the first place.
Reduce the number of repetitions of each query from 1000 to 500. I did this because adding the additional conjuncts made multi-planning run for significantly longer, so I wanted to avoid making the workload run for longer than necessary.

Patch Testing Results

https://spruce.mongodb.com/version/663bb9ac889ffa0007942ecb/tasks

jimoleary

It doesn't look like you have any tasks in the patch yet.

jimoleary · 2024-05-08T19:29:31Z

src/workloads/query/multiplanner/CompoundIndexes.yml

@@ -278,10 +284,70 @@ Actors:
      collection: *coll
      query: &query {
        Filter: {
-          tenantId: {$eq : 0},
+          tenantId: {$eq : 1},


~~I was wondering why the tenantId changed but you are still picking the tenantId.~~

~~Is the overall effect of the query change to roughly retain the selectivity of the original?~~

You have answered this in the description.

This particular change was because I wanted to have a constant above which determines the number if unique tenantIds, which in this case is 3. The {distribution: uniform, min: x, max: y} expects the min and max bounds to be inclusive, so it's easiest to express using a min of 1 rather than 0. That meant I had to update the filter, since we used to generate tenant ids starting at 0.

dstorch · 2024-05-08T20:23:48Z

It doesn't look like you have any tasks in the patch yet.

Oops, thanks for the reminder. I just reconfigured the patch to run the multiplanner_compound_index task on the relevant build variants.

jimoleary

The Description is out of step with the test now.

Specifically the 'predicates a small number of fields'. Can you correct that and add something about the selectivity?

The workload is being executed on the following ARM variants:

Classic Query Engine Standalone ARM AWS 2023-11
SBE Standalone ARM AWS 2023-11

These have very similar performance. Do these variants correspond to 'classic' and 'SBE' or 'classic' and 'classic planner + SBE execution'? And what does this mean for the following statement?

  We expect classic to have better latency and throughput than SBE on this workload,
  and we expect the combination of classic planner + SBE execution (PM-3591) to perform about as well as classic.

Also: theres a fairly large drop in performance ... we'll should make sure the build barons are aware of it beforehand.

dstorch · 2024-05-09T14:52:40Z

The Description is out of step with the test now.

Specifically the 'predicates a small number of fields'. Can you correct that and add something about the selectivity?

I didn't fix this originally because this would conflict with @dpercy's work in #1213. I think it makes more sense to fix the description as part of this patch though, so I've pushed a new commit which does so. @dpercy heads up that this will conflict with your patch.

Do these variants correspond to 'classic' and 'SBE' or 'classic' and 'classic planner + SBE execution'? And what does this mean for the following statement?

Classic Query Engine Standalone ARM AWS 2023-11 will use the classic multi-planner and classic execution. SBE Standalone ARM AWS 2023-11 will also use the classic multi-planner but will use SBE as the execution engine. There used to be a third configuration which used an SBE specific multi-planner followed by SBE as the execution engine. However, now that the work from PM-3591: Classic multi-planner with SBE that third configuration is no longer tested (in either correctness or performance contexts) and should be removed from the code base soon.

Regarding this:

We expect classic to have better latency and throughput than SBE on this workload,
and we expect the combination of classic planner + SBE execution (PM-3591) to perform about as well as classic.

I cleaned up the description to not focus on the differences between the three configurations -- this is an artifact of the context under which the workload was originally developed and seems less relevant as we continue to run this workload going forward. I think we should do similar cleanup across all of the multi-planning workloads as part of #1213. Since both SBE and classic always use the classic multi-planner now, and this is a multi-planning workload, it's not surprising that these two configurations are exhibiting similar performance.

I've re-requested review, please take another look!

dstorch · 2024-05-09T14:55:51Z

Oh, one other thing. @dpercy @jimoleary your comment above made me realize that Classic Query Engine Standalone ARM AWS 2023-11 and SBE Standalone Intel AWS 2023-11 are the only build variants on which we are running the multi-planning workloads. This seems possibly accidental, since all of the AutoRun sections specific all-feature-flags variants. See here, for example:

genny/src/workloads/query/multiplanner/CompoundIndexes.yml

Line 352 in 414f74e

- standalone-all-feature-flags # At time of writing this will enable PM-3591.

Should we file and schedule a ticket to make sure that the multi-planning genny workloads are actually running on all-feature-flags variants in the master branch?

jimoleary · 2024-05-09T15:28:48Z

Should we file and schedule a ticket to make sure that the multi-planning genny workloads are actually running on all-feature-flags variants in the master branch?

Yeah I think so ... we could fix it in #1213 but this PR is fairly targetted and it would be nice to fix this with a ticket.

jimoleary

LGTM with a new ticket for the autoruns and informing the barons of the change in performance.

dstorch · 2024-05-09T17:21:24Z

Thanks for the reviews! The Evergreen patch failed to due t_check_generated_docs failing; I've pushed a commit to check in the necessary changes to the generated documentation. No need to take another look since this change was generated by running ./run-genny generate-docs.

Here's a performance analyzer link for reference: https://performance-analyzer.server-tig.prod.corp.mongodb.com/perf-analyzer-viz/?comparison_id=0af281c3-dcd4-4d70-b441-8bbf4d6b42a2

I've alerted #performance-build-baroning in this thread: https://mongodb.slack.com/archives/C8CT98KL4/p1715274831473279

I filed DEVPROD-7308 about fixing the build variants on which these workloads run and will look for an assignee.

dstorch · 2024-05-09T20:44:56Z

I had to merge with the linting changes recently made in 414f74e#diff-5dd88d22cd5fe6b47267656c0c3d7c1bb1096e1a2f6bc975bce19d57977f57cb. This passes the linter locally now and I will let Evergreen re-run. I've chosen to keep the more compact formatting that the workload previously had prior to 414f74e#diff-5dd88d22cd5fe6b47267656c0c3d7c1bb1096e1a2f6bc975bce19d57977f57cb since the linter still seems to be happy with it and I find it to be a bit more readable.

Auto-merge is enabled, so this will merge if Evergreen remains happy.

PERF-5358 Improve CompoundIndexes.yml workload

882e10b

dstorch requested review from jimoleary and dpercy May 8, 2024 17:49

dstorch requested a review from a team as a code owner May 8, 2024 17:49

jimoleary reviewed May 8, 2024

View reviewed changes

dstorch requested a review from jimoleary May 8, 2024 20:23

jimoleary reviewed May 9, 2024

View reviewed changes

improve workload description

b8b8fb6

dstorch requested a review from jimoleary May 9, 2024 14:52

jimoleary approved these changes May 9, 2024

View reviewed changes

dpercy approved these changes May 9, 2024

View reviewed changes

fix generated docs

36f81a7

dstorch enabled auto-merge May 9, 2024 18:56

dstorch added 4 commits May 9, 2024 20:08

Merge branch 'master' into dstorch/PERF-5358

14af262

re-autoformat CompountIndexes.yml

51a506c

auto-format again, but better

534518a

fix generated docs

e947a8f

empty commit

cc6b49a

dstorch added this pull request to the merge queue May 9, 2024

Merged via the queue into master with commit c9868ba May 10, 2024
11 checks passed

dstorch deleted the dstorch/PERF-5358 branch May 10, 2024 13:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PERF-5358 Improve CompoundIndexes.yml workload #1214

PERF-5358 Improve CompoundIndexes.yml workload #1214

dstorch commented May 8, 2024 •

edited

Loading

jimoleary left a comment

jimoleary May 8, 2024

dstorch May 8, 2024 •

edited

Loading

dstorch commented May 8, 2024

jimoleary left a comment •

edited

Loading

dstorch commented May 9, 2024

dstorch commented May 9, 2024

jimoleary commented May 9, 2024

jimoleary left a comment

dstorch commented May 9, 2024

dstorch commented May 9, 2024

PERF-5358 Improve CompoundIndexes.yml workload #1214

PERF-5358 Improve CompoundIndexes.yml workload #1214

Conversation

dstorch commented May 8, 2024 • edited Loading

Whats Changed

Patch Testing Results

jimoleary left a comment

Choose a reason for hiding this comment

jimoleary May 8, 2024

Choose a reason for hiding this comment

dstorch May 8, 2024 • edited Loading

Choose a reason for hiding this comment

dstorch commented May 8, 2024

jimoleary left a comment • edited Loading

Choose a reason for hiding this comment

dstorch commented May 9, 2024

dstorch commented May 9, 2024

jimoleary commented May 9, 2024

jimoleary left a comment

Choose a reason for hiding this comment

dstorch commented May 9, 2024

dstorch commented May 9, 2024

dstorch commented May 8, 2024 •

edited

Loading

dstorch May 8, 2024 •

edited

Loading

jimoleary left a comment •

edited

Loading