Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF-5358 Improve CompoundIndexes.yml workload #1214

Merged
merged 8 commits into from
May 10, 2024
Merged

Conversation

dstorch
Copy link
Contributor

@dstorch dstorch commented May 8, 2024

Jira Ticket: PERF-5358

Whats Changed

There are three related changes:

  1. Change the data so that there are 300,000 with approximately 100,000 associated with each of three tenants. Then change how the "selectivity" value is calculated so that the query returns ~101 results. I did confirm by looking at the mongod logs that beforehand the query was returning 0 results and now it returns 87 (which is much closer to the expected 101).
  2. Change the predicate to have a conjunct for each indexed field. This makes the test much more similar to "Simple.yml", and it's not clear to me why it wasn't written this way in the first place.
  3. Reduce the number of repetitions of each query from 1000 to 500. I did this because adding the additional conjuncts made multi-planning run for significantly longer, so I wanted to avoid making the workload run for longer than necessary.

Patch Testing Results

https://spruce.mongodb.com/version/663bb9ac889ffa0007942ecb/tasks

@dstorch dstorch requested review from jimoleary and dpercy May 8, 2024 17:49
@dstorch dstorch requested a review from a team as a code owner May 8, 2024 17:49
Copy link
Contributor

@jimoleary jimoleary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like you have any tasks in the patch yet.

@@ -278,10 +284,70 @@ Actors:
collection: *coll
query: &query {
Filter: {
tenantId: {$eq : 0},
tenantId: {$eq : 1},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering why the tenantId changed but you are still picking the tenantId.

Is the overall effect of the query change to roughly retain the selectivity of the original?

You have answered this in the description.

Copy link
Contributor Author

@dstorch dstorch May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular change was because I wanted to have a constant above which determines the number if unique tenantIds, which in this case is 3. The {distribution: uniform, min: x, max: y} expects the min and max bounds to be inclusive, so it's easiest to express using a min of 1 rather than 0. That meant I had to update the filter, since we used to generate tenant ids starting at 0.

@dstorch
Copy link
Contributor Author

dstorch commented May 8, 2024

It doesn't look like you have any tasks in the patch yet.

Oops, thanks for the reminder. I just reconfigured the patch to run the multiplanner_compound_index task on the relevant build variants.

@dstorch dstorch requested a review from jimoleary May 8, 2024 20:23
Copy link
Contributor

@jimoleary jimoleary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Description is out of step with the test now.

Specifically the 'predicates a small number of fields'. Can you correct that and add something about the selectivity?

The workload is being executed on the following ARM variants:

  • Classic Query Engine Standalone ARM AWS 2023-11
  • SBE Standalone ARM AWS 2023-11

These have very similar performance. Do these variants correspond to 'classic' and 'SBE' or 'classic' and 'classic planner + SBE execution'? And what does this mean for the following statement?

  We expect classic to have better latency and throughput than SBE on this workload,
  and we expect the combination of classic planner + SBE execution (PM-3591) to perform about as well as classic.

Also: theres a fairly large drop in performance ... we'll should make sure the build barons are aware of it beforehand.

@dstorch
Copy link
Contributor Author

dstorch commented May 9, 2024

The Description is out of step with the test now.

Specifically the 'predicates a small number of fields'. Can you correct that and add something about the selectivity?

I didn't fix this originally because this would conflict with @dpercy's work in #1213. I think it makes more sense to fix the description as part of this patch though, so I've pushed a new commit which does so. @dpercy heads up that this will conflict with your patch.

Do these variants correspond to 'classic' and 'SBE' or 'classic' and 'classic planner + SBE execution'? And what does this mean for the following statement?

Classic Query Engine Standalone ARM AWS 2023-11 will use the classic multi-planner and classic execution. SBE Standalone ARM AWS 2023-11 will also use the classic multi-planner but will use SBE as the execution engine. There used to be a third configuration which used an SBE specific multi-planner followed by SBE as the execution engine. However, now that the work from PM-3591: Classic multi-planner with SBE that third configuration is no longer tested (in either correctness or performance contexts) and should be removed from the code base soon.

Regarding this:

We expect classic to have better latency and throughput than SBE on this workload,
and we expect the combination of classic planner + SBE execution (PM-3591) to perform about as well as classic.

I cleaned up the description to not focus on the differences between the three configurations -- this is an artifact of the context under which the workload was originally developed and seems less relevant as we continue to run this workload going forward. I think we should do similar cleanup across all of the multi-planning workloads as part of #1213. Since both SBE and classic always use the classic multi-planner now, and this is a multi-planning workload, it's not surprising that these two configurations are exhibiting similar performance.

I've re-requested review, please take another look!

@dstorch dstorch requested a review from jimoleary May 9, 2024 14:52
@dstorch
Copy link
Contributor Author

dstorch commented May 9, 2024

Oh, one other thing. @dpercy @jimoleary your comment above made me realize that Classic Query Engine Standalone ARM AWS 2023-11 and SBE Standalone Intel AWS 2023-11 are the only build variants on which we are running the multi-planning workloads. This seems possibly accidental, since all of the AutoRun sections specific all-feature-flags variants. See here, for example:

- standalone-all-feature-flags # At time of writing this will enable PM-3591.

Should we file and schedule a ticket to make sure that the multi-planning genny workloads are actually running on all-feature-flags variants in the master branch?

@jimoleary
Copy link
Contributor

Should we file and schedule a ticket to make sure that the multi-planning genny workloads are actually running on all-feature-flags variants in the master branch?

Yeah I think so ... we could fix it in #1213 but this PR is fairly targetted and it would be nice to fix this with a ticket.

Copy link
Contributor

@jimoleary jimoleary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a new ticket for the autoruns and informing the barons of the change in performance.

@dstorch
Copy link
Contributor Author

dstorch commented May 9, 2024

Thanks for the reviews! The Evergreen patch failed to due t_check_generated_docs failing; I've pushed a commit to check in the necessary changes to the generated documentation. No need to take another look since this change was generated by running ./run-genny generate-docs.

Here's a performance analyzer link for reference: https://performance-analyzer.server-tig.prod.corp.mongodb.com/perf-analyzer-viz/?comparison_id=0af281c3-dcd4-4d70-b441-8bbf4d6b42a2

I've alerted #performance-build-baroning in this thread: https://mongodb.slack.com/archives/C8CT98KL4/p1715274831473279

I filed DEVPROD-7308 about fixing the build variants on which these workloads run and will look for an assignee.

@dstorch dstorch enabled auto-merge May 9, 2024 18:56
@dstorch
Copy link
Contributor Author

dstorch commented May 9, 2024

I had to merge with the linting changes recently made in 414f74e#diff-5dd88d22cd5fe6b47267656c0c3d7c1bb1096e1a2f6bc975bce19d57977f57cb. This passes the linter locally now and I will let Evergreen re-run. I've chosen to keep the more compact formatting that the workload previously had prior to 414f74e#diff-5dd88d22cd5fe6b47267656c0c3d7c1bb1096e1a2f6bc975bce19d57977f57cb since the linter still seems to be happy with it and I find it to be a bit more readable.

Auto-merge is enabled, so this will merge if Evergreen remains happy.

@dstorch dstorch added this pull request to the merge queue May 9, 2024
Merged via the queue into master with commit c9868ba May 10, 2024
11 checks passed
@dstorch dstorch deleted the dstorch/PERF-5358 branch May 10, 2024 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants