Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF-5374 Improve comment headers for multiplanner/ workloads #1213

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

dpercy
Copy link
Contributor

@dpercy dpercy commented May 7, 2024

Jira Ticket: PERF-5374

Whats Changed

Mostly I updated the first sentence about each test's goal. In some cases I also added some more detail.

In one case, 'NoResults.yml', I realized that the original description was wrong because I hadn't thought through what would actually happen. The results at full-results.ipynb are consistent with the new description.

Patch Testing Results

I have not tested this, because I'm only changing comments.

Mostly I updated the first sentence about each test's goal. In some
cases I also added some more detail.

In one case, 'NoResults.yml', I realized that the original description
was wrong because I hadn't thought through what would actually happen.
The results at [full-results.ipynb](https://github.com/10gen/product-perf-experimentations/blob/master/investigations/PERF-5121-compare-multiplanners/full-results.ipynb)
are consistent with the new description.
@dpercy dpercy requested a review from a team as a code owner May 7, 2024 18:58
@dpercy dpercy requested review from jimoleary and dstorch May 7, 2024 18:58
Copy link
Contributor

@jimoleary jimoleary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good I've left some small suggestions and I think you'll need to regenerate the docs before you can merge this.

src/workloads/query/multiplanner/CompoundIndexes.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/NoGoodPlan.yml Outdated Show resolved Hide resolved
Copy link
Contributor

@dstorch dstorch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on @dpercy. I left a few comments on the specifics.

One overall comment: A lot of the workload descriptions were written in the context of evaluating classic+classic against classic+sbe ("mix") against sbe+sbe. But the pure SBE case is no longer tested and we seem on track for deleting it after shipping 8.0. That means that for most of their lifetime, these workloads will just stand on their own as scenarios to test the performance of multi-planning and will no longer serve the original purpose of comparing the SBE multiplanner against the classic multi-planner. For this purpose, I think it's useful to (at least for the most part) scrub the workload descriptions of commentary related to the behaviors of the SBE runtime planners. What do you think?

src/workloads/query/multiplanner/BlockingSort.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/BlockingSort.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/BlockingSort.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/CompoundIndexes.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/ClusteredCollection.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/NonBlockingVsBlocking.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/NonBlockingVsBlocking.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/UseClusteredIndex.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/VariedSelectivity.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/VariedSelectivity.yml Outdated Show resolved Hide resolved
dpercy and others added 10 commits May 10, 2024 13:44
Co-authored-by: David Storch <dstorch@users.noreply.github.com>
Co-authored-by: David Storch <dstorch@users.noreply.github.com>
Co-authored-by: David Storch <dstorch@users.noreply.github.com>
Co-authored-by: David Storch <dstorch@users.noreply.github.com>
Co-authored-by: David Storch <dstorch@users.noreply.github.com>
@dpercy dpercy requested a review from dstorch June 4, 2024 18:31
Copy link
Contributor Author

@dpercy dpercy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've addressed everything, and I've just updated this on top of your recent changes @dstorch. Do you want to take another look?

src/workloads/query/multiplanner/NoSuchField.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/NonBlockingVsBlocking.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/NonBlockingVsBlocking.yml Outdated Show resolved Hide resolved
Copy link
Contributor

@dstorch dstorch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still a couple small things here.

Note that I'm out on vacation tomorrow and Friday and then we're at SIGMOD. So can this wait until the week of June 17?

src/workloads/query/multiplanner/BlockingSort.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/ClusteredCollection.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/CompoundIndexes.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/CompoundIndexes.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/MultiplannerWithGroup.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/NonBlockingVsBlocking.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/NonBlockingVsBlocking.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/NonBlockingVsBlocking.yml Outdated Show resolved Hide resolved
src/workloads/query/multiplanner/NonBlockingVsBlocking.yml Outdated Show resolved Hide resolved
clustered collection and add a selective predicate on _id, so that the clustered index is a viable candidate plan.
The goal of this test is to exercise multiplanning in the presence of clustered indexes. We
create as many indexes as possible, and run a query that makes all of them eligible, so we get
as many competing plans as possible. The collection is clustered and has very large strings as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like this note about very large strings is accurate. It looks like we don't explicitly mention the _id field during data generation, so we probably end up with ObjectIds, not large strings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping!

@dpercy dpercy requested a review from dstorch June 17, 2024 21:37
Copy link
Contributor

@dstorch dstorch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM with a few final thoughts. Don't forget to run run-genny generate-docs again if you make additional changes to the workload descriptions!

@@ -7,10 +7,6 @@ Description: |
multi-planner will behave more optimally than the SBE multiplanner because it will cut off execution
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[optional] We are still comparing the classic multi-planner to the SBE multi-planner here. Could consider cleaning that up.

We expect classic to have better latency and throughput than SBE on this workload,
and we expect the combination of classic planner + SBE execution (PM-3591) to perform about
as well as classic.
This test was created to show how three different multiplanners handle $group.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This test was created to show how three different multiplanners handle $group.
This test was created to show how the multiplanner handles $group.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, why did the linter complain for this workload but not the others? In your branch, only two of the multiplanner workloads have the keyword specified. I guess you should either add Keywords to all the multi-planner workloads in this patch or file a ticket about doing so later. (If we file a ticket, it's probably something we would stick into the neweng bucket?)

Comment on lines 6 to 7
are very selective (match 0% of the documents). With zero results, we do no hit the EOF optimization
and all competing plans hit the works limit instead of document limit.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, reading this again I'm confused about the purpose of this workload. Won't every plan hit EOF immediately because the interval for the index scan is empty? If I'm reading things correctly, this note about not hitting the EOF optimization is wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, you're right and we must have discussed this previously, because I had left the same comment on the google doc: https://docs.google.com/document/d/1hBJoIOwDMPZItTCvYTebfbACB62YwvhRCo7pP8Q_iy4/edit?disco=AAABM1bxvks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the goal were to hit max works, then a better test would be to have all the indexed predicates select 100% of documents, and have one residual predicate that selects 0%. This is what NoSuchField.yml already does, so this case is covered.

I'll just update the comment here to describe what this file actually does.

clustered collection and add a selective predicate on _id, so that the clustered index is a viable candidate plan.
The goal of this test is to exercise multiplanning in the presence of clustered indexes. We
create as many indexes as possible, and run a query that makes all of them eligible, so we get
as many competing plans as possible. The collection is clustered and has very large strings as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants