-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF-5374 Improve comment headers for multiplanner/ workloads #1213
base: master
Are you sure you want to change the base?
Conversation
Mostly I updated the first sentence about each test's goal. In some cases I also added some more detail. In one case, 'NoResults.yml', I realized that the original description was wrong because I hadn't thought through what would actually happen. The results at [full-results.ipynb](https://github.com/10gen/product-perf-experimentations/blob/master/investigations/PERF-5121-compare-multiplanners/full-results.ipynb) are consistent with the new description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good I've left some small suggestions and I think you'll need to regenerate the docs before you can merge this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking this on @dpercy. I left a few comments on the specifics.
One overall comment: A lot of the workload descriptions were written in the context of evaluating classic+classic against classic+sbe ("mix") against sbe+sbe. But the pure SBE case is no longer tested and we seem on track for deleting it after shipping 8.0. That means that for most of their lifetime, these workloads will just stand on their own as scenarios to test the performance of multi-planning and will no longer serve the original purpose of comparing the SBE multiplanner against the classic multi-planner. For this purpose, I think it's useful to (at least for the most part) scrub the workload descriptions of commentary related to the behaviors of the SBE runtime planners. What do you think?
Co-authored-by: David Storch <dstorch@users.noreply.github.com>
Co-authored-by: David Storch <dstorch@users.noreply.github.com>
Co-authored-by: David Storch <dstorch@users.noreply.github.com>
Co-authored-by: David Storch <dstorch@users.noreply.github.com>
Co-authored-by: David Storch <dstorch@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've addressed everything, and I've just updated this on top of your recent changes @dstorch. Do you want to take another look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still a couple small things here.
Note that I'm out on vacation tomorrow and Friday and then we're at SIGMOD. So can this wait until the week of June 17?
clustered collection and add a selective predicate on _id, so that the clustered index is a viable candidate plan. | ||
The goal of this test is to exercise multiplanning in the presence of clustered indexes. We | ||
create as many indexes as possible, and run a query that makes all of them eligible, so we get | ||
as many competing plans as possible. The collection is clustered and has very large strings as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like this note about very large strings is accurate. It looks like we don't explicitly mention the _id field during data generation, so we probably end up with ObjectIds, not large strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM with a few final thoughts. Don't forget to run run-genny generate-docs
again if you make additional changes to the workload descriptions!
@@ -7,10 +7,6 @@ Description: | | |||
multi-planner will behave more optimally than the SBE multiplanner because it will cut off execution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[optional] We are still comparing the classic multi-planner to the SBE multi-planner here. Could consider cleaning that up.
We expect classic to have better latency and throughput than SBE on this workload, | ||
and we expect the combination of classic planner + SBE execution (PM-3591) to perform about | ||
as well as classic. | ||
This test was created to show how three different multiplanners handle $group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was created to show how three different multiplanners handle $group. | |
This test was created to show how the multiplanner handles $group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, why did the linter complain for this workload but not the others? In your branch, only two of the multiplanner workloads have the keyword specified. I guess you should either add Keywords
to all the multi-planner workloads in this patch or file a ticket about doing so later. (If we file a ticket, it's probably something we would stick into the neweng bucket?)
are very selective (match 0% of the documents). With zero results, we do no hit the EOF optimization | ||
and all competing plans hit the works limit instead of document limit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, reading this again I'm confused about the purpose of this workload. Won't every plan hit EOF immediately because the interval for the index scan is empty? If I'm reading things correctly, this note about not hitting the EOF optimization is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, you're right and we must have discussed this previously, because I had left the same comment on the google doc: https://docs.google.com/document/d/1hBJoIOwDMPZItTCvYTebfbACB62YwvhRCo7pP8Q_iy4/edit?disco=AAABM1bxvks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the goal were to hit max works, then a better test would be to have all the indexed predicates select 100% of documents, and have one residual predicate that selects 0%. This is what NoSuchField.yml
already does, so this case is covered.
I'll just update the comment here to describe what this file actually does.
clustered collection and add a selective predicate on _id, so that the clustered index is a viable candidate plan. | ||
The goal of this test is to exercise multiplanning in the presence of clustered indexes. We | ||
create as many indexes as possible, and run a query that makes all of them eligible, so we get | ||
as many competing plans as possible. The collection is clustered and has very large strings as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping!
Jira Ticket: PERF-5374
Whats Changed
Mostly I updated the first sentence about each test's goal. In some cases I also added some more detail.
In one case, 'NoResults.yml', I realized that the original description was wrong because I hadn't thought through what would actually happen. The results at full-results.ipynb are consistent with the new description.
Patch Testing Results
I have not tested this, because I'm only changing comments.