PERF-5374 Improve comment headers for multiplanner/ workloads #1213

dpercy · 2024-05-07T18:58:03Z

Jira Ticket: PERF-5374

Whats Changed

Mostly I updated the first sentence about each test's goal. In some cases I also added some more detail.

In one case, 'NoResults.yml', I realized that the original description was wrong because I hadn't thought through what would actually happen. The results at full-results.ipynb are consistent with the new description.

Patch Testing Results

I have not tested this, because I'm only changing comments.

Mostly I updated the first sentence about each test's goal. In some cases I also added some more detail. In one case, 'NoResults.yml', I realized that the original description was wrong because I hadn't thought through what would actually happen. The results at [full-results.ipynb](https://github.com/10gen/product-perf-experimentations/blob/master/investigations/PERF-5121-compare-multiplanners/full-results.ipynb) are consistent with the new description.

jimoleary

Overall looks good I've left some small suggestions and I think you'll need to regenerate the docs before you can merge this.

src/workloads/query/multiplanner/CompoundIndexes.yml

src/workloads/query/multiplanner/NoGoodPlan.yml

dstorch

Thanks for taking this on @dpercy. I left a few comments on the specifics.

One overall comment: A lot of the workload descriptions were written in the context of evaluating classic+classic against classic+sbe ("mix") against sbe+sbe. But the pure SBE case is no longer tested and we seem on track for deleting it after shipping 8.0. That means that for most of their lifetime, these workloads will just stand on their own as scenarios to test the performance of multi-planning and will no longer serve the original purpose of comparing the SBE multiplanner against the classic multi-planner. For this purpose, I think it's useful to (at least for the most part) scrub the workload descriptions of commentary related to the behaviors of the SBE runtime planners. What do you think?

src/workloads/query/multiplanner/BlockingSort.yml

src/workloads/query/multiplanner/CompoundIndexes.yml

src/workloads/query/multiplanner/ClusteredCollection.yml

src/workloads/query/multiplanner/NonBlockingVsBlocking.yml

src/workloads/query/multiplanner/UseClusteredIndex.yml

src/workloads/query/multiplanner/VariedSelectivity.yml

Co-authored-by: David Storch <dstorch@users.noreply.github.com>

dpercy

I think I've addressed everything, and I've just updated this on top of your recent changes @dstorch. Do you want to take another look?

src/workloads/query/multiplanner/NoSuchField.yml

src/workloads/query/multiplanner/NonBlockingVsBlocking.yml

dstorch

Still a couple small things here.

Note that I'm out on vacation tomorrow and Friday and then we're at SIGMOD. So can this wait until the week of June 17?

src/workloads/query/multiplanner/BlockingSort.yml

src/workloads/query/multiplanner/ClusteredCollection.yml

src/workloads/query/multiplanner/CompoundIndexes.yml

src/workloads/query/multiplanner/MultiplannerWithGroup.yml

src/workloads/query/multiplanner/NonBlockingVsBlocking.yml

dstorch · 2024-06-05T17:53:33Z

src/workloads/query/multiplanner/UseClusteredIndex.yml

-   clustered collection and add a selective predicate on _id, so that the clustered index is a viable candidate plan.
+  The goal of this test is to exercise multiplanning in the presence of clustered indexes. We
+  create as many indexes as possible, and run a query that makes all of them eligible, so we get
+  as many competing plans as possible. The collection is clustered and has very large strings as


It doesn't look like this note about very large strings is accurate. It looks like we don't explicitly mention the _id field during data generation, so we probably end up with ObjectIds, not large strings.

dstorch

This LGTM with a few final thoughts. Don't forget to run run-genny generate-docs again if you make additional changes to the workload descriptions!

dstorch · 2024-06-18T13:06:36Z

src/workloads/query/multiplanner/MultikeyIndexes.yml

@@ -7,10 +7,6 @@ Description: |
  multi-planner will behave more optimally than the SBE multiplanner because it will cut off execution


[optional] We are still comparing the classic multi-planner to the SBE multi-planner here. Could consider cleaning that up.

dstorch · 2024-06-18T13:07:16Z

src/workloads/query/multiplanner/MultiplannerWithGroup.yml

-  We expect classic to have better latency and throughput than SBE on this workload,
-  and we expect the combination of classic planner + SBE execution (PM-3591) to perform about
-  as well as classic.
+  This test was created to show how three different multiplanners handle $group.


Suggested change

This test was created to show how three different multiplanners handle $group.

This test was created to show how the multiplanner handles $group.

dstorch · 2024-06-18T13:10:13Z

src/workloads/query/multiplanner/NoSuchField.yml

Oh, why did the linter complain for this workload but not the others? In your branch, only two of the multiplanner workloads have the keyword specified. I guess you should either add Keywords to all the multi-planner workloads in this patch or file a ticket about doing so later. (If we file a ticket, it's probably something we would stick into the neweng bucket?)

dstorch · 2024-06-18T13:15:52Z

src/workloads/query/multiplanner/NoResults.yml

  are very selective (match 0% of the documents). With zero results, we do no hit the EOF optimization
  and all competing plans hit the works limit instead of document limit.


Wait, reading this again I'm confused about the purpose of this workload. Won't every plan hit EOF immediately because the interval for the index scan is empty? If I'm reading things correctly, this note about not hitting the EOF optimization is wrong.

Oops, you're right and we must have discussed this previously, because I had left the same comment on the google doc: https://docs.google.com/document/d/1hBJoIOwDMPZItTCvYTebfbACB62YwvhRCo7pP8Q_iy4/edit?disco=AAABM1bxvks

If the goal were to hit max works, then a better test would be to have all the indexed predicates select 100% of documents, and have one residual predicate that selects 0%. This is what NoSuchField.yml already does, so this case is covered.

I'll just update the comment here to describe what this file actually does.

dstorch · 2024-06-18T13:20:53Z

src/workloads/query/multiplanner/UseClusteredIndex.yml

-   clustered collection and add a selective predicate on _id, so that the clustered index is a viable candidate plan.
+  The goal of this test is to exercise multiplanning in the presence of clustered indexes. We
+  create as many indexes as possible, and run a query that makes all of them eligible, so we get
+  as many competing plans as possible. The collection is clustered and has very large strings as


dpercy requested a review from a team as a code owner May 7, 2024 18:58

dpercy requested review from jimoleary and dstorch May 7, 2024 18:58

dpercy assigned dstorch May 7, 2024

jimoleary reviewed May 8, 2024

View reviewed changes

src/workloads/query/multiplanner/CompoundIndexes.yml Outdated Show resolved Hide resolved

src/workloads/query/multiplanner/NoGoodPlan.yml Outdated Show resolved Hide resolved

dstorch requested changes May 8, 2024

View reviewed changes

dstorch mentioned this pull request May 9, 2024

PERF-5358 Improve CompoundIndexes.yml workload #1214

Merged

dpercy and others added 10 commits May 10, 2024 13:44

typo of -> or

9ac18f3

Co-authored-by: David Storch <dstorch@users.noreply.github.com>

no need to split on "classic and SBE"

727c8f5

Co-authored-by: David Storch <dstorch@users.noreply.github.com>

typo "prepence" -> presence

3966b55

Co-authored-by: David Storch <dstorch@users.noreply.github.com>

don't split on "choice of multiplanner"

608dbb0

Co-authored-by: David Storch <dstorch@users.noreply.github.com>

collectionSize one word

46a82aa

Co-authored-by: David Storch <dstorch@users.noreply.github.com>

Merge branch 'master' into PERF-5374-comments

d813797

state explicitly ClusteredCollection doesn't do a clustered scan

bb2a774

other -> remaining

dc9b3af

however, however however; however.

70fd686

equally bad

9953a7b

dpercy requested a review from dstorch June 4, 2024 18:31

dpercy commented Jun 5, 2024

View reviewed changes

src/workloads/query/multiplanner/NoSuchField.yml Outdated Show resolved Hide resolved

src/workloads/query/multiplanner/NonBlockingVsBlocking.yml Outdated Show resolved Hide resolved

src/workloads/query/multiplanner/NonBlockingVsBlocking.yml Outdated Show resolved Hide resolved

dstorch requested changes Jun 5, 2024

View reviewed changes

dpercy added 9 commits June 17, 2024 17:31

Merge remote-tracking branch 'origin/master' into PERF-5374-comments

9948cf8

fix bad merge

d691dd8

remove "We expect ..." comment about SBE multiplanner

e12b6eb

avoid "empty data" wording

2b0b7bb

typo

c9d3933

update docs

d0b801d

rephrase SBE multiplanner as "historical"

69cbdc6

note about residual selectivity

8524dec

update docs

d61a02b

dpercy requested a review from dstorch June 17, 2024 21:37

appease linter by adding keyword to unchanged file

e52494b

dstorch approved these changes Jun 18, 2024

View reviewed changes

dpercy added 6 commits June 18, 2024 15:28

dont compare

d5c7d76

the multiplanner handles group

1830416

empty bounds

42a536b

not large strings

b5ba2ba

update docs

539fad1

Merge branch 'master' into PERF-5374-comments

324a0c3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PERF-5374 Improve comment headers for multiplanner/ workloads #1213

PERF-5374 Improve comment headers for multiplanner/ workloads #1213

dpercy commented May 7, 2024

jimoleary left a comment

dstorch left a comment

dpercy left a comment

dstorch left a comment

dstorch Jun 5, 2024

dstorch Jun 18, 2024

dstorch left a comment

dstorch Jun 18, 2024

dstorch Jun 18, 2024

dstorch Jun 18, 2024

dstorch Jun 18, 2024

dpercy Jun 18, 2024

dpercy Jun 18, 2024

dstorch Jun 18, 2024

		@@ -7,10 +7,6 @@ Description: \|
		multi-planner will behave more optimally than the SBE multiplanner because it will cut off execution

	This test was created to show how three different multiplanners handle $group.
	This test was created to show how the multiplanner handles $group.

		are very selective (match 0% of the documents). With zero results, we do no hit the EOF optimization
		and all competing plans hit the works limit instead of document limit.

PERF-5374 Improve comment headers for multiplanner/ workloads #1213

Are you sure you want to change the base?

PERF-5374 Improve comment headers for multiplanner/ workloads #1213

Conversation

dpercy commented May 7, 2024

Whats Changed

Patch Testing Results

jimoleary left a comment

Choose a reason for hiding this comment

dstorch left a comment

Choose a reason for hiding this comment

dpercy left a comment

Choose a reason for hiding this comment

dstorch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dstorch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment