New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: reduce planning time for queries with many joins #114623
Conversation
This commit moves the schemas used for slow-query benchmarks into a separate SQL file so that the schemas can continue to grow without making a mess of `bench_test.go`. Release note: None
Release note: None
`BenchmarkSlowQueries` now runs each slow query benchmark with `reorder_join_limit` set to `0` and `8`. This will allow us to analyze the range of query optimization performance with respect to this setting which is commonly adjusted to reduce planning latency. Release note: None
Prior to this commit, `FuncDepSet.tryToReduceKey` was called at the end of `FuncDepSet.addEquivalency`. This was inefficient because `FuncDepSet.AddFrom` and `FuncDepSet.AddEquivFrom` can call `addEquivalency` multiple times, and the FD's key only needs to be reduced after all equivalencies and dependencies have been added. Now, `tryToReduceKey` is no longer called in `addEquivalency` and callers are responsible for ensuring that it is called. This mimics the behavior of the similar function `addDependency`. Release note: None
Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Benchmark results comparing master to the combination of all commits here on a GCE worker are below. I was mainly focusing on the profile from
|
This commit adds a new method `FuncDepSet.addConstantsNoKeyReduction` which is the same as `FuncDepSet.AddConstants`, only differing because it not attempt to reduce the key like `AddConstants does. This allows us to avoid key-reducing computation in a few places. Release note: None
This commit changes the invocation of `ComputeEquivClosure` within `FuncDepSet.addEquivalency` to `ComputeEquivClosureNoCopy`. This is valid because `addEquivalency` is only called with sets that have already been copied, so mutating the input set is safe. Release note: None
a6bbdf0
to
a3a8f74
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @mgartner, and @michae2)
pkg/sql/opt/bench/bench_test.go
line 1459 at r8 (raw file):
b.Run(query.name, func(b *testing.B) { for _, reorderJoinLimit := range []int64{0, 8} { b.Run(fmt.Sprintf("reorder-join-%d", reorderJoinLimit), func(b *testing.B) {
Note that changing the benchmark names will make it harder to compare against old SHAs that use the old name. If we plan to backport this to at least 23.2, then it seems fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r1, 2 of 2 files at r2, 1 of 1 files at r3, 2 of 2 files at r4, 1 of 1 files at r5, 1 of 1 files at r6, 1 of 1 files at r7, 3 of 3 files at r8, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @michae2)
-- commits
line 6 at r1:
Thanks for doing this!
pkg/sql/opt/memo/statistics_builder.go
line 1655 at r7 (raw file):
switch joinType { case opt.LeftJoinOp, opt.LeftJoinApplyOp: // if right cols is not empty, then we'll add
[nit] This looks incomplete.
This commit reduces allocation in `statisticsBuilder.colStatJoin`. Previously, it was creating intersections of two sets, which, in some cases, were only useful for checking their emptiness. Now we use the `ColSet.Intersects` method which returns a boolean and does not build a new set. Release note: None
This commit reduces allocations in the statistics builder by avoiding creating singleton column sets. A new `ColStatsMap.LookupSingleton` method allows columns stats for a single column to be fetched without building a column set and potentially allocating memory. Release note: None
a3a8f74
to
97112ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball, @michae2, and @yuzefovich)
pkg/sql/opt/memo/statistics_builder.go
line 1655 at r7 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
[nit] This looks incomplete.
Removed.
pkg/sql/opt/bench/bench_test.go
line 1459 at r8 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Note that changing the benchmark names will make it harder to compare against old SHAs that use the old name. If we plan to backport this to at least 23.2, then it seems fine.
One option would be to make a BenchmarkSlowQueries and a BenchmarkSlowQueriesNew where the former always uses the same benchmark names as the previous version. But backporting might be simpler, and I don't see any harm in doing so — it's not a bad time to do it since we're about to cut a release and we've already verified the regressions for this benchmark.
TFTRs! I'm tentatively adding the backport labels so this isn't lost, though I understand there's non-zero risk in backporting these. One thing that would be wise if we do backport is to run the test suite with both bors r+ |
Build failed (retrying...): |
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 09bab66 to blathers/backport-release-23.1-114623: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1.x failed. See errors above. error creating merge commit from 09bab66 to blathers/backport-release-23.2-114623: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
If we backport the last commit here with |
opt: move slow-query benchmark schemas to SQL file
This commit moves the schemas used for slow-query benchmarks into a
separate SQL file so that the schemas can continue to grow without
making a mess of
bench_test.go
.Release note: None
opt: add slow-query-6
Release note: None
opt: add two reorder_join_limit cases for BenchmarkSlowQueries
BenchmarkSlowQueries
now runs each slow query benchmark withreorder_join_limit
set to0
and8
. This will allow us to analyzethe range of query optimization performance with respect to this setting
which is commonly adjusted to reduce planning latency.
Release note: None
opt: avoid FuncDepSet.tryToReduceKey in FuncDepSet.addEquivalency
Prior to this commit,
FuncDepSet.tryToReduceKey
was called at the endof
FuncDepSet.addEquivalency
. This was inefficient becauseFuncDepSet.AddFrom
andFuncDepSet.AddEquivFrom
can calladdEquivalency
multiple times, and the FD's key only needs to bereduced after all equivalencies and dependencies have been added. Now,
tryToReduceKey
is no longer called inaddEquivalency
and callers areresponsible for ensuring that it is called. This mimics the behavior of
the similar function
addDependency
.Release note: None
opt: add
FuncDepSet.addConstantsNoKeyReduction
This commit adds a new method
FuncDepSet.addConstantsNoKeyReduction
which is the same as
FuncDepSet.AddConstants
, only differing becauseit not attempt to reduce the key like `AddConstants does. This allows us
to avoid key-reducing computation in a few places.
Release note: None
opt: use
ComputeEquivClosureNoCopy
inFuncDepSet.addEquivalency
This commit changes the invocation of
ComputeEquivClosure
withinFuncDepSet.addEquivalency
toComputeEquivClosureNoCopy
. This isvalid because
addEquivalency
is only called with sets that havealready been copied, so mutating the input set is safe.
Release note: None
opt: avoid ColSet allocations in statisticsBuilder.colStatJoin
This commit reduces allocation in
statisticsBuilder.colStatJoin
.Previously, it was creating intersections of two sets, which, in some
cases, were only useful for checking their emptiness. Now we use the
ColSet.Intersects
method which returns a boolean and does notbuild a new set.
Release note: None
opt: reduce allocations in statistic builder
This commit reduces allocations in the statistics builder by avoiding
creating singleton column sets. A new
ColStatsMap.LookupSingleton
method allows columns stats for a single column to be fetched without
building a column set and potentially allocating memory.
Epic: None
Release note: None