New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: fast path uniqueness checks for single-row insert #102995
Conversation
f888d7b
to
8f6b764
Compare
1475d2a
to
4748ce3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 14 files at r1, 1 of 2 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @msirek and @rytaft)
pkg/sql/insert_fast_path.go
line 191 at r1 (raw file):
// Combine them together to get the final index prefix key to search. for _, templateRow := range c.DatumsFromConstraint { // Each row must have its own memory as it is part of a batch.
I'm confused by this - I'd assume that the BatchRequest would only contain the span generated based on the combined row which should internally copy datums into roachpb.Key
s, no?
pkg/sql/insert_fast_path.go
line 487 at r3 (raw file):
// Perform the FK checks. // TODO(radu): we could run the FK batch in parallel with the main batch (if
nit: do you plan on addressing this TODO as well? It seems like it'd be nice to do it since you're in this area.
pkg/sql/opt/exec/factory.opt
line 468 at r1 (raw file):
# - all FK checks can be performed using direct lookups into unique indexes. # - all UNIQUE WITHOUT INDEX checks can be performed using direct lookups # - into an index (could be a key prefix and not the entire key).
nit: probably we don't want -
before "into an index"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rytaft and @yuzefovich)
pkg/sql/insert_fast_path.go
line 191 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I'm confused by this - I'd assume that the BatchRequest would only contain the span generated based on the combined row which should internally copy datums into
roachpb.Key
s, no?
Good catch. I updated the logic to allocate the combined row once, and updated the comment.
pkg/sql/insert_fast_path.go
line 487 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: do you plan on addressing this TODO as well? It seems like it'd be nice to do it since you're in this area.
There was more work I wanted to do in uniqueness checks in general, to support more cases (more forms of INSERT statements). But I cut it short in favor of switching to partial stats work. If we were to defer the partial stats work a bit longer, I'd rather give priority to those missing cases above this TODO. Also, only the FK checks could be done in parallel with the insert, because uniqueness checks could be trying to read from the same index as being inserted, and we don't want to "see" the newly-inserted row. So, in that case we'd have to run the uniqueness check before the insert(s) anyway. So, this would be an enhancement for cases which only have FK checks and no uniqueness checks, which this PR is not covering.
pkg/sql/opt/exec/factory.opt
line 468 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: probably we don't want
-
before "into an index"?
Fixed
The last 2 pushes update the code to the latest master and add the Locking attribute to fast path checks, copied over from the Scan which the check was built upon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, this is impressive work! I've reviewed 5/9 commits so far with mostly just clarification questions to understand it better. If you wanted to I think you could split some of the commits off into their own PRs (e.g., the Autocommit commit) but I'm not sure that would make the whole thing more manageable, and splitting off some of the meaty commits would lose some comment history, so not sure that's desirable.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, @msirek, @rytaft, and @yuzefovich)
pkg/sql/explain_bundle.go
line 338 at r57 (raw file):
memo.ExprFmtHideQualifications|memo.ExprFmtHideScalars|memo.ExprFmtHideTypes|memo.ExprFmtHideNotVisibleIndexInfo|memo.ExprFmtHideFastPathChecks, )) b.z.AddFile("opt-vv.txt", formatOptPlan(memo.ExprFmtHideQualifications|memo.ExprFmtHideNotVisibleIndexInfo|memo.ExprFmtHideFastPathChecks))
Why omit it from opt-vv.txt? Seems like it would be useful to include the plan for the fast path check in stmt bundles. If it will be used for some checks in production, how will we debug if it turns out there are issues with the fast path?
Including it would also be consistent with the treatment of EXPLAIN(TYPES)
below.
pkg/sql/opt/exec/factory.go
line 256 at r59 (raw file):
// ConstructInsertFastPath). It identifies the index into which we can perform // the lookup. type InsertFastPathFKUniqCheck struct {
optional nit: maybe rename this and others to InsertFastPathCheck
instead.
pkg/sql/opt/memo/interner.go
line 1197 at r48 (raw file):
} for i := range l { if l[i].ReferencedTableID != r[i].ReferencedTableID {
Should we also use h.IsRelExprEqual
to compare the Check
RelExpr
, too?
pkg/sql/opt/optbuilder/testdata/unique-checks-insert
line 2134 at r57 (raw file):
│ └── t.a:24 = 10 └── fast-path-unique-checks-item: &{0 0 [] [] { record best-effort}} └── values
Is this arm of the explain tree from making a non-nil fast path so we don't break RelExpr
? Trying to understand what
fast-path-unique-checks-item: &{0 0 [] [] { record best-effort}}
└── values
means, and why we have both this and a fleshed out fast path check above it.
pkg/sql/opt/xform/testdata/rules/insert
line 187 at r58 (raw file):
│ │ ├── columns: t.k:26!null t.r:27!null t.c:30 │ │ ├── constraint: /27/30/26 │ │ │ ├── [/'east' - /'east']
I thought that fast path unique checks were characterized by a single KV lookup, but this looks like at least 2 KV look ups (for east and west) plus an index join. Am I not reading or understanding this right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reviewed the remaining files and don't have any additional feedback, but I'd like to check out the benchmark results (are the ones in the commit message up-to-date?) and have responses to the open comments before giving a stamp of approval.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, @msirek, @rytaft, and @yuzefovich)
pkg/sql/opt/bench/uniq_test.go
line 54 at r63 (raw file):
} func BenchmarkUniqInsert(b *testing.B) {
This might have gotten lost in the sea of comments, but could you please re-share the latest benchmark results? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left my comments for the first commit. The first commit also needs:
- optbuilder tests that show the new expressions being built for the cases covered by the new code.
- norm tests for the new pruning normalization rules.
- TestInterner tests for the new checks in interner_test.go - this might have caught the missing check that Rachael found
If you wanted to I think you could split some of the commits off into their own PRs
+1 building this incrementally would speed up time-to-merge. It doesn't help that Reviewable is struggling with so many commit revisions.
Reviewed 3 of 28 files at r32, 1 of 3 files at r38, 37 of 37 files at r40, 3 of 5 files at r47, 49 of 49 files at r48.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @michae2, @msirek, and @yuzefovich)
pkg/sql/opt/norm/rules/prune_cols.opt
line 542 at r40 (raw file):
) # PruneReturningCols removes columns from the Insert operator's ReturnCols
nit: PruneInsertReturnCols
pkg/sql/opt/ops/mutation.opt
line 302 at r40 (raw file):
# columns for a uniqueness check, in the order of the ReferencedIndexOrdinal # columns. For each, the value in the array indicates the index of the # column in the input table.
nit: This last sentence confuses me. A ColList is a list of column IDs, not ordinals. I believe that these column IDs contain the values being inserted, and the position in this list corresponds with the ordinal of the referenced index columns. Is that correct?
pkg/sql/opt/ops/mutation.opt
line 305 at r40 (raw file):
InsertCols ColList # DatumsFromConstraint contains constant values from the WHERE clause
This WHERE
clause is from partial unique constraints? Or somewhere else?
pkg/sql/opt/optbuilder/mutation_builder_unique.go
line 501 at r34 (raw file):
Previously, msirek (Mark Sirek) wrote…
It may make things more complicated if we try to specifically search for and process a projection of a virtual computed column, and my understanding is we wish to target a very narrow set of cases in the first pass. In some cases, where we build a Values expression, the projection may be merged with the Values by a rewrite, in which case we'll find the value and handle it. In other cases where we have a WithScanExpr of a Values expression, and a projection, the projection may not be merged. I think that if and when foreign key fast path checking code gets refactored, the need to build a WithScanExpr may go away and the current simple handling will also handle those cases.
It sounds like calling
uniqueChecksHelper.buildTableScan
is preferred overDuplicateScanPrivate
though, so I've updated the code to use the former.
I was genuinely curious, not necessarily preferring buildTableScan
. DuplicateScanPrivate would be fine and seems simpler.
pkg/sql/opt/optbuilder/mutation_builder_unique.go
line 96 at r40 (raw file):
// any existing rows. The check prevents rows from matching themselves by // adding a filter based on the primary key. uniqueCheckItems, _ := h.buildInsertionCheck()
Can you leave a TODO to avoid the extra work of building the fast paths for this case?
pkg/sql/opt/optbuilder/mutation_builder_unique.go
line 136 at r40 (raw file):
// any existing rows. The check prevents rows from matching themselves by // adding a filter based on the primary key. uniqueCheckItems, _ := h.buildInsertionCheck()
ditto: add a TODO here to avoid the extra work of building the fast paths for this case
pkg/sql/opt/optbuilder/mutation_builder_unique.go
line 334 at r40 (raw file):
if withScanScopeIsValues { valuesExpr = withScanScopeValues isValues = true
nit: simplify this to
withScanExpr, isWithScan := withScanScope.expr.(*memo.WithScanExpr)
values, isValues := withScanScope.expr.(*memo.ValuesExpr)
pkg/sql/opt/optbuilder/mutation_builder_unique.go
line 428 at r40 (raw file):
)) } // Find the ScanExpr which reads from the table this unique check applies to.
nit: mention what cases there would be a project here and why it's safe to skip over the projects - I presume because we won't support tables with computed columns yet?
pkg/sql/opt/optbuilder/mutation_builder_unique.go
line 513 at r40 (raw file):
} else { // Things blow up if a RelExpr is nil, so construct a minimal dummy relation // that will not be used.
Can we just return nil
as the second return value of this function instead?
pkg/sql/opt/optbuilder/mutation_builder_unique.go
line 315 at r48 (raw file):
// path uniqueness check). func (h *uniqueCheckHelper) buildFiltersForFastPathCheck( withScanScope *scope, scanExpr *memo.ScanExpr,
This deserves some more explanation of how it works. It looks like you're either pulling out a Values expression from the unique check that's already been inlined, or you're trying to find a Values expression in the Insert's input. Is that correct?
Also, I think the naming like withScanScope
is adding confusion here because it's not necessarily a WithScan expression. You could consider the names uniqueCheckScope
. And insertInputValues
instead of possibleValues
, for example.
pkg/sql/opt/optbuilder/mutation_builder_unique.go
line 323 at r48 (raw file):
// Skip to the WithScan or Values. for skipProjectExpr, ok := possibleValues.(*memo.ProjectExpr); ok; skipProjectExpr, ok = possibleValues.(*memo.ProjectExpr) { possibleValues = skipProjectExpr.Input
This code relies on h.mb.outScope.expr
to be the input of the insert expression, correct? That invariant should be made clear, at the least in a comment somewhere. Even better would be an assertion of some type to check this, or to use another method to find the Insert's input, though I'm not sure at the moment what that would be. Maybe it'd be possible to store a pointer to that expression when it is built in mutationBuilder
and access it directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, @rharding6373, and @yuzefovich)
pkg/sql/explain_bundle.go
line 338 at r57 (raw file):
Previously, rharding6373 (Rachael Harding) wrote…
Why omit it from opt-vv.txt? Seems like it would be useful to include the plan for the fast path check in stmt bundles. If it will be used for some checks in production, how will we debug if it turns out there are issues with the fast path?
Including it would also be consistent with the treatment of
EXPLAIN(TYPES)
below.
Good point. Modified it to include fast path check expressions.
pkg/sql/opt/exec/factory.go
line 256 at r59 (raw file):
Previously, rharding6373 (Rachael Harding) wrote…
optional nit: maybe rename this and others to
InsertFastPathCheck
instead.
Done
pkg/sql/opt/memo/interner.go
line 1197 at r48 (raw file):
Previously, rharding6373 (Rachael Harding) wrote…
Should we also use
h.IsRelExprEqual
to compare theCheck
RelExpr
, too?
Done
pkg/sql/opt/norm/rules/prune_cols.opt
line 542 at r40 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: PruneInsertReturnCols
Done
pkg/sql/opt/ops/mutation.opt
line 302 at r40 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: This last sentence confuses me. A ColList is a list of column IDs, not ordinals. I believe that these column IDs contain the values being inserted, and the position in this list corresponds with the ordinal of the referenced index columns. Is that correct?
Reworded this to:
# InsertCols contains the table column ordinals of the referenced index key
# columns. The position in this list corresponds with the ordinal of the
# referenced index column (its position in the index key).
pkg/sql/opt/ops/mutation.opt
line 305 at r40 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
This
WHERE
clause is from partial unique constraints? Or somewhere else?
Mention of a WHERE clause is confusing, perhaps. The values are from the insert row or check constraint. Updated it to:
# DatumsFromConstraint contains constant values from the insert row for the
# columns in the unique constraint. Columns not available directly from the
# insert row or computed from insert row values (computed column) may be
# filled in from a CHECK constraint on the column. The number of entries
# corresponds with the number of KV lookups. For example, when built from
# analyzing a locality-optimized operation accessing 1 local region and 2
# remote regions, the resulting DatumsFromConstraint would have 3 entries.
pkg/sql/opt/optbuilder/mutation_builder_unique.go
line 501 at r34 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
I was genuinely curious, not necessarily preferring
buildTableScan
. DuplicateScanPrivate would be fine and seems simpler.
OK. Looking at this again, it seems like calling buildTableScan
may be more robust as it will always build the correct fields under the Scan, and DuplicateScanPrivate is not coded to copy all scan private fields.
pkg/sql/opt/optbuilder/mutation_builder_unique.go
line 96 at r40 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Can you leave a TODO to avoid the extra work of building the fast paths for this case?
Added a buildFastPathCheck parameter to buildInsertionCheck to handle this.
pkg/sql/opt/optbuilder/mutation_builder_unique.go
line 136 at r40 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
ditto: add a TODO here to avoid the extra work of building the fast paths for this case
Added a buildFastPathCheck parameter to buildInsertionCheck to handle this.
pkg/sql/opt/optbuilder/mutation_builder_unique.go
line 334 at r40 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: simplify this to
withScanExpr, isWithScan := withScanScope.expr.(*memo.WithScanExpr) values, isValues := withScanScope.expr.(*memo.ValuesExpr)
Done
pkg/sql/opt/optbuilder/mutation_builder_unique.go
line 428 at r40 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: mention what cases there would be a project here and why it's safe to skip over the projects - I presume because we won't support tables with computed columns yet?
OK, added:
// Projections may have been added such as for computed column expressions.
// Skip over these to find the Scan on the target table of the Insert.
// Fast path uniqueness checks on computed columns aren't supported
// currently. We just need to access the Scan defining regular columns.
pkg/sql/opt/optbuilder/mutation_builder_unique.go
line 513 at r40 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Can we just return
nil
as the second return value of this function instead?
The 2nd return value is not a pointer, so nil can't be returned. Modified it so we return memo.FastPathUniqueChecksItem{}
when buildFastPathCheck is false. When it's true we still need to build a valid relation that can be optimized, otherwise we'll hit an error during exploration.
pkg/sql/opt/optbuilder/mutation_builder_unique.go
line 315 at r48 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
This deserves some more explanation of how it works. It looks like you're either pulling out a Values expression from the unique check that's already been inlined, or you're trying to find a Values expression in the Insert's input. Is that correct?
Also, I think the naming like
withScanScope
is adding confusion here because it's not necessarily a WithScan expression. You could consider the namesuniqueCheckScope
. AndinsertInputValues
instead ofpossibleValues
, for example.
Renamed items as suggested. Added more details to the description and more comments. Let me know if more details are needed of if anything is not clear:
// buildFiltersForFastPathCheck builds ANDed equality filters between the
// columns in the uniqueness check defined by h.uniqueOrdinals and scalar
// expressions present in a single Values row being inserted. It is expected
// that buildCheckInputScan has been called and has set up in
// uniqueCheckExpr the columns corresponding with the scalars in the
// insert row. buildCheckInputScan has either inlined the insert row as a Values
// expression, or embedded it within a WithScanExpr, in which case `h.mb.inputForInsertExpr`
// holds the input to the WithScanExpr. In the latter case, for a
// given table column ordinal `i` in `h.uniqueOrdinals`, instead of finding the
// matching scalar in the Values row via uniqueCheckCols[i],
// withScanExpr.InCols[i] holds the column ID to match on. scanExpr is
// the scan on the insert target table used on the right side of the semijoins
// in the non-fast-path uniqueness checks, with column ids matching h.scanScope.cols.
//
// The purpose of this function is to build filters representing a uniqueness
// check on a given insert row, which can be applied as a Select from a Scan,
// and optimized during exploration when all placeholders have been filled in.
// The goal is to find a constrained Scan of an index, which consumes all
// filters (meaning it could also be safely executed via a KV lookup in a fast
// path uniqueness check).
pkg/sql/opt/optbuilder/mutation_builder_unique.go
line 323 at r48 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
This code relies on
h.mb.outScope.expr
to be the input of the insert expression, correct? That invariant should be made clear, at the least in a comment somewhere. Even better would be an assertion of some type to check this, or to use another method to find the Insert's input, though I'm not sure at the moment what that would be. Maybe it'd be possible to store a pointer to that expression when it is built inmutationBuilder
and access it directly.
I guess there's no guarantee someone won't re-use h.mb.outScope.expr
, so I've created a inputForInsertExpr
in mutationBuilder
to guarantee it won't change since the most recent call to buildInputForInsert
and added a comment.
pkg/sql/opt/optbuilder/testdata/unique-checks-insert
line 2134 at r57 (raw file):
Previously, rharding6373 (Rachael Harding) wrote…
Is this arm of the explain tree from making a non-nil fast path so we don't break
RelExpr
? Trying to understand whatfast-path-unique-checks-item: &{0 0 [] [] { record best-effort}} └── values
means, and why we have both this and a fleshed out fast path check above it.
fast-path-unique-checks
holds two fast-path-unique-checks-item
s here. One for each check:
UNIQUE WITHOUT INDEX (a),
UNIQUE WITHOUT INDEX (b),
For the check on b
, that is a computed column, so the dummy values
check is built, meaning fast path is not possible for that check. For the check on a
, we build a full fast-path-unique-checks-item
on it with a Select expression, meaning fast path is possible on it. Since not all fast-path-unique-checks-item
s are fully built out, fast path will not be picked for this query.
pkg/sql/opt/xform/testdata/rules/insert
line 187 at r58 (raw file):
Previously, rharding6373 (Rachael Harding) wrote…
I thought that fast path unique checks were characterized by a single KV lookup, but this looks like at least 2 KV look ups (for east and west) plus an index join. Am I not reading or understanding this right?
Your read of the tree is correct, and there will be more than one KV lookup for this check. Check out the comments on DatumsFromConstraint
. Basically, there will be one KV lookup per constraint span.
pkg/sql/opt/bench/uniq_test.go
line 54 at r63 (raw file):
Previously, rharding6373 (Rachael Harding) wrote…
This might have gotten lost in the sea of comments, but could you please re-share the latest benchmark results? Thanks!
Sure.
Here are a few runs of it:
BenchmarkUniqInsert/SingleRow/NoFastPath-10 1690 600194 ns/op
BenchmarkUniqInsert/SingleRow/FastPath-10 3080 381874 ns/op
BenchmarkUniqInsert/SingleRow/NoFastPath-10 1774 596803 ns/op
BenchmarkUniqInsert/SingleRow/FastPath-10 3147 375995 ns/op
BenchmarkUniqInsert/SingleRow/NoFastPath-10 1726 586618 ns/op
BenchmarkUniqInsert/SingleRow/FastPath-10 3145 376511 ns/op
BenchmarkUniqInsert/SingleRow/NoFastPath-10 1760 594442 ns/op
BenchmarkUniqInsert/SingleRow/FastPath-10 3129 374240 ns/op
BenchmarkUniqInsert/SingleRow/NoFastPath-10 1710 595147 ns/op
BenchmarkUniqInsert/SingleRow/FastPath-10 3184 378356 ns/op
9877135
to
d8b99d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first commit also needs:
- optbuilder tests that show the new expressions being built for the cases covered by the new code.
The commit titled "memo: optbuild unit tests for fast path unique check exprs" has these tests, in file pkg/sql/opt/optbuilder/testdata/unique-checks-insert.
- norm tests for the new pruning normalization rules.
I found PruneInsertInputCols is never fired and there doesn't seem to be a use case for it, so removed it. I've added some norm tests for PruneInsertReturnCols.
- TestInterner tests for the new checks in interner_test.go
Added TestInterner tests
If you wanted to I think you could split some of the commits off into their own PRs
+1 building this incrementally would speed up time-to-merge. It doesn't help that Reviewable is struggling with so many commit revisions.
I've split the first 3 commits off into a separate PR: #111403
Once that merges I can either rebase this PR, or split off the next couple commits, etc.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, @rharding6373, @rytaft, and @yuzefovich)
… checks This creates a `FastPathUniqueChecks` list as a child of `InsertExpr`, similar to `UniqueChecks`, but consisting of relational expressions which can be used to build a fast path uniqueness check using a particular index. Each item in the `FastPathUniqueChecks` list is a `FastPathUniqueChecksItem` whose `Check` expression, if the unique constraint is provisionally eligible to use a fast path uniqueness check, is a filtered scan from the target table of an `INSERT INTO ... VALUES` statement. It is only built if the VALUES clause has a single row. The filter equates each of the unique constraint key columns with its corresponding value from the VALUES clause. The values may either be a `ValuesExpr` or a `WithScanExpr` whose source is a `ValuesExpr`. During exploration, GenerateConstrainedScans will find all valid constrained index scans which are equivalent to the original filtered Scan. The built relations are not executed or displayed in the EXPLAIN, but will be analyzed in a later commit to specify the index and index key values to use for insert fast path uniqueness checks. FastPathUniqueChecksItemPrivate is added with elements which are designed to communicate details of how to build a fast path uniqueness check to the execbuilder. Epic: CRDB-26290 Informs: cockroachdb#58047 Release note: None
This commit adds the `ExprFmtHideFastPathChecks` flag, which hides fast path check expressions from the output of expression formatting. It is used in all places except TestBuilder. Also, unit tests for `buildFiltersForFastPathCheck` are added. Epic: CRDB-26290 Informs: cockroachdb#58047 Release note: None
This commit adds an insert transformation rules test which displays optimized fast path checks which are built in the optbuilder and later explored and optimized. This test is modified in a later commit to test for rule firing. Epic: CRDB-26290 Informs: cockroachdb#58047 Release note: None
This commit wires up insert fast path execution to support fast path uniqueness checks via KV Scan requests in a similar method as is currently done for foreign key checks. Except with uniqueness checks, the statement errors out if a row is found instead of if a row is not found. Both `addUniqChecks` and `runUniqChecks` functions are provided to initialize the checks and run them. Interfaces will be modified in a subsequent commit to utilize the new methods. `exec.InsertFastPathFKCheck` is renamed to `exec.InsertFastPathFKUniqCheck` to reflect that it will be used to build both foreign key and uniqueness fast path checks. A `DatumsFromConstraint` field is added to `exec.InsertFastPathFKUniqCheck`. DatumsFromConstraint contains constant values from the WHERE clause constraint which are part of the index key to look up. The number of entries corresponds with the number of KV lookups. For example, when built from analyzing a locality-optimized operation accessing 1 local region and remote regions, the resulting DatumsFromConstraint would have 3 entries. A `TestAddUniqChecks` unit test is added. Epic: CRDB-26290 Informs: cockroachdb#58047 Release note: None
This commit adds an AutoCommit method to the Planner interface AutoCommit indicates whether the Planner has flagged the current statement as eligible for transaction auto-commit. Epic: CRDB-26290 Informs: cockroachdb#58047 Release note: None
This adds support for building and executing simple INSERT statements with a single-row VALUES clause where any required uniqueness constraint checks can be handled via a constrained scan on an index. This includes INSERT cases such as: - a single-row VALUES clause into a REGIONAL BY ROW table with a PRIMARY KEY which has a UUID column generated by default, ie. `id UUID PRIMARY KEY DEFAULT gen_random_uuid()`, where the crdb_region column is not specified in the VALUES clause; either the gateway region is used or it is computed based on other column values. - a single-row VALUES clause into a REGIONAL BY ROW table with a hash sharded unique index where the crdb_region column is not specified in the VALUES clause - a single-row VALUES clause into a table with an experimental UNIQUE WITHOUT INDEX constraint which also has an index with leading key columns matching the constraint columns. In optbuild, when creating a uniqueness check for rows which are added to a table, a fast path index check relation is also built when the mutation source is a single-row values expression or WithScan from a single-row values expression. That relation is a filtered Select of a Scan from the target table, where the filters equate all of the unique check columns with their corresponding constants or placeholders from the Values expression. If there is a uniqueness check with a partial index predicate, fast path is disallowed. A new exploration rule called InsertFastPath is added to walk the memo group members created during exploration in `FastPathUniqueChecks` of the `InsertExpr`, to find any which have been rewritten as a constrained `ScanExpr`. If found, that means that Scan fully represents the lookup needed to check for duplicate entries and the Scan constraint can be used to identify the constants to use in a KV lookup on the scanned index in a fast path check. Function CanUseUniqueChecksForInsertFastPath walks the expressions generated during exploration of the `FastPathUniqueChecks.Check` relation. If a constrained scan is found, it is used to build elements of the `FastPathUniqueChecksItemPrivate` structure to communicate to the execbuilder the table and index to use for the check, and the column ids in the insert row to use for building the fast path KV request. In addition, a new `DatumsFromConstraint` field is added, consisting of a ScalarListExpr of TupleExprs specifying the index key, which allows an index key column to be matched with multiple Datums. One insert row may result in more than one KV lookup for a given uniqueness constraint. These items are used to build the `InsertFastPathFKUniqCheck` structure in the execbuilder. The new `FastPathUniqueChecksItemPrivate` is built into a new the corresponding `FastPathUniqueChecksItem`s of a new `FastPathUniqueChecksExpr` and communicated to the caller via return value `newFastPathUniqueChecks`. A small adjustment is made in the coster to make the fast path unique constraint slightly cheaper, so it should always be chosen over the original non-fast path check. Epic: CRDB-26290 Fixes: cockroachdb#58047 Release note (performance improvement): This patch adds support for insert fast-path uniqueness checks on REGIONAL BY ROW tables where the source is a VALUES clause with a single row. This results in a reduction in latency for single-row inserts to REGIONAL BY ROW tables and hash-sharded REGIONAL BY ROW tables with unique indexes.
This commit combines the foreign key checks of insert fast-path with the uniqueness checks into a single batch so that they may be processed in parallel for reduced latency. Epic: CRDB-26290 Informs: cockroachdb#58047 Release note: None
This commit adds a small benchmark for single-row insert fast path with a UNIQUE WITHOUT INDEX constraint. ``` BenchmarkUniqInsert/SingleRow BenchmarkUniqInsert/SingleRow/NoFastPath BenchmarkUniqInsert/SingleRow/NoFastPath-10 1623 634627 ns/op BenchmarkUniqInsert/SingleRow/FastPath BenchmarkUniqInsert/SingleRow/FastPath-10 3058 399215 ns/op ``` Epic: CRDB-26290 Informs: cockroachdb#58047 Release note: None
This commit optimizes fast-path uniqueness checks by picking index scans which minimize the number of constraint spans which in turn minimizes the number of KV lookups performed by the check. A latency cost is associated with each KV lookup, so it's desirable to use as few KV requests as possible. A TODO is added to make this a cost-based decision in the future, where latency costs are considered. Epic: CRDB-26290 Informs: cockroachdb#58047 Release note: None
RFAL, either in this PR, or in the split off PR: #111403 |
It's been a while since I looked at this PR, is there a particular commit(s) that you'd like me to take a look at? |
The changes in this PR have been merged as separate PRs: The last commit, xform: minimize the number of KV lookups for fast-path unique checks, was not merged, but could be opened as a possible future improvement. |
sql: add new check relation to help build insert fast path uniqueness checks
This creates a
FastPathUniqueChecks
list as a child ofInsertExpr
,similar to
UniqueChecks
, but consisting of relational expressionswhich can be used to build a fast path uniqueness check using a
particular index. Each item in the
FastPathUniqueChecks
list is aFastPathUniqueChecksItem
whoseCheck
expression, if the uniqueconstraint is provisionally eligible to use a fast path uniqueness
check, is a filtered scan from the target table of an
INSERT INTO ... VALUES
statement. It is only built if theVALUES clause has a single row. The filter equates each of the unique
constraint key columns with its corresponding value from the VALUES
clause. The values may either be a
ValuesExpr
or aWithScanExpr
whose source is a
ValuesExpr
. During exploration,GenerateConstrainedScans will find all valid constrained index scans
which are equivalent to the original filtered Scan. The built relations
are not executed or displayed in the EXPLAIN, but will be analyzed in a
later commit to specify the index and index key values to use for insert
fast path uniqueness checks.
FastPathUniqueChecksItemPrivate is added with elements which are
designed to communicate details of how to build a fast path uniqueness
check to the execbuilder.
Epic: CRDB-26290
Informs: #58047
Release note: None
memo: optbuild unit tests for fast path unique check exprs
This commit adds the
ExprFmtHideFastPathChecks
flag, which hides fastpath check expressions from the output of expression formatting. It is
used in all places except TestBuilder. Also, unit tests for
buildFiltersForFastPathCheck
are added.Epic: CRDB-26290
Informs: #58047
Release note: None
xform: opt test for insert fast path checks
This commit adds an
insert
transformation rules test whichdisplays optimized fast path checks which are built in the optbuilder
and later explored and optimized. This test is modified in a later
commit to test for rule firing.
Epic: CRDB-26290
Informs: #58047
Release note: None
sql: insert fast path uniq checks, execution support
This commit wires up insert fast path execution to support fast path
uniqueness checks via KV Scan requests in a similar method as is
currently done for foreign key checks. Except with uniqueness checks, the
statement errors out if a row is found instead of if a row is not found.
Both
addUniqChecks
andrunUniqChecks
functions are provided toinitialize the checks and run them. Interfaces will be modified in a
subsequent commit to utilize the new methods.
exec.InsertFastPathFKCheck
is renamed toexec.InsertFastPathFKUniqCheck
to reflect that it will be used tobuild both foreign key and uniqueness fast path checks.
A
DatumsFromConstraint
field is added toexec.InsertFastPathFKUniqCheck
.DatumsFromConstraint contains constant values from the WHERE clause
constraint which are part of the index key to look up. The number of
entries corresponds with the number of KV lookups. For example, when built
from analyzing a locality-optimized operation accessing 1 local region and
remote regions, the resulting DatumsFromConstraint would have 3 entries.
A
TestAddUniqChecks
unit test is added.Epic: CRDB-26290
Informs: #58047
Release note: None
eval: add AutoCommit method to Planner interface.
This commit adds an AutoCommit method to the Planner interface
AutoCommit indicates whether the Planner has flagged the current
statement as eligible for transaction auto-commit.
Epic: CRDB-26290
Informs: #58047
Release note: None
sql: fast path uniqueness checks for single-row insert
This adds support for building and executing simple INSERT statements
with a single-row VALUES clause where any required uniqueness constraint
checks can be handled via a constrained scan on an index.
This includes INSERT cases such as:
PRIMARY KEY which has a UUID column generated by default, ie.
id UUID PRIMARY KEY DEFAULT gen_random_uuid()
, where thecrdb_region column is not specified in the VALUES clause; either
the gateway region is used or it is computed based on other column
values.
hash sharded unique index where the crdb_region column is not specified
in the VALUES clause
WITHOUT INDEX constraint which also has an index with leading key
columns matching the constraint columns.
In optbuild, when creating a uniqueness check for rows which are added
to a table, a fast path index check relation is also built when the
mutation source is a single-row values expression or WithScan from
a single-row values expression. That relation is a filtered
Select of a Scan from the target table, where the filters equate
all of the unique check columns with their corresponding
constants or placeholders from the Values expression. If there is a
uniqueness check with a partial index predicate, fast path is
disallowed.
A new exploration rule called InsertFastPath is added to walk the memo
group members created during exploration in
FastPathUniqueChecks
ofthe
InsertExpr
, to find any which have been rewritten as a constrainedScanExpr
. If found, that means that Scan fully represents the lookupneeded to check for duplicate entries and the Scan constraint can be
used to identify the constants to use in a KV lookup on the scanned
index in a fast path check.
Function CanUseUniqueChecksForInsertFastPath walks the expressions
generated during exploration of the
FastPathUniqueChecks.Check
relation. If a constrained scan is found, it is used to build elements
of the
FastPathUniqueChecksItemPrivate
structure to communicate to theexecbuilder the table and index to use for the check, and the column ids
in the insert row to use for building the fast path KV request. In
addition, a new
DatumsFromConstraint
field is added, consisting of aScalarListExpr of TupleExprs specifying the index key, which allows an
index key column to be matched with multiple Datums. One insert row may
result in more than one KV lookup for a given uniqueness constraint.
These items are used to build the
InsertFastPathFKUniqCheck
structurein the execbuilder. The new
FastPathUniqueChecksItemPrivate
is builtinto a new the corresponding
FastPathUniqueChecksItem
s of a newFastPathUniqueChecksExpr
and communicated to the caller via returnvalue
newFastPathUniqueChecks
.A small adjustment is made in the coster to make the fast path unique
constraint slightly cheaper, so it should always be chosen over the
original non-fast path check.
Epic: CRDB-26290
Fixes: #58047
Release note (performance improvement): This patch adds support for
insert fast-path uniqueness checks on REGIONAL BY ROW tables where
the source is a VALUES clause with a single row. This results in a
reduction in latency for single-row inserts to REGIONAL BY ROW tables
and hash-sharded REGIONAL BY ROW tables with unique indexes.
sql: run insert fast-path FK checks in parallel with uniqueness checks
This commit combines the foreign key checks of insert fast-path with the
uniqueness checks into a single batch so that they may be processed in
parallel for reduced latency.
Epic: CRDB-26290
Informs: #58047
Release note: None
bench: insert fast path single row unique check benchmark
This commit adds a small benchmark for single-row insert fast path with
a UNIQUE WITHOUT INDEX constraint.
Epic: CRDB-26290
Informs: #58047
Release note: None
xform: minimize the number of KV lookups for fast-path unique checks
This commit optimizes fast-path uniqueness checks by picking index scans
which minimize the number of constraint spans which in turn minimizes
the number of KV lookups performed by the check. A latency cost is
associated with each KV lookup, so it's desirable to use as few KV
requests as possible.
A TODO is added to make this a cost-based decision in the future, where
latency costs are considered.
Epic: CRDB-26290
Informs: #58047
Release note: None