Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: ensure that Upsert will never update the same row twice #45372

Merged
merged 4 commits into from
Feb 26, 2020

Conversation

andy-kimball
Copy link
Contributor

Previously, there were cases where an UPSERT or INSERT..ON CONFLICT
statement could operate on the same row more than once. This is a
problem because these statements are designed to read once to get
existing rows, and then to insert or update based on that information.
They don't expect that previous writes for the same statement will
affect the correctness of subsequent writes. This can lead to index
corruption in cases where a row "moves" from one location to another
when one of its index keys changes. See issue #44466 for an example.

This PR fixes the problem by introducing a new variation of the
DistinctOn operator that ensures that the input to the Upsert
operator never has duplicates. It differs from the regular DistinctOn
operator in two ways:

  1. Null behavior: UpsertDistinctOn treats NULL values as not equal
    to one another for purposes of grouping. Two rows having a NULL-
    valued grouping column will be placed in different groups. This
    differs from DistinctOn behavior, where the two rows would be
    grouped together. This behavior difference reflects SQL semantics,
    in which a unique index key still allows multiple NULL values.

  2. Duplicate behavior: UpsertDistinctOn raises an error if any
    distinct grouping contains more than one row. It has "input must
    be distinct" semantics rather than "make the input distinct"
    semantics. This is used to ensure that no row will be updated
    more than once.

The optbuilder now wraps the input to the Upsert operator with this
new UpsertDistinctOn operator. In addition, there are several commits
that add optimization rules designed to remove these distinct operators
when they can be proven to not be necessary.

Fixes #44466

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@andy-kimball
Copy link
Contributor Author

@jordanlewis, not sure who on your team should review the SQL execution changes (part of the 1st commit). I added @yuzefovich as a guess.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: very nice!

Reviewed 39 of 39 files at r1, 2 of 2 files at r2, 9 of 9 files at r3, 9 of 9 files at r4.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball, @jordanlewis, @RaduBerinde, and @yuzefovich)


pkg/sql/opt/norm/groupby.go, line 23 at r3 (raw file):

// RemoveGroupingCols returns a new grouping private struct with the given
// columns removed from the window partition column set.

window partition -> grouping


pkg/sql/opt/norm/groupby.go, line 222 at r4 (raw file):

}

// MakeRowsDistinct tries to ensure that the given rows are unique with respect

MakeRowsDistinct -> areRowsDistinct

Also need to change the rest of the text in this comment


pkg/sql/opt/norm/groupby.go, line 247 at r4 (raw file):

		forceDistinct := false
		for iCol, colID := range cols {

[nit] Why not just use i instead of iCol?


pkg/sql/opt/norm/rules/groupby.opt, line 120 at r1 (raw file):

#
# Note that this rule does not apply to UpsertDistinctOn, since that will raise
# error if there are duplicate rows.

raise error -> raise an error


pkg/sql/opt/norm/testdata/rules/groupby, line 1347 at r4 (raw file):

           └── variable: column1 [type=int]

# DistinctOn treats NULL values as distinct, so it can't be eliminated.

Isn't it the opposite?


pkg/sql/opt/norm/testdata/rules/groupby, line 1362 at r4 (raw file):

      └── (NULL,) [type=tuple{unknown}]

# UpsertDistinctOn treats NULL values as not distinct, so it can be eliminated.

Isn't it the opposite?


pkg/sql/opt/optbuilder/insert.go, line 139 at r1 (raw file):

//     CASE WHEN fetch_a IS NULL ins_c ELSE fetch_c END AS ups_c,
//   FROM (
//     SELECT DISTINCT ON (ins_a) *

would help to emphasize somewhere below that this is not a normal distinct on (or somehow make that clear with syntax here)


pkg/sql/rowexec/distinct.go, line 179 at r1 (raw file):

		// row.
		if d.nullsAreDistinct && d.lastGroupKey[colIdx].IsNull() {
			return false, err

return false, nil

Copy link
Contributor Author

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @RaduBerinde, @rytaft, and @yuzefovich)


pkg/sql/opt/norm/groupby.go, line 23 at r3 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

window partition -> grouping

Done.


pkg/sql/opt/norm/groupby.go, line 222 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

MakeRowsDistinct -> areRowsDistinct

Also need to change the rest of the text in this comment

Done.


pkg/sql/opt/norm/groupby.go, line 247 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] Why not just use i instead of iCol?

At one point, I had an iRow as well. Fixed.


pkg/sql/opt/norm/rules/groupby.opt, line 120 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

raise error -> raise an error

Done.


pkg/sql/opt/norm/testdata/rules/groupby, line 1347 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Isn't it the opposite?

Done.


pkg/sql/opt/norm/testdata/rules/groupby, line 1362 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Isn't it the opposite?

Done.


pkg/sql/opt/optbuilder/insert.go, line 139 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

would help to emphasize somewhere below that this is not a normal distinct on (or somehow make that clear with syntax here)

Done.


pkg/sql/rowexec/distinct.go, line 179 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

return false, nil

Done.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this PR also close #37880?

How does it relate to #44434? Do we support IGNORE?

The execution stuff looks good, I just have a few comments.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball, @RaduBerinde, @rytaft, and @yuzefovich)


pkg/sql/distinct.go, line 50 at r5 (raw file):

	// A distinct operation on column "c" will result in one output row if
	// nullsAreDistinct is false, or two output rows if true.
	nullsAreDistinct bool

Could you include some context in this comment? We tend to grow lots of flags like this one over time that change the behavior of the operators significantly. Code readers tend to want to know what the flags are for. Example:

"For ordinary SQL DISTINCT, this flag is true/false. It's set to false/true when doing XYZ, for example"


pkg/sql/distinct.go, line 55 at r5 (raw file):

	// the distinct operation finds two rows with duplicate grouping column
	// values. This is used to implement the UPSERT and INSERT..ON CONFLICT
	// statements, both of which prohibit the same row from being changed twice.

This is a good example of the context I was looking for in the first comment. Maybe just duplicate it above?


pkg/sql/execinfrapb/processors_sql.proto, line 287 at r5 (raw file):

  //
  // A distinct operation on column "c" will result in one output row if
  // NullsAreDistinct is false, or two output rows if true.

ditto on the context comment from above


pkg/sql/logictest/testdata/logic_test/upsert, line 1077 at r5 (raw file):

# Ensure this test stays synchronized to the EXPLAIN in exec/execbuilder/upsert,
# so that use of streaming group-by is confirmed.
# ------------------------------------------------------------------------------

I know we do this all over, but it's fragile. Can you also add unit tests to rowexec/distinct_test.go? That file probably could use more tests in general, but also it'll let you test the functionality you added to the execution engine more tightly.


pkg/sql/opt/exec/execbuilder/testdata/upsert, line 289 at r5 (raw file):

 WHERE operation != 'dist sender send'
----
join reader                           Scan /Table/57/1/2{-/#}

Why did these change?

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 21 files at r5, 4 of 9 files at r7, 9 of 9 files at r8.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball, @RaduBerinde, and @yuzefovich)

Previously, there were cases where an UPSERT or INSERT..ON CONFLICT
statement could operate on the same row more than once. This is a
problem because these statements are designed to read once to get
existing rows, and then to insert or update based on that information.
They don't expect that previous writes for the same statement will
affect the correctness of subsequent writes. This can lead to index
corruption in cases where a row "moves" from one location to another
when one of its index keys changes. See issue cockroachdb#44466 for an example.

This commit fixes the problem by introducing a new variation of the
DistinctOn operator that ensures that the input to the Upsert
operator never has duplicates. It differs from the regular DistinctOn
operator in two ways:

  1. Null behavior: UpsertDistinctOn treats NULL values as not equal
     to one another for purposes of grouping. Two rows having a NULL-
     valued grouping column will be placed in different groups. This
     differs from DistinctOn behavior, where the two rows would be
     grouped together. This behavior difference reflects SQL semantics,
     in which a unique index key still allows multiple NULL values.

  2. Duplicate behavior: UpsertDistinctOn raises an error if any
     distinct grouping contains more than one row. It has "input must
     be distinct" semantics rather than "make the input distinct"
     semantics. This is used to ensure that no row will be updated
     more than once.

The optbuilder now wraps the input to the Upsert operator with this
new UpsertDistinctOn operator.

Fixes cockroachdb#44466

Release note (sql change): UPSERT and INSERT..ON CONFLICT statements
will sometimes need to do an extra check to ensure that they never
update the same row twice. This may adversely affect performance in
cases where the optimizer cannot statically prove the extra check is
unnecessary.
Add support for UpsertDistinctOn to existing rule that operates over
various grouping operators. The EliminateGroupByProject rule eliminates
an unnecessary input Project operator.

Release note: None
This commit adds rules which simplify the UpdateDistinctOn grouping
columns. These are similar to rules already in place for DistinctOn,
but need some modifications related to null and error handling. The
UpdateDistinctOn treats NULL values as not equal to one another, and
so puts them in different groups. In addition, UpdateDistinctOn
raises an error when there is more than one row in a group.

Release note: None
Copy link
Contributor Author

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR does not handle the DO NOTHING case. That's a separate PR that will follow, but will use some of the machinery that this PR introduces. #44434 is just a duplicate of #37880, so the follow-on PR will close them both.

Note that the two cases are related but different: the index corruption bug is caused because duplicate rows are not raising an error. The DO NOTHING bug is caused because duplicate rows are raising an error. Implementation-wise, both will use UpsertDistinctOp. But DO NOTHING will set ErrorOnDup=False, whereas ON CONFLICT DO UPDATE will set ErrorOnDup=True.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball, @jordanlewis, @RaduBerinde, and @yuzefovich)


pkg/sql/distinct.go, line 50 at r5 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Could you include some context in this comment? We tend to grow lots of flags like this one over time that change the behavior of the operators significantly. Code readers tend to want to know what the flags are for. Example:

"For ordinary SQL DISTINCT, this flag is true/false. It's set to false/true when doing XYZ, for example"

Done.


pkg/sql/distinct.go, line 55 at r5 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

This is a good example of the context I was looking for in the first comment. Maybe just duplicate it above?

Done.


pkg/sql/execinfrapb/processors_sql.proto, line 287 at r5 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

ditto on the context comment from above

Done.


pkg/sql/logictest/testdata/logic_test/upsert, line 1077 at r5 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

I know we do this all over, but it's fragile. Can you also add unit tests to rowexec/distinct_test.go? That file probably could use more tests in general, but also it'll let you test the functionality you added to the execution engine more tightly.

Ah, I didn't notice those. Added some tests.

Add new rule that eliminates a distinct operator that has a constant input
Values operator that is already distinct with respect to the grouping columns.
The Values operator may be the immediate input, or it may be wrapped by
Select, Project, LeftJoin, and/or other operators. These are common patterns
that are generated by the optbuilder's upsert construction code, which must
ensure the same row cannot be updated twice.

Here is an example of the rule in action:

   SELECT DISTINCT (x) * FROM (VALUES (1), (2)) t(x)
   =>
   SELECT * FROM (VALUES (1), (2)) t(x)

Release note: None
Copy link
Contributor Author

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @RaduBerinde, @rytaft, and @yuzefovich)


pkg/sql/opt/exec/execbuilder/testdata/upsert, line 289 at r5 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Why did these change?

Just intra-commit change. No overall change. There was an extra distinct operator in the mix, until a later commit added an optimization to get rid of it.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde, @rytaft, and @yuzefovich)

@andy-kimball
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 26, 2020

Build succeeded

@craig craig bot merged commit 7e9b00d into cockroachdb:master Feb 26, 2020
@andy-kimball andy-kimball deleted the upsert branch February 26, 2020 14:56
Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late to the party - but this is great stuff! Obviously was a lot of work, but I think it's a pretty clean solution. Just have one question below.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andy-kimball, @RaduBerinde, @rytaft, and @yuzefovich)


pkg/sql/opt/norm/groupby.go, line 216 at r12 (raw file):

	case *memo.LeftJoinExpr:
		if groupingCols.SubsetOf(t.Left.Relational().OutputCols) {
			return c.AreValuesDistinct(t.Left, groupingCols, nullsAreDistinct)

Why is this ok? A left join could duplicate the value from the left side (unless the eq columns are a key, which is probably the case we care about).

@andy-kimball
Copy link
Contributor Author


pkg/sql/opt/norm/groupby.go, line 216 at r12 (raw file):

Previously, RaduBerinde wrote…

Why is this ok? A left join could duplicate the value from the left side (unless the eq columns are a key, which is probably the case we care about).

You're absolutely right. My mind was stuck on thinking of this like a semi-join, which it basically is in this particular upsert case (since the right side always has a key that's part of join). I'll post another PR to fix.

bobvawter added a commit to cockroachdb/replicator that referenced this pull request May 31, 2022
This change works around a CockroachDB implementation quirk wherein an UPSERT
is unable to read its own writes, which prevents the same row from being
upserted twice within the same statement. This also corrects a potential source
of data inconsistency in the applier code, if a single call to Apply contains
both upserts and deletes of the same row.

Both of the above issues can be addressed by sorting the mutations according to
their HLC time and then de-duplicating them by primary key.

The newly-added test in the apply package checks for the specific upsert
behavior, so that this workaround can be reconsidered if a future version of
CockroachDB allows duplicate rows within an UPSERT or CTE.

X-Ref: cockroachdb/cockroach#44466
X-Ref: cockroachdb/cockroach#70731
X-Ref: cockroachdb/cockroach#45372
bobvawter added a commit to cockroachdb/replicator that referenced this pull request May 31, 2022
This change works around a CockroachDB implementation quirk wherein an UPSERT
is unable to read its own writes, which prevents the same row from being
upserted twice within the same statement. This also corrects a potential source
of data inconsistency in the applier code, if a single call to Apply contains
both upserts and deletes of the same row.

Both of the above issues can be addressed by sorting the mutations according to
their HLC time and then de-duplicating them by primary key.

The newly-added test in the apply package checks for the specific upsert
behavior, so that this workaround can be reconsidered if a future version of
CockroachDB allows duplicate rows within an UPSERT or CTE.

X-Ref: cockroachdb/cockroach#44466
X-Ref: cockroachdb/cockroach#70731
X-Ref: cockroachdb/cockroach#45372
bobvawter added a commit to cockroachdb/replicator that referenced this pull request May 31, 2022
This change works around a CockroachDB implementation quirk wherein an UPSERT
is unable to read its own writes, which prevents the same row from being
upserted twice within the same statement. This also corrects a potential source
of data inconsistency in the applier code, if a single call to Apply contains
both upserts and deletes of the same row.

Both of the above issues can be addressed by sorting the mutations according to
their HLC time and then de-duplicating them by primary key.

The newly-added test in the apply package checks for the specific upsert
behavior, so that this workaround can be reconsidered if a future version of
CockroachDB allows duplicate rows within an UPSERT or CTE.

X-Ref: cockroachdb/cockroach#44466
X-Ref: cockroachdb/cockroach#70731
X-Ref: cockroachdb/cockroach#45372
bobvawter added a commit to cockroachdb/replicator that referenced this pull request May 31, 2022
This change works around a CockroachDB implementation quirk wherein an UPSERT
is unable to read its own writes, which prevents the same row from being
upserted twice within the same statement. This also corrects a potential source
of data inconsistency in the applier code, if a single call to Apply contains
both upserts and deletes of the same row.

Both of the above issues can be addressed by sorting the mutations according to
their HLC time and then de-duplicating them by primary key.

The newly-added test in the apply package checks for the specific upsert
behavior, so that this workaround can be reconsidered if a future version of
CockroachDB allows duplicate rows within an UPSERT or CTE.

X-Ref: cockroachdb/cockroach#44466
X-Ref: cockroachdb/cockroach#70731
X-Ref: cockroachdb/cockroach#45372
bobvawter added a commit to cockroachdb/replicator that referenced this pull request May 31, 2022
This change works around a CockroachDB implementation quirk wherein an UPSERT
is unable to read its own writes, which prevents the same row from being
upserted twice within the same statement. This also corrects a potential source
of data inconsistency in the applier code, if a single call to Apply contains
both upserts and deletes of the same row.

Both of the above issues can be addressed by sorting the mutations according to
their HLC time and then de-duplicating them by primary key.

The newly-added test in the apply package checks for the specific upsert
behavior, so that this workaround can be reconsidered if a future version of
CockroachDB allows duplicate rows within an UPSERT or CTE.

X-Ref: cockroachdb/cockroach#44466
X-Ref: cockroachdb/cockroach#70731
X-Ref: cockroachdb/cockroach#45372
bobvawter added a commit to cockroachdb/replicator that referenced this pull request May 31, 2022
This change works around a CockroachDB implementation quirk wherein an UPSERT
is unable to read its own writes, which prevents the same row from being
upserted twice within the same statement. This also corrects a potential source
of data inconsistency in the applier code, if a single call to Apply contains
both upserts and deletes of the same row.

Both of the above issues can be addressed by sorting the mutations according to
their HLC time and then de-duplicating them by primary key.

The newly-added test in the apply package checks for the specific upsert
behavior, so that this workaround can be reconsidered if a future version of
CockroachDB allows duplicate rows within an UPSERT or CTE.

X-Ref: cockroachdb/cockroach#44466
X-Ref: cockroachdb/cockroach#70731
X-Ref: cockroachdb/cockroach#45372
bobvawter added a commit to cockroachdb/replicator that referenced this pull request May 31, 2022
This change works around a CockroachDB implementation quirk wherein an UPSERT
is unable to read its own writes, which prevents the same row from being
upserted twice within the same statement. This also corrects a potential source
of data inconsistency in the applier code, if a single call to Apply contains
both upserts and deletes of the same row.

Both of the above issues can be addressed by sorting the mutations according to
their HLC time and then de-duplicating them by primary key.

The newly-added test in the apply package checks for the specific upsert
behavior, so that this workaround can be reconsidered if a future version of
CockroachDB allows duplicate rows within an UPSERT or CTE.

X-Ref: cockroachdb/cockroach#44466
X-Ref: cockroachdb/cockroach#70731
X-Ref: cockroachdb/cockroach#45372
bobvawter added a commit to cockroachdb/replicator that referenced this pull request Jun 7, 2022
This change works around a CockroachDB implementation quirk wherein an UPSERT
is unable to read its own writes, which prevents the same row from being
upserted twice within the same statement. This also corrects a potential source
of data inconsistency in the applier code, if a single call to Apply contains
both upserts and deletes of the same row.

Both of the above issues can be addressed by sorting the mutations according to
their HLC time and then de-duplicating them by primary key.

The newly-added test in the apply package checks for the specific upsert
behavior, so that this workaround can be reconsidered if a future version of
CockroachDB allows duplicate rows within an UPSERT or CTE.

X-Ref: cockroachdb/cockroach#44466
X-Ref: cockroachdb/cockroach#70731
X-Ref: cockroachdb/cockroach#45372
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multi-record UPSERT inserts duplicate values in PRIMARY KEY, resulting in inconsistent results
5 participants