Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optbuilder: disallow locking tables on null-extended side of outer join #115795

Merged
merged 1 commit into from
Dec 8, 2023

Conversation

rytaft
Copy link
Collaborator

@rytaft rytaft commented Dec 7, 2023

This commit disallows locking tables on the null-extended side of outer joins. It does so by adding a new isNullExtended field to the lockingContext in the optbuilder.

Fixes #97434

Release note (bug fix): Locking tables (e.g., with SELECT ... FOR UPDATE) on the null-extended side of outer joins (e.g., the right side of a LEFT JOIN) is now disallowed and returns an error. This improves compatibility with Postgres and prevents ambiguity in locking semantics. This bug has existed since locking with FOR UPDATE was introduced.

@rytaft rytaft requested a review from michae2 December 7, 2023 17:48
@rytaft rytaft requested a review from a team as a code owner December 7, 2023 17:48
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@DrewKimball
Copy link
Collaborator

DrewKimball commented Dec 7, 2023

Here's a potentially interesting test case:

postgres=# create table xy (x int primary key, y int);
CREATE TABLE
postgres=# create table ab (a int primary key, b int);
CREATE TABLE
postgres=# select * from xy left join ab on x = a for update;
ERROR:  FOR UPDATE cannot be applied to the nullable side of an outer join
postgres=# select * from xy left join ab on x = a where a is not null for update;
 x | y | a | b
---+---+---+---
(0 rows)

postgres=# explain select * from xy left join ab on x = a for update;
ERROR:  FOR UPDATE cannot be applied to the nullable side of an outer join
postgres=# explain select * from xy left join ab on x = a where a is not null for update;
                               QUERY PLAN
-------------------------------------------------------------------------
 LockRows  (cost=60.71..121.75 rows=2249 width=28)
   ->  Hash Join  (cost=60.71..99.26 rows=2249 width=28)
         Hash Cond: (xy.x = ab.a)
         ->  Seq Scan on xy  (cost=0.00..32.60 rows=2260 width=14)
         ->  Hash  (cost=32.60..32.60 rows=2249 width=14)
               ->  Seq Scan on ab  (cost=0.00..32.60 rows=2249 width=14)
                     Filter: (a IS NOT NULL)
(7 rows)

Does this mean postgres only cares about null extension if the nulls "make it" to the lock?

@mgartner
Copy link
Collaborator

mgartner commented Dec 7, 2023

Does this mean postgres only cares about null extension the nulls "make it" to the lock?

Isn't that because that left join is converted to an inner join due to the a is not null filter?

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: But @michae2 should also take a look.

Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @michae2)

@DrewKimball
Copy link
Collaborator

Isn't that because that left join is converted to an inner join due to the a is not null filter?

Probably, but it means they check after optimization has occurred or after not null properties have been calculated, rather than the AST directly.

This commit disallows locking tables on the null-extended side of outer
joins. It does so by adding a new isNullExtended field to the
lockingContext in the optbuilder.

Fixes cockroachdb#97434

Release note (bug fix): Locking tables (e.g., with SELECT ... FOR UPDATE)
on the null-extended side of outer joins (e.g., the right side of a LEFT
JOIN) is now disallowed and returns an error. This improves compatibility
with Postgres and prevents ambiguity in locking semantics. This bug has
existed since locking with FOR UPDATE was introduced.
Copy link
Collaborator Author

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs! @DrewKimball I added the test case -- nice find! I'm inclined to just leave it as a TODO for now, though, since I think matching Postgres' behavior in this case would add a lot more complexity (unless you have an idea about how to do this cleanly...).

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @michae2)

@DrewKimball
Copy link
Collaborator

@DrewKimball I added the test case -- nice find! I'm inclined to just leave it as a TODO for now, though, since I think matching Postgres' behavior in this case would add a lot more complexity (unless you have an idea about how to do this cleanly...).

Maybe we could check if the primary key columns are nullable when building the lock operator? I agree that this is probably fine for now, though.

@mgartner
Copy link
Collaborator

mgartner commented Dec 7, 2023

I'm inclined to just leave it as a TODO for now, though, since I think matching Postgres' behavior in this case would add a lot more complexity (unless you have an idea about how to do this cleanly...).

I'm fine leaving as TODO, but is there a reason the check cannot be moved from buildDataSource to after constructJoin is called in buildJoin? The join coming out of constructJoin should be normalized to an inner join.

Copy link
Collaborator Author

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could check if the primary key columns are nullable when building the lock operator?

I tried this and unfortunately it doesn't fix the test case you found (it also breaks a few other cases). Even if it did work, it makes me uneasy since it feels like this approach would violate the don't inspect what you build principle of optbuilder.

I added this code to lockingItem.validate() (after passing in the scope):

notNullCols := outScope.expr.Relational().NotNullCols  
for _, lb := range item.builders {  
  for _, col := range lb.keyCols {  
    if !notNullCols.Contains(col) {  
      panic(pgerror.Newf(  
        pgcode.FeatureNotSupported, "%s cannot be applied to the nullable side of an outer join",  
        li.Strength))  
    }  
  }  
}

Maybe I'm missing something?

is there a reason the check cannot be moved from buildDataSource to after constructJoin is called in buildJoin? The join coming out of constructJoin should be normalized to an inner join.

I don't think this would work, since we don't actually know if there should be an error until we get down to the data source and determine that the lock is set. That's why we need the isNullExtended field in the lockCtx. If we wanted this to work, we'd need some way to propagate the error back up to the original join that set the isNullExtended flag (which might not be the join immediately above the data source with the error).

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @michae2)

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this and unfortunately it doesn't fix the test case you found (it also breaks a few other cases). Even if it did work, it makes me uneasy since it feels like this approach would violate the don't inspect what you build principle of optbuilder.

I see, that's too bad. I agree we should respect the invariant, and it seems ok to be slightly more strict than postgres. Thanks for trying this out!

:lgtm:

Reviewed 3 of 4 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @michae2)

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, thanks for the explanation and sorry for sending you down that rabbit hole! :lgtm_strong:

Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @michae2)

@mgartner mgartner added backport-23.2.x Flags PRs that need to be backported to 23.2. backport-23.2.0-rc labels Dec 8, 2023
Copy link
Collaborator Author

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, thanks to you both for the suggestions! Always good to try to avoid discrepancies with Postgres if possible...

bors r+

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @michae2)

@craig
Copy link
Contributor

craig bot commented Dec 8, 2023

Build succeeded:

Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained


pkg/sql/opt/optbuilder/select.go line 270 at r2 (raw file):

		lockCtx.filter(source.As.Alias)
		if lockCtx.locking.isSet() {

Oh, I just realized: we also need to check lockCtx.isNullExtended here to disallow statements like:

CREATE TABLE a (a INT);
SELECT 'a'::regclass::int;
SELECT * FROM [104 a1] LEFT JOIN [104 a2] USING (a) FOR UPDATE;

@mgartner
Copy link
Collaborator

Nice catch @michae2!

@rytaft
Copy link
Collaborator Author

rytaft commented Dec 11, 2023

Thanks for the catch! I'll open a new PR shortly to fix.

rytaft added a commit to rytaft/cockroach that referenced this pull request Dec 13, 2023
… join

In cockroachdb#115795 we disallowed locking tables on the null-extended side of outer
joins, but did not make the same change for numeric table references. This
commit fixes that oversight.

Informs cockroachdb#97434

There is no release note since cockroachdb#115795 has not yet been included in any
release.

Release note: None
craig bot pushed a commit that referenced this pull request Dec 14, 2023
116103: logictest: uncomment composite types test that passes r=rafiss a=rafiss

This was broken for the legacy schema changer, but there's no need to skip it now that it works in the new schema changer.

informs #91972
Release note: None

116383: opt: disallow locking table references on null-extended side of outer join r=rytaft a=rytaft

In #115795 we disallowed locking tables on the null-extended side of outer joins, but did not make the same change for numeric table references. This commit fixes that oversight.

Informs #97434

There is no release note since #115795 has not yet been included in any release.

Release note: None

116385: ui: default to descending hot ranges order r=koorosh a=kvoli

The hot ranges page default sort setting changed from descending to ascending in b06b6f0. Change it back to descending, so that the hottest ranges appear first by default.

Epic: none
Release note: None

116452: roachtest: pass startopts to SetDefaultPort helpers by reference r=srosenberg,RaduBerinde,renatolabs a=DarrylWong

Startopts was being passed by value, effectively making the helpers noops.

Release note: none
Epic: none
Fixes: #116432 
Fixes: #116433 
Fixes: #116485

116463: go.mod: bump Pebble to ab4952c5f87b r=jbowens,aadityasondhi a=sumeerbhola

cockroachdb/pebble@ab4952c5 metamorphic: track object key bounds
cockroachdb/pebble@288bf0fb db: add compaction callbacks for unexpected SingleDelete behavior

Release note: None

Informs #115881
Informs #114421

Epic: none

116477: bazel: remove spare line r=rail a=rickystewart

--incompatible_strict_action_env is set regardless, so we don't need to specify it again.

Epic: CRDB-8308
Release note: None

Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
Co-authored-by: Rebecca Taft <becca@cockroachlabs.com>
Co-authored-by: Austen McClernon <austen@cockroachlabs.com>
Co-authored-by: DarrylWong <darryl@cockroachlabs.com>
Co-authored-by: sumeerbhola <sumeer@cockroachlabs.com>
Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
blathers-crl bot pushed a commit that referenced this pull request Dec 14, 2023
… join

In #115795 we disallowed locking tables on the null-extended side of outer
joins, but did not make the same change for numeric table references. This
commit fixes that oversight.

Informs #97434

There is no release note since #115795 has not yet been included in any
release.

Release note: None
blathers-crl bot pushed a commit that referenced this pull request Dec 14, 2023
… join

In #115795 we disallowed locking tables on the null-extended side of outer
joins, but did not make the same change for numeric table references. This
commit fixes that oversight.

Informs #97434

There is no release note since #115795 has not yet been included in any
release.

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.2.x Flags PRs that need to be backported to 23.2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

opt: don't allow locking null-extended rows
5 participants