sql: allow creation of subset-unique foreign keys#169132
sql: allow creation of subset-unique foreign keys#169132andyyang890 wants to merge 1 commit intocockroachdb:masterfrom
Conversation
|
Merging to
After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here |
|
Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
00f85eb to
87aa8c2
Compare
This comment was marked as resolved.
This comment was marked as resolved.
3934701 to
1810c2a
Compare
This comment was marked as resolved.
This comment was marked as resolved.
480dfc1 to
4eb4103
Compare
This comment was marked as resolved.
This comment was marked as resolved.
64135a4 to
54e6022
Compare
ZhouXing19
left a comment
There was a problem hiding this comment.
Great work! Putting some comments i have had so far.
@ZhouXing19 reviewed 24 files and made 6 comments.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on andyyang890, DrewKimball, and michae2).
-- commits line 10 at r1:
I wonder if we should default to allowing it, given it's not standard of major sql dbms, also the possible performance implications. As in, I think the performance penalty and portability risk are things users should consciously choose. I don't feel very strongly about this though, so feel free to push back.
-- commits line 27 at r1:
I think schema policies should be uniform cluster-wide and DBA-controlled, not just per-session. The conventional patterns, such as for experimental_alter_column_type.enabled and sql.defaults.experimental_enable_unique_without_index_constraints.enabled, is that cluster setting sets the cluster-wide default, session variable allows per-session override. Should we do something similar? (Maybe this is a question to the foundations team as well)
pkg/sql/create_table.go line 1149 at r1 (raw file):
return err } _, isStrictSubset, err := catalog.FindFKReferencedUniqueConstraint(target, fkConstraint, true /* allowSubset */)
nit: this can be consolidated into one call of FindFKReferencedUniqueConstraint, consider something like:
_, isStrictSubset, err := FindFKReferencedUniqueConstraint(target, fk, allowSubset)
if err != nil { return err }
if isStrictSubset { /* emit notice */ }pkg/sql/create_table.go line 1156 at r1 (raw file):
evalCtx.ClientNoticeSender.BufferClientNotice( ctx, pgnotice.Newf("foreign key %q is currently backed by a unique constraint "+
Great notice!
pkg/sql/catalog/table_elements.go line 1408 at r1 (raw file):
) (_ UniqueConstraint, isStrictSubset bool, _ error) { // Try exact match first. for _, uwi := range referencedTable.UniqueConstraintsWithIndex() {
super nit: here we iterate the with index & without index dual-loops for both the general and allow-subset case. I think we can collapse it without hurting readability. E.g.
allUniqueConstraints := func() []UniqueConstraint {referencedTable.UniqueConstraintsWithIndex()..., referencedTable.UniqueConstraintsWithoutIndex()...}
for _, c := range allUniqueConstraints {...}
If we collapse it into a single loop like this, need to fix isBetter as well to explicitly shows that Unique-with-index wins over unique-without-index.
|
great catch! I wonder if this is a fix for a pre-existing bug but we bundled into the PR, rather than something required by the subset-matching feature itself. Maybe worth extracting it as a separate commit. |
andyyang890
left a comment
There was a problem hiding this comment.
Thanks for the review!
@andyyang890 made 3 comments.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on DrewKimball, michae2, and ZhouXing19).
Previously, ZhouXing19 (Jane Xing) wrote…
I wonder if we should default to allowing it, given it's not standard of major sql dbms, also the possible performance implications. As in, I think the performance penalty and portability risk are things users should consciously choose. I don't feel very strongly about this though, so feel free to push back.
I also don't feel that strongly about it and I'd be okay with defaulting to not allowing it (though I'd be curious if anyone else has strong feelings). My initial thinking was that people would generally be deliberately creating indexes like that and the notice they get when they create it would be enough of a warning if it was accidental. Though one thing I should probably also add is a notice when the feature is off and they want to create an index like this that tells them it'll work if they enable the session var.
Previously, ZhouXing19 (Jane Xing) wrote…
I think schema policies should be uniform cluster-wide and DBA-controlled, not just per-session. The conventional patterns, such as for
experimental_alter_column_type.enabledandsql.defaults.experimental_enable_unique_without_index_constraints.enabled, is that cluster setting sets the cluster-wide default, session variable allows per-session override. Should we do something similar? (Maybe this is a question to the foundations team as well)
I originally had a matching sql.defaults.* cluster setting but when I was looking into it, it seemed like that was deprecated and the new preferred way for someone to set it cluster-wide is ALTER ROLE ALL SET require_fk_unique_constraint_on_all_columns = true
spilchen
left a comment
There was a problem hiding this comment.
@spilchen reviewed 1 file and made 5 comments.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on andyyang890, DrewKimball, michae2, and ZhouXing19).
pkg/sql/schemachanger/scbuild/internal/scbuildstmt/alter_table_drop_column.go line 542 at r1 (raw file):
constraintElems := b.QueryByID(fkOriginTableID).Filter(hasConstraintIDAttrFilter(fkConstraintID)) if fkOriginTableID == tbl.TableID { _, _, constraintName := scpb.FindConstraintWithoutIndexName(constraintElems.Filter(publicTargetFilter))
nit: these versions of the functions are deprecated
pkg/sql/schemachanger/scbuild/internal/scbuildstmt/alter_table_drop_column.go line 551 at r1 (raw file):
} if behavior != tree.DropCascade { _, _, originName := scpb.FindNamespace(b.QueryByID(fkOriginTableID))
nit: FindNamespace and FindConstraintWithoutIndexName is deprecated
pkg/sql/create_table.go line 1143 at r1 (raw file):
// filter on the parent lookup. fkConstraint := c.(catalog.ForeignKeyConstraint) allowSubset := evalCtx.Settings.Version.IsActive(ctx, clusterversion.V26_3) &&
Won't this be confusing for the user? Say they haven't finalized v26.3 yet but they've set the session variable, thinking they're getting the new behaviour. Yet we silently fall back instead.
If allowSubset is off because of the version and they don't have a compliant index, we error out. Could we make that error include a hint based on the setting + version state?
pkg/sql/create_table.go line 1158 at r1 (raw file):
pgnotice.Newf("foreign key %q is currently backed by a unique constraint "+ "that covers a subset of the referenced columns of %s; this is a "+ "CockroachDB extension and may not be portable to other databases. "+
The mention of a not begin a portable extension is odd. We have many cases where we diverge from postgres but we don't have similar notices (e.g. ASOF time, RBT tables, etc.). Why is this special?
pkg/sql/create_table.go line 1159 at r1 (raw file):
"that covers a subset of the referenced columns of %s; this is a "+ "CockroachDB extension and may not be portable to other databases. "+ "Foreign key enforcement on insert may incur additional latency; "+
I find mentioning potential latency here a bit alarming. If the latency is real and significant, why have on it by default. But if it's negligible why even mention it in the notice. I wonder if we shouldn't mention this at all and just have it in our documentation.
rafiss
left a comment
There was a problem hiding this comment.
@rafiss made 1 comment.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on andyyang890, DrewKimball, michae2, and ZhouXing19).
Previously, andyyang890 (Andy Yang) wrote…
I originally had a matching
sql.defaults.*cluster setting but when I was looking into it, it seemed like that was deprecated and the new preferred way for someone to set it cluster-wide isALTER ROLE ALL SET require_fk_unique_constraint_on_all_columns = true
to answer this, it would help to think about the persona who would be using such a setting. for example, if the idea is that an operator would it to prevent all the application developers in their company from using this, then a cluster setting (without a session variable) would make sense. if the idea is that it's just a guardrail that provides a reminder to someone trying to do this that this feature can be configured, then a session variable makes sense, and i don't think we need a cluster setting.
rafiss
left a comment
There was a problem hiding this comment.
@rafiss made 1 comment.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on andyyang890, DrewKimball, michae2, and ZhouXing19).
Previously, rafiss (Rafi Shamim) wrote…
to answer this, it would help to think about the persona who would be using such a setting. for example, if the idea is that an operator would it to prevent all the application developers in their company from using this, then a cluster setting (without a session variable) would make sense. if the idea is that it's just a guardrail that provides a reminder to someone trying to do this that this feature can be configured, then a session variable makes sense, and i don't think we need a cluster setting.
in general, i think a session variable makes more sense for opt-in features that we want to keep off by default, but let someone use it if they toggle the setting on. i see less of a reason to have a session variable for opt-out features, but like Jane said, it's fine to keep since it can be globally configured by ALTER ROLE ALL SET ...
andyyang890
left a comment
There was a problem hiding this comment.
@andyyang890 made 1 comment.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on DrewKimball, michae2, and ZhouXing19).
Previously, rafiss (Rafi Shamim) wrote…
in general, i think a session variable makes more sense for opt-in features that we want to keep off by default, but let someone use it if they toggle the setting on. i see less of a reason to have a session variable for opt-out features, but like Jane said, it's fine to keep since it can be globally configured by
ALTER ROLE ALL SET ...
I guess an additional reason we might want to have a cluster setting is it requires privileges to change so an admin could disable it if they wanted to prevent foreign keys like this from being created whereas even with ALTER ROLE ALL SET ..., a user could override it in their session. I'll think about it some more but I'm leaning towards changing it to a cluster setting and getting rid of the session var.
575c736 to
4e3382b
Compare
This comment was marked as resolved.
This comment was marked as resolved.
4e3382b to
cf13de4
Compare
This comment was marked as resolved.
This comment was marked as resolved.
789c59a to
f4db866
Compare
This patch updates both the legacy and declarative schema changer to allow creating a foreign key when a subset of the referenced columns is covered by a unique constraint on the parent table. This is safe from a correctness standpoint because uniqueness on a set of columns implies uniqueness on a superset of those same columns. The creation of these relaxed foreign keys is gated on `V26_3` to prevent nodes running older binaries from corrupting them. Release note (sql change): Foreign keys are now allowed to be created when the referenced table has a unique constraint on a subset of the referenced columns.
f4db866 to
bffd71f
Compare
andyyang890
left a comment
There was a problem hiding this comment.
Thanks for the reviews! Ready for another look
@andyyang890 made 9 comments and resolved 1 discussion.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on DrewKimball, michae2, spilchen, and ZhouXing19).
Previously, andyyang890 (Andy Yang) wrote…
I guess an additional reason we might want to have a cluster setting is it requires privileges to change so an admin could disable it if they wanted to prevent foreign keys like this from being created whereas even with
ALTER ROLE ALL SET ..., a user could override it in their session. I'll think about it some more but I'm leaning towards changing it to a cluster setting and getting rid of the session var.
I ended up removing the session variable/cluster setting altogether. It adds a bunch of complexity for a feature that users can simply choose to not use. I can add one in a follow-up PR if we really want one.
pkg/sql/create_table.go line 1143 at r1 (raw file):
Previously, spilchen wrote…
Won't this be confusing for the user? Say they haven't finalized v26.3 yet but they've set the session variable, thinking they're getting the new behaviour. Yet we silently fall back instead.
If allowSubset is off because of the version and they don't have a compliant index, we error out. Could we make that error include a hint based on the setting + version state?
Now that there isn't a session var, we only gate on V26_3
pkg/sql/create_table.go line 1158 at r1 (raw file):
Previously, spilchen wrote…
The mention of a not begin a portable extension is odd. We have many cases where we diverge from postgres but we don't have similar notices (e.g. ASOF time, RBT tables, etc.). Why is this special?
Removed
pkg/sql/create_table.go line 1159 at r1 (raw file):
Previously, spilchen wrote…
I find mentioning potential latency here a bit alarming. If the latency is real and significant, why have on it by default. But if it's negligible why even mention it in the notice. I wonder if we shouldn't mention this at all and just have it in our documentation.
Removed
pkg/sql/catalog/table_elements.go line 1408 at r1 (raw file):
Previously, ZhouXing19 (Jane Xing) wrote…
super nit: here we iterate the
with index&without indexdual-loops for both the general and allow-subset case. I think we can collapse it without hurting readability. E.g.allUniqueConstraints := func() []UniqueConstraint {referencedTable.UniqueConstraintsWithIndex()..., referencedTable.UniqueConstraintsWithoutIndex()...} for _, c := range allUniqueConstraints {...}If we collapse it into a single loop like this, need to fix
isBetteras well to explicitly shows thatUnique-with-index wins over unique-without-index.
Rewritten to remove duplication
pkg/sql/schemachanger/scbuild/internal/scbuildstmt/alter_table_drop_column.go line 530 at r1 (raw file):
Previously, ZhouXing19 (Jane Xing) wrote…
great catch! I wonder if this is a fix for a pre-existing bug but we bundled into the PR, rather than something required by the subset-matching feature itself. Maybe worth extracting it as a separate commit.
I thought about doing this too but it wasn't a problem in the old code because any inbound foreign keys would've been filtered out already before it got here (in the old code, every column in the foreign key would have to be in a unique index and so when the unique index got torn down earlier by maybeDropDependentFKConstraints it would've dropped the FK at the same time)
pkg/sql/schemachanger/scbuild/internal/scbuildstmt/alter_table_drop_column.go line 542 at r1 (raw file):
Previously, spilchen wrote…
nit: these versions of the functions are deprecated
Replaced
pkg/sql/schemachanger/scbuild/internal/scbuildstmt/alter_table_drop_column.go line 551 at r1 (raw file):
Previously, spilchen wrote…
nit:
FindNamespaceandFindConstraintWithoutIndexNameis deprecated
Replaced
spilchen
left a comment
There was a problem hiding this comment.
@spilchen reviewed 33 files and all commit messages, made 1 comment, and resolved 5 discussions.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on andyyang890, DrewKimball, michae2, rafiss, and ZhouXing19).
This patch updates both the legacy and declarative schema changer to
allow creating a foreign key when a subset of the referenced columns
is covered by a unique constraint on the parent table. This is safe
from a correctness standpoint because uniqueness on a set of columns
implies uniqueness on a superset of those same columns.
The creation of these relaxed foreign keys is gated on
V26_3toprevent nodes running older binaries from corrupting them.
Fixes #160013
Release note (sql change): Foreign keys are now allowed to be created
when the referenced table has a unique constraint on a subset of the
referenced columns.