Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: Allow dropping enum value when referenced by UDF #118780

Merged
merged 2 commits into from
Mar 5, 2024

Conversation

rimadeodhar
Copy link
Collaborator

@rimadeodhar rimadeodhar commented Feb 5, 2024

This PR extends the current validation code executed
while dropping an enum value to also check for the value
being referenced in a UDF. Additionally, we re-enable
testing for alterTypeDropValue within the random
schema change workload.

Prior to dropping an enum value, we confirm that the value
is unused in tables, constraints, indices etc. However,
when we added support for UDFs, we did not extend the
scope of this to also check for an enum value being
referenced in a UDF. This introduced a bug where we were
unable to drop an unused enum value if some other value
from within the enum was being referenced by a UDF.
For example, if an enum 'e' contained values {'1', '2', '3'},
with the value '1' being referenced in a UDF, we were
unable to drop the values '2' and '3' as well.
This PR fixes this bug by expanding the usage check for
enum values to also include UDFs. With this fix, we can drop
an enum value as long as the value itself is unreferenced.

The first commit address this bug and and refactors some of
the common code into its own methods to avoid code duplication.
It extends the existing test suite to test for multiple scenarios
when an enum value is dropped.

The second commit re-enables testing for alterTypeDropValue
within the random schema workload. This was disabled
due to the bug which is being addressed in the first commit.

Epic: none

Fixes: #115612, #114844

Release note (bug fix): Fix an existing bug where we are
unable to drop an unused value from an enum if the enum is being
referenced in a UDF. With this bug fix, we can drop a value
from an enum as long as the value is not being referenced
by a UDF. Note, the enum can still be referenced by a UDF.
We only allow a value to be dropped if the value is not being
referenced by any other data element including UDFs.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rimadeodhar rimadeodhar requested a review from a team February 5, 2024 20:59
@rimadeodhar rimadeodhar marked this pull request as ready for review February 5, 2024 20:59
@rimadeodhar rimadeodhar requested a review from a team February 6, 2024 17:06
Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rimadeodhar)


pkg/sql/type_change_test.go line 523 at r1 (raw file):

	if _, err := sqlDB.Exec(`
CREATE DATABASE d;

one thing to be aware of: if the statements are all sent in a single string like this, they are all executed in the same implicit transaction. was that intentional for this test? or should we break it into multiple transactions? (it would be good to add a comment either way)

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rimadeodhar)


-- commits line 14 at r1:
does this also resolve #114844 ?

if so, let's make another commit in this PR that enables alterTypeDropValue in the RSW:

alterTypeDropValue: 0, // Disabled and tracked with #114844, #113859, and #115612.

@rafiss rafiss requested a review from Xiang-Gu February 7, 2024 14:14
Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rimadeodhar and @Xiang-Gu)


-- commits line 20 at r1:
just wanted to make sure i follow correctly. here it says: "we can drop a value from an enum as long as the value is not being referenced by a UDF."

but the sentence before is "Fix an existing bug where we are unable to drop a value from an enum if the enum is being referenced in a UDF." should this second sentence instead say something about how what we are fixing is actually the error handling? IIUC, both before and after this patch, the DROP would fail.

also the commit message title is "Allow dropping enum value when referenced by UDF." should it be "Allow dropping enum value when not referenced by UDF"

Copy link
Collaborator Author

@rimadeodhar rimadeodhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @Xiang-Gu)


-- commits line 14 at r1:

Previously, rafiss (Rafi Shamim) wrote…

does this also resolve #114844 ?

if so, let's make another commit in this PR that enables alterTypeDropValue in the RSW:

alterTypeDropValue: 0, // Disabled and tracked with #114844, #113859, and #115612.

Yeah, it likely will. I'll do this as a follow up.


-- commits line 20 at r1:

Previously, rafiss (Rafi Shamim) wrote…

just wanted to make sure i follow correctly. here it says: "we can drop a value from an enum as long as the value is not being referenced by a UDF."

but the sentence before is "Fix an existing bug where we are unable to drop a value from an enum if the enum is being referenced in a UDF." should this second sentence instead say something about how what we are fixing is actually the error handling? IIUC, both before and after this patch, the DROP would fail.

also the commit message title is "Allow dropping enum value when referenced by UDF." should it be "Allow dropping enum value when not referenced by UDF"

Not quite. We are not just fixing the error handling. This will allow the value to be dropped even if the enum is referenced as long as the value being dropped isn't referenced. I'll make the wording consistent in the messaging.


pkg/sql/type_change_test.go line 523 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

one thing to be aware of: if the statements are all sent in a single string like this, they are all executed in the same implicit transaction. was that intentional for this test? or should we break it into multiple transactions? (it would be good to add a comment either way)

Yeah, this is intentional as the statements are all for the test setup. I'll add a clarifying comment.

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rimadeodhar and @Xiang-Gu)


-- commits line 20 at r1:

Previously, rimadeodhar (Rima Deodhar) wrote…

Not quite. We are not just fixing the error handling. This will allow the value to be dropped even if the enum is referenced as long as the value being dropped isn't referenced. I'll make the wording consistent in the messaging.

ah i understand the distinction now, thanks for explaining

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, the new code is much more readable.

i had some nits, and just one question to make sure a case is tested

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rimadeodhar and @Xiang-Gu)


pkg/sql/type_change.go line 635 at r2 (raw file):

}

func visitExprToCheckEnumValueUsage(

nice cleanup.. it was more confusing with that inner function closure


pkg/sql/type_change.go line 707 at r2 (raw file):

		if foundUsageInCurrentWalk {
			// Set foundUsage to true if enum usage is detected in any expression in the AST walk.
			foundUsage = true

super nit: could change this to foundUsage = foundUsage || foundUsageInCurrentWalk

also technically, we could stop the recursion as soon as we find a usage, but it seems a bit harder to read that way. just mentioning it in case you think there's a way to short-circuit but still have this easy to understand


pkg/sql/type_change.go line 729 at r2 (raw file):

		if foundUsageInCurrentWalk {
			// Set foundUsage to true if enum usage is detected in any expression in the AST walk.
			foundUsage = true

super nit: could change this to foundUsage = foundUsage || foundUsageInCurrentWalk


pkg/sql/type_change.go line 760 at r2 (raw file):

		if foundUsageInCurrentWalk {
			// Set foundUsage to true if enum usage is detected in any expression in the AST walk.
			foundUsage = true

super nit: could change this to foundUsage = foundUsage || foundUsageInCurrentWalk


pkg/sql/type_change.go line 789 at r2 (raw file):

		plpgsqltree.Walk(&v, stmt.AST)
		if v.Err != nil {
			return errors.Wrapf(err, "failed to parse UDF %s", udfDesc.GetName())

nit: should be v.Err, not err


pkg/sql/type_change.go line 1009 at r2 (raw file):

) error {
	descGetter := descsCol.ByID(txn.KV()).WithoutNonPublic().Get()
	for _, ID := range typeDesc.ReferencingDescriptorIDs {

super super nit: mostly in this code we use id instead of ID


pkg/sql/type_change.go line 1016 at r2 (raw file):

		}
		// An enum value can be used within a table and a UDF.
		if desc.DescriptorType() == catalog.Table {

nit: this would be a good place to use switch:

switch desc := desc.(type) {
case catalog.TableDescriptor:
 	if err := t.canRemoveEnumValueFromTable(ctx, typeDesc, txn, member, descsCol, desc, id); err != nil {
		return err
	}
case catalog.FunctionDescriptor:
 	if err := t.canRemoveEnumValueFromUDF(typeDesc, member, desc); err != nil {
		return err
	}
default:
// ...
}

it helps remove a lot of boilerplate


pkg/sql/type_change.go line 1189 at r2 (raw file):

		// execute the following code when its referenced with a relation. So check for descriptor
		// type and skip if it is not a relation.
		// TODO(rima): Is the above a valid assumption?

hm, i wonder if we would need to also check UDFs to see if any functions have a default parameter that is an enum array, which contains the element being dropped? is that covered in a test?


pkg/sql/type_change.go line 1194 at r2 (raw file):

		}
		tblDesc, ok := desc.(catalog.TableDescriptor)
		if !ok {

we could avoid the check on DescriptorType() and instead do:

tblDesc, isTable := desc.(catalog.TableDescriptor)
if !isTable {
  continue
}

we use the type assertion pattern in most other places

Copy link
Collaborator Author

@rimadeodhar rimadeodhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @Xiang-Gu)


pkg/sql/type_change.go line 707 at r2 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

super nit: could change this to foundUsage = foundUsage || foundUsageInCurrentWalk

also technically, we could stop the recursion as soon as we find a usage, but it seems a bit harder to read that way. just mentioning it in case you think there's a way to short-circuit but still have this easy to understand

Done. As for short circuiting the recursion, we do that but the statement walker continues to evaluate any other unvisited parts of the expression. So its important to set foundUsage to true if enum value usage is detected at any point.


pkg/sql/type_change.go line 729 at r2 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

super nit: could change this to foundUsage = foundUsage || foundUsageInCurrentWalk

Done.


pkg/sql/type_change.go line 760 at r2 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

super nit: could change this to foundUsage = foundUsage || foundUsageInCurrentWalk

Done.


pkg/sql/type_change.go line 1009 at r2 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

super super nit: mostly in this code we use id instead of ID

Done.


pkg/sql/type_change.go line 1189 at r2 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

hm, i wonder if we would need to also check UDFs to see if any functions have a default parameter that is an enum array, which contains the element being dropped? is that covered in a test?

I do have array usage as values within a UDF. We don't support default values yet for UDF arguments https://www.cockroachlabs.com/docs/stable/user-defined-functions#:~:text=CockroachDB%20does%20not%20support%20default,implicit%20record%20type%2C%20or%20VOID%20.


pkg/sql/type_change.go line 1194 at r2 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

we could avoid the check on DescriptorType() and instead do:

tblDesc, isTable := desc.(catalog.TableDescriptor)
if !isTable {
  continue
}

we use the type assertion pattern in most other places

Done.

@rimadeodhar rimadeodhar requested a review from a team as a code owner February 16, 2024 20:46
@rimadeodhar rimadeodhar requested review from DarrylWong and renatolabs and removed request for a team February 16, 2024 20:46
Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! but there is one cleanup we should still add

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DarrylWong, @renatolabs, @rimadeodhar, and @Xiang-Gu)


pkg/sql/type_change.go line 1013 at r5 (raw file):

		switch desc.DescriptorType() {
		case catalog.Table:
			tblDesc, err := descGetter.Table(ctx, id)

nit: calling descGetter.Table and descGetter.Function causes it to re-fetch the descriptor. that's not necessary, since we already have it. so we can make the switch like this instead (which is also the convention we use in most parts of sql/catalog):

switch desc := desc.(type) {
case catalog.TableDescriptor:
 	if err := t.canRemoveEnumValueFromTable(ctx, typeDesc, txn, member, descsCol, desc, id); err != nil {
		return err
	}
case catalog.FunctionDescriptor:
 	if err := t.canRemoveEnumValueFromUDF(typeDesc, member, desc); err != nil {
		return err
	}
default:
// ...
}

Copy link
Collaborator Author

@rimadeodhar rimadeodhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DarrylWong, @rafiss, @renatolabs, and @Xiang-Gu)


pkg/sql/type_change.go line 1013 at r5 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

nit: calling descGetter.Table and descGetter.Function causes it to re-fetch the descriptor. that's not necessary, since we already have it. so we can make the switch like this instead (which is also the convention we use in most parts of sql/catalog):

switch desc := desc.(type) {
case catalog.TableDescriptor:
 	if err := t.canRemoveEnumValueFromTable(ctx, typeDesc, txn, member, descsCol, desc, id); err != nil {
		return err
	}
case catalog.FunctionDescriptor:
 	if err := t.canRemoveEnumValueFromUDF(typeDesc, member, desc); err != nil {
		return err
	}
default:
// ...
}

TIL! Thanks for the pointer.

This commit  extends the current validation code executed
while dropping an enum value to also check for the value
being referenced in a UDF.

Prior to dropping an enum value, we confirm that the value
is unused in tables, constraints, indices etc. However,
when we added support for UDFs, we did not extend the
scope of this to also check for an enum value being
referenced in a UDF. This introduced a bug where we were
unable to drop an unused enum value if some other value
from within the enum was  being referenced by a UDF.
For example, if an enum 'e' contained values {'1', '2', '3'},
with the value '1' being referenced in a UDF, we were
unable to drop the values '2' and '3' as well.
This PR fixes this bug by expanding the usage check for
enum values to also include UDFs. With this fix, we can drop
an enum value as long as the value itself is unreferenced.

Additionally, the PR also refactors some of the common code into its
own methods to avoid code duplication. Finally, it extends the existing
test suite to test for multiple scenarios when an enum value
is dropped.

Fixes: cockroachdb#115612
@rimadeodhar
Copy link
Collaborator Author

TFTR

bors r=rafiss

@craig
Copy link
Contributor

craig bot commented Mar 5, 2024

Build succeeded:

@craig craig bot merged commit ba61280 into cockroachdb:master Mar 5, 2024
18 checks passed
@fqazi
Copy link
Collaborator

fqazi commented Mar 27, 2024

blathers backport 23.2

Copy link

blathers-crl bot commented Mar 27, 2024

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 01414b0 to blathers/backport-release-23.2-118780: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.2 failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: allow dropping enum value when this type is referenced by a function
4 participants