Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: enable dependency tracking and support for UDFs calling other UDFs #119932

Merged
merged 4 commits into from Mar 13, 2024

Conversation

fqazi
Copy link
Collaborator

@fqazi fqazi commented Mar 5, 2024

Previously, we did not support UDFs invoking other UDFs because the dependency tracking infrastructure was missing. That meant it was possible for a UDF to be broken if a dependent was dropped. To address this this patch does the following:

  • Inside optbuilder enable support for this feature (Note: This doesn't mean its production ready but to allow us to test things out).
  • Add dependency tracking support for invocation to other UDFs
  • Use the dependency tracing inside the legacy / declarative CREATE FUNCTION so that the descriptor reflects this information
  • If any dependent object exists block DROP FUNCTION operations

This PR does not add DROP FUNCTION CASCADE yet, which will need a bit more work since it needs to deal with all sorts of other dependencies.

Informs: #88198
Fixes: #120353
Fixes: #119930

Copy link

blathers-crl bot commented Mar 5, 2024

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@fqazi fqazi force-pushed the spCallingSpTrack branch 4 times, most recently from 3f5d50e to b896271 Compare March 6, 2024 02:29
@fqazi fqazi marked this pull request as ready for review March 6, 2024 02:30
@fqazi fqazi requested review from a team as code owners March 6, 2024 02:30
@fqazi fqazi requested review from yuzefovich and removed request for a team March 6, 2024 02:30
Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exciting work! I have mostly nits and test requests, apart from a bug in the optbuilder code.

Reviewed 4 of 4 files at r1, 21 of 21 files at r2, 29 of 29 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @fqazi, @mgartner, and @yuzefovich)


pkg/sql/create_function.go line 455 at r2 (raw file):

	// Add forward references to UDF descriptor.
	udfDesc.DependsOn = backrefTblIDs.Ordered()
	//udfDesc.DependsOnFunction =

[nit] Left over line here.


pkg/sql/catalog/descpb/structured.proto line 1657 at r2 (raw file):

  repeated uint32 depends_on_types = 14 [(gogoproto.casttype) = "ID"];

[nit] extra newline


pkg/sql/catalog/descpb/structured.proto line 1674 at r2 (raw file):

  // DependsOnFunctions are the user defined functions that this function
  // qdepends on.

[nit] qdepends -> depends


pkg/sql/catalog/funcdesc/func_desc.go line 292 at r2 (raw file):

	// Currently, we don't support cross function references yet.
	// So here we assume that all inbound references are from tables.

[nit] This comment is outdated.


pkg/sql/opt/optbuilder/routine.go line 98 at r2 (raw file):

	if b.trackSchemaDeps {
		b.schemaFunctionDeps.Add(int(o.Oid))

This will miss a UDF used as a data source like SELECT * FROM f(), as well as a stored proc like CALL p(). This would probably be the best location:

o := f.ResolvedOverload()
b.factory.Metadata().AddUserDefinedFunction(o, f.Func.ReferenceByName)

Also, we should set trackSchemaDeps to false (and then defer resetting it back to true) like we do for views here, so that we don't pick up dependencies from within the nested routine's body.


pkg/sql/schemachanger/scpb/elements.proto line 727 at r3 (raw file):

  repeated uint32 uses_sequence_ids = 6 [(gogoproto.customname) = "UsesSequenceIDs", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/sem/catid.DescID"];
  repeated uint32 uses_type_ids = 7 [(gogoproto.customname) = "UsesTypeIDs", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/sem/catid.DescID"];
  repeated FunctionReference uses_functions = 8  [(gogoproto.nullable) = false];

[nit] Would it be nicer to do an ID slice instead like for type tracking? Do you anticipate we'll need to track other fields in the future?


pkg/sql/logictest/testdata/logic_test/drop_function line 288 at r2 (raw file):


statement ok
DROP FUNCTION f_called_by_b2;

Can we have similar tests for a depended-on function where we:

  • try to CREATE OR REPLACE it
  • try to rename it
  • try to change its schema

Here are some tests I wrote a while ago while thinking about this issue: https://gist.github.com/DrewKimball/47390e110fc3dc3fae11cc96adcd9579

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I only have some nits and will defer to Drew for approval.

Reviewed 4 of 4 files at r1, 21 of 21 files at r2, 29 of 29 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @fqazi, and @mgartner)


pkg/sql/create_function.go line 77 at r2 (raw file):

		}
		if dbID := funcDesc.GetParentID(); dbID != n.dbDesc.GetID() && dbID != keys.SystemDatabaseID {
			return pgerror.Newf(pgcode.FeatureNotSupported, "the function cannot refer to other databases")

nit: should we make this error message more user-friendly? Something like "dependent function %s cannot refer to other databases", funcDesc.Name().


pkg/sql/create_function.go line 180 at r2 (raw file):

func (n *createFunctionNode) replaceFunction(udfDesc *funcdesc.Mutable, params runParams) error {
	// TODO(chengxiong): add validation that the function is not referenced. This

nit: this TODO should probably be removed.


pkg/sql/opt_exec_factory.go line 1871 at r2 (raw file):

	}

	planDeps, typeDepSet, _, err := toPlanDependencies(deps, typeDeps, intsets.Fast{})

nit: add inlined comment /* funcDeps */.


pkg/sql/catalog/validate.go line 125 at r2 (raw file):

// ValidateOutboundFunctionRef validates outbound reference to a function descriptor
// depID from descriptor selfID.

nit: I don't see a mention of selfID in the argument list nor in the code (ditto for the comment on ValidateOutboundTableRef).


pkg/sql/catalog/descpb/structured.proto line 1676 at r2 (raw file):

  // qdepends on.
  repeated uint32 depends_on_functions = 22  [(gogoproto.casttype) = "ID"];
  // Next field id is 22

nit: now it's 23


pkg/sql/logictest/testdata/logic_test/drop_function line 288 at r2 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Can we have similar tests for a depended-on function where we:

  • try to CREATE OR REPLACE it
  • try to rename it
  • try to change its schema

Here are some tests I wrote a while ago while thinking about this issue: https://gist.github.com/DrewKimball/47390e110fc3dc3fae11cc96adcd9579

We probably should add some test cases to udf_rewrite files too.


pkg/sql/logictest/testdata/logic_test/udf_oid_ref line 101 at r1 (raw file):


statement ok
CREATE FUNCTION f_using_udf() RETURNS INT LANGUAGE SQL AS $$ SELECT [FUNCTION $fn_oid]() $$;

nit: we should also invoke this UDF now.


pkg/sql/logictest/testdata/logic_test/udf_unsupported line 39 at r1 (raw file):



subtest disallow_udf_in_views_and_udf

nit: the subtest name should be adjusted. Also probably some of the test cases should be moved into a different file (since they are no longer "unsupported").

@fqazi fqazi force-pushed the spCallingSpTrack branch 2 times, most recently from f082e1a to 7332b0e Compare March 12, 2024 16:46
Copy link
Collaborator Author

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @mgartner, and @yuzefovich)


pkg/sql/create_function.go line 455 at r2 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] Left over line here.

Done.


pkg/sql/catalog/descpb/structured.proto line 1657 at r2 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] extra newline

Done.


pkg/sql/catalog/descpb/structured.proto line 1674 at r2 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] qdepends -> depends

Done.


pkg/sql/catalog/funcdesc/func_desc.go line 292 at r2 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] This comment is outdated.

Done.


pkg/sql/opt/optbuilder/routine.go line 98 at r2 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

This will miss a UDF used as a data source like SELECT * FROM f(), as well as a stored proc like CALL p(). This would probably be the best location:

o := f.ResolvedOverload()
b.factory.Metadata().AddUserDefinedFunction(o, f.Func.ReferenceByName)

Also, we should set trackSchemaDeps to false (and then defer resetting it back to true) like we do for views here, so that we don't pick up dependencies from within the nested routine's body.

Done.


pkg/sql/schemachanger/scpb/elements.proto line 727 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] Would it be nicer to do an ID slice instead like for type tracking? Do you anticipate we'll need to track other fields in the future?

Done.


pkg/sql/logictest/testdata/logic_test/drop_function line 288 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

We probably should add some test cases to udf_rewrite files too.

Good point, I added tests to confirm this is broken. I have opened: #120351 for fixing it.

@fqazi fqazi force-pushed the spCallingSpTrack branch 2 times, most recently from d848b5c to 6e05b46 Compare March 12, 2024 18:12
@fqazi fqazi requested a review from yuzefovich March 12, 2024 19:25
Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Great work!

Reviewed 14 of 63 files at r4, 55 of 55 files at r8, 2 of 2 files at r9, 24 of 24 files at r10, 30 of 30 files at r11, 4 of 4 files at r12, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @fqazi, @mgartner, and @yuzefovich)


pkg/sql/opt/optbuilder/builder.go line 147 at r10 (raw file):

	// insideUDF is true when the current expressions are being built within a
	// UDF.
	insideUDF int

[nit] Comment is outdated.

Also, I think it'd be fine to keep as a bool, and combine restoration of the previous value with trackSchemaDeps:

oldTrackingSchemaDeps := b.trackSchemaDeps
oldInsideUDF := b.insideUDF
defer func() {
  b.trackSchemaDeps = oldTrackingSchemaDeps
  b.insideUDF = oldInsideUDF
}()

Up to you, though.


pkg/sql/opt/optbuilder/routine.go line 264 at r10 (raw file):

	// boolean will not be sufficient to track whether or not we are in a UDF.
	// We'll need to track the depth of the UDFs we are building expressions
	// within.

[nit] Outdated comment.

Copy link

blathers-crl bot commented Mar 13, 2024

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

Thank you for updating your pull request.

Before a member of our team reviews your PR, I have some potential action items for you:

  • We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
  • When CI has completed, please ensure no errors have appeared.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@blathers-crl blathers-crl bot added the O-community Originated from the community label Mar 13, 2024
@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Mar 13, 2024

CLA assistant check
All committers have signed the CLA.

Previously, we blocked the ability of one UDF to call another UDF due to
a lack of dependency tracking. This patch removes this limitation by
allowing user-defined functions to reference other user-defined functions
in their bodies.

Informs: cockroachdb#88198
Release note: None
Previously, we lacked a way to track the functions potentially invoked
by a UDF. Consequently, only built-in functions could be called from
within a UDF. This patch addresses this issue by adding dependency
tracking for function calls within the optbuilder. These dependencies
are passed to the legacy schema changer, where they are stored in the
function descriptor. This information prevents DROP FUNCTION operations
that would break dependent functions. Similarly, CREATE OR REPLACE
updates and corrects these dependencies.

Informs: cockroachdb#119930
Epic: CRDB-19398
Release note: none
@fqazi fqazi removed the O-community Originated from the community label Mar 13, 2024
@fqazi fqazi force-pushed the spCallingSpTrack branch 3 times, most recently from ef0c185 to 6aebcb8 Compare March 13, 2024 01:54
Previously, we did not allow user-defined functions to invoke other
user-defined functions due to a lack of dependency tracking. This patch
adds support for dependency tracking within the declarative schema
changer and includes an end-to-end test to confirm correct updates to
the descriptors.

Fixes: cockroachdb#119930
Epic: CRDB-19398
Release note: none
Previously, the DROP SCHEMA CASCADE command could leave dangling
function references. This would cause tables or functions that
referenced functions in the dropped schema to become unusable. The issue
stemmed from the legacy schema changer not preventing the deletion of
functions with cross-schema dependencies. This patch addresses the
problem by blocking such drops from executing with a dependent object
error.

The declarative schema changer implements this cascade behaviour
properly.

Fixes: cockroachdb#120353
Release note (bug fix): DROP SCHEMA CASCADE could lead to dangling
function references in other schemas accessing any functions.
@fqazi
Copy link
Collaborator Author

fqazi commented Mar 13, 2024

@DrewKimball @yuzefovich TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 13, 2024

@craig craig bot merged commit 2ac5657 into cockroachdb:master Mar 13, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants