New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: enable dependency tracking and support for UDFs calling other UDFs #119932
Conversation
Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
3f5d50e
to
b896271
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exciting work! I have mostly nits and test requests, apart from a bug in the optbuilder code.
Reviewed 4 of 4 files at r1, 21 of 21 files at r2, 29 of 29 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @fqazi, @mgartner, and @yuzefovich)
pkg/sql/create_function.go
line 455 at r2 (raw file):
// Add forward references to UDF descriptor. udfDesc.DependsOn = backrefTblIDs.Ordered() //udfDesc.DependsOnFunction =
[nit] Left over line here.
pkg/sql/catalog/descpb/structured.proto
line 1657 at r2 (raw file):
repeated uint32 depends_on_types = 14 [(gogoproto.casttype) = "ID"];
[nit] extra newline
pkg/sql/catalog/descpb/structured.proto
line 1674 at r2 (raw file):
// DependsOnFunctions are the user defined functions that this function // qdepends on.
[nit] qdepends
-> depends
pkg/sql/catalog/funcdesc/func_desc.go
line 292 at r2 (raw file):
// Currently, we don't support cross function references yet. // So here we assume that all inbound references are from tables.
[nit] This comment is outdated.
pkg/sql/opt/optbuilder/routine.go
line 98 at r2 (raw file):
if b.trackSchemaDeps { b.schemaFunctionDeps.Add(int(o.Oid))
This will miss a UDF used as a data source like SELECT * FROM f()
, as well as a stored proc like CALL p()
. This would probably be the best location:
cockroach/pkg/sql/opt/optbuilder/routine.go
Lines 164 to 165 in f4dc2b5
o := f.ResolvedOverload() | |
b.factory.Metadata().AddUserDefinedFunction(o, f.Func.ReferenceByName) |
Also, we should set trackSchemaDeps
to false (and then defer resetting it back to true) like we do for views here, so that we don't pick up dependencies from within the nested routine's body.
pkg/sql/schemachanger/scpb/elements.proto
line 727 at r3 (raw file):
repeated uint32 uses_sequence_ids = 6 [(gogoproto.customname) = "UsesSequenceIDs", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/sem/catid.DescID"]; repeated uint32 uses_type_ids = 7 [(gogoproto.customname) = "UsesTypeIDs", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/sem/catid.DescID"]; repeated FunctionReference uses_functions = 8 [(gogoproto.nullable) = false];
[nit] Would it be nicer to do an ID slice instead like for type tracking? Do you anticipate we'll need to track other fields in the future?
pkg/sql/logictest/testdata/logic_test/drop_function
line 288 at r2 (raw file):
statement ok DROP FUNCTION f_called_by_b2;
Can we have similar tests for a depended-on function where we:
- try to CREATE OR REPLACE it
- try to rename it
- try to change its schema
Here are some tests I wrote a while ago while thinking about this issue: https://gist.github.com/DrewKimball/47390e110fc3dc3fae11cc96adcd9579
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I only have some nits and will defer to Drew for approval.
Reviewed 4 of 4 files at r1, 21 of 21 files at r2, 29 of 29 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @fqazi, and @mgartner)
pkg/sql/create_function.go
line 77 at r2 (raw file):
} if dbID := funcDesc.GetParentID(); dbID != n.dbDesc.GetID() && dbID != keys.SystemDatabaseID { return pgerror.Newf(pgcode.FeatureNotSupported, "the function cannot refer to other databases")
nit: should we make this error message more user-friendly? Something like "dependent function %s cannot refer to other databases", funcDesc.Name()
.
pkg/sql/create_function.go
line 180 at r2 (raw file):
func (n *createFunctionNode) replaceFunction(udfDesc *funcdesc.Mutable, params runParams) error { // TODO(chengxiong): add validation that the function is not referenced. This
nit: this TODO should probably be removed.
pkg/sql/opt_exec_factory.go
line 1871 at r2 (raw file):
} planDeps, typeDepSet, _, err := toPlanDependencies(deps, typeDeps, intsets.Fast{})
nit: add inlined comment /* funcDeps */
.
pkg/sql/catalog/validate.go
line 125 at r2 (raw file):
// ValidateOutboundFunctionRef validates outbound reference to a function descriptor // depID from descriptor selfID.
nit: I don't see a mention of selfID
in the argument list nor in the code (ditto for the comment on ValidateOutboundTableRef
).
pkg/sql/catalog/descpb/structured.proto
line 1676 at r2 (raw file):
// qdepends on. repeated uint32 depends_on_functions = 22 [(gogoproto.casttype) = "ID"]; // Next field id is 22
nit: now it's 23
pkg/sql/logictest/testdata/logic_test/drop_function
line 288 at r2 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Can we have similar tests for a depended-on function where we:
- try to CREATE OR REPLACE it
- try to rename it
- try to change its schema
Here are some tests I wrote a while ago while thinking about this issue: https://gist.github.com/DrewKimball/47390e110fc3dc3fae11cc96adcd9579
We probably should add some test cases to udf_rewrite
files too.
pkg/sql/logictest/testdata/logic_test/udf_oid_ref
line 101 at r1 (raw file):
statement ok CREATE FUNCTION f_using_udf() RETURNS INT LANGUAGE SQL AS $$ SELECT [FUNCTION $fn_oid]() $$;
nit: we should also invoke this UDF now.
pkg/sql/logictest/testdata/logic_test/udf_unsupported
line 39 at r1 (raw file):
subtest disallow_udf_in_views_and_udf
nit: the subtest name should be adjusted. Also probably some of the test cases should be moved into a different file (since they are no longer "unsupported").
f082e1a
to
7332b0e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @mgartner, and @yuzefovich)
pkg/sql/create_function.go
line 455 at r2 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
[nit] Left over line here.
Done.
pkg/sql/catalog/descpb/structured.proto
line 1657 at r2 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
[nit] extra newline
Done.
pkg/sql/catalog/descpb/structured.proto
line 1674 at r2 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
[nit]
qdepends
->depends
Done.
pkg/sql/catalog/funcdesc/func_desc.go
line 292 at r2 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
[nit] This comment is outdated.
Done.
pkg/sql/opt/optbuilder/routine.go
line 98 at r2 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
This will miss a UDF used as a data source like
SELECT * FROM f()
, as well as a stored proc likeCALL p()
. This would probably be the best location:cockroach/pkg/sql/opt/optbuilder/routine.go
Lines 164 to 165 in f4dc2b5
o := f.ResolvedOverload() b.factory.Metadata().AddUserDefinedFunction(o, f.Func.ReferenceByName) Also, we should set
trackSchemaDeps
to false (and then defer resetting it back to true) like we do for views here, so that we don't pick up dependencies from within the nested routine's body.
Done.
pkg/sql/schemachanger/scpb/elements.proto
line 727 at r3 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
[nit] Would it be nicer to do an ID slice instead like for type tracking? Do you anticipate we'll need to track other fields in the future?
Done.
pkg/sql/logictest/testdata/logic_test/drop_function
line 288 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
We probably should add some test cases to
udf_rewrite
files too.
Good point, I added tests to confirm this is broken. I have opened: #120351 for fixing it.
d848b5c
to
6e05b46
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 14 of 63 files at r4, 55 of 55 files at r8, 2 of 2 files at r9, 24 of 24 files at r10, 30 of 30 files at r11, 4 of 4 files at r12, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @fqazi, @mgartner, and @yuzefovich)
pkg/sql/opt/optbuilder/builder.go
line 147 at r10 (raw file):
// insideUDF is true when the current expressions are being built within a // UDF. insideUDF int
[nit] Comment is outdated.
Also, I think it'd be fine to keep as a bool, and combine restoration of the previous value with trackSchemaDeps
:
oldTrackingSchemaDeps := b.trackSchemaDeps
oldInsideUDF := b.insideUDF
defer func() {
b.trackSchemaDeps = oldTrackingSchemaDeps
b.insideUDF = oldInsideUDF
}()
Up to you, though.
pkg/sql/opt/optbuilder/routine.go
line 264 at r10 (raw file):
// boolean will not be sufficient to track whether or not we are in a UDF. // We'll need to track the depth of the UDFs we are building expressions // within.
[nit] Outdated comment.
Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks. Thank you for updating your pull request. Before a member of our team reviews your PR, I have some potential action items for you:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Previously, we blocked the ability of one UDF to call another UDF due to a lack of dependency tracking. This patch removes this limitation by allowing user-defined functions to reference other user-defined functions in their bodies. Informs: cockroachdb#88198 Release note: None
Previously, we lacked a way to track the functions potentially invoked by a UDF. Consequently, only built-in functions could be called from within a UDF. This patch addresses this issue by adding dependency tracking for function calls within the optbuilder. These dependencies are passed to the legacy schema changer, where they are stored in the function descriptor. This information prevents DROP FUNCTION operations that would break dependent functions. Similarly, CREATE OR REPLACE updates and corrects these dependencies. Informs: cockroachdb#119930 Epic: CRDB-19398 Release note: none
ef0c185
to
6aebcb8
Compare
Previously, we did not allow user-defined functions to invoke other user-defined functions due to a lack of dependency tracking. This patch adds support for dependency tracking within the declarative schema changer and includes an end-to-end test to confirm correct updates to the descriptors. Fixes: cockroachdb#119930 Epic: CRDB-19398 Release note: none
Previously, the DROP SCHEMA CASCADE command could leave dangling function references. This would cause tables or functions that referenced functions in the dropped schema to become unusable. The issue stemmed from the legacy schema changer not preventing the deletion of functions with cross-schema dependencies. This patch addresses the problem by blocking such drops from executing with a dependent object error. The declarative schema changer implements this cascade behaviour properly. Fixes: cockroachdb#120353 Release note (bug fix): DROP SCHEMA CASCADE could lead to dangling function references in other schemas accessing any functions.
@DrewKimball @yuzefovich TFTR! bors r+ |
Build succeeded: |
Previously, we did not support UDFs invoking other UDFs because the dependency tracking infrastructure was missing. That meant it was possible for a UDF to be broken if a dependent was dropped. To address this this patch does the following:
This PR does not add DROP FUNCTION CASCADE yet, which will need a bit more work since it needs to deal with all sorts of other dependencies.
Informs: #88198
Fixes: #120353
Fixes: #119930