-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution #26890
[SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution #26890
Conversation
ok to test. |
} | ||
|
||
private def parseSessionCatalogFunctionIdentifier(sql: String, catalog: CatalogPlugin, | ||
functionIdent: Identifier): (Option[String], String) = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- private def parseSessionCatalogFunctionIdentifier(sql: String, catalog: CatalogPlugin,
+ private def parseSessionCatalogFunctionIdentifier(
+ sql: String,
+ catalog: CatalogPlugin,
} else { | ||
throw new AnalysisException ("SHOW FUNCTIONS is only supported in v1 catalog") | ||
} | ||
val (db, fn) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, Oh. Got it. It's used in the above.val (database, function) =
for consistency.
} | ||
DescribeFunctionCommand(functionIdentifier, extended) | ||
val (database, function) = | ||
parseSessionCatalogFunctionIdentifier("DESCRIBE FUNCTION", catalog, functionIdent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question. Is it hard to make SessionCatalogAndFunction
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be possible in the case that we would have v2 catalog functions to resolve it. But for now we need to handle an specific error for this case
throw new AnalysisException(s"$sql is only supported in v1 catalog")
If we don't handle it here, we find and unresolved exception that is not clear.
It is the same situation that we have with the commands that use parseV1Table function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can do it if we make the error message generic (Function command is only supported in v1 catalog.
)? Either approach sounds reasonable to me, but +1 for using extractor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what are you proposing exactly. We could have something like that:
case DescribeFunctionStatement(SessionCatalogAndFunction(catalog, functionIdent), extended) =>
...
case DropFunctionStatement(SessionCatalogAndFunction(catalog, functionIdent), ifExists, isTemp) =>
...
case CreateFunctionStatement(SessionCatalogAndFunction(catalog, functionIdent),
className, resources, isTemp, ignoreIfExists, replace) =>
...
where SessionCatalogAndFunction is an extractor that tests if isSessionCatalog, something like SessionCatalogAndTable do.
In the case that !isSessionCatalog it implies that any of this cases match and the result is
unresolved operator 'CreateFunctionStatement [testcat, ns1, ns2, fun]
for Create function
How could we catch this case generically and throw a generic exception like you propose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah OK. This requires throwing an exception inside the extractor, which may not be ideal.
Test build #115339 has finished for PR 26890 at commit
|
Test build #115355 has finished for PR 26890 at commit
|
} | ||
DescribeFunctionCommand(functionIdentifier, extended) | ||
val (database, function) = | ||
parseSessionCatalogFunctionIdentifier("DESCRIBE FUNCTION", catalog, functionIdent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can do it if we make the error message generic (Function command is only supported in v1 catalog.
)? Either approach sounds reasonable to me, but +1 for using extractor.
replace) | ||
} | ||
|
||
private def parseSessionCatalogFunctionIdentifier( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would return FunctionIdentifier
to match the function name, and also using FunctionIdentifier
makes the intention / usage clear on the caller site.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, yes, it sounds good!
parseSessionCatalogFunctionIdentifier("DROP FUNCTION", catalog, functionIdent) | ||
DropFunctionCommand(database, function, ifExists, isTemp) | ||
|
||
case CreateFunctionStatement(CatalogAndIdentifier(catalog, functionIdent), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use ident
when you use CatalogAndIdentifier
instead of functionIdent
. (And use functionIdent
when it references a FunctionIdentifier
object).
Test build #115368 has finished for PR 26890 at commit
|
case Seq(db, fn) => FunctionIdentifier(fn, Some(db)) | ||
case Seq(fn) => FunctionIdentifier(fn, None) | ||
case _ => | ||
throw new AnalysisException(s"Unsupported function name '${functionIdent.quoted}'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not caused by this PR. This error message looks too general, and I do not see much info from it.
Maybe we can make it clearer? Like v1 function name should not have multiple namespaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the show columns statement we decide this message
Namespace name should have only one part if specified
How about the same message or similar? Both are clearer
Test build #115409 has finished for PR 26890 at commit
|
[SPARK-30039][SQL] Pull requests comments [SPARK-30039][SQL] Pull request comments [SPARK-30039][SQL] Pull request comments
707b037
to
3796a3a
Compare
Test build #116184 has finished for PR 26890 at commit
|
Test build #116185 has finished for PR 26890 at commit
|
Hi @cloud-fan @viirya @imback82 @dongjoon-hyun! Any more comments on this? |
thanks, merging to master! |
What changes were proposed in this pull request?
Add CreateFunctionStatement and make CREATE FUNCTION go through the same catalog/table resolution framework of v2 commands.
Why are the changes needed?
It's important to make all the commands have the same table resolution behavior, to avoid confusing
CREATE FUNCTION namespace.function
Does this PR introduce any user-facing change?
Yes. When running CREATE FUNCTION namespace.function Spark fails the command if the current catalog is set to a v2 catalog.
How was this patch tested?
Unit tests.