New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-39579][SQL][PYTHON][R] Make ListFunctions/getFunction/functionExists compatible with 3 layer namespace #36977
[SPARK-39579][SQL][PYTHON][R] Make ListFunctions/getFunction/functionExists compatible with 3 layer namespace #36977
Conversation
aef63ca
to
80ebb13
Compare
f7acee4
to
67e812c
Compare
67e812c
to
fe583c7
Compare
fe583c7
to
eb98e66
Compare
775970c
to
6d2e578
Compare
@@ -2139,16 +2139,16 @@ class Analyzer(override val catalogManager: CatalogManager) | |||
} | |||
|
|||
def lookupBuiltinOrTempFunction(name: Seq[String]): Option[ExpressionInfo] = { | |||
if (name.length == 1) { | |||
v1SessionCatalog.lookupBuiltinOrTempFunction(name.head) | |||
if (name.length == 1 || name.length == 3) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to support look up temp function with a 3-layer ident.
b21756d
to
93c8793
Compare
all tests passed, just rebase to fix conflicts |
cc @cloud-fan @amaliujia could you please take a look when you find some time? |
@zhengruifeng sorry it has conflicts now... |
@cloud-fan dont worry, let me update the pr |
93c8793
to
a3a50b9
Compare
@@ -2080,7 +2080,7 @@ class Analyzer(override val catalogManager: CatalogManager) | |||
throw QueryCompilationErrors.expectPersistentFuncError( | |||
nameParts.head, cmd, mismatchHint, u) | |||
} else { | |||
ResolvedNonPersistentFunc(nameParts.head, V1Function(info)) | |||
ResolvedNonPersistentFunc(nameParts.last, V1Function(info)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it a bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be a bug. should always be the name
:
/**
* A plan containing resolved non-persistent (temp or built-in) function.
*/
case class ResolvedNonPersistentFunc(
name: String,
func: UnboundFunction)
extends LeafNodeWithoutStats {
override def output: Seq[Attribute] = Nil
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have a separate PR to fix this bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, my change was confusing, it is not a bug:
lookupBuiltinOrTempFunction(nameParts)
.orElse(lookupBuiltinOrTempTableFunction(nameParts))
could be non-empty only when nameparts.length == 1
, so nameParts.head
is fine.
@@ -2139,16 +2139,16 @@ class Analyzer(override val catalogManager: CatalogManager) | |||
} | |||
|
|||
def lookupBuiltinOrTempFunction(name: Seq[String]): Option[ExpressionInfo] = { | |||
if (name.length == 1) { | |||
v1SessionCatalog.lookupBuiltinOrTempFunction(name.head) | |||
if (name.length == 1 || name.length == 3) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to look up temp function with name
= spark_default.default.add
to look up temp function with name
= spark_default.default.+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me try to handle system functions and user functions seperately, then the changes in Analyzer
could be reverted.
// a qualified namespace with catalog name. We assume it's a single database name | ||
// and check if we can find the dbName in sessionCatalog. If so we listFunctions under | ||
// that database. Otherwise we try 3-part name parsing and locate the database. | ||
if (sessionCatalog.databaseExists(dbName) || sessionCatalog.isGlobalTempViewDB(dbName)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need to check global temp view db, as we are listing functions.
} | ||
|
||
private def makeFunction(funcIdent: FunctionIdentifier): Function = { | ||
val metadata = sessionCatalog.lookupFunctionInfo(funcIdent) | ||
val metadata = try { | ||
Some(sessionCatalog.getFunctionMetadata(funcIdent)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we make this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me revert this
1ca922c
to
1a491a3
Compare
val ident = sparkSession.sessionState.sqlParser.parseMultipartIdentifier(functionName) | ||
val catalog = | ||
sparkSession.sessionState.catalogManager.catalog(ident(0)).asFunctionCatalog | ||
catalog.functionExists(Identifier.of(Array(ident(1)), ident(2))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes the first name part is always the catalog, which is not the case. We may use the current catalog. I think we should rely on the analyzer and use UnresolvedFunc
.
ec0784b
to
4ea562c
Compare
@cloud-fan could you please take another look? thanks! also cc @HyukjinKwon if you are intrested in this. |
gentle ping @cloud-fan |
@@ -288,19 +301,75 @@ def functionExists(self, functionName: str, dbName: Optional[str] = None) -> boo | |||
name of the database to check function existence in. | |||
If no database is specified, the current database is used | |||
|
|||
.. deprecated:: 3.4.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amaliujia we need to do the same deprecation in scala
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack will follow up on the scala side.
thanks, merging to master! |
@cloud-fan Thank you for reivew |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM2
"a future version. Use functionExists(`dbName.tableName`) instead.", | ||
FutureWarning, | ||
) | ||
return self._jcatalog.functionExists(self.currentDatabase(), functionName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, here should be self._jcatalog.functionExists(dbName, functionName)
let me fix it in a follow up
…me) when dbName is not None ### What changes were proposed in this pull request? fix functionExists(functionName, dbName) ### Why are the changes needed? #36977 introduce a bug in `functionExists(functionName, dbName)`, when dbName is not None, should call `self._jcatalog.functionExists(dbName, functionName)` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? existing testsuite Closes #37088 from zhengruifeng/py_3l_fix_functionExists. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
What changes were proposed in this pull request?
Make ListFunctions/getFunction/functionExists compatible with 3 layer namespace
Why are the changes needed?
to support 3 layer namespace
Does this PR introduce any user-facing change?
yes
How was this patch tested?
added UT