-
Notifications
You must be signed in to change notification settings - Fork 29.1k
[SPARK-28710][SQL]to fix replace function, spark should call drop and create function #25452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
Outdated
Show resolved
Hide resolved
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few comments, @sandeep-katta . Thank you for your contribution.
| """.stripMargin) | ||
| val cnt = | ||
| sql("SELECT customAdd(2, 'A', 10, date '2015-01-01', 'B', 20, date '2016-01-01')").count() | ||
| assert(cnt === 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: How about checking actual answers instread of the row count?
| } | ||
| } | ||
| } | ||
| test("SPARK-28710: Replace permanent function ") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test should be in HiveUDFSuite?
| // Handles `CREATE OR REPLACE FUNCTION AS ... USING ...` | ||
| if (replace && catalog.functionExists(func.identifier)) { | ||
| // alter the function in the metastore | ||
| catalog.alterFunction(func) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of changing CreateFunctionCommand, should we change the behavior at HiveExternalCatalog.alterFunction? If this is for Hive only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HiveExternalCatalog is for Hive only, currently alter function of Hive does not update the resource
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@viirya do I need to handle any other cases for this PR ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think @viirya's point is valid. Isn't the root cause that HiveExternalCatalog.alterFunction does not handle?
If this is for Hive only?
I think he meant the behaviours at InMemoryCatalog.alterFunction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got it, yes this problem is only for HiveExternalCatalog, I will update the code and push
I have fixed the comments |
|
ok to test |
|
Test build #110697 has finished for PR 25452 at commit
|
|
retest this please |
1 similar comment
|
retest this please |
|
Test build #110861 has finished for PR 25452 at commit
|
|
retest this please |
|
Test build #110879 has finished for PR 25452 at commit
|
Remove jar and move testcase to HiveUDFSuite. Also added check for change in the resource
f9ba185 to
05f3b7b
Compare
05f3b7b to
ebf7868
Compare
|
Test build #115391 has finished for PR 25452 at commit
|
| client.listFunctions(db, pattern) | ||
| } | ||
|
|
||
| def isResourceChanged(db: String, newFunction: CatalogFunction): Boolean = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private?
| case true => client.dropFunction(db, functionName) | ||
| client.createFunction(db, newDefinition) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this style looks uncommon in Spark. We have it usually:
isResourceChanged(db, newDefinition) match {
case true =>
client.dropFunction(db, functionName)
client.createFunction(db, newDefinition)
...
}| val changedResources = newFunction.resources.filter( | ||
| newResource => !oldFunction.resources.contains(newResource)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will case sensitivity be problem in this comparison? Shall we lower case before comparison?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be case sensitive, cz in case of unix based system /opt/some.jar is different from /OPT/SOME.jar. Isn't it ?
| client.createFunction(db, newDefinition) | ||
| // replace the function in the metastore if there is no change in the resource | ||
| case _ => | ||
| client.alterFunction(db, newDefinition) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From Hive DDL, looks like it does not have "replace" in create function:
CREATE FUNCTION [db_name.]function_name AS class_name
[USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ];
Maybe it is why this API does not do resource replacement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, as of now Hive only alter name, owner, class name, type but not resource URI. There is no replace DDL in Hive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And also initially in the jira, I proposed 2 solutions for this bug
solution 1: throw Unsupported Error for permanent function
solution 2: instead of alter function, do drop and create
|
Test build #115390 has finished for PR 25452 at commit
|
|
Test build #115427 has finished for PR 25452 at commit
|
|
ping @viirya |
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
create or replace function X as 'Y' using jar Z; does not work if the X is already present and is created with some other jar lets say xyz.
As per current implementation spark calls alter Function API of Hive, as of now Hive only alter name, owner, class name, type but not resource URI. After calling alter function only X and Y are updated but not Z.
So when the select is performed on X UDF it throws class not found exception.
Observation 1: Temporary function does not have this problem as it is handled by spark logic
Observation 2: For permanent function Spark calls the Hive to alter function , as of now Hive only alter name, owner, class name, type but not resource URI.
[SparkCode]

[Hive Code]

As per Hive Documentation it does not supports create or replace command for function
How was this patch tested?
Added UT and also manually tested
Before Fix:

After Fix:
