Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-6747] [SQL] Throw an AnalysisException when unsupported Java list types used in Hive UDF #7248

Closed
wants to merge 19 commits into from

Conversation

maropu
Copy link
Member

@maropu maropu commented Jul 7, 2015

The current implementation can't handle List<> as a return type in Hive UDF and
throws meaningless Match Error.
We assume an UDF below;
public class UDFToListString extends UDF {
public List evaluate(Object o)
{ return Arrays.asList("xxx", "yyy", "zzz"); }
}
An exception of scala.MatchError is thrown as follows when the UDF used;
scala.MatchError: interface java.util.List (of class java.lang.Class)
at org.apache.spark.sql.hive.HiveInspectors$class.javaClassToDataType(HiveInspectors.scala:174)
at org.apache.spark.sql.hive.HiveSimpleUdf.javaClassToDataType(hiveUdfs.scala:76)
at org.apache.spark.sql.hive.HiveSimpleUdf.dataType$lzycompute(hiveUdfs.scala:106)
at org.apache.spark.sql.hive.HiveSimpleUdf.dataType(hiveUdfs.scala:106)
at org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:131)
at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:95)
at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:94)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at scala.collection.TraversableLike$$anonfun$collect$1.apply(TraversableLike.scala:278)
...
To make udf developers more understood, we need to throw a more suitable exception.

@maropu
Copy link
Member Author

maropu commented Jul 7, 2015

@marmbrus Through the discussion of #5395, I think it is hard to support java List<> types in SparkSQL because of type erasure. ISTM that if udf developers use this type, they'd be better to use GenericUDF interfaces instead of UDF ones. So, I re-created a PR to throw a meaningful exception when this kind of types used.

Any thought?

testData.registerTempTable("inputTable")

sql(s"CREATE TEMPORARY FUNCTION testUDFToListInt AS '${classOf[UDFToListInt].getName}'")
intercept[AnalysisException] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assign the result of this function to a variable and check that the message is correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed and Does it satisfy your comment?

@marmbrus
Copy link
Contributor

marmbrus commented Jul 7, 2015

ok to test

@marmbrus
Copy link
Contributor

marmbrus commented Jul 7, 2015

This looks great! One minor comment on the tests.

@maropu
Copy link
Member Author

maropu commented Jul 7, 2015

@marmbrus Ok and thanks.
After this patch merged, I'll make a same patch for Map<> because it has the same issue.

@SparkQA
Copy link

SparkQA commented Jul 7, 2015

Test build #36628 has finished for PR 7248 at commit 56305de.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@marmbrus
Copy link
Contributor

marmbrus commented Jul 7, 2015

Thanks! Merging to master.

@asfgit asfgit closed this in 1821fc1 Jul 7, 2015
@SparkQA
Copy link

SparkQA commented Jul 7, 2015

Test build #36629 has finished for PR 7248 at commit 1c3df2a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Jul 8, 2015
…ap<K,V> types used in Hive UDF

To make UDF developers understood, throw an exception when unsupported Map<K,V> types used in Hive UDF. This fix is the same with #7248.

Author: Takeshi YAMAMURO <linguin.m.s@gmail.com>

Closes #7257 from maropu/ThrowExceptionWhenMapUsed and squashes the following commits:

916099a [Takeshi YAMAMURO] Fix style errors
7886dcc [Takeshi YAMAMURO] Throw an exception when Map<> used in Hive UDF
@maropu maropu deleted the FixBugInHiveInspectors branch July 5, 2017 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants