Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-8572] [SQL] Type coercion for ScalaUDFs #7203

Closed
wants to merge 3 commits into from

Conversation

piaozhexiu
Copy link

Implemented type coercion for udf arguments in Scala. The changes include-

  • Add with ExpectsInputTypes to ScalaUDF class.
  • Pass down argument types info from UDFRegistration and functions.

With this patch, the example query in SPARK-8572 no longer throws a type cast error at runtime.

Also added a unit test to UDFSuite in which a decimal type is passed to a udf that expects an int.

@SparkQA
Copy link

SparkQA commented Jul 3, 2015

Test build #36457 has finished for PR 7203 at commit 066deed.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class ScalaUDF(
    • case class UserDefinedFunction protected[sql] (

@@ -126,7 +126,8 @@ class UDFRegistration private[sql] (sqlContext: SQLContext) extends Logging {
*/
def register[RT: TypeTag](name: String, func: Function0[RT]): UserDefinedFunction = {
val dataType = ScalaReflection.schemaFor[RT].dataType
def builder(e: Seq[Expression]) = ScalaUDF(func, dataType, e)
val inputTypes = Seq[DataType]()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update the script earlier to include this new line?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's just a few lines above this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rxin Thank you for reviewing! I updated the script.

@@ -87,6 +87,7 @@ class UDFRegistration private[sql] (sqlContext: SQLContext) extends Logging {
(0 to 22).map { x =>
val types = (1 to x).foldRight("RT")((i, s) => {s"A$i, $s"})
val typeTags = (1 to x).map(i => s"A${i}: TypeTag").foldLeft("RT: TypeTag")(_ + ", " + _)
val inputTypes = (1 to x).foldLeft("Nil")((s, i) => {s"$s :+ ScalaReflection.schemaFor[A$i].dataType"})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the convention is usually

a :: b :: c :: Nil

rather than

Nil :+ a :+ ...

Do you mind updating it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@rxin
Copy link
Contributor

rxin commented Jul 3, 2015

Looks good. Just two nitpicks.

@rxin
Copy link
Contributor

rxin commented Jul 3, 2015

LGTM.

@SparkQA
Copy link

SparkQA commented Jul 3, 2015

Test build #36479 has finished for PR 7203 at commit dce1efd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class ScalaUDF(
    • case class UserDefinedFunction protected[sql] (

@SparkQA
Copy link

SparkQA commented Jul 3, 2015

Test build #36486 has finished for PR 7203 at commit 2d0ed15.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class ScalaUDF(
    • case class UserDefinedFunction protected[sql] (

@cloud-fan
Copy link
Contributor

lgtm

@rxin
Copy link
Contributor

rxin commented Jul 4, 2015

Thanks. I've merged this.

@asfgit asfgit closed this in 4a22bce Jul 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants