Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType #26287

Closed
wants to merge 3 commits into from

Conversation

HeartSaVioR
Copy link
Contributor

@HeartSaVioR HeartSaVioR commented Oct 29, 2019

What changes were proposed in this pull request?

There're some issues observed in HiveUserDefinedTypeSuite."Support UDT in Hive UDF":

  1. Neither function (TestUDF) nor test take "nullable" point column into account.
  2. ExamplePointUDT. sqlType is ArrayType which doesn't provide information how many elements are expected. RandomDataGenerator may provide less elements than needed.

This patch fixes HiveUserDefinedTypeSuite."Support UDT in Hive UDF" to change the type of "point" column to be non-nullable, as well as not use RandomDataGenerator to create row for UDT backed by ArrayType.

Why are the changes needed?

CI builds are failing in high occurrences.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Manually tested by running tests locally multiple times.

@HeartSaVioR
Copy link
Contributor Author

@HeartSaVioR
Copy link
Contributor Author

HeartSaVioR commented Oct 29, 2019

I've also found other kind of intermittent test failure locally as well after submitting the patch. Looking into it.

1
java.lang.ArrayIndexOutOfBoundsException: 1
	at org.apache.spark.sql.catalyst.util.GenericArrayData.getAs(GenericArrayData.scala:64)
	at org.apache.spark.sql.catalyst.util.GenericArrayData.getDouble(GenericArrayData.scala:73)
	at org.apache.spark.sql.test.ExamplePointUDT.deserialize(ExamplePointUDT.scala:60)
	at org.apache.spark.sql.test.ExamplePointUDT.deserialize(ExamplePointUDT.scala:44)
	at org.apache.spark.sql.RandomDataGenerator$.$anonfun$forType$26(RandomDataGenerator.scala:265)
	at org.apache.spark.sql.RandomDataGenerator$.$anonfun$forType$24(RandomDataGenerator.scala:249)
	at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
	at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:75)
	at scala.collection.TraversableLike.map(TraversableLike.scala:238)
	at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
	at scala.collection.AbstractTraversable.map(Traversable.scala:108)
	at org.apache.spark.sql.RandomDataGenerator$.$anonfun$forType$23(RandomDataGenerator.scala:249)
	at org.apache.spark.sql.hive.HiveUserDefinedTypeSuite.$anonfun$new$1(HiveUserDefinedTypeSuite.scala:40)
	at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
	at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
	at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
	at org.scalatest.Transformer.apply(Transformer.scala:22)
	at org.scalatest.Transformer.apply(Transformer.scala:20)
	at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
	at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:149)
	at org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184)
	at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196)
	at org.scalatest.SuperEngine.runTestImpl(Engine.scala:286)
	at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196)
	at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178)
	at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:56)
	at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:221)
	at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:214)
	at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:56)
	at org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229)
	at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:393)
	at scala.collection.immutable.List.foreach(List.scala:392)
	at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:381)
	at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:376)
	at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:458)
	at org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229)
	at org.scalatest.FunSuiteLike.runTests$(FunSuiteLike.scala:228)
	at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
	at org.scalatest.Suite.run(Suite.scala:1124)
	at org.scalatest.Suite.run$(Suite.scala:1106)
	at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
	at org.scalatest.FunSuiteLike.$anonfun$run$1(FunSuiteLike.scala:233)
	at org.scalatest.SuperEngine.runImpl(Engine.scala:518)
	at org.scalatest.FunSuiteLike.run(FunSuiteLike.scala:233)
	at org.scalatest.FunSuiteLike.run$(FunSuiteLike.scala:232)
	at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:56)
	at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
	at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
	at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
	at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:56)
	at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:45)
	at org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13(Runner.scala:1349)
	at org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13$adapted(Runner.scala:1343)
	at scala.collection.immutable.List.foreach(List.scala:392)
	at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1343)
	at org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24(Runner.scala:1033)
	at org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24$adapted(Runner.scala:1011)
	at org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runner.scala:1509)
	at org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runner.scala:1011)
	at org.scalatest.tools.Runner$.run(Runner.scala:850)
	at org.scalatest.tools.Runner.run(Runner.scala)
	at org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.runScalaTest2(ScalaTestRunner.java:133)
	at org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.main(ScalaTestRunner.java:27)

@HeartSaVioR
Copy link
Contributor Author

HeartSaVioR commented Oct 29, 2019

Looks like RandomDataGenerator is not appropriate for creating input row, as sqlType of ExamplePointUDT is ArrayType which cannot denote how many elements are needed. It requires at least two elements. I saw the case where RandomDataGenerator created Nil for serialized data of ExamplePointUDT.

@HeartSaVioR HeartSaVioR changed the title [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't let 'ExamplePoint' in input row be nullable [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType Oct 29, 2019
@HeartSaVioR
Copy link
Contributor Author

I've run test for 1000 times (via adding a loop) with latest commit and it ran well.

Please let me know if the patch misses intentional randomness. Thanks!

@SparkQA
Copy link

SparkQA commented Oct 29, 2019

Test build #112807 has finished for PR 26287 at commit 70459ad.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 29, 2019

Test build #112810 has finished for PR 26287 at commit 8d0929b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 29, 2019

Test build #112811 has finished for PR 26287 at commit d1ba290.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan cloud-fan closed this in fb80dfe Oct 29, 2019
@cloud-fan
Copy link
Contributor

thanks, merging to master!

@HeartSaVioR
Copy link
Contributor Author

Thanks all for reviewing and merging!

@HeartSaVioR HeartSaVioR deleted the SPARK-28158-FOLLOWUP branch October 29, 2019 05:33
@dongjoon-hyun
Copy link
Member

Late LGTM! Thank you, @HeartSaVioR and all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
7 participants