Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use with AWS Glue 4.0 #630

Closed
imjaleel opened this issue Mar 24, 2023 · 2 comments
Closed

Unable to use with AWS Glue 4.0 #630

imjaleel opened this issue Mar 24, 2023 · 2 comments
Labels
duplicate This issue or pull request already exists

Comments

@imjaleel
Copy link

imjaleel commented Mar 24, 2023

Running a pyspark job with some transformations and writing the output to Delta tables in S3.
AWS Glue 4.0 uses Spark 3.3 and Scala 2.12
Added spark-3.3-spline-agent-bundle_2.12-1.0.6.jar to the Dependant JARS
Using the Parameters -
--packages za.co.absa.spline.agent.spark:spark-3.3-spline-agent-bundle_2.12:1.0.6

and added the below parameters to the Spark Session
.config("spark.sql.queryExecutionListeners", "za.co.absa.spline.harvester.listener.SplineQueryExecutionListener")
.config("spark.spline.lineageDispatcher", "console")

The Job starts, while the output is being written to S3, the job fails with the below error.

ERROR [spark-listener-group-shared] util.Utils (Logging.scala:logError(98)): uncaught error in thread spark-listener-group-shared, stopping SparkContext
java.lang.ExceptionInInitializerError: null
	at za.co.absa.spline.harvester.HashBasedUUIDGenerator.nextId(idGenerators.scala:57) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.DataTypeIdGenerator.nextId(idGenerators.scala:83) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.converter.DataTypeConverter.convert(DataTypeConverter.scala:39) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.agent.SplineAgent$$anon$1$$anon$2.za$co$absa$commons$lang$CachingConverter$$super$convert(SplineAgent.scala:76) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.commons.lang.CachingConverter.$anonfun$convert$1(converters.scala:47) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at scala.collection.mutable.MapLike.getOrElseUpdate(MapLike.scala:206) ~[scala-library.jar:?]
	at scala.collection.mutable.MapLike.getOrElseUpdate$(MapLike.scala:203) ~[scala-library.jar:?]
	at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80) ~[scala-library.jar:?]
	at za.co.absa.commons.lang.CachingConverter.convert(converters.scala:47) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.commons.lang.CachingConverter.convert$(converters.scala:44) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.agent.SplineAgent$$anon$1$$anon$2.convert(SplineAgent.scala:76) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.agent.SplineAgent$$anon$1$$anon$2.convert(SplineAgent.scala:76) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.converter.DataTypeConverter.convert(DataTypeConverter.scala:43) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.converter.AttributeConverter.convert(AttributeConverter.scala:42) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.builder.plan.PlanOperationNodeBuilder$$anon$1.za$co$absa$commons$lang$CachingConverter$$super$convert(PlanOperationNodeBuilder.scala:37) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.commons.lang.CachingConverter.$anonfun$convert$1(converters.scala:47) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at scala.collection.mutable.MapLike.getOrElseUpdate(MapLike.scala:206) ~[scala-library.jar:?]
	at scala.collection.mutable.MapLike.getOrElseUpdate$(MapLike.scala:203) ~[scala-library.jar:?]
	at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80) ~[scala-library.jar:?]
	at za.co.absa.commons.lang.CachingConverter.convert(converters.scala:47) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.commons.lang.CachingConverter.convert$(converters.scala:44) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.builder.plan.PlanOperationNodeBuilder$$anon$1.convert(PlanOperationNodeBuilder.scala:37) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.builder.plan.PlanOperationNodeBuilder.$anonfun$outputAttributes$1(PlanOperationNodeBuilder.scala:67) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233) ~[scala-library.jar:?]
	at scala.collection.immutable.List.foreach(List.scala:388) ~[scala-library.jar:?]
	at scala.collection.TraversableLike.map(TraversableLike.scala:233) ~[scala-library.jar:?]
	at scala.collection.TraversableLike.map$(TraversableLike.scala:226) ~[scala-library.jar:?]
	at scala.collection.immutable.List.map(List.scala:294) ~[scala-library.jar:?]
	at za.co.absa.spline.harvester.builder.plan.PlanOperationNodeBuilder.outputAttributes(PlanOperationNodeBuilder.scala:67) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.builder.plan.PlanOperationNodeBuilder.outputAttributes$(PlanOperationNodeBuilder.scala:66) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.builder.plan.read.ReadNodeBuilder.outputAttributes$lzycompute(ReadNodeBuilder.scala:28) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.builder.plan.read.ReadNodeBuilder.outputAttributes(ReadNodeBuilder.scala:28) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.builder.plan.read.ReadNodeBuilder.build(ReadNodeBuilder.scala:42) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.builder.plan.read.ReadNodeBuilder.build(ReadNodeBuilder.scala:28) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.LineageHarvester.$anonfun$harvest$6(LineageHarvester.scala:68) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233) ~[scala-library.jar:?]
	at scala.collection.immutable.List.foreach(List.scala:388) ~[scala-library.jar:?]
	at scala.collection.TraversableLike.map(TraversableLike.scala:233) ~[scala-library.jar:?]
	at scala.collection.TraversableLike.map$(TraversableLike.scala:226) ~[scala-library.jar:?]
	at scala.collection.immutable.List.map(List.scala:294) ~[scala-library.jar:?]
	at za.co.absa.spline.harvester.LineageHarvester.$anonfun$harvest$4(LineageHarvester.scala:68) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at scala.Option.flatMap(Option.scala:171) ~[scala-library.jar:?]
	at za.co.absa.spline.harvester.LineageHarvester.harvest(LineageHarvester.scala:61) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.agent.SplineAgent$$anon$1.$anonfun$handle$1(SplineAgent.scala:91) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.agent.SplineAgent$$anon$1.withErrorHandling(SplineAgent.scala:100) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.agent.SplineAgent$$anon$1.handle(SplineAgent.scala:72) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.listener.QueryExecutionListenerDelegate.onSuccess(QueryExecutionListenerDelegate.scala:28) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.$anonfun$onSuccess$1(SplineQueryExecutionListener.scala:41) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.$anonfun$onSuccess$1$adapted(SplineQueryExecutionListener.scala:41) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at scala.Option.foreach(Option.scala:257) ~[scala-library.jar:?]
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.onSuccess(SplineQueryExecutionListener.scala:41) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at org.apache.spark.sql.util.ExecutionListenerBus.doPostEvent(QueryExecutionListener.scala:165) ~[spark-sql_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at org.apache.spark.sql.util.ExecutionListenerBus.doPostEvent(QueryExecutionListener.scala:135) ~[spark-sql_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:117) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:101) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at org.apache.spark.sql.util.ExecutionListenerBus.postToAll(QueryExecutionListener.scala:135) ~[spark-sql_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at org.apache.spark.sql.util.ExecutionListenerBus.onOtherEvent(QueryExecutionListener.scala:147) ~[spark-sql_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:100) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:117) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:101) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at org.apache.spark.scheduler.AsyncEventQueue.super$postToAll(AsyncEventQueue.scala:105) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at org.apache.spark.scheduler.AsyncEventQueue.$anonfun$dispatch$1(AsyncEventQueue.scala:105) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:12) ~[scala-library.jar:?]
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) ~[scala-library.jar:?]
	at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:100) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.$anonfun$run$1(AsyncEventQueue.scala:96) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1447) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
	at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.run(AsyncEventQueue.scala:96) ~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
Caused by: scala.tools.reflect.ToolBoxError: reflective compilation has failed: cannot initialize the compiler due to java.lang.NoSuchMethodError: scala.tools.reflect.package$$anon$4.INFO()Lscala/reflect/internal/Reporter$Severity;
	at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$withCompilerApi$api$.liftedTree1$1(ToolBoxFactory.scala:360) ~[scala-compiler-2.12.15.jar:?]
	at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$withCompilerApi$api$.compiler$lzycompute(ToolBoxFactory.scala:346) ~[scala-compiler-2.12.15.jar:?]
	at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$withCompilerApi$api$.compiler(ToolBoxFactory.scala:345) ~[scala-compiler-2.12.15.jar:?]
	at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$withCompilerApi$.apply(ToolBoxFactory.scala:372) ~[scala-compiler-2.12.15.jar:?]
	at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl.parse(ToolBoxFactory.scala:429) ~[scala-compiler-2.12.15.jar:?]
	at za.co.absa.spline.harvester.json.HarvesterJsonSerDe$.<init>(HarvesterJsonSerDe.scala:49) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.json.HarvesterJsonSerDe$.<clinit>(HarvesterJsonSerDe.scala) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	... 71 more
Caused by: java.lang.NoSuchMethodError: scala.tools.reflect.package$$anon$4.INFO()Lscala/reflect/internal/Reporter$Severity;
	at scala.tools.reflect.package$$anon$4.<init>(package.scala:85) ~[scala-compiler-2.12.15.jar:?]
	at scala.tools.reflect.package$.frontEndToReporter(package.scala:77) ~[scala-compiler-2.12.15.jar:?]
	at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$withCompilerApi$api$.liftedTree1$1(ToolBoxFactory.scala:350) ~[scala-compiler-2.12.15.jar:?]
	at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$withCompilerApi$api$.compiler$lzycompute(ToolBoxFactory.scala:346) ~[scala-compiler-2.12.15.jar:?]
	at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$withCompilerApi$api$.compiler(ToolBoxFactory.scala:345) ~[scala-compiler-2.12.15.jar:?]
	at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$withCompilerApi$.apply(ToolBoxFactory.scala:372) ~[scala-compiler-2.12.15.jar:?]
	at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl.parse(ToolBoxFactory.scala:429) ~[scala-compiler-2.12.15.jar:?]
	at za.co.absa.spline.harvester.json.HarvesterJsonSerDe$.<init>(HarvesterJsonSerDe.scala:49) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	at za.co.absa.spline.harvester.json.HarvesterJsonSerDe$.<clinit>(HarvesterJsonSerDe.scala) ~[spark-3.3-spline-agent-bundle_2.12-1.0.6.jar:?]
	... 71 more
@cerveada
Copy link
Contributor

This is most probably duplicate of #602

@imjaleel
Copy link
Author

This is most probably duplicate of #602

Yes @cerveada it's the same issue, I looked for similar issues but somehow missed finding that. Thank you for pointing it out.

However, I'm not able to access the JAR file that's referred to in that issue, can you please help me with the same JAR so that I can build my solution around it until 1.1.0 is released publicly.

@wajda wajda added the duplicate This issue or pull request already exists label Mar 27, 2023
@wajda wajda closed this as completed Mar 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
Status: Done
Development

No branches or pull requests

3 participants