Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to run on Google cloud dataflow when built with Java 10 #25

Closed
hilliao opened this issue May 24, 2018 · 4 comments
Closed

Failed to run on Google cloud dataflow when built with Java 10 #25

hilliao opened this issue May 24, 2018 · 4 comments

Comments

@hilliao
Copy link

hilliao commented May 24, 2018

I can't figure out how to run DBeam in Google cloud dataflow. I digged SCIO's doc and tried to run DBeam with Google cloud dataflow's runner in sbt shell. I had to add the following line to build.sbt:
"org.apache.beam" % "beam-runners-google-cloud-dataflow-java" % beamVersion,
under libraryDependencies ++= Seq( but still got the errors:

hil@macbook13i72017 ~/c/dbeam> sbt
[info] Loading settings from idea.sbt ...
[info] Loading global plugins from /Users/hil/.sbt/1.0/plugins
[info] Loading settings from plugins.sbt ...
[info] Loading project definition from /Users/hil/cbsi/dbeam/project
[info] Loading settings from version.sbt,build.sbt ...
[info] Set current project to dbeam-foss-parent (in build file:/Users/hil/cbsi/dbeam/)
[info] sbt server started at local:///Users/hil/.sbt/1.0/server/b6db3491d7efae758331/sock
sbt:dbeam-foss-parent> project dbeamCore
[info] Set current project to dbeam-core (in build file:/Users/hil/cbsi/dbeam/)
sbt:dbeam-core> runMain com.spotify.dbeam.JdbcAvroJob --project=i-ingest-poc --zone=us-west1-c --runner=DataflowRunner --connectionUrl=jdbc:mysql://localhost:3306/dbeamtest --table=pet --username=hil --password=password --output=gs://dbeam-test/tmp
[warn] Multiple main classes detected. Run 'show discoveredMainClasses' to see the list
[info] Running (fork) com.spotify.dbeam.JdbcAvroJob --project=i-ingest-poc --zone=us-west1-c --runner=DataflowRunner --connectionUrl=jdbc:mysql://localhost:3306/dbeamtest --table=pet --username=hil --password=password --output=gs://dbeam-test/tmp
[error] Wed May 23 17:21:17 PDT 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
[error] [main] INFO JdbcAvroConversions - Creating Avro schema based on the first read row from the database
[error] [main] INFO JdbcAvroConversions - Schema created successfully. Generated schema: {"type":"record","name":"pet","namespace":"dbeam_generated","doc":"Generate schema from JDBC ResultSet from 'pet' or the --sqlFile with jdbc:mysql://localhost:3306/dbeamtest","fields":[{"name":"name","type":["null","string"],"doc":"From sqlType 12 VARCHAR","default":null,"typeName":"VARCHAR","sqlCode":"12","columnName":"name"},{"name":"owner","type":["null","string"],"doc":"From sqlType 12 VARCHAR","default":null,"typeName":"VARCHAR","sqlCode":"12","columnName":"owner"},{"name":"species","type":["null","string"],"doc":"From sqlType 12 VARCHAR","default":null,"typeName":"VARCHAR","sqlCode":"12","columnName":"species"},{"name":"sex","type":["null","string"],"doc":"From sqlType 1 CHAR","default":null,"typeName":"CHAR","sqlCode":"1","columnName":"sex"},{"name":"birth","type":["null","long"],"doc":"From sqlType 91 DATE","default":null,"typeName":"DATE","sqlCode":"91","columnName":"birth"},{"name":"death","type":["null","long"],"doc":"From sqlType 91 DATE","default":null,"typeName":"DATE","sqlCode":"91","columnName":"death"}],"connectionUrl":"jdbc:mysql://localhost:3306/dbeamtest","tableName":"pet"}
[error] [main] INFO com.spotify.dbeam.JdbcAvroJob$ - Elapsed time to schema 0.585 seconds
[error] Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: Current ClassLoader is 'jdk.internal.loader.ClassLoaders$AppClassLoader@4b9af9a9' only URLClassLoaders are supported
[error] at scala.Predef$.require(Predef.scala:277)
[error] at com.spotify.scio.runners.dataflow.DataflowContext$.detectClassPathResourcesToStage(DataflowContext.scala:58)
[error] at com.spotify.scio.runners.dataflow.DataflowContext$.getFilesToStage(DataflowContext.scala:49)
[error] at com.spotify.scio.runners.dataflow.DataflowContext$.prepareOptions(DataflowContext.scala:39)
[error] at com.spotify.scio.RunnerContext$.prepareOptions(ScioContext.scala:104)
[error] at com.spotify.scio.ScioContext.pipeline(ScioContext.scala:287)
[error] at com.spotify.scio.ScioContext$$anonfun$parallelize$1.apply(ScioContext.scala:857)
[error] at com.spotify.scio.ScioContext$$anonfun$parallelize$1.apply(ScioContext.scala:856)
[error] at com.spotify.scio.ScioContext.requireNotClosed(ScioContext.scala:419)
[error] at com.spotify.scio.ScioContext.parallelize(ScioContext.scala:856)
[error] at com.spotify.dbeam.JdbcAvroJob$.createSchema(JdbcAvroJob.scala:63)
[error] at com.spotify.dbeam.JdbcAvroJob$.prepareExport(JdbcAvroJob.scala:131)
[error] at com.spotify.dbeam.JdbcAvroJob$.runExport(JdbcAvroJob.scala:151)
[error] at com.spotify.dbeam.JdbcAvroJob$.main(JdbcAvroJob.scala:160)
[error] at com.spotify.dbeam.JdbcAvroJob.main(JdbcAvroJob.scala)
[error] java.lang.RuntimeException: Nonzero exit code returned from runner: 1
[error] at sbt.ForkRun.processExitCode$1(Run.scala:33)
[error] at sbt.ForkRun.run(Run.scala:42)
[error] at sbt.Defaults$.$anonfun$bgRunMainTask$6(Defaults.scala:1147)
[error] at sbt.Defaults$.$anonfun$bgRunMainTask$6$adapted(Defaults.scala:1142)
[error] at sbt.internal.BackgroundThreadPool.$anonfun$run$1(DefaultBackgroundJobService.scala:366)
[error] at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
[error] at scala.util.Try$.apply(Try.scala:209)
[error] at sbt.internal.BackgroundThreadPool$BackgroundRunnable.run(DefaultBackgroundJobService.scala:289)
[error] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
[error] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
[error] at java.base/java.lang.Thread.run(Thread.java:844)
[error] (Compile / runMain) Nonzero exit code returned from runner: 1
[error] Total time: 7 s, completed May 23, 2018, 5:21:18 PM

The error was Current ClassLoader is 'jdk.internal.loader.ClassLoaders$AppClassLoader@4b9af9a9' only URLClassLoaders are supported

Changing --runner to DirectRunner succeeded:

sbt:dbeam-core> runMain com.spotify.dbeam.JdbcAvroJob --project=i-ingest-poc --zone=us-west1-c --runner=DirectRunner --connectionUrl=jdbc:mysql://localhost:3306/dbeamtest --table=pet --username=hil --password=password --output=gs://dbeam-test/tmp
[warn] Multiple main classes detected. Run 'show discoveredMainClasses' to see the list
[info] Running (fork) com.spotify.dbeam.JdbcAvroJob --project=i-ingest-poc --zone=us-west1-c --runner=DirectRunner --connectionUrl=jdbc:mysql://localhost:3306/dbeamtest --table=pet --username=hil --password=password --output=gs://dbeam-test/tmp
[error] Wed May 23 17:28:20 PDT 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
[error] [main] INFO JdbcAvroConversions - Creating Avro schema based on the first read row from the database
[error] [main] INFO JdbcAvroConversions - Schema created successfully. Generated schema: {"type":"record","name":"pet","namespace":"dbeam_generated","doc":"Generate schema from JDBC ResultSet from 'pet' or the --sqlFile with jdbc:mysql://localhost:3306/dbeamtest","fields":[{"name":"name","type":["null","string"],"doc":"From sqlType 12 VARCHAR","default":null,"typeName":"VARCHAR","sqlCode":"12","columnName":"name"},{"name":"owner","type":["null","string"],"doc":"From sqlType 12 VARCHAR","default":null,"typeName":"VARCHAR","sqlCode":"12","columnName":"owner"},{"name":"species","type":["null","string"],"doc":"From sqlType 12 VARCHAR","default":null,"typeName":"VARCHAR","sqlCode":"12","columnName":"species"},{"name":"sex","type":["null","string"],"doc":"From sqlType 1 CHAR","default":null,"typeName":"CHAR","sqlCode":"1","columnName":"sex"},{"name":"birth","type":["null","long"],"doc":"From sqlType 91 DATE","default":null,"typeName":"DATE","sqlCode":"91","columnName":"birth"},{"name":"death","type":["null","long"],"doc":"From sqlType 91 DATE","default":null,"typeName":"DATE","sqlCode":"91","columnName":"death"}],"connectionUrl":"jdbc:mysql://localhost:3306/dbeamtest","tableName":"pet"}
[error] [main] INFO com.spotify.dbeam.JdbcAvroJob$ - Elapsed time to schema 0.726 seconds
[error] [main] INFO com.spotify.dbeam.JdbcAvroJob$ - Running queries: List(SELECT * FROM pet)
[error] WARNING: An illegal reflective access operation has occurred
[error] WARNING: Illegal reflective access by org.apache.beam.runners.direct.repackaged.com.google.protobuf.UnsafeUtil (file:/private/var/folders/q3/rf49by096192ckdl4j7dt56w0000gn/T/sbt_d9330199/target/b187293c/beam-runners-direct-java-2.4.0.jar) to field java.nio.Buffer.address
[error] WARNING: Please consider reporting this to the maintainers of org.apache.beam.runners.direct.repackaged.com.google.protobuf.UnsafeUtil
[error] WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
[error] WARNING: All illegal access operations will be denied in a future release
[error] [direct-runner-worker] INFO org.apache.beam.sdk.io.WriteFiles - Opening writer 191b3dde-2394-4c6b-a022-5bab0f73dd00 for window org.apache.beam.sdk.transforms.windowing.GlobalWindow@fe7b6b0 pane PaneInfo{isFirst=true, isLast=true, timing=ON_TIME, index=0, onTimeIndex=0} destination null
[error] [direct-runner-worker] INFO com.spotify.dbeam.JdbcAvroIO$JdbcAvroWriter - jdbcavroio : Preparing write...
[error] Wed May 23 17:28:23 PDT 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
[error] Wed May 23 17:28:23 PDT 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
[error] [direct-runner-worker] INFO com.spotify.dbeam.JdbcAvroIO$JdbcAvroWriter - jdbcavroio : Write prepared
[error] [direct-runner-worker] INFO com.spotify.dbeam.JdbcAvroIO$JdbcAvroWriter - jdbcavroio : Starting write...
[error] [direct-runner-worker] INFO com.spotify.dbeam.JdbcAvroIO$JdbcAvroWriter - jdbcavroio : Executing query (this can take a few minutes) ...
[error] [direct-runner-worker] INFO com.spotify.dbeam.JdbcAvroIO$JdbcAvroWriter - jdbcavroio : Execute query took 0.01 seconds
[error] [direct-runner-worker] INFO com.spotify.dbeam.JdbcAvroIO$JdbcAvroWriter - jdbcavroio : Read 1 rows, took 0.01 seconds
[error] [direct-runner-worker] INFO com.spotify.dbeam.JdbcAvroIO$JdbcAvroWriter - jdbcavroio : Closing connection, flushing writer...
[error] [direct-runner-worker] INFO com.spotify.dbeam.JdbcAvroIO$JdbcAvroWriter - jdbcavroio : Write finished
[error] [direct-runner-worker] INFO org.apache.beam.sdk.io.FileBasedSink$Writer - Successfully wrote temporary file gs://dbeam-test/tmp/.temp-beam-2018-05-24_00-28-22-1/191b3dde-2394-4c6b-a022-5bab0f73dd00
[error] [direct-runner-worker] INFO org.apache.beam.sdk.io.WriteFiles - Finalizing 1 file results
[error] [direct-runner-worker] INFO org.apache.beam.sdk.io.FileBasedSink - Finalizing for destination null num shards 1.
[error] [direct-runner-worker] INFO org.apache.beam.sdk.io.FileBasedSink - Will copy temporary file FileResult{tempFilename=gs://dbeam-test/tmp/.temp-beam-2018-05-24_00-28-22-1/191b3dde-2394-4c6b-a022-5bab0f73dd00, shard=0, window=org.apache.beam.sdk.transforms.windowing.GlobalWindow@fe7b6b0, paneInfo=PaneInfo{isFirst=true, isLast=true, timing=ON_TIME, index=0, onTimeIndex=0}} to final location gs://dbeam-test/tmp/part-00000-of-00001.avro
[error] [direct-runner-worker] INFO org.apache.beam.sdk.io.FileBasedSink - Will remove known temporary file gs://dbeam-test/tmp/.temp-beam-2018-05-24_00-28-22-1/191b3dde-2394-4c6b-a022-5bab0f73dd00
[error] [main] INFO com.spotify.dbeam.JdbcAvroJob$ - Metrics Metrics(0.5.4,2.12.4,JdbcAvroJob,DONE,BeamMetrics(List(BeamMetric(com.spotify.scio.ScioMetrics,schemaElapsedTimeMs,MetricValue(726,Some(726))), BeamMetric(com.spotify.dbeam.JdbcAvroIO.JdbcAvroWriter,writeElapsedMs,MetricValue(7,Some(7))), BeamMetric(com.spotify.dbeam.JdbcAvroIO.JdbcAvroWriter,recordCount,MetricValue(1,Some(1))), BeamMetric(com.spotify.dbeam.JdbcAvroIO.JdbcAvroWriter,executeQueryElapsedMs,MetricValue(9,Some(9)))),List(),List(BeamMetric(com.spotify.dbeam.JdbcAvroIO.JdbcAvroWriter,msPerMillionRows,MetricValue(BeamGauge(7000000,2018-05-24T00:28:23.895Z),Some(BeamGauge(7000000,2018-05-24T00:28:23.895Z)))), BeamMetric(com.spotify.dbeam.JdbcAvroIO.JdbcAvroWriter,rowsPerMinute,MetricValue(BeamGauge(8571,2018-05-24T00:28:23.895Z),Some(BeamGauge(8571,2018-05-24T00:28:23.895Z)))))))
[error] [main] INFO com.spotify.dbeam.JdbcAvroJob$ - all counters and gauges Map(MetricName{namespace=com.spotify.dbeam.JdbcAvroIO.JdbcAvroWriter, name=rowsPerMinute} -> MetricValue(GaugeResult{value=8571, timestamp=2018-05-24T00:28:23.895Z},Some(GaugeResult{value=8571, timestamp=2018-05-24T00:28:23.895Z})), MetricName{namespace=com.spotify.scio.ScioMetrics, name=schemaElapsedTimeMs} -> MetricValue(726,Some(726)), MetricName{namespace=com.spotify.dbeam.JdbcAvroIO.JdbcAvroWriter, name=recordCount} -> MetricValue(1,Some(1)), MetricName{namespace=com.spotify.dbeam.JdbcAvroIO.JdbcAvroWriter, name=executeQueryElapsedMs} -> MetricValue(9,Some(9)), MetricName{namespace=com.spotify.dbeam.JdbcAvroIO.JdbcAvroWriter, name=writeElapsedMs} -> MetricValue(7,Some(7)), MetricName{namespace=com.spotify.dbeam.JdbcAvroIO.JdbcAvroWriter, name=msPerMillionRows} -> MetricValue(GaugeResult{value=7000000, timestamp=2018-05-24T00:28:23.895Z},Some(GaugeResult{value=7000000, timestamp=2018-05-24T00:28:23.895Z})))
[success] Total time: 12 s, completed May 23, 2018, 5:28:27 PM
sbt:dbeam-core>

@hilliao hilliao changed the title How to run on Google cloud dataflow Failed to run on Google cloud dataflow May 24, 2018
@hilliao hilliao changed the title Failed to run on Google cloud dataflow Failed to run on Google cloud dataflow when built with Java 10 May 25, 2018
@prideloki
Copy link

got the same error, any suggestions?

@hilliao
Copy link
Author

hilliao commented Jun 11, 2018

If you read the title, you'd know Java 10 is causing the Current ClassLoader error. My solution was to downgrade to Java 8 on the build computer, usually your local development Linux or Mac OS. Google cloud dataflow supports only Java 8. Code built with Java 10 hits errors like this. The steps I took was uninstall Java 10 and all Java JDK, runtime. Make sure nothing Java is left on the build computer; Then install Java 8 latest version.

@labianchin
Copy link
Collaborator

labianchin commented Jun 12, 2018

That seems an issue with Google Dataflow SDK. Maybe open a issue here: https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/ ?

@labianchin
Copy link
Collaborator

Closing this as Beam SDK does not yet support JDK 9/10/11. Once a Beam SDK version with JDK 11 support is available, DBeam will be upgraded to that version.

See the following for more details:
https://beam.apache.org/roadmap/java-sdk/
https://issues.apache.org/jira/browse/BEAM-2530

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants