-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
can not read from BQ table #35
Comments
com.google.api.gax.* should be shaded/relocated in our connector, but isn't shaded in your stack trace. Are you using gs://spark-lib/bigquery/spark-bigquery-latest.jar or building it yourself? If you build it yourself you should use sbt assembly. |
This is the same as #36. I think I will update the compilation instructions to compile against the seaded profile. |
@pmkc I am building it myself using sbt assembly, even though I am seeing this error. |
Can you give me your approximate build and run commands. |
my build command is and this is how my build.sbt looks like,
|
Could you try this?
|
I'm not actually sure why that is giving you a no such method for grpc, you could try sbt dependencyTree to find a conflict, but I would just side step that. I would just mark "spark-bigquery" provided and use gs://spark-lib/bigquery/spark-bigquery-latest.jar OR compile against the shaded qualifier. |
@achelimed I am seeing same error even after applying shadeRules you mentioned above. |
This issue seems to be resolved in 0.8.0 release. Thank you @achelimed and @pmkc for your help. |
I am trying to read data from BigQuery in my dataproc spark job
using below code
val df_bpp: DataFrame = spark.read.bigquery("mintreporting.TEST_DS.spr_sample") df_bpp.printSchema(); df_bpp.show(10);
The schema for my table is getting printed properly but the code fails while executing the last line (show()). Below is the stack trace.
Any kind of help is highly appreciated.
9/07/02 08:02:47 ERROR io.grpc.internal.ManagedChannelImpl: [Channel<1>: (bigquerystorage.googleapis.com:443)] Uncaught exception in the SynchronizationContext. Panic!
java.lang.NoSuchMethodError: io.grpc.internal.ClientTransportFactory$ClientTransportOptions.getProxyParameters()Lio/grpc/internal/ProxyParameters;
at io.grpc.netty.shaded.io.grpc.netty.NettyChannelBuilder$NettyTransportFactory.newClientTransport(NettyChannelBuilder.java:542)
at io.grpc.internal.CallCredentialsApplyingTransportFactory.newClientTransport(CallCredentialsApplyingTransportFactory.java:48)
at io.grpc.internal.InternalSubchannel.startNewTransport(InternalSubchannel.java:263)
at io.grpc.internal.InternalSubchannel.obtainActiveTransport(InternalSubchannel.java:216)
at io.grpc.internal.ManagedChannelImpl$SubchannelImpl.requestConnection(ManagedChannelImpl.java:1452)
at io.grpc.internal.PickFirstLoadBalancer.handleResolvedAddressGroups(PickFirstLoadBalancer.java:59)
at io.grpc.internal.AutoConfiguredLoadBalancerFactory$AutoConfiguredLoadBalancer.handleResolvedAddressGroups(AutoConfiguredLoadBalancerFactory.java:148)
at io.grpc.internal.ManagedChannelImpl$NameResolverListenerImpl$1NamesResolved.run(ManagedChannelImpl.java:1326)
at io.grpc.SynchronizationContext.drain(SynchronizationContext.java:101)
at io.grpc.SynchronizationContext.execute(SynchronizationContext.java:130)
at io.grpc.internal.ManagedChannelImpl$NameResolverListenerImpl.onAddresses(ManagedChannelImpl.java:1331)
at io.grpc.internal.DnsNameResolver$Resolve.resolveInternal(DnsNameResolver.java:318)
at io.grpc.internal.DnsNameResolver$Resolve.run(DnsNameResolver.java:220)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "main" com.google.api.gax.rpc.InternalException: io.grpc.StatusRuntimeException: INTERNAL: Panic! This is a bug!
at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:67)
at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72)
at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60)
at com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
at com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68)
at repackaged.com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1070)
at repackaged.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
at repackaged.com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1139)
at repackaged.com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
at repackaged.com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:507)
at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:482)
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:699)
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:397)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:459)
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:63)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:546)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$600(ClientCallImpl.java:467)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:584)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Suppressed: com.google.api.gax.rpc.AsyncTaskException: Asynchronous task failed
at com.google.api.gax.rpc.ApiExceptions.callAndTranslateApiException(ApiExceptions.java:57)
at com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112)
at com.google.cloud.bigquery.storage.v1beta1.BigQueryStorageClient.createReadSession(BigQueryStorageClient.java:237)
at com.google.cloud.spark.bigquery.direct.DirectBigQueryRelation.buildScan(DirectBigQueryRelation.scala:84)
at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$10.apply(DataSourceStrategy.scala:293)
at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$10.apply(DataSourceStrategy.scala:293)
at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:326)
at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:325)
at org.apache.spark.sql.execution.datasources.DataSourceStrategy.pruneFilterProjectRaw(DataSourceStrategy.scala:403)
at org.apache.spark.sql.execution.datasources.DataSourceStrategy.pruneFilterProject(DataSourceStrategy.scala:321)
at org.apache.spark.sql.execution.datasources.DataSourceStrategy.apply(DataSourceStrategy.scala:289)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:63)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:63)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:78)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:75)
at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157)
at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1334)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:75)
at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:67)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
at org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:72)
at org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:68)
at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:77)
at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:77)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3359)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2544)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2758)
at org.apache.spark.sql.Dataset.getRows(Dataset.scala:254)
at org.apache.spark.sql.Dataset.showString(Dataset.scala:291)
at org.apache.spark.sql.Dataset.show(Dataset.scala:745)
at org.apache.spark.sql.Dataset.show(Dataset.scala:704)
at transformations.SPRBarcPPJob$.flattenSPRBARC(SPRBarcPPJob.scala:42)
at LoadData$.main(LoadData.scala:34)
at LoadData.main(LoadData.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: io.grpc.StatusRuntimeException: INTERNAL: Panic! This is a bug!
at io.grpc.Status.asRuntimeException(Status.java:532)
... 23 more
Caused by: java.lang.NoSuchMethodError: io.grpc.internal.ClientTransportFactory$ClientTransportOptions.getProxyParameters()Lio/grpc/internal/ProxyParameters;
at io.grpc.netty.shaded.io.grpc.netty.NettyChannelBuilder$NettyTransportFactory.newClientTransport(NettyChannelBuilder.java:542)
at io.grpc.internal.CallCredentialsApplyingTransportFactory.newClientTransport(CallCredentialsApplyingTransportFactory.java:48)
at io.grpc.internal.InternalSubchannel.startNewTransport(InternalSubchannel.java:263)
at io.grpc.internal.InternalSubchannel.obtainActiveTransport(InternalSubchannel.java:216)
at io.grpc.internal.ManagedChannelImpl$SubchannelImpl.requestConnection(ManagedChannelImpl.java:1452)
at io.grpc.internal.PickFirstLoadBalancer.handleResolvedAddressGroups(PickFirstLoadBalancer.java:59)
at io.grpc.internal.AutoConfiguredLoadBalancerFactory$AutoConfiguredLoadBalancer.handleResolvedAddressGroups(AutoConfiguredLoadBalancerFactory.java:148)
at io.grpc.internal.ManagedChannelImpl$NameResolverListenerImpl$1NamesResolved.run(ManagedChannelImpl.java:1326)
at io.grpc.SynchronizationContext.drain(SynchronizationContext.java:101)
at io.grpc.SynchronizationContext.execute(SynchronizationContext.java:130)
at io.grpc.internal.ManagedChannelImpl$NameResolverListenerImpl.onAddresses(ManagedChannelImpl.java:1331)
at io.grpc.internal.DnsNameResolver$Resolve.resolveInternal(DnsNameResolver.java:318)
at io.grpc.internal.DnsNameResolver$Resolve.run(DnsNameResolver.java:220)
... 3 more
19/07/02 08:02:47 INFO org.spark_project.jetty.server.AbstractConnector: Stopped Spark@6fca5907{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
ERROR: (gcloud.dataproc.jobs.submit.spark) Job [920e05ddb57643fbbc6bf980e7b351bb] failed with error:
Google Cloud Dataproc Agent reports job failure. If logs are available, they can be found in 'gs://uat-mint-dataproc/google-cloud-dataproc-metainfo/9d999556-1033-457c-a954-05c6c4b3dcaf/jobs/920e05ddb57643fbbc6bf980e7b351bb/driveroutput'.
The text was updated successfully, but these errors were encountered: