Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.io.IOException: The Application Default Credentials are not available in DataFlow pipeline #475

Closed
derjust opened this issue Sep 9, 2015 · 9 comments
Assignees
Labels
api: bigtable Issues related to the googleapis/java-bigtable-hbase API. 🚨 This issue needs some love. triage me I really want to be triaged.

Comments

@derjust
Copy link
Contributor

derjust commented Sep 9, 2015

Create a simple File/GCS to BigTable Dataflow pipeline:

        Pipeline p = Pipeline.create(options);

        CloudBigtableIO.initializeForWrite(p)
                .apply(TextIO.Read.from(options.getInputFile()))
                .apply(ParDo.named("Parse").of(new DoFn<String, Mutation>() {
                    @Override
                    public void processElement(DoFn<String, Mutation>.ProcessContext c) {
                            List<Put> putStatements = generateStatements();

                            for (Mutation mutation : putStatements) {
                                c.output(mutation);
                            }
                    }}))
                .apply(CloudBigtableIO.writeToTable(CloudBigtableTableConfiguration.fromCBTOptions(options)));

        p.run();

Job is executed on 10 files around 3GB (+- 1.5GB) with autoscaling which jumps to 100 and 328 workers within 4 minutes.

We saw many times this - and it looks like that the same amount of messages didn't made it to BigTable (more precise investigation required)

Sep 9, 2015, 6:43:37 PM
(d8a351d35056269a): java.io.IOException: The Application Default Credentials are not available. They are available if running in Google Compute Engine. Otherwise, the environment variable GOOGLE_APPLICATION_CREDENTIALS must be defined pointing to a file defining the credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information. 
at com.google.auth.oauth2.DefaultCredentialsProvider.getDefaultCredentials(DefaultCredentialsProvider.java:62) 
at com.google.auth.oauth2.GoogleCredentials.getApplicationDefault(GoogleCredentials.java:66) 
at com.google.cloud.bigtable.config.CredentialFactory.getApplicationDefaultCredential(CredentialFactory.java:173) 
at com.google.cloud.bigtable.config.CredentialFactory.getCredentials(CredentialFactory.java:100) 
at com.google.cloud.bigtable.grpc.BigtableSession$4.call(BigtableSession.java:249) 
at com.google.cloud.bigtable.grpc.BigtableSession$4.call(BigtableSession.java:246) 
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:745)
@sduskis
Copy link
Contributor

sduskis commented Sep 9, 2015

Is that error happening on your local machine or in the Dataflow environment?

@derjust
Copy link
Contributor Author

derjust commented Sep 9, 2015

Executed at on DataFlow. JobId is 2015-09-09_15_35_56-9067137410121152824

In the logs also this warnings appear very often - Unsure if that is related:

"message": "Exception processing message",
  "work": "7395070837218891579",
  "thread": "59",
  "worker": "ingestionpipeline2-sebast-09091535-322d-harness-7uup",
  "exception": "java.util.concurrent.RejectedExecutionException: Task io.grpc.SerializingExecutor$TaskRunner@7a4baff4 rejected from java.util.concurrent.ThreadPoolExecutor@6da41f95[Shutting down, pool size = 4, active threads = 4, queued tasks = 0, completed tasks = 145980]
    at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
    at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
    at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
    at io.grpc.SerializingExecutor.execute(SerializingExecutor.java:112)
    at io.grpc.ChannelImpl$CallImpl$ClientStreamListenerImpl.closed(ChannelImpl.java:398)
    at io.grpc.transport.AbstractClientStream.closeListener(AbstractClientStream.java:256)
    at io.grpc.transport.AbstractClientStream.transportReportStatus(AbstractClientStream.java:230)
    at io.grpc.transport.AbstractClientStream.remoteEndClosed(AbstractClientStream.java:180)
    at io.grpc.transport.AbstractStream$1.endOfStream(AbstractStream.java:121)
    at io.grpc.transport.MessageDeframer.deliver(MessageDeframer.java:253)
    at io.grpc.transport.MessageDeframer.deframe(MessageDeframer.java:168)
    at io.grpc.transport.AbstractStream.deframe(AbstractStream.java:285)
    at io.grpc.transport.AbstractClientStream.inboundTrailersReceived(AbstractClientStream.java:175)
    at io.grpc.transport.Http2ClientStream.transportTrailersReceived(Http2ClientStream.java:162)
    at io.grpc.transport.netty.NettyClientStream.transportHeadersReceived(NettyClientStream.java:110)
    at io.grpc.transport.netty.NettyClientHandler.onHeadersRead(NettyClientHandler.java:179)
    at io.grpc.transport.netty.NettyClientHandler.access$800(NettyClientHandler.java:69)
    at io.grpc.transport.netty.NettyClientHandler$LazyFrameListener.onHeadersRead(NettyClientHandler.java:424)
    at com.google.bigtable.repackaged.io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onHeadersRead(DefaultHttp2ConnectionDecoder.java:316)
    at com.google.bigtable.repackaged.io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onHeadersRead(DefaultHttp2ConnectionDecoder.java:265)
    at com.google.bigtable.repackaged.io.netty.handler.codec.http2.Http2InboundFrameLogger$1.onHeadersRead(Http2InboundFrameLogger.java:54)
    at com.google.bigtable.repackaged.io.netty.handler.codec.http2.DefaultHttp2FrameReader$2.processFragment(DefaultHttp2FrameReader.java:450)
    at com.google.bigtable.repackaged.io.netty.handler.codec.http2.DefaultHttp2FrameReader.readHeadersFrame(DefaultHttp2FrameReader.java:459)
    at com.google.bigtable.repackaged.io.netty.handler.codec.http2.DefaultHttp2FrameReader.processPayloadState(DefaultHttp2FrameReader.java:226)
    at com.google.bigtable.repackaged.io.netty.handler.codec.http2.DefaultHttp2FrameReader.readFrame(DefaultHttp2FrameReader.java:130)
    at com.google.bigtable.repackaged.io.netty.handler.codec.http2.Http2InboundFrameLogger.readFrame(Http2InboundFrameLogger.java:39)
    at com.google.bigtable.repackaged.io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder.decodeFrame(DefaultHttp2ConnectionDecoder.java:100)
    at com.google.bigtable.repackaged.io.netty.handler.codec.http2.Http2ConnectionHandler$FrameDecoder.decode(Http2ConnectionHandler.java:293)
    at com.google.bigtable.repackaged.io.netty.handler.codec.http2.Http2ConnectionHandler.decode(Http2ConnectionHandler.java:336)
    at com.google.bigtable.repackaged.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:327)
    at com.google.bigtable.repackaged.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:230)
    at com.google.bigtable.repackaged.io.netty.channel.ChannelHandlerInvokerUtil.invokeChannelReadNow(ChannelHandlerInvokerUtil.java:83)
    at com.google.bigtable.repackaged.io.netty.channel.DefaultChannelHandlerInvoker.invokeChannelRead(DefaultChannelHandlerInvoker.java:153)
    at com.google.bigtable.repackaged.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:157)
    at com.google.bigtable.repackaged.io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1069)
    at com.google.bigtable.repackaged.io.netty.handler.ssl.SslHandler.decode(SslHandler.java:944)
    at com.google.bigtable.repackaged.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:327)
    at com.google.bigtable.repackaged.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:230)
    at com.google.bigtable.repackaged.io.netty.channel.ChannelHandlerInvokerUtil.invokeChannelReadNow(ChannelHandlerInvokerUtil.java:83)
    at com.google.bigtable.repackaged.io.netty.channel.DefaultChannelHandlerInvoker.invokeChannelRead(DefaultChannelHandlerInvoker.java:153)
    at com.google.bigtable.repackaged.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:157)
    at com.google.bigtable.repackaged.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:946)
    at com.google.bigtable.repackaged.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:127)
    at com.google.bigtable.repackaged.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:510)
    at com.google.bigtable.repackaged.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:467)
    at com.google.bigtable.repackaged.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:381)
    at com.google.bigtable.repackaged.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:353)
    at com.google.bigtable.repackaged.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:703)
    at java.lang.Thread.run(Thread.java:745)",
  "logger": "io.grpc.transport.AbstractClientStream",
  "stage": "s01",
  "job": "2015-09-09_15_35_56-9067137410121152824"
}

@sduskis
Copy link
Contributor

sduskis commented Sep 10, 2015

The RejectedExecutionException happens when a connection is closed and more requests are occurring. I'm not sure why that would happen. I'll see if I can reproduce that problem outside of Dataflow.

@derjust
Copy link
Contributor Author

derjust commented Sep 10, 2015

Ok. As RejectedExcecutionException also appears without the Credentials problems I created a dedicated ticket for that: #478

@derjust
Copy link
Contributor Author

derjust commented Sep 23, 2015

Were able to reproduce. It looks like it can be 'enforced'/appears more likely if the BT servers are rebalancing (or the amount of BT servers it changed which also causes a rebalancing). Thanks to very few workers this should be the whole stacktrace:

Uncaught exception in main thread. Exiting with status code 1.
java.lang.RuntimeException: Unable to get application default credentials. Please see https://developers.google.com/accounts/docs/application-default-credentials for details on how to specify credentials. This version of the SDK is dependent on the gcloud core component version 2015.02.05 or newer to be able to get credentials from the currently authorized user via gcloud auth.
    at com.google.cloud.dataflow.sdk.util.Credentials.getCredential(Credentials.java:122)
    at com.google.cloud.dataflow.sdk.util.GcpCredentialFactory.getCredential(GcpCredentialFactory.java:43)
    at com.google.cloud.dataflow.sdk.options.GcpOptions$GcpUserCredentialsFactory.create(GcpOptions.java:260)
    at com.google.cloud.dataflow.sdk.options.GcpOptions$GcpUserCredentialsFactory.create(GcpOptions.java:250)
    at com.google.cloud.dataflow.sdk.options.ProxyInvocationHandler.getDefault(ProxyInvocationHandler.java:290)
    at com.google.cloud.dataflow.sdk.options.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:129)
    at com.sun.proxy.$Proxy23.getGcpCredential(Unknown Source)
    at com.google.cloud.dataflow.sdk.util.Transport.newDataflowClient(Transport.java:135)
    at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$DataflowWorkUnitClient.fromOptions(DataflowWorkerHarness.java:233)
    at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness.create(DataflowWorkerHarness.java:196)
    at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness.main(DataflowWorkerHarness.java:118)
Caused by: java.io.IOException: The Application Default Credentials are not available. They are available if running in Google Compute Engine. Otherwise, the environment variable GOOGLE_APPLICATION_CREDENTIALS must be defined pointing to a file defining the credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.
    at com.google.api.client.googleapis.auth.oauth2.DefaultCredentialProvider.getDefaultCredential(DefaultCredentialProvider.java:93)   
    at com.google.api.client.googleapis.auth.oauth2.GoogleCredential.getApplicationDefault(GoogleCredential.java:213)   
    at com.google.api.client.googleapis.auth.oauth2.GoogleCredential.getApplicationDefault(GoogleCredential.java:191)   
    at com.google.cloud.dataflow.sdk.util.Credentials.getCredential(Credentials.java:120)   
    ... 10 more 

I see where the IOException is created. Whereas I do not understand why it is at that point still null: com.google.api.client.googleapis.auth.oauth2.DefaultCredentialProvider.getDefaultCredentialUnsynchronized(HttpTransport, JsonFactory) is just beyond my understanding what should be ok and what is not.

This was logged to STDOUT - can this be intercepted at a better place and logged via SLF4j/commons so that it appears as one log entry in Cloud logging?

@kevinsi4508
Copy link
Contributor

According to the log, you are missing default application default credentials. Is this still happening? We need to talk to DataFlow to figure out why this is happening.

@derjust
Copy link
Contributor Author

derjust commented Oct 9, 2015

We never saw it happening again except for this specific day.
Hopefully it is related to #504 and gets fixed with it.

@kevinsi4508
Copy link
Contributor

I don't believe #504 is related to this issue. The problem here is at startup time. If you don't see this problem anymore, we should close this.

@sduskis
Copy link
Contributor

sduskis commented Oct 9, 2015

My best guess is that this was an intermittent Dataflow system configuration issue relating to Google Cloud Credential setup. Let's close this for now, since there is nothing actionable for Cloud Bigtable. We'll reopen if this becomes a problem. We'll let the Dataflow team know about this.

@sduskis sduskis closed this as completed Oct 9, 2015
@google-cloud-label-sync google-cloud-label-sync bot added the api: bigtable Issues related to the googleapis/java-bigtable-hbase API. label Jan 31, 2020
@yoshi-automation yoshi-automation added 🚨 This issue needs some love. triage me I really want to be triaged. labels Apr 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigtable Issues related to the googleapis/java-bigtable-hbase API. 🚨 This issue needs some love. triage me I really want to be triaged.
Projects
None yet
Development

No branches or pull requests

4 participants