Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bigtable: INTERNAL: HTTP/2 error code: INTERNAL_ERROR Received Rst Stream as a retryable status code #106

Closed
igorbernstein2 opened this issue Jun 19, 2019 · 10 comments · Fixed by #586
Assignees
Labels
api: bigtable Issues related to the googleapis/java-bigtable API. needs more info This issue needs more information from the customer to proceed. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@igorbernstein2
Copy link
Contributor

Long lived connections are sometimes disconnected via a RST frame. These should be handled similar to UNAVAILABLE. Also need to make sure that GOAWAYs are handled similar.

This is the same issue as googleapis/java-bigtable-hbase#2177

@igorbernstein2 igorbernstein2 self-assigned this Jun 19, 2019
@sduskis
Copy link
Contributor

sduskis commented Jun 19, 2019

Did this come from Dataflow? If so, which version of grpc was used?

@igorbernstein2
Copy link
Contributor Author

Not sure if this is dataflow but customer said that this happened during batch jobs using the hbase client 1.11. I'll try to replicate it next week

@igorbernstein2
Copy link
Contributor Author

I think we should try to remap this to an UNAVAILABLE error so that its similar to the more common gfe reset:

UNAVAILABLE: io exception
io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
	at io.grpc.Status.asRuntimeException(Status.java:533)
	at io.grpc.stub.ClientCalls$BlockingResponseStream.hasNext(ClientCalls.java:584)
	at Main.main(Main.java:36)
Caused by: io.grpc.netty.shaded.io.netty.channel.unix.Errors$NativeIoException: syscall:read(..) failed: Connection reset by peer
	at io.grpc.netty.shaded.io.netty.channel.unix.FileDescriptor.readAddress(..)(Unknown Source)

@kmjung
Copy link

kmjung commented Oct 3, 2019

Any updates here?

@igorbernstein2
Copy link
Contributor Author

We haven't been able to reproduce this. So I'm not sure if this should be paved over in the bigable client or if the fix needs to be pushed down to the grpc layer. Is this something you are experiencing? If so, can you offer some details as to when you are experiencing this issue?

@chingor13 chingor13 transferred this issue from googleapis/google-cloud-java Dec 4, 2019
@chingor13 chingor13 added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p2 Moderately-important priority. Fix may not be included in next release. labels Dec 4, 2019
@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Dec 16, 2019
@kolea2 kolea2 added needs more info This issue needs more information from the customer to proceed. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed 🚨 This issue needs some love. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Dec 23, 2019
@google-cloud-label-sync google-cloud-label-sync bot added the api: bigtable Issues related to the googleapis/java-bigtable API. label Jan 29, 2020
@shikhar
Copy link

shikhar commented May 15, 2020

We are experiencing a "goaway", which is mentioned above in a comment. This is after the process has been running for over an hour.

Caused by: com.google.api.gax.rpc.UnavailableException: io.grpc.StatusRuntimeException: UNAVAILABLE: HTTP/2 error code: NO_ERROR
Received Goaway
max_age
        at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:69)
        at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72)
        at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60)
        ... 23 more

We are performing checkAndMutateRow operations. Are we being thrown this error because checkAndMutateRow is considered non-idempotent (via BigtableStubSettings), and so not internally retried? When such an error happens can it be safely concluded that the operation was not attempted at all, so perhaps retried idempotently by the library?

@Koriit
Copy link

Koriit commented Aug 10, 2020

We are facing the same issue as @shikhar with checkAndMutateRow operations. However, we have a retry mechanism on top of the BT call and our logic depends on the return value (whether the mutation was applied or not; duplicated mutation would return "not applied"). In the several cases that I checked after several of our retries, the mutation was finally applied without duplication.

However, I would much appreciate it if there were no errors or at least if the error was more descriptive. I would be more than happy to take some measures to prevent it from happening in our setup but at the moment I am clueless as to what is the cause.

@mutianf
Copy link
Contributor

mutianf commented Jan 27, 2021

GOAWAY is a different issue from the "Received rst stream" error. For checkAndMutateRow, we'll need to investigate if the requests have ever reached cloud bigtable service when these errors happen.

@naveg
Copy link

naveg commented May 19, 2021

We are still seeing this issue using the latest bigtable library. I can get a list of timestamps and a GCP bigtable instance if that is helpful.

StatusRuntimeException: INTERNAL: HTTP/2 error code: INTERNAL_ERROR
Received Rst Stream
    at io.grpc.Status.asRuntimeException(Status.java:521)
    at com.google.cloud.bigtable.grpc.async.AbstractRetryingOperation.onError(AbstractRetryingOperation.java:210)
    at com.google.cloud.bigtable.grpc.async.AbstractRetryingOperation.onClose(AbstractRetryingOperation.java:175)
    at com.google.cloud.bigtable.grpc.scanner.RetryingReadRowsOperation.onClose(RetryingReadRowsOperation.java:211)
    at com.google.cloud.bigtable.grpc.io.ChannelPool$InstrumentedChannel$2.onClose(ChannelPool.java:210)
...
(26 additional frame(s) were not displayed)

IOExceptionWithStatus: Error in response stream
    at com.google.cloud.bigtable.grpc.scanner.ResultQueueEntry$ExceptionResultQueueEntry.getResponseOrThrow(ResultQueueEntry.java:100)
    at com.google.cloud.bigtable.grpc.scanner.ResponseQueueReader.getNextMergedRow(ResponseQueueReader.java:105)
    at com.google.cloud.bigtable.grpc.scanner.ResponseQueueReader.getNextMergedRow(ResponseQueueReader.java:111)
    at com.google.cloud.bigtable.grpc.scanner.ResumingStreamingResultScanner.next(ResumingStreamingResultScanner.java:80)
    at com.google.cloud.bigtable.grpc.scanner.ResumingStreamingResultScanner.next(ResumingStreamingResultScanner.java:35)
...
(24 additional frame(s) were not displayed)

RuntimeException: com.google.cloud.bigtable.grpc.io.IOExceptionWithStatus: Error in response stream
    at org.apache.hadoop.hbase.client.AbstractClientScanner$1.hasNext(AbstractClientScanner.java:95)
    at com.alloymetrics.service.compute.query.aggregation.AggregationQuerySpliterator.tryAdvance(AggregationQuerySpliterator.java:166)
    at java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:294)
    at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206)
    at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:169)
...
(18 additional frame(s) were not displayed)

com.google.cloud.bigtable.grpc.io.IOExceptionWithStatus: Error in response stream

@mutianf
Copy link
Contributor

mutianf commented May 26, 2021

Working on a fix for client-core: googleapis/java-bigtable-hbase#3000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigtable Issues related to the googleapis/java-bigtable API. needs more info This issue needs more information from the customer to proceed. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants