Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SpannerSampleIT Deadline_Exceeded #2571

Closed
lesv opened this issue Apr 5, 2020 · 3 comments · Fixed by googleapis/java-spanner#141
Closed

SpannerSampleIT Deadline_Exceeded #2571

lesv opened this issue Apr 5, 2020 · 3 comments · Fixed by googleapis/java-spanner#141
Assignees
Labels
api: spanner Issues related to the Spanner API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@lesv
Copy link
Contributor

lesv commented Apr 5, 2020

com.google.cloud.spanner.SpannerException: DEADLINE_EXCEEDED: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: Deadline expired before operation could complete.
	at com.example.spanner.SpannerSampleIT.runSample(SpannerSampleIT.java:60)
	at com.example.spanner.SpannerSampleIT.testSample(SpannerSampleIT.java:268)
Caused by: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: Deadline expired before operation could complete.
@lesv lesv added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. api: spanner Issues related to the Spanner API. labels Apr 5, 2020
@lesv lesv assigned dmitry-s and skuruppu and unassigned nnegrey Apr 5, 2020
@lesv
Copy link
Contributor Author

lesv commented Apr 5, 2020

Found in PR #2563

@skuruppu
Copy link
Contributor

skuruppu commented Apr 6, 2020

@olavloite, would you please be able to take a look? Thanks very much.

@olavloite
Copy link
Contributor

I've seen similar flakes in other test cases as well, and the team developing the CPP client for Spanner is also running into a similar issue. The problem is that the CreateBackup and CreateDatabase operations are non-idempotent. These can therefore sometimes fail with DEADLINE_EXCEEDED and/or UNAVAILABLE in case of temporary network problems or other transient issues.

I'll have a look and see if we can try to fix this in the client library by using approximately the following logic:

  1. In case of a DEADLINE_EXCEEDED or UNAVAILABLE error for these Create... calls, the client library will query the backend for the corresponding long-running operation for the call.
  2. If the long-running operation is found, the call will return an OperationFuture for the operation.
  3. If the long-running operation was not found, the Create... call will be retried.

olavloite added a commit to googleapis/java-spanner that referenced this issue Apr 7, 2020
RPCs returning a long-running operation, such as CreateDatabase,
CreateBackup and RestoreDatabase, are non-idempotent and cannot be
retried automatically by gax. This means that these RPCs sometimes fail
with transient errors, such as UNAVAILABLE or DEADLINE_EXCEEDED. This
change introduces automatic retries of these RPCs using the following
logic:
1. Execute the RPC and wait for the operation to be returned.
2. If a transient error occurs while waiting for the operation, the
   client library queries the backend for the corresponding operation.
   If the operation is found, the resumes the tracking of the existing
   operation and returns that to the user.
3. If no corresponding operation is found in step 2, the client library
   retries the RPC from step 1.

Fixes GoogleCloudPlatform/java-docs-samples#2571
olavloite added a commit to googleapis/java-spanner that referenced this issue Apr 10, 2020
RPCs returning a long-running operation, such as CreateDatabase,
CreateBackup and RestoreDatabase, are non-idempotent and cannot be
retried automatically by gax. This means that these RPCs sometimes fail
with transient errors, such as UNAVAILABLE or DEADLINE_EXCEEDED. This
change introduces automatic retries of these RPCs using the following
logic:
1. Execute the RPC and wait for the operation to be returned.
2. If a transient error occurs while waiting for the operation, the
   client library queries the backend for the corresponding operation.
   If the operation is found, the resumes the tracking of the existing
   operation and returns that to the user.
3. If no corresponding operation is found in step 2, the client library
   retries the RPC from step 1.

Fixes GoogleCloudPlatform/java-docs-samples#2571
olavloite added a commit to googleapis/java-spanner that referenced this issue Apr 13, 2020
RPCs returning a long-running operation, such as CreateDatabase, CreateBackup and RestoreDatabase, are non-idempotent and cannot be retried automatically by gax. This means that these RPCs sometimes fail with transient errors, such as UNAVAILABLE or DEADLINE_EXCEEDED. This change introduces automatic retries of these RPCs using the following logic:
1. Execute the RPC and wait for the operation to be returned.
2. If a transient error occurs while waiting for the operation, the
   client library queries the backend for the corresponding operation.
   If the operation is found, the resumes the tracking of the existing
   operation and returns that to the user.
3. If no corresponding operation is found in step 2, the client library
   retries the RPC from step 1.

Fixes GoogleCloudPlatform/java-docs-samples#2571
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner Issues related to the Spanner API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants