Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spanner: ReadWriteTransaction reties on io.ErrUnexpectedEOF #1384

Closed
ashi009 opened this issue Apr 9, 2019 · 7 comments
Closed

spanner: ReadWriteTransaction reties on io.ErrUnexpectedEOF #1384

ashi009 opened this issue Apr 9, 2019 · 7 comments
Assignees
Labels
api: spanner Issues related to the Spanner API. type: question Request for information or clarification. Not an issue.

Comments

@ashi009
Copy link

ashi009 commented Apr 9, 2019

ReadWriteTransaction function returned io.ErrUnexpectedEOF will cause spanner client to retry the transaction repeatedly.

_, err := client.ReadWriteTransaction(ctx, func(ctx context.Context, txn *spanner.ReadWriteTransaction) error {
    var b []byte
    row, err := txn.ReadRow(ctx, "Accounts", spanner.Key{"alice"}, []string{"proto"})
    if err != nil {
        return err
    }
    if err := row.Column(0, &b); err != nil {
        return err
    }
 	var msg pb.Message
	if err := proto.Unmarshal(b, &msg); err != nil {
        return err // which might be io.ErrUnexpectedEOF
    }
	...
    return nil
})

Even wrapping the returned error will cause it to retry, due to the way isRetryable works:

func isErrorUnexpectedEOF(err error) bool {
if err == nil {
return false
}
// Unexpected EOF is a transport layer issue that could be recovered by
// retries. The most likely scenario is a flaky RecvMsg() call due to
// network issues.
// For grpc version >= 1.14.0, the error code is Internal.
// (https://github.com/grpc/grpc-go/releases/tag/v1.14.0)
if ErrCode(err) == codes.Internal && strings.Contains(ErrDesc(err), "unexpected EOF") {
return true
}
// For grpc version < 1.14.0, the error code in Unknown.
if ErrCode(err) == codes.Unknown && strings.Contains(ErrDesc(err), "unexpected EOF") {
return true
}
return false
}

@jeanbza
Copy link
Member

jeanbza commented Apr 9, 2019

Could you provide a little more information? Which method returned io.ErrUnexpectedEOF, and why should this not be retried?

@jeanbza jeanbza self-assigned this Apr 9, 2019
@jeanbza jeanbza added api: spanner Issues related to the Spanner API. needs more info This issue needs more information from the customer to proceed. type: question Request for information or clarification. Not an issue. labels Apr 9, 2019
@ashi009
Copy link
Author

ashi009 commented Apr 10, 2019

porto.Unmarshal might. It's necessary to do so within an RWT to manipulate it atomically if the data stored in spanner is a proto blob.

@jeanbza jeanbza removed the needs more info This issue needs more information from the customer to proceed. label Apr 15, 2019
@jeanbza jeanbza assigned olavloite and unassigned jeanbza Apr 23, 2019
@olavloite
Copy link
Contributor

@ashi009 Wrapping the error in a custom error with a custom error message should work. Running the following test works for me, in the sense that the transaction does not retry but fails with the custom error:

func TestReadWriteTransactionCustomError(t *testing.T) {
// setup test client
....

  _, err = client.ReadWriteTransaction(ctx,
      func(ctx context.Context, transaction *spanner.ReadWriteTransaction) error {
    _, err = unmarshal()
    if err == io.ErrUnexpectedEOF {
      err = &CustomError{
        Description:"Custom error",
        Err:io.ErrUnexpectedEOF,
      }
    }
    if err != nil {
      return err
    }
    return nil
  })
  if err == nil {
    t.Fatalf("missing exception")
  }
  if err.Error() != "Custom error" {
    t.Fatalf("unexpected exception: %s, expected %s", err.Error(), "Custom error")
  }
  e, ok := err.(*CustomError)
  if !ok {
    t.Fatalf("expected CustomError")
  }
  if e.Err != io.ErrUnexpectedEOF {
    t.Fatalf("expected io.ErrUnexpectedEOF as underlying error")
  }
}

type CustomError struct {
  Description string
  Err error
}

func (e *CustomError) Error() string {
  return e.Description
}

func unmarshal() (int, error) {
  return 0, io.ErrUnexpectedEOF
}

Running the same test without wrapping io.ErrUnexpectedEOF with a custom error message does indeed cause the transaction to retry indefinitely. That is however as expected, as this specific error is treated by the Spanner library as a retryable error code because it is an error that might happen at the transport level of gRPC.

@ashi009
Copy link
Author

ashi009 commented Apr 29, 2019

@olavloite This is indeed a working solution. However, it's unintuitive, as the convention for wrapping an error is to prepend the context. In that case, the "unexpected EOF" message will always present in the final error message and trigger the retry.

@broady
Copy link
Contributor

broady commented May 2, 2019

Might want to remove the bottom part of this retry logic, which seems obsolete:

// For grpc version >= 1.14.0, the error code is Internal.
// (https://github.com/grpc/grpc-go/releases/tag/v1.14.0)
if ErrCode(err) == codes.Internal && strings.Contains(ErrDesc(err), "unexpected EOF") {
return true
}
// For grpc version < 1.14.0, the error code in Unknown.
if ErrCode(err) == codes.Unknown && strings.Contains(ErrDesc(err), "unexpected EOF") {
return true
}

@ashi009
Copy link
Author

ashi009 commented May 3, 2019 via email

gopherbot pushed a commit that referenced this issue Jul 3, 2019
This change contains the following global changes:
1. spanner.Client uses the generated gapic client for gRPC
calls, instead of a gRPC connection and a
spannerpb.SpannerClient.
2. The gapic client uses the default gax retry logic.
3. Most custom retry logic has been removed, except:
   * retry on aborted transactions
   * retry for resumableStreamDecoder.next()

The change also includes an in-memory Spanner server for test
purposes. The server requires the user to mock the result of queries
and update statements. Sessions and transactions are handled
automatically. It also allows the user to register specific errors
to be returned for each gRPC function.
This test server makes it easier to develop test cases that verify
the behavior of the client library for an entire transaction for
situations that cannot easily be created in an integration test
using a real Cloud Spanner instance, such as aborted transactions
or temporary retryable errors.
The test cases can use the standard Spanner client withouth the
need to mock any of the server functions, other than specifying the
results for queries and updates.

Fixes #1418 and #1384

Change-Id: If0a8bbed50b512b32d73a8ef7ad74cdb1192294b
Reviewed-on: https://code-review.googlesource.com/c/gocloud/+/41131
Reviewed-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jean de Klerk <deklerk@google.com>
@olavloite
Copy link
Contributor

This issue was also solved with the change mentioned above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner Issues related to the Spanner API. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

No branches or pull requests

4 participants