Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage client write requests should retry on "node not found" errors #3424

Closed
ptrus opened this issue Oct 16, 2020 · 0 comments · Fixed by #3425
Closed

Storage client write requests should retry on "node not found" errors #3424

ptrus opened this issue Oct 16, 2020 · 0 comments · Fixed by #3425
Assignees
Labels
c:bug Category: bug c:storage Category: storage

Comments

@ptrus
Copy link
Member

ptrus commented Oct 16, 2020

Storage write client already retries on some requests such as ErrPreviousVersionMismatch & ErrRootNotFound, but currently treats ErrNodeNotFound as permanent, which should actually also be retried:

var rerr error
resp, rerr = fn(ctx, api.NewStorageClient(conn.ClientConn), conn.Node)
switch {
case status.Code(rerr) == codes.Unavailable:
// Storage node may be temporarily unavailable.
return rerr
case status.Code(rerr) == codes.PermissionDenied:
// Writes can fail around an epoch transition due to policy errors.
return rerr
case errors.Is(rerr, api.ErrPreviousVersionMismatch):
// Storage node may not have yet processed the epoch transition.
return rerr
case errors.Is(rerr, api.ErrRootNotFound):
// Storage node may not have yet processed the epoch transition.
return rerr
default:
// All other errors are permanent.
return backoff.Permanent(rerr)
}

todo: Check other storage api errors, if any other should be treated as non-permanent as well.

@ptrus ptrus added c:storage Category: storage c:bug Category: bug labels Oct 16, 2020
@ptrus ptrus self-assigned this Oct 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c:bug Category: bug c:storage Category: storage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant