Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
rgw: RGWMetaSyncShardControlCR retries with backoff on all error codes #13546
RGWBackoffControlCR only treats EBUSY and EAGAIN as 'temporary' error
to RGWMetaSyncShardControlCR, a 'fatal' error means that no further sync
this changes RGWMetaSyncShardControlCR to set exit_on_error to false, so
@yehudasa can you think of any cases where we really do want to give up on a shard?
the only case i can think of is for #13070, where we need to signal that we're done with the current period - but i'd rather make that an explicit part of the
there was some discussion about the risks for continuing to retry on non-transient errors. i'd like to find a way to address these, while maintaining the ability to recover from errors at this level
one risk is related to
are there other issues that need to be addressed before we can safely retry on all errors?
we might also want to raise the maximum backoff time above 30 seconds to deal with extreme cases