Retryable writes spec does not mention the time limits within which driver should retry a write #842

sar-gup · 2020-08-05T19:53:40Z

As per the driver specification for retryable writes (present in retryable-writes.rst), there is no mention of the time limit within which the mongo driver should issue a retry write (if applicable). The spec only mentions limitation on the number of times a write can be retried (one time).

Does this mean that it is legal for a driver to issue a retry write command after waiting for random amount of time?

p-mongo · 2020-08-05T20:11:59Z

Read and write retries generally do not have a forced delay that you seem to be alluding to. The retries are performed as soon as they can be performed.

For example, if you are reading from a secondary, and one secondary goes down but another one is available, the driver would immediately retry the read on the available secondary.

There are various conditions that may result in an application-perceived delay:

There isn't a suitable server to which to send the query. For example, if performing primary reads or writes and the primary becomes unavailable, the driver would wait until a primary is available again (up to server selection timeout) before retrying.
There may not be (enough or any) existing connections established to the server selected for the retry. If so the retry needs to wait until there is an established (+ authenticated, if appropriate) connection. If the server selected is at connection pool capacity, retry may need to wait for some other operation to complete.

But, in these situations there isn't a forced delay added by the driver.

divjotarora · 2020-08-05T20:21:33Z

@p-mongo's response covers the current state of drivers very well. To add onto it, there is a drivers specification in progress regarding client-side timeouts. Part of this specification will include a forced backoff period during retries to avoid spamming the server with consecutive requests.

sar-gup · 2020-08-06T00:03:33Z

Thanks for the responses @p-mongo and @divjotarora.

My concern is more around cases when the client side cannot ensure that the retry request is sent as soon as a write errors out. Here's an example scenario:

T = 0sec
Retry writes enabled. Initial write request sent. TransactionId = Tx

T = 30sec
TCP response not received. Client throws a timeout exception.
At this point, the client will try to do a retry the write.

Assume the client application freezes for 20 mins.

T= 10000 secs
Client retries the write request. TransactionId = Tx.

This request might end up processing the write again at the backend, thus violating the write once semantics.
To be safe around such scenarios, shouldn't the drivers follow some time duration restrictions?

p-mongo · 2020-08-06T02:00:09Z

"Write once" means the write isn't performed multiple times. Specifically that refers to the scenario when the first write succeeded and the server responded with the success, but the response got lost due to e.g. a network problem and subsequently the client retried the write. In this case (because the same transaction number is used) the server would know that the write was already performed and won't perform it again.

With regard to your scenario, different programming languages provide various facilities to impose time limits on operations. Both Go and Ruby to my knowledge provide general-purpose timeouts, for example. As Divjot mentioned work is also in progress to provide a similar operation-level timeout functionality for driver operations.

sar-gup · 2020-08-07T15:43:18Z

@p-mongo, referring to the same scenario that you mentioned where the request was processed by the server but response couldn't reach the client.

"Write once" means the write isn't performed multiple times. Specifically that refers to the scenario when the first write succeeded and the server responded with the success, but the response got lost due to e.g. a network problem and subsequently the client retried the write. In this case (because the same transaction number is used) the server would know that the write was already performed and won't perform it again.

After the client retries, will the server be able to tell that the write was already performed even if the retried request reaches the server after an interval of 1 hour?

p-mongo · 2020-08-07T17:15:52Z

This is my impression but for practical reasons we don't have spec test coverage for this specific scenario.

You could hand craft a write command and send it via the generic command helper and then rerun the command an hour later with the same txnum to see what would happen.

jmikola · 2020-08-07T18:22:12Z

You could hand craft a write command and send it via the generic command helper and then rerun the command an hour later with the same txnum to see what would happen.

FYI, the session for that retryable write (i.e. lsid) would still need to be active on the server. Otherwise, the retry attempt would be indistinguishable from a new write. If the write has previously been committed, a successful retry (which would be a no-op with the server returning the original write result) might also depend on the oplog retention.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retryable writes spec does not mention the time limits within which driver should retry a write #842

Retryable writes spec does not mention the time limits within which driver should retry a write #842

sar-gup commented Aug 5, 2020

p-mongo commented Aug 5, 2020 •

edited

Loading

divjotarora commented Aug 5, 2020

sar-gup commented Aug 6, 2020

p-mongo commented Aug 6, 2020

sar-gup commented Aug 7, 2020

p-mongo commented Aug 7, 2020

jmikola commented Aug 7, 2020 •

edited

Loading

Retryable writes spec does not mention the time limits within which driver should retry a write #842

Retryable writes spec does not mention the time limits within which driver should retry a write #842

Comments

sar-gup commented Aug 5, 2020

p-mongo commented Aug 5, 2020 • edited Loading

divjotarora commented Aug 5, 2020

sar-gup commented Aug 6, 2020

p-mongo commented Aug 6, 2020

sar-gup commented Aug 7, 2020

p-mongo commented Aug 7, 2020

jmikola commented Aug 7, 2020 • edited Loading

p-mongo commented Aug 5, 2020 •

edited

Loading

jmikola commented Aug 7, 2020 •

edited

Loading