Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: database/sql: add SetMaxBadConnRetries #52886

Open
jinzhu opened this issue May 13, 2022 · 17 comments · May be fixed by #52771
Open

proposal: database/sql: add SetMaxBadConnRetries #52886

jinzhu opened this issue May 13, 2022 · 17 comments · May be fixed by #52771
Labels
Milestone

Comments

@jinzhu
Copy link
Contributor

@jinzhu jinzhu commented May 13, 2022

This method adds the ability to sets the number of maximum retries if the driver returns driver.ErrBadConn

Currently, database/sql has a hardcoded maxBadConnRetries, which is set to 2, and it will rerun the SQL if the driver returns a bad connection error, which is causing problems in many cases.

For example, if we have a long-running INSERT SQL, the first try might be killed due to TCP timeout or whatever reason, and the next retry will insert duplicated data, which is confusing to end-users.

With this option, we can set the max bad retry to 0, and write our retry logic to avoid the issue. and it also opens the opportunity to do some optimizations like https://cacm.acm.org/magazines/2013/2/160173-the-tail-at-scale/fulltext

Related issues:

#48309
#15608

Implementation:

I have tried to submit a possible implementation: https://go-review.googlesource.com/c/go/+/404935

@gopherbot gopherbot added this to the Proposal milestone May 13, 2022
@jinzhu jinzhu linked a pull request May 13, 2022 that will close this issue
@gopherbot
Copy link

@gopherbot gopherbot commented May 13, 2022

Change https://go.dev/cl/404935 mentions this issue: database/sql: add SetMaxBadConnRetries

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented May 13, 2022

CC @kardianos

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals May 13, 2022
@kardianos
Copy link
Contributor

@kardianos kardianos commented May 18, 2022

To prevent a lock on max retries, I would prefer this to be set in OpenDB from the Connector. Maybe the connector could be have a configuration optional interface. Unsure. However, the associated CL is not correct, it doesn't lock correctly. It also changes the BadConn error I think.

Should we do this? Probably. I had ideally wanted to create a v2 sql package, but realistically my time is very limited. So if this was set when the DB is created so no lock is required, I'd be alright.

Another option would be 2 retries or no retires.

I'm okay with accepting this, conditionally that a good design can be found.

@rsc
Copy link
Contributor

@rsc rsc commented May 18, 2022

We can avoid the lock by setting this value atomically, I think.
So the API should be OK either way.

@rsc rsc moved this from Incoming to Active in Proposals May 18, 2022
@rsc
Copy link
Contributor

@rsc rsc commented May 18, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@jinzhu
Copy link
Contributor Author

@jinzhu jinzhu commented May 19, 2022

Thank you for your review, fixed the lock issue with atomic.

@MatthewJamesBoyle
Copy link

@MatthewJamesBoyle MatthewJamesBoyle commented May 21, 2022

This seems a good proposal to me.

A couple of thoughts:

  • Does it make sense to have an upper bound on the amount of retries we allow a user to configure?
  • As far as I can see, the retry strategies today do not perform any exponential back off/jitter. This was likely fine for most use cases since the maximum was 2, but as users may put higher numbers here, we may want to do more to prevent a thundering herd problem.

@rsc
Copy link
Contributor

@rsc rsc commented May 25, 2022

The CL would be a lot clearer if the code was not being refactored at the same time. Can you instead add a db.maxBadConnRetries() method and replace all the i < maxBadConnRetries with i < db.maxBadConnRetries() ?

@jinzhu
Copy link
Contributor Author

@jinzhu jinzhu commented May 26, 2022

The CL would be a lot clearer if the code was not being refactored at the same time. Can you instead add a db.maxBadConnRetries() method and replace all the i < maxBadConnRetries with i < db.maxBadConnRetries() ?

Ok, will do that.

@jinzhu
Copy link
Contributor Author

@jinzhu jinzhu commented May 26, 2022

The CL would be a lot clearer if the code was not being refactored at the same time. Can you instead add a db.maxBadConnRetries() method and replace all the i < maxBadConnRetries with i < db.maxBadConnRetries() ?

but in the old style of code, we can't change the maxBadConnRetries to 0, or it will never reuse connection from pool, is it ok?

@jinzhu
Copy link
Contributor Author

@jinzhu jinzhu commented Jun 3, 2022

The CL would be a lot clearer if the code was not being refactored at the same time. Can you instead add a db.maxBadConnRetries() method and replace all the i < maxBadConnRetries with i < db.maxBadConnRetries() ?

Hello @rsc , which connReuseStrategy should we use if the maxBadConnRetries set to 0? alwaysNewConn or cachedOrNewConn?

@rsc
Copy link
Contributor

@rsc rsc commented Jun 8, 2022

@kardianos Do you have any suggestions or preferences here?

@kardianos
Copy link
Contributor

@kardianos kardianos commented Jun 21, 2022

@jinzhu Can you explain to me when you would want a connection retry other then (default=2 or zero)?

This "retry" is to try to take a connection out of the connection pool, then if it is not a valid connection, return an error with BadConn. It is fundamentally about connection pooling, though the side effect might be to re-run a query if a driver mis-reports the connection to be bad, when it actually had side effects.

Turning off this connection retry I can understand: the driver is (now) provided many ways to report a connection as bad, when we actually attempt a query, we want to give the user an option to not retry.

Alternatively, the option to not retry could be implemented as a query parameter say, const NoRetry ..., then pass that into a param in the query.

So questions:

  1. Can you justify any other option besides Default and Off?
  2. Would the use case always be for the entire connection pool? Or would this be better per query?

To answer your question:

Hello @rsc , which connReuseStrategy should we use if the maxBadConnRetries set to 0? alwaysNewConn or cachedOrNewConn?

If max retries is set to zero, you must use cachedOrNewConn. If you used alwaysNew, then it would negate the entire point of a connection pool.

@jinzhu
Copy link
Contributor Author

@jinzhu jinzhu commented Jun 22, 2022

Hi @kardianos ,

Thank you for your response

@jinzhu Can you explain to me when you would want a connection retry other then (default=2 or zero)?

From my use case, default or zero should be enough.

Alternatively, the option to not retry could be implemented as a query parameter say, const NoRetry ..., then pass that into a param in the query.

This sounds make things even more tricky ;)

Compare to per query solution, I would prefer to set up two database pools with different retry option in this case, which works perfectly in read-write splitting cases.

@kardianos
Copy link
Contributor

@kardianos kardianos commented Jun 22, 2022

@jinzhu
Copy link
Contributor Author

@jinzhu jinzhu commented Jun 22, 2022

So it sounds like you do have a case where some connections (reads) would retry, and write would not.

yes, sounds reasonable in some cases, but in our case, we would like to disable retry totally for https://cacm.acm.org/magazines/2013/2/160173-the-tail-at-scale/fulltext

Do you mean tricky from a db user or implementation?

For a db user

@jinzhu
Copy link
Contributor Author

@jinzhu jinzhu commented Jun 27, 2022

Hi @kardianos

for the implementation https://go-review.googlesource.com/c/go/+/404935 , do you have any suggestions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging a pull request may close this issue.

6 participants