-
Notifications
You must be signed in to change notification settings - Fork 40
Backdrop encounters transient MySQL/MariaDB errors under high concurrency #7092
Description
Description of the bug
This is something D7 did too, at least on my projects.
On sites with a lot of user activity, Backdrop can sometimes encounter transient MySQL errors such as:
- Deadlocks (err 1213)
- Lock wait timeouts (err 1205)
- Table def changes (err 1412)
See some recent screenshots I got while working:
This is obviously from watchdog, but in the user experience, I actually get an error screen, which is no good.
This can be fixed by simply retrying the query/transaction. I intend to submit a PR which does this optionally.
Steps To Reproduce
This is a little hard to reproduce on demand-- when it happened to me, it was because I was creating a new content type, and loaded another content type in a new tab at the same time as the first tab was trying to save.
But, I've seen similar "deadlock" errors occur in other situations in D7, especially when a lot of users were all doing things at the same time.
I have noticed this especially happens when performing long batch operations while other users are on the site doing things.
Expected behavior / My plan for the PR
Before throwing an exception, we should retry up to a couple of times, with a random short delay between attempts, to avoid further collisions.
My plan in the PR is going to be that this has to be enabled in the settings.php file with something similar to the following:
$settings['database_error_retry'] = [
'enabled' => TRUE,
'max_attempts' => 3,
'min_delay_ms' => 100,
'max_delay_ms' => 800,
'retry_error_codes' => [1213, 1205, 1412],
];
Your average user need never bother messing with this, but power users would be able to enable it if they run into similar problems as I have.
Ideally, if accepted, this could be commented-out in the settings.php file in a future release, with an explanation for devs as to what it is doing and why.