x/pkgsite: select healthy DB per request #40444
Comments
Change https://golang.org/cl/244603 mentions this issue: |
Reorganize the server so that each request gets its own DataSource, instead of using a single DataSource for every request. Currently, the behavior doesn't change because we do in fact use the same DataSource for every request. But this paves the way to having a pool of health-checked DB connections, while still having each request work with a single connection. For golang/go#40444. Change-Id: I717450593a8dcfd5689a8d28f634324776305042 Reviewed-on: https://go-review.googlesource.com/c/pkgsite/+/244603 Reviewed-by: Julie Qiu <julie@golang.org>
Alternative to picking a healthy DB connection per request:
This will cause the health check watcher (AppEngine or the GKE load balancer) to quickly kill processes that are connected to a bad DB, and start new ones. |
That approach will work well assuming a failure mode where one database becomes completely unavailable. If instead all the databases become 5% unavailable, e.g. because of a bug that overloads them all, they won't find a database that works all the time. Depending on the exact characteristics of the failure, how fast the instances get killed, and how fast they restart, that could end up making a minor outage into a very large one. There's also some risk with pointing each instance to a single database; after an outage on one replica, all the frontend instances will be pointed at the other replica. Since the failures we've seen so far are (AFAIK) total outages of one replica, this might be fine. But an approach that doesn't crash would be better in all respects, except perhaps for development ease. |
Currently, our service processes connect to a single DB on startup. If that DB enters a bad state while running, the process continues to run, serving 500s.
Instead, we should pick from a set of healthy DBs on each request.
The text was updated successfully, but these errors were encountered: