New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail elegantly when expected database schema is missing #31
Comments
Agreed. All three endpoints should catch exceptions, return a 500, and write the exception details to stderr. |
It makes sense to avoid an unexpected halt. If we catch these exceptions and return 500s instead, the service will stay up, continue trying to connect to the database for no reason, and continue returning 500s. Are we satisifed with that? Is that the best practice in this situation? |
The case of src/github.com/azavea/ecobenefits/main.go:140 looks like it occurs when the app is initializing, before starting the http event loop. So failing abruptly is probably a sane behavior, unless we code around the failures with restart logic. I think it makes sense to panic and fail when the db schema doesn't match. Better to write ops code to restart the process after a db upgrade than to have it keep trying to discover the correct schema. As for the database being offline, you could imagine the app trying repeatedly to reconnect and initialize, before passing control over to the http process. Or, since all it's doing is closing over and caching some db state for performance, we could punt that to happen on the next request, or the next request, or the next request, with each one returning 500 if it fails to connect, caching if it succeeds. |
I think that most web applications with a database dependency make use of database connections lazily. For example, I created a test Rails app locally that is pointed at a nonexistent MySQL database. When I start the app, it starts, but doesn't fail until I attempt to make a request. In this application, it appears as though some data is loaded once at startup. That data is then reused to deal with subsequent requests. Panicking in this situation makes sense, but usually the argument to
The Upstart script for this service has a
Are the API requests leading to additional database queries each time, or is it only using data pulled during the startup process? |
(Most) of the API requests lead to additional queries against the DB. The data we get at startup are for things used for every request, which do not change very frequently. |
Got it. Then from my perspective, I think those requests should expose database connectivity failures to HTTP API consumers (via some HTTP status code) and operators (via a log entry), but not attempt any restart logic. Not clear if any of that happens now, but all of the logs I've found around this service don't contain failures. |
Status codes are problematic because the service is built on |
If the database is offline or the schema is not setup properly, the service crashes hard:
The text was updated successfully, but these errors were encountered: