New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use local DB and clock process in every container #5236
Conversation
dcee869
to
fb4362d
Compare
@jgmize I've made some changes that might seem unrelated, but this effort has uncovered some things. The way we handle Git repos for management commands has changed for example, since the initial population of the DB doesn't have the app config it could set the wrong remote, so now we don't use git remotes since git remotes are really just conveniences and you can use the git repo URL. I also further improved the cron health check to get the health check task names and expected run intervals from the job configs themselves. It's possibly a tad hacky, but has the advantage of making sure that the first run of the health check view will set the last-run file mtimes so that it won't fail immediately on deploy if the app starts before the cron process. tl'dr: coming along nicely |
You can see the new fancy task health check on my demo: |
9c06fea
to
19a5cf9
Compare
Fixes issue #5235 |
Did some testing with a local container and ab. I had the cron process set to constantly be reloading the security advisories data from disk to the db and ran the following while that was happening:
Looks pretty good to me. |
After talking with @jgmize we'd like to try another direction. The current implementation has every bedrock container fetching data from all of the sources. This is pretty inefficient as well as potentially more error prone since they're all fetching data across the internet. We should be able to scale down the number of running containers with this change, but it's still not great. New ProposalWe should have a single container (or Jenkins job) running in a single region that is doing the DB updates. This container will upload the file to S3 when the file changes, naming it after the git-sha of the running bedrock code. Naming it after the git-sha ensures that we don't run into any schema change issues. The bedrock containers meanwhile will have a process that will be checking for updated database files on a schedule. When the check returns a new file, it will save said file, and swap the new db file for the old one and all should be well. It will do the swap via symlinks. It should work something like:
This will mean that we have only a single updater process to monitor rather than |
* Removes our reliance on an external database server. * Adds a /healthz-cron/ URL that will 500 if no data updates have happened in more than 10 minutes.
Will now check and report on each cron task individually. Also include the data git repos for prod-details, security advisories, and release notes in the docker image.
Strange state when initial setup of git repos happened under a different config than the running container. This should fix this by using repo URLs directly.
cron.py will now print a CSV of its config to a tmp file at startup which the health check view will read.
Moves us from 4 dockerfiles to 1 \o/
Closing in favor of #5334 |
Removes our reliance on an external database server. A.K.A:
THE SQLITENING