-
-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Room for improvement on handling failures in migrations when deploying Pycon #55
Comments
We deployed a fix on Friday afternoon to our staging server, but we were still getting errors showing tracebacks from the previous code this morning. I verified that the code checked out on the server was the updated code, so apparently the server processes (or at least one of them) were still running the old code. Here's a bit of the minion log from Friday:
I think here's what happened:
I still don't have any wonderful ideas for fixing this kind of problem. See my previous comments in this issue. |
We now deploy the pycon site via Heroku, so this is no longer relevant. |
I noticed this yesterday - running migrations during a deploy (highstate) is conditional on the code having changed. Which makes a lot of sense, given that the highstate runs many, many times a day. However, if the migration fails, it fails that highstate run, but subsequent runs don't try to run the migration again (because the code was updated in the previous run, so in the later run it's not changing), and so subsequent runs appear to succeed even though things are actually not in the proper state anymore.
I'm not sure what the best fix is though. We could hack up the way we run migrations so we run them when the code has changed or the previous migration failed (keeping track of that somehow), but that's pretty kludgey. And anyway, unless someone has fixed something manually, migrations aren't suddenly going to start working without a code change.
Or we could just bite the bullet and remove the condition, so Django checks whether any migrations need to run on each highstate. Maybe we should also consider whether deploys should run in a frequent periodic highstate... but at least this way, if something was wrong each highstate would fail until it was fixed.
What we'd want ideally would be for all the changes in a deploy to happen in a transaction (somehow), so if anything fails, no changes take effect. The previous system with Chef was set up that way with regard to the source code, but not the database or the virtualenv, so things could still get out of sync when there was a failure. And I don't think anyone has a great solution for that.
The text was updated successfully, but these errors were encountered: