-
Notifications
You must be signed in to change notification settings - Fork 38
fix(deployment): run migrations using init containers instead #325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I think this will be an issue when using the sql-proxy. Since the migration is run as an init container, the sql-proxy side car is not yet started, so the migration will not be able to reach the database. |
oh, that's a good point, let me think... This proxy makes things so tricky :( |
There might be another way of running the proxy. Maybe we can run the proxy as its own deployment, and then talk to it through its internal DNS, what do you think? what I mean is: 1 - Deploy the proxy as its own deployment in the same namespace If we do that, we could keep either the job approach or the init container approach, either should work, I think. I like the init container approach, it makes the setup more predictable since the controlplane will not work if the DB settings are not properly set. |
I think the approach of having a single deployment for the SQL poxy is the right one. There is no need for the sql-proxy to be run multiple time in each deployment of the control-plane.
Would it be possible that the migration code changes somehow, then a new control-plane pods starts, run the migration while other control-plan pods are still alive. Making the pods that did not restart incompatible with the latest version of the database schema? |
Yes, I guess that's indeed a possibility. But I do not see it different than having a job that modifies the DB schema while we are still running an old version of the code as we currently do no? In any case to me, schema changes need to be non-destructive and take into account both the rollout process and possible rollbacks. We can achieve that now through review process since the migrations are now part of the code. What I could see though is a old version of the migration code trying to update the DB that it might be in a newer state, the good news are that in that case Does it make sense? BTW, I'll halt merging this code until we figure out how to proceed re: proxy, thanks for raising the issue. |
Indeed this could also happen with the current setup.
Yes it does. I think we can go ahead with your proposal. |
Perfect, let me know if updating the Chart to run the proxy as its own component is something you could give a try since you have a GKE setup. Otherwise it will take me more time. Happy either way. Thanks |
Yes I can work on that and test on GKE. |
Signed-off-by: Miguel Martinez Trivino <miguel@chainloop.dev>
c244786
to
4f5ea9c
Compare
Instead of running the migration as a separate job, which had some issues #309, #271, we now will run the migration as an init container.
I also double checked that our backend, Postgres can handle safely concurrent migration requests from different replicasm using
advisory locking
as stated in Atlas upstream documentationCloses #309 and #271