New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIGTERM during migration does not cleanly abort migration #37866
Comments
@devurandom I'm looking into this, and unfortunately the library we're using to run migrations does not support being interrupted or aborted. I'm doing some research into whether it's possible to extend it to add this support, but I'm wondering if you have thoughts on either of the following simpler solutions:
|
So I've found a way to skip the migrations remaining after the one in progress, but I'm concerned about just terminating the current migration since there's no telling what state that'll leave the database in. I'm inclined to block on waiting for the current migration with a long timeout. |
Is it possible to make it run migrations in transactions? Is that feasible for our setup?
That would work. |
@imrkd found these related Liquibase issues:
From a very cursory reading, their approach to solve this appears to be to release the migration lock when the process receives
From reading the Liquibase documentation I think this should be safe:
|
Thanks for doing all the digging! Interesting to see all the proposals that had been made upstream, I hadn't thought to look there and just fumbled my way through the source code in my IDE.
Unfortunately our migrations typically involve DDL queries which are not transactional or even idempotent for all our app databases. The favored approach in that thread was something I handn't thought of - having the locks released as a connection closed hook on the database side. That's really clever, but requires us to not only re-implement that functionality, which could be brittle for upgrading, it would also require driver specific ways to register it. Seems like a lot of complexity, especially since we're uncertain about sticking with liquibase. I'll have a look at whether that 3rd party plugin would be simple for us to integrate. For now I'm going to go with something really simple to ship with 50.0 - waiting on the current migrations for say 20s before quitting, and automatically releasing the lock when we do so. |
Describe the bug
When Metabase v48.3 receives a
SIGTERM
while it is running migrations (after "Migration lock is cleared. Running ${n} migrations ...", but before "Migration complete in ${x} s"), Metabase will shut down ("Metabase Shutting Down ...") but not cleanly abort the migration / clear the migration lock (subsequent runs of Metabase will report "liquibase.exception.LockException: Database has migration lock; cannot run migrations.")Once this happened, the workaround is running the mentioned command
java -jar metabase.jar migrate release-locks
, but until this has been done Metabase cannot start again.To Reproduce
SIGTERM
(before "Migration complete")Expected behavior
Upon SIGTERM Metabase aborts the migration and clears the migration lock
Logs
Example logs of the migration termination:
Example logs of subsequent startup failure:
Information about your Metabase installation
Metabase v48.3 on Linux.
Severity
can cause downtime, but easy workaround
Additional context
https://metaboat.slack.com/archives/C013N8XL286/p1705573653217009
Different issue, not the problem I ran into, but similar cause: #30360
The text was updated successfully, but these errors were encountered: