Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add repeat conditional migrations #39

Merged
merged 8 commits into from
Aug 24, 2017
Merged

Add repeat conditional migrations #39

merged 8 commits into from
Aug 24, 2017

Conversation

karenc
Copy link
Owner

@karenc karenc commented Aug 10, 2017

If the migration file has should_run defined, it is considered a
repeat migration. should_run is called every time migrate is run,
and if it returns true, up is run.

Close Connexions/db-migrator#1

@coveralls
Copy link

coveralls commented Aug 10, 2017

Coverage Status

Coverage increased (+0.9%) to 97.056% when pulling 0e30aee on repeat-migrations into 5091c38 on master.

@karenc
Copy link
Owner Author

karenc commented Aug 10, 2017

@reedstrm Can you have a look at this? We can update Connexions/db-migrator when the changes are merged.

@karenc
Copy link
Owner Author

karenc commented Aug 10, 2017

Dynamic Migrations

@coveralls
Copy link

coveralls commented Aug 11, 2017

Coverage Status

Coverage increased (+0.7%) to 96.851% when pulling 7cfa9fb on repeat-migrations into 5091c38 on master.

Copy link

@philschatz philschatz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had a couple questions, but I like the ideas! (didn't look at how extensive the tests were though)

@@ -69,10 +69,16 @@ Example usage::
generates a file called ``migrations/20151217170514_add_id_to_users.py``
with content::

# Uncomment should_run if this is a repeat migration
# def should_run(cursor):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case would a migration run more than once? I thought that the DB would just have a table with all the migrations that successfully ran, no?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a case for schema migrations, more for data migrations. Please see the original issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For something like a data-migration, you could write an up that stored the previous data in a side table, then down would copy that back in place. And set a should_run to refuse to run if the backup exists (or does something clever with timestamps for the name of the backup) Not too different than the existing down scripts, which tend to be frozen copies of old schema.

cursor.execute('INSERT INTO table VALUES (%s)', (data,))


def down(cursor):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be out-of-scope but I don't think there is a way to roll back all migrations. For example, CNXML->HTML would need to check out the previous version of the XSLT files and run them in order to roll back.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For any migrations that cannot be rolled back. You can do a pass for down.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, me thinks I'd want to put "has this migration already been applied" tests in "should_run", which then makes then idem-potent to some extent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(somehow my comments got reversed :-) )

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what "should_run" is, you can add a check to "up" for non repeatable migrations?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah true, true.

@reedstrm
Copy link
Contributor

👍

@reedstrm
Copy link
Contributor

It occurs to me that eventually, we'll have a huge number of "repeatable" migrations, which are marked "applied" (with the most recent application timestamp, I presume?) But that must be tested every time any migration happens, even 10 years later when they no longer apply. Even the test may no longer apply. I guess there's nothing that says we can't "cosolidate" migrations by deleting them, meaning we no longer support migrating from the state "just in front of" that migration. I'd like some way to tag a migration as "no need to use this one ever again on this database" - a terminal "no more repeats" .For the cases where the DBA/devops can say "the code that generates that bad data no longer runs, we do not need to migrate anymore"

Since the existing table is "schema_migrations" does it make sense to instead track these as "data_migrations". The "should_run" function could be described as "run tests against existing data to determine if there is work for this data migration to do". That's what I intend to write, BTW - @deferred + repeatable with a should_run that checks for markers of what data has been converted, and an up that converts a subset of data, and marks it (by making a backup?), so down can put it back. :-)

@reedstrm
Copy link
Contributor

If presence of should_run makes a migration repeatable, how about having the absence of a down mark a migration as irreversible, and make it only run if --forced ?

@karenc
Copy link
Owner Author

karenc commented Aug 15, 2017

dbmigrator migrate already has an option --version for running migrations up to a certain version, we can alias it to --to-version and add a --from-version so we can exclude some old repeatable migrations.

@karenc
Copy link
Owner Author

karenc commented Aug 15, 2017

I think if we add irreversible migrations that are only run if --forced is used, we're adding complexity to the order of the migrations. The migrations are meant to be ordered by the timestamp. We are probably not going to test running the migrations in different order, it might fail if we have a large number of migrations that need to be run.

@coveralls
Copy link

coveralls commented Aug 15, 2017

Coverage Status

Coverage increased (+0.7%) to 96.848% when pulling b4839b8 on repeat-migrations into 4a82ebf on master.

@coveralls
Copy link

coveralls commented Aug 16, 2017

Coverage Status

Coverage increased (+0.7%) to 96.841% when pulling 00e4473 on repeat-migrations into 4a82ebf on master.

@karenc
Copy link
Owner Author

karenc commented Aug 17, 2017

Now that I think about it, if you comment out should_run, the repeatable migration won't run anymore. Will that be enough?

@reedstrm
Copy link
Contributor

Turns out that, in order to be able to do a "down" for the data migration, I needed to store my own state anyway, so no need for extra table stuff. This seems to be working as_is_ for me as well. I haven't ran it a bunch yet on a larger test data set, which is my actual use-case: I want to have an 'up' that only does part of the work, so is worth calling repeatedly. I'll assume that the date applied will update to reflect the last_date applied. THe only problem is with that isapplied is no longer enough info in list to know if a migrate will do anything, so we need an indication of repeatability. OH, ow about an asterisk? True* means will try to run again (could even have False*,but it's not needed) And it looks like a regex or glob, meaning "many" :-)

@coveralls
Copy link

coveralls commented Aug 21, 2017

Coverage Status

Coverage increased (+0.8%) to 96.988% when pulling 68e4c2f on repeat-migrations into 4a82ebf on master.

@coveralls
Copy link

coveralls commented Aug 22, 2017

Coverage Status

Coverage increased (+0.8%) to 97.009% when pulling 0dae398 on repeat-migrations into 4a82ebf on master.

If the migration file has `should_run` defined, it is considered a
repeat migration.  `should_run` is called every time `migrate` is run,
and if it returns true, `up` is run.
When a migration's up function has @deferred, the migration is
automatically set to "deferred".  It will run when `migrate
--run-deferred` is called.
If migration is not found, "mark" was displaying an error message but
exiting normally.

The error message is still printed and dbmigrator will now exits with
code 1.
There is no way to show migrations as "repeated" when listing them.  It
is useful to know which migrations might run when running "migrate".
@karenc karenc merged commit 32ddb9e into master Aug 24, 2017
@karenc karenc deleted the repeat-migrations branch August 24, 2017 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants