Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Repeatable migrations #1046

Closed
rquadling opened this issue Feb 27, 2017 · 9 comments
Closed

RFC: Repeatable migrations #1046

rquadling opened this issue Feb 27, 2017 · 9 comments
Labels

Comments

@rquadling
Copy link
Collaborator

Scenario: Multiple developers developing features in branches and using migrations to affect views, stored procedures, functions, triggers or events.

If 2 developers create a migrations that affect the same view, there is no way to combine/verify that the up and down contain all the right amendments - after all, the migrations are separate and would require all members of the dev team to be fully aware of all the migrations of other branches that relate to the view that they are altering. Not particularly secure way of working.

This proposal is to introduce the idea of repeatable migrations.

At its core, a repeatable migration can be run as often as required. Commonly a checksum is used to determine if a repeatable migration is required to be run, but, even without that, all repeatable migrations should be repeatable, so, even if in the first instance, no checksum exists, no corruption or loss of data would exist if all repeatable migrations were run.

As a reference to where this idea has come from, see https://flywaydb.org/documentation/migration/repeatable.

Some thoughts on implementation

A separate directory containing migrations that are based upon a separate class (must implement AbstractRepeatableMigration).
A new config path for the repeatable migrations.
A new config for repeatable_migration_base_class.
A new create-repeatable command.

Currently, a normal migration classes uses up()/down() or change() methods to provide the logic of a migration. For a repeatable migration, I think provideHash() and repeatMigration() methods would provide the functionality required.

The provideHash method would return the hash that Phinx will use to determine if this repeatable migration is expected to be run. The AbstractRepeatableMigration class would stub this method and return true by default. Returning a non boolean would result in a comparison, returning true or false would cause the migration to run (or not) without a hash check. Ideally, a dev can generate the hash (for views, stored procedures, etc.) from a hash of the SQL. For repeatable migrations that are calculations, then an alternative method - maybe md5(microtome(true)) - would be required.

The repeatMigration() method would run the migration code.

There would be no rollback concept.

A separate table would be used to log the repeated migrations, allowing this data to be merged into a phinx status.

Possible structure:

CREATE TABLE `phinxlog_repeatables` (
	`start_time` TIMESTAMP(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
	`end_time` TIMESTAMP(6) NOT NULL DEFAULT '0000-00-00 00:00:00.000000',
	`migration_name` VARCHAR(100) NOT NULL,
        `hash` VARCHAR(32) NULL
);

Maybe this should be normalised into 2 tables (phinxlog_repeatables and phinxlog_repeatables_history).

Running a migration would, after all the standard migrations were run, run the repeatable migrations automatically.

The option to NOT run repeatables automatically, as well as running repeatables only would exist, thus providing the greatest degree of flexibility for when a manual migration is being run.

If tagging is in play (see #1044), then the ability to control the order of several groups of repeatable migrations would exist if that was necessary.

@rquadling
Copy link
Collaborator Author

@coatesap, @dignat, @hkwak, @joeHickson, @pedanticantic, @antriver

Any input you may have on this RFC would be greatly appreciated.

@mvrhov
Copy link
Contributor

mvrhov commented Feb 27, 2017

Along #1044 and #1046 there is the 3rd thing. Imagine multiple libraries where each library has it's own migrations. Now you as the author of your app would like to bring all this together.
This means,that you somehow need to list the order in which the migrations should happen. So the natural order which is currently time is not enough.

@rquadling
Copy link
Collaborator Author

@mvrhov, the primary purpose of repeatables is that they can be blindly repeated. Of course, if you've also tagged your repeatables, you can control the ordering using tags and a suitable deployment process. As for multiple directories, well, again, you could tag each directory and control the order that way.

@mvrhov
Copy link
Contributor

mvrhov commented Apr 4, 2017

@rquadling: Take a look at how sqitch solves this problem. IMO it's quite brilliant.

@rquadling
Copy link
Collaborator Author

@mvrhov, from the reading the reviews, sqitch seems to fail for many of the reasons for this RFC. Multiple developers with non-delta/patchable views/stored procedures/etc.

Unless I've read things wrong of course.

If you've used sqitch, I'd really like to know how to solve this problem.

@Koriit
Copy link

Koriit commented Apr 17, 2017

In my opinion, the way flyway handles this case is simple and solves the problem. Why not do it the same way?
I have only two concerns. I think that repeatMigration is overly verbose, wouldn't migrate be good enough? Second is the fact that with your proposal provideHash would have two responsibilities. First, actually providing a hash as name suggests and second, controlling whether the migration should be run. I believe that those should be separated. I would like to propose isMigrated and getHash. By default isMigrated would check the hash and getHash would return md5 of script contents, for example:

$reflector = new ReflectionClass(static::class);
return md5(file_get_contents($reflector->getFileName()));

I believe it to be sensible defaults which could actually handle some cases instead of making users override them each time. Actually, in the same way plain SQL scripts would be handled too.

@rquadling
Copy link
Collaborator Author

@Koriit, thank you for your comments. As you will see, this proposal specifically references Flyway's Repeatables (https://flywaydb.org/documentation/migration/repeatable) as the reason for this RFC. Essentially unversioned migrations executed when some comparison fails. If something is missing regarding this RFC, please add it to this issue. It is far easier to see and understand an RFC if all the details are present in one place.

I am hopefully matching the expectation of repeatables as they exist within Flyway, but also adding in the flexibility of multiple templates that Phinx currently possess and that I find great use for within my day-to-day work for versioned migrations.

I'm a verbose kind of developer ( as my colleagues will attest ), so if repeatMigration is too verbose, .... My issue is that migrate is already a known concept and I don't want there to be any confusion between a migration and a repeatable migration, especially as some people may well be working with external templates and extensions to the base abstract classes provided by Phinx or already have migrations for views, stored procedures, etc. and clarity would be required between what would be the legacy approach and the new repeatables approach.

I can see the advantage of having 2 methods, both of which would be abstract protected methods within the AbstractRepeatableMigration class. It would then allow a dev to control both aspects that you correctly identified.

The implementation of these methods would be specific to the template used and the class from which the template is coded to extend (similar to how AbstractMigration is the class a migration extends and is determined by Migration.template.php.dist).

At this stage, I am unsure what implementation to use. I know my use case will have external SQL files, so my template will call upon a specific sub class of AbstractRepeatableMigration that only really requires a filename.

But equally, some may embed statements into the repeatable class and therefore, some reflection on the current class would be required.

So, with that, I think 2 templates (with corresponding abstract class both of which extend AbstractRepeatableMigration) should cover a lot of use cases.

AbstractRepeatableSqlFileMigration would cover the use case of an external SQL file being the source of the repeatable content, and therefore a simple hash of that file would be provided by getHash(). This would almost certainly be my preferred use case. 1 SQL file per view, stored procedure, etc., along with a repeatable migration class per SQL file.

AbstractRepeatableClassMigration would cover the use case of a class containing all the code for the repeatable migration and would use reflection of the class to generate the hash (similar to the example you provided).

@othercorey
Copy link
Member

Closing due to lack of activity or out of scope. If there is new interest, please open a PR.

@rquadling
Copy link
Collaborator Author

After much consideration, we implemented our own repeatables logic outside of Phinx.

Unfortunately, at this time, the code is not open source, but if anyone wants a quick understanding of how we work with migrations, repeatables, and code deployment, then I'd like to suggest reading about our very silly acronym called ProMiReD (Top of the list if you Google 'promired'!).

I'm happy to talk more about this on the gist, or you can hunt down my email if you fancy a chat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants