Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration? #15

Open
FCO opened this issue Aug 18, 2018 · 79 comments
Open

Migration? #15

FCO opened this issue Aug 18, 2018 · 79 comments
Labels
hacktoberfest help wanted Extra attention is needed question Further information is requested

Comments

@FCO
Copy link
Owner

FCO commented Aug 18, 2018

Use :ver<>?

@FCO FCO added help wanted Extra attention is needed hacktoberfest question Further information is requested labels Oct 2, 2018
@FCO
Copy link
Owner Author

FCO commented Mar 15, 2019

#7 (comment)

Has a new idea that could be used to migrations...

@Xliff
Copy link
Contributor

Xliff commented Mar 15, 2019

I think migrations should be their own thing, especially since migrations are done at the Model level, not the collection one. Collection versioning is a great idea, though!

@Xliff
Copy link
Contributor

Xliff commented Mar 15, 2019

Maybe best explained, Migrations are an operation between two versions of the same model. So how would that best work? Maybe have something that encapsulates models like a collection, but it is NOT a collection of models, but of conversion operations. A big problem would be naming conventions.

Say we have two versions of model A. How would a migration of A:ver<1.2> to A:ver<1.3> be named?

Is something like this possible?

migration A:from<1.2>:to<1.3> { ... }

If so, I have an idea to do this in a very clean manner.

@FCO
Copy link
Owner Author

FCO commented Mar 15, 2019

no, it isn't, I think...

what about:

migration A:ver<1.3> {
   method from:ver<1.2> {...}
}

@Xliff
Copy link
Contributor

Xliff commented Mar 15, 2019

Actually, that's not a bad idea. However I was hoping to use methods for individual fields. In that case, field level conversions could use another mechanism.

Try this:

# Pseudo
migration A:ver<1.3> {

  # Specific field-level conversions. Key layout is:
  #  <from_version><model><attribute> => -> $old_model, $to_model { ... }
  has %!conversions;

 has CollectionA<1.2> $!old_a;
 has CollectionA<1.3> $!new_a;

  method from:ver<1.2> {
     # Iterate over each model. 
     for $!old_a.^models Z $!new_a.^models -> ($old_model, $new_model) {
        for $new_model.^attributes -> $attr {
          with %!conversion<1.2>{$new_model}{$attr} {
             $_($old_model, $new_model)
           } else {
             # By default, we move the value from the old model to the new.
             $new_model."&{ $attr }"() = $old_model."&{ $attr }"();
           }
        }
     }
  }
}

@federico-razzoli
Copy link

Note that usually DBMSs provide non-standard SQL syntaxes for migrations, like:

  • CREATE TABLE IF NOT EXISTS
  • DROP TABLE IF EXISTS
  • CREATE OR REPLACE TABLE
  • ALTER TABLE IF EXISTS
  • ALTER TABLE t ADD COLUMN IF NOT EXISTS, DROP COLUMN IF EXISTS, etc

And some databases allow to run CREATE/ALTER/DROP in transactions.

I believe that it would be nice to make use of these features, where available. For example the migration itself could have an on_conflict property that could be "ignore" (if not exists), "replace" (or replace), "fail".

@FCO
Copy link
Owner Author

FCO commented Mar 19, 2019

I were wondering about migration and I think I came to something interesting...

If I have 2 versions of the same model, for example:

model Bla:ver<0.1> {
   has Int $.a is column;
}

model Bla:ver<0.2> {
   has Str $.b is column;
}

It’s relatively easy to say that it should create a column b of type string and drop the column a. The problem is try to guess what should be done with the data... if the content of be shoul be generated based on the old data on a, we have a problem, once we dropped a.

We could fix that explaining to the migration how to generate the data. The other migrations that I know manipulate the data using plain SQL. But we already have a way to manipulate data! The AST!

I don’t think it would be impossible to make something like this to generate the data for a new column:

method #`{or sub, idk} migrate:<a> {
   String: { .a * 3 }
}

And it would run a:

UPDATE
   my_table
SET
   b = ‘String: ‘ || a * 3

Or something that would be better for that migration (@Santec, please help me!)

Maybe something should be a bit different because it is possible to a new column on a table can use data from different tables.

Sent with GitHawk

@Xliff
Copy link
Contributor

Xliff commented Mar 19, 2019

@FCO:

This is why I had the %!conversions attribute.

So for something like this situation, you'd have

submethod BUILD {
  %!conversions<0.2><Bla><b> = -> { $new_model<Bla>.b = $old_model<Bla>.a };
}

So %!conversions handles all special casing at the field level.

@FCO
Copy link
Owner Author

FCO commented Mar 19, 2019

Do we need the new model? Won’t we always use only the old one?

What about?

migration MySchema:ver:<0.2> {
   has Bla:ver<0.1> $.old-model1;
   has Ble:ver<0.1> $.old-model2;
   has Bla:ver<0.2> $.new-model1;
   has Ble:ver<0.2> $.new-model2;

   method Bla:<a> { { $!old-model1.b } & { $!old-model2.c } }
}

And it would use the return to create the update...

Sent with GitHawk

@FCO

This comment has been minimized.

@FCO

This comment has been minimized.

@FCO

This comment has been minimized.

@FCO

This comment has been minimized.

@FCO

This comment has been minimized.

@FCO
Copy link
Owner Author

FCO commented Mar 20, 2019

Now I see that it doesn't make sense to use another table here... I don't have a join here...

it should be done with relationship...

@FCO
Copy link
Owner Author

FCO commented Mar 20, 2019

So maybe it make sense to have migrations by model...

@FCO
Copy link
Owner Author

FCO commented Mar 20, 2019

Now I think I got it!

Bla.^migration: :from<0.1>, {
   .a = .b * 42 + 3;
   .c = .d - .e;
}

@FCO
Copy link
Owner Author

FCO commented Mar 21, 2019

I just started playing with migrations... now (since a185e2f) it's possible:

$ perl6 -Ilib -e '
use Red "experimental migrations";
model Bla {
   has Int $.a is column;
}

model Ble {
   has Int $.b is column;
}
Ble.^migration: {
   .b = .a * 3
}
Ble.^migrate: :from(Bla);
Ble.^dump-migrations
'
b => (bla.a)::int * 3

please, pay attention on use Red "experimental migrations";...

@FCO FCO closed this as completed Mar 21, 2019
@FCO FCO reopened this Mar 21, 2019
@FCO

This comment has been minimized.

@FCO
Copy link
Owner Author

FCO commented Mar 21, 2019

any thoughts about it?

@Xliff
Copy link
Contributor

Xliff commented Mar 21, 2019

Yes. That;'s not bad. It accomplishes the basics. However I would prefer if we did migration as a ClassHOW so that we can split complex operations up into encapsulated (self-contained) pieces.

You can still do it with this method, but everything has to be written out.

Give me a few days to think about ways alleviate this issue and if I find any, I will post.

@FCO
Copy link
Owner Author

FCO commented Mar 22, 2019

The solution I started implementing do not handle:

  • table rename
  • model rename
  • table population
  • table truncation
  • table deletion

Maybe a solution with a migration type with a collection of models could help with it...

(My next step is creating the migration models to save the state of a migration on the database...)

@FCO
Copy link
Owner Author

FCO commented Jul 20, 2020

I like the option of Red generate the SQL and the developer be able to review/edit it. But is the advantage of the yaml instead of saving it directly on Red's syntax?

@Altreus
Copy link
Contributor

Altreus commented Jul 20, 2020

There is no specific advantage to YAML. They use it because it is convenient. I think we can avoid this step. I convinced myself the step is valuable!

The YAML is used in order to accurately diff the two versions of the schema. They convert the schema to an internal format which makes it easier to diff.

The YAML is simply a serialisation of this data structure. This helps, because it means you can jump directly to this format for the previous version, instead of trying to figure it out from the code. Note that the previous version of the schema probably only exists in git. Therefore, they put it in the project directory for later!

Any format will do.

@FCO
Copy link
Owner Author

FCO commented Jul 20, 2020

On Red we can use objs of this types. They were made with that content in mind.

@Altreus
Copy link
Contributor

Altreus commented Jul 20, 2020

That would be super.

Here's my proposal then.

  • Version 1 (or zero, doesn't matter to me) is a deployment. We create the SQL file for the whole database and serialise all the "types" that you mentioned
  • Version 2 and onwards is a migration. We read the previous serialised types, compare to the current types, and output SQL and serialise the new types.
  • When a migration is performed:
    • We read the version from a special table. If it's not there we deploy, and put 1 in the version table. If that goes wrong we tell the user off.
    • If it is there we start with that version and run each successive version's SQL until we run out.

How is this sounding so far?

@FCO
Copy link
Owner Author

FCO commented Jul 22, 2020

Why store the actual version of the database schema? Why not just compare tour classes with the current database schema?

@Altreus
Copy link
Contributor

Altreus commented Jul 22, 2020

You need to be able to run all the migrations between version X and the latest, because migrations might not simply be to update the schema; they might have data migrations in between as well. Therefore, you need to be able to discover what X is.

You might as well store a serialised version of the data structures used for comparing, essentially as a cache and a historical record.

@FCO
Copy link
Owner Author

FCO commented Mar 7, 2024

I just thought a way that could work that I think would be perfect to my way of working I'm not sure if it would be good to everyone, but I'll describe it here and please let me know what you think. We may generalise that...

The user creates the models the way they like, then run the command:

red migrate update # may need to define what models to use

it will update the DB to reflect the models, read the DB after the changes, generate and save Red::Migration::* objects on a migrations.sqlite file.

after that, when the user is sure that's how the DB should look, they can run:

red migrate prepare

it will create 2 new files migrations/1/up.sql with the SQL used to create the tables and migrations/1/down.sql with drop tables.

then on the target system, the user can run

red migrate apply

it will create the DB using the SQL on migrations/1/up.sql. And it will also create a table to store the DB version

When the user wants to change something it edits the models and then run:

red migrate update

it will validate if the DB is still in sync with the Red::Migration::* objects stored on migrations.sql if not, break. If everything is ok, it will calculate the diff between the DB and the model, and update the DB, read the DB to generate the Red::Migrate::* objects and save them on migrations.sqlite.

The user can create new changes and run red migrate prepare as many times as needed, it will always validate the database with the last stored version on migrations.sqlite and update it (and the DB itself).

When the use is happy, they can run:

red migrate prepare

that will get the diff between the last version on migrations.sqlite and the previous one, generate the file migrations/2/up.sql with the SQL applying that diff and migrations/2/down.sql with the SQL applying the diff between the previous and the last version.

Running on the target system:

red migration apply

Will run the migrations/2/up.sql and increment the version number and running:

red migration unapply

Would run migrations/2/down.sql and decrement the version.

it should also be possible to run:

red migrate update -e 'MyModel.^migration: { .new-column = .old-col1 . " " . .old-col2 }'

that would do the diff and also include on SQL something like:

update my_model set new_column = old_col1 || " " || old_col2; -- probably doing that on batches, but you got it...

it should also be possible to run something like:

red migrate update --sql 'ALTER TABLE my_model ...'

and that would apply the SQL on the local database and then read it to create the Red::Migration::* objects on migrations.sqlite.

After a red migrate update the user can run red migrate downgrade to return the DB to the previous state.

After a red migrate prepare, the user can modify the generated SQL and call red migrate update-from-prepare that will grab the modified SQL, and run red migrate downgrade && red migrate update --sql-file <modified sql>.

The generated SQL files may need SHA1 to know if they were changed (it could be stored on migrations.sqlite).

The dir where we store the SQL should also be different fo what driver we are using...

When validating if the DB is still in sync, if that's not, it could/should be possible to generate a intermediate version (if an option is passed, something like --ignore-async) to make it work.

when running update, prepare and apply if it detects that it's going to lose data, it should break. To run that anyway, it should need a very verbose option like: --i-am-ok-on-losing-data.

Please let me know if there are any comments/suggestions...

@FCO
Copy link
Owner Author

FCO commented Mar 7, 2024

of course all command/sub-command names can be changed...

@FCO
Copy link
Owner Author

FCO commented Mar 7, 2024

As Voldenet suggested on #raku irc channel, it can be used by many different on git... maybe, instead of using ints for path on migrations, we could use UUID to avoid merge conflicts, but we would need to find another way to decide order.

@voldenet
Copy link

voldenet commented Mar 8, 2024

I think there should be red migrate prepare command that would generate commands needed to migrate current db in high level language:

migration v2 {
   method created-on { "2024-03-07T23:42:03.430480Z" } # used for order of migrations
   # method using high-level language of operations to perform
   method update { 
       .remove-column(:table<user> :name<type>);
       .create-table({
            .set-name('admin');
              # Guid will get converted into uniqueidentifier, char(36) or something supported by current db
            .add-column(:type(Guid) :name<userId>);
       });
   }
   # method describing changes to data model, this lets Red assume what this migration does
   # even if performed operations are different
   method Red-update {
       .remove-column(:table<user> :name<type>);
       .create-table({
            .set-name('admin');
            .add-column(:type(Guid) :name<userId>);
       });
   }
   method revert { ... }
   method Red-revert { ... }
}

After the migration is generated and applied to current database, user can review this high-level-language change, reorder it and replace the scripting (in the example, users stop having "type" indicating admins and get moved into table containing admins):

migration v2 {
   method created-on { "2024-03-07T23:42:03.430480Z" } # used for order of migrations
   method upgrade { 
       # user can reorder schema modifications
       .create-table({
            .set-name('admin');
            # user can choose data type explicitly, which is useful for floats of different precision
            .add-column(:type("uniqueidentifier") :name<userId>);
       });
       # user can add sql code to move admins to the new table
       .sql('insert into admin(userId) select id from user where type = 1');
       # user can choose to risk not removing the column,
       # in which case Red would assume, that this column is considered as removed
       # .remove-column(:table<user> :name<type>);
   }
   # this mustn't change
   method Red-upgrade {
       .remove-column(:table<user> :name<type>);
       .create-table({
            .set-name('admin');
            .add-column(:type(Guid) :name<userId>);
       });
   }
   method revert { ... }
   method Red-revert { ... }
}

With the above code, red migrate generate-script should generate the following (example for mssql):

CREATE TABLE admin ([userId] uniqueidentifier);
insert into admin(userId) select id from user where type = 1;
INSERT INTO [$migrations]([Name], [Date]) VALUES ('v2', SYSDATETIME());

@FCO
Copy link
Owner Author

FCO commented Mar 8, 2024

I'm not sure if I got the Red-update. We will have the last representation of the models stored on migrations.sqlite and the current one as a .rakumod file (the code itself). Do we need also store steps to change the model? Why not just change the file?

@voldenet
Copy link

voldenet commented Mar 8, 2024

A few notes regarding my previous post:

  • migration should be probably named by user, then the path could be /2/some-name.migration
  • created-on could be version number taken from the folder name - it would force users to always resolve them manually (by moving migrations to another path by hand)
  • revert operation may not always be possible is unsupported trait could be useful
  • Some migrations may require additional computation:
    • db has https://rest-api/entities/42 and wants to change the format to identifier taken from that url for every entity
    • db stores json in nvarchar(*) fields and database has no support for json

@voldenet
Copy link

voldenet commented Mar 8, 2024

Regarding Red-*:
This lets you ask for model at any version, which makes any tooling, bugfixing and analysis easier. I could think of keeping file with migration code separate from calculated model changes (/2/some-name.rakumod, /2/some-name.up, /2/some-name.down), so this doesn't get in the way too much.

The current-model.rakumod approach would work too, but there would be no way to tell model versions in the past (unless all the versions were stored in git). I'd say that instead of sqlite, plaintext could be used to keep merge conflict resolutions easier.

@FCO
Copy link
Owner Author

FCO commented Mar 8, 2024

if we are going that path, I was thinking about that "class"... I am thinking on something like this:

migration CreateAdminDropTypeFrmUser:ver<2> {
   method up {
       User.^migration: *.drop-column("type");
       Schema.^migration: {                                         # There is no model for that, then we need use schema
            .create-table("admin", { userId => %( :type<Guid> ) });
       });
   }

   method down {
      User.^migration: *.create-column("type", :type<text>);        # maybe, when deleting on the local DB
                                                                    # we could store the data somewhere in case we
                                                                    # want undo it (this down method)

      Schema.^migration: *.drop-table: "admin";                     # again, no model, so we need to use schema
   }
}

@FCO
Copy link
Owner Author

FCO commented Mar 8, 2024

I'd say that instead of sqlite, plaintext could be used to keep merge conflict resolutions easier.

That makes sense

@voldenet
Copy link

voldenet commented Mar 8, 2024

The last example is nice, would require keeping versioned entities (/migrations/entities/User/1/entity.raku), grouping them for migration:

class Schema:ver<2> does RedSchema { has @.models = ("Model1:ver<1>", "Model2:ver<3>", "Model3:ver<5>") }

and then having use RedSchema inside target migration. Separate per-entity versioning would reduce need to duplicate unchanged entities in migrations.

@FCO
Copy link
Owner Author

FCO commented Mar 8, 2024

Each migration dir may have a Schema.rakumod file where will have:

use RedSchema;

red-schema "Model1:ver<1>", "Model2:ver<3>", "Model3:ver<5>";

and on migration file we would use it and it would import those models on those versions and a \Schema with the schema object.

@FCO
Copy link
Owner Author

FCO commented Mar 8, 2024

No, it would need to be something like:

use RedSchema ("Model1:ver<1>", "Model2:ver<3>", "Model3:ver<5>");

@FCO
Copy link
Owner Author

FCO commented Mar 9, 2024

I've been wondering... would it make sense to on updating local db, save the dump of the DB to be able to restore it in case the user want to revert it?

@FCO
Copy link
Owner Author

FCO commented Mar 9, 2024

We will probably need our own version of META6.provides. A file can contain multiple models. And maybe we shouldn't use the model version as path to avoid merge conflicts...
Also, would it make sense to make it possible to use resources for that in case the application is installed as Raku module.

@FCO
Copy link
Owner Author

FCO commented Mar 9, 2024

Should we have a different module called Red::Migration instead of doing that inside Red?

@FCO
Copy link
Owner Author

FCO commented Mar 9, 2024

We'll probably need a configuration file... should we use Configuration?

@FCO
Copy link
Owner Author

FCO commented Mar 9, 2024

Not sure if we are going to use Configuration, but I'm here just thinking out loud what we will probably need:

class MigrationConfig {
    has UInt  $.current-version = self!find-current-version;
    has UInt  %.model-current-version;
    has IO()  $.base-path where *.add("META6.json") ~~ :f = $*CWD;
    has IO()  $.migration-base-path = $!base-path.add: "migrations";
    has IO()  $.dump-path = $!base-path.add: ".db-dumps";
    has IO()  $model-storage-path = $!migration-base-path.add: "models";
    has IO()  $.version-storage-path = $!migration-base-path.add: "versions";
    has Str() $.sql-subdir = "sql";
    has Str() @.drivers = <SQLite Pg>;

    has %!versions-cache = self!compute-version-cache;
    has %!models-cache = self!compute-model-cache;


    method !compute-version-cache {...}
    method !compute-models-cache {...}
    method !find-current-version {...}
    method !random-string {...}

    multi method version-path($version) {
        $!version-storage-path.add: ...
    }

    multi method version-path {
        $.version-path: $!current-version
    }

    method new-model-version(Str $model-name) {
        ++%!current-model-version{$model-name}
    }

    multi method model-path(Str() $model-name) {
        $.model-path: $model-name, %!model-current-version{$model-name}
    }

    multi method model-path(Str() $model-name, UInt $version) {
        $.model-version-path: $model-name, $version
    }

    multi method model-version-path(Str() $model-name, UInt() $version) {
        $!model-storage-path.add: %!models-cache{$model-name}{$version}
    }

    multi method model-version-path(Str() $model-name-version) {
        if $model-name-version ~~ /^$<name>=[<[\w:]>*\w] ":" ["ver<" ~ ">" $<ver>=[\d+]|\w+ "<" ~ ">" \w+]* % ":"/ {
            $.model-version-path: $<name>, $<ver> // 0
    }

    multi method migration-sql-path(Str $driver where @!drivers.one) {
        $.version-path.add($!sql-subdir).add: $driver
    }

    multi method migration-sql-path(Str $driver where @!drivers.one, UInt $version) {
        $.version-path($version).add($!sql-subdir).add: $driver
    }
}

@arunvickram
Copy link
Contributor

We could use Configuration for this, based on what I'm looking at, I like where this is going

@FCO
Copy link
Owner Author

FCO commented Mar 11, 2024

I had a few time now and started playing with that idea... I'm still only planning what to do... but here how it looks now:

image

on branch https://github.com/FCO/Red/tree/red-migration

@FCO
Copy link
Owner Author

FCO commented Mar 24, 2024

image image

FCO pushed a commit that referenced this issue Mar 24, 2024
@FCO
Copy link
Owner Author

FCO commented Mar 24, 2024

The way I'm seeing it now is:

  • update: updates your local database based on your current model
  • prepare: create new SQLs needed to make your old model become your current local DB schema
  • apply: apply those SQLs to your production DB

FCO pushed a commit that referenced this issue Mar 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hacktoberfest help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

9 participants