New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a reverse migration to distribution change #495
Conversation
|
It turns out, that this operation fails to perform backwards. I have no idea yet, how to solve this. migrations.AlterField(
model_name='filedistribution',
name='distribution_ptr',
field=models.OneToOneField(auto_created=True,
on_delete=django.db.models.deletion.CASCADE,
parent_link=True, primary_key=True,
related_name='file_filedistribution', serialize=False,
to='core.Distribution'),
preserve_default=False,
),I raises an error: |
|
WARNING!!! This PR is not attached to an issue. In most cases this is not advisable. Please see our PR docs for more information about how to attach this PR to an issue. |
|
It turns out, that the following statements in the resulting sql-backward migration (pulpcore-manager sqlmigrate file 0009 --backwards) are in the wrong order: ALTER TABLE "file_filedistribution" ALTER COLUMN "distribution_ptr_id" DROP NOT NULL;
ALTER TABLE "file_filedistribution" DROP CONSTRAINT "file_filedistribution_distribution_ptr_id_5ce5564b_pk";I think, this is a django bug. edit: Filed an issue: https://code.djangoproject.com/ticket/32634 |
f24e23f
to
7619803
Compare
|
This looks good to me. |
It does not work. Once loaded with data, you get: |
This is actually a complete redesign of the migration. It not only makes it possible to have the backward migration, but it makes the process much more symmetric and easier. [noissue]
7619803
to
327ee9b
Compare
|
This should work now. I tested forward and backward several times with the following data: |
|
I created some distributions against |
|
I guess we need to update this @pulp/rpm migration too? https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/migrations/0032_new_distribution_model.py |
|
@daviddavis Yup |
| FileDistribution = apps.get_model('file', 'FileDistribution') | ||
| CoreDistribution = apps.get_model('core', 'Distribution') | ||
| for old_file_distribution in FileDistribution.objects.all(): | ||
| NewFileDistribution = apps.get_model('file', 'NewFileDistribution') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just an attempt at making it more generic and proper or is it strictly necessary for functional reasons for the file plugin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, what you mean. The apps.get_model is Django's way to provide you the historical model with a schema that matches the current state of the migration.
In this case, it provides you the transient NewFileDistribution that only pops into existance to later be renamed to FileDistribution. Its evanescent nature manifests in its complete lack of presence in the actual code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that NewFileDistribution and core.Distribution are effectively the same synce the former adds no new fields over the latter (except for _ptr). I agree that it should be used since it's more specific, I'm just trying to make sure my mental model is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NewFileDistribution here is more than core.Distribution, as it's a detail table. So i am copying objects from one detail table to the other.
IMHO you should never add objects to the core.Distribution table directly.
| new_master_model_entry.save() | ||
| old_file_distribution.distribution_ptr = new_master_model_entry | ||
| old_file_distribution.save() | ||
| pks_to_delete.append(old_file_distribution.pulp_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm glad we can get rid of this, I'm just concerned that perhaps there was some necessary reason it was done in such a convoluted fashion originally? I know Brian ran into a lot of problems throughout the process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the problem of being unable to delete the old base objects was, that the basedistribution_base_ptr had a foreign key constraint that was only lifted by removing the field from the table.
In the new approach there is no such ambiguity of inheritance that prevents the immediate deletion of old objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
|
||
|
|
||
| class Migration(migrations.Migration): | ||
| atomic = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This solves the aforementioned trigger error.
My theory: When inserting new items into the new table, Django or Postgres add a trigger to check for constraints upon leaving the transaction. Renaming the table however breaks the relation and the trigger cannot be found anymore. Ending the transaction earlier fixes the problem.
BTW: Django suggests making data migrations non atomic:
https://docs.djangoproject.com/en/2.2/howto/writing-migrations/#non-atomic-migrations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that constraints are checked last, I don't know the specific implementation details. But this sounds plausible. It also sounds potentially like a bug - maybe we should bring it up in #django and what people say.
Are there portions of the migration that we can make atomic or non-atomic selectively rather than making the entire migration non-atomic in a blanket way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not seen a way to make several steps of the migration atomic.
What we could do is move the last two steps (Deleting and renaming Tables) in a separate migration. But i am unsure whether this would gain much.
| migrations.RemoveField( | ||
| model_name='filedistribution', | ||
| name='basedistribution_ptr', | ||
| migrations.RunPython( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is, I believe we can put atomic=True here rather than having our own transactions inside. Maybe for other operations also.
Sidenote: There's no such thing as "nested transactions", so if the step is already atomic (run in a transaction) by default, it means that the "transaction" in the migration was actually not a "real transaction", it's a "savepoint". I don't remember precisely how the semantics differ, but it might potentially be part of the issue.
I remember one of the things the SqlAlchemy author wrote is that they hate this particular part of the Django ORM because the semantics can bite you sometimes. I don't know that this is one of those times I just want to point it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transactions in postgres can have multiple savepoints, and rollback can go back to a specific savepoint (rather than rolling back the entire transaction). You have to specifically do a 'ROLLBACK TO savepoint', iirc, to take advantage of this, and I don't know if/how Django does so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had atomic=true with and without the inner transactions and it produced the same error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still interested in figuring out of this is an upstream bug, but LGTM
|
@mdellweg I'm planning on swapping the distribution models today on many of the plugins. If you can copy this PR to the RPM plugin and fix the template to use the new model, then I'll handle all of the rest. |
What template do you mean? |
|
Oh yea, I totally forgot about the plugin template. I can take care of it unless @mdellweg you want to. |
|
@daviddavis that would be great! |
No description provided.