Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switches FileDistribution use Distribution #481

Closed
wants to merge 1 commit into from
Closed

Switches FileDistribution use Distribution #481

wants to merge 1 commit into from

Conversation

bmbouter
Copy link
Member

@bmbouter bmbouter commented Mar 19, 2021

pulpcore==3.12 introduces a new MasterModel named Distribution
designed to replace the BaseDistribution MasterModel. This PR switches
the FileDistribution to use Distribution. It also ships a migration
which moves the data from the BaseDistribution table to the
Distribution field.

Required PR: pulp/pulpcore#1198

closes #8387

@pulpbot
Copy link
Member

pulpbot commented Mar 19, 2021

WARNING!!! This PR is not attached to an issue. In most cases this is not advisable. Please see our PR docs for more information about how to attach this PR to an issue.

@bmbouter
Copy link
Member Author

An example of a Distribution with an object label before migration:

pulp=> select * from core_label;
               pulp_id                |              object_id               | key | value | content_type_id 
--------------------------------------+--------------------------------------+-----+-------+-----------------
 ac6bc77e-a791-436d-9cd8-e57c390eed62 | d7c8b95a-27a2-4373-bcb2-7e4f4740c36c | foo | bar   |              53
(1 row)

pulp=> select * from core_basedistribution;
               pulp_id                |         pulp_created          |       pulp_last_updated       | pulp_type |   name   | base_path | content_guard_id | remote_id 
--------------------------------------+-------------------------------+-------------------------------+-----------+----------+-----------+------------------+-----------
 d7c8b95a-27a2-4373-bcb2-7e4f4740c36c | 2021-03-24 18:00:31.513068+00 | 2021-03-24 18:00:34.304217+00 | file.file | mydistro | foo       |                  | 
(1 row)

pulp=> select * from core_distribution;
ERROR:  relation "core_distribution" does not exist
LINE 1: select * from core_distribution;
                      ^
pulp=> select * from file_filedistribution;
       basedistribution_ptr_id        | publication_id 
--------------------------------------+----------------
 d7c8b95a-27a2-4373-bcb2-7e4f4740c36c | 
(1 row)

An example (different pk because different run, but it would have been the same if the same...):

pulp=> select * from core_label;
               pulp_id                |              object_id               | key | value | content_type_id 
--------------------------------------+--------------------------------------+-----+-------+-----------------
 124613bf-c15c-4abe-a115-245c47fde17c | 49c9e3d5-2df7-4b42-80fb-5d81ccaec7d2 | foo | bar   |              53
(1 row)

pulp=> select * from core_basedistribution;
 pulp_id | pulp_created | pulp_last_updated | pulp_type | name | base_path | content_guard_id | remote_id 
---------+--------------+-------------------+-----------+------+-----------+------------------+-----------
(0 rows)

pulp=> select * from core_distribution;
               pulp_id                |         pulp_created          |      pulp_last_updated       | pulp_type |   name   | base_path | content_guard_id | publication_id | remote_id | repository_id | repository_version_id 
--------------------------------------+-------------------------------+------------------------------+-----------+----------+-----------+------------------+----------------+-----------+---------------+-----------------------
 49c9e3d5-2df7-4b42-80fb-5d81ccaec7d2 | 2021-03-24 18:22:11.374041+00 | 2021-03-24 18:22:11.37406+00 | file.file | mydistro | foo       |                  |                |           |               | 
(1 row)

pulp=> select * from file_filedistribution;
         distribution_ptr_id          
--------------------------------------
 49c9e3d5-2df7-4b42-80fb-5d81ccaec7d2
(1 row)

@bmbouter
Copy link
Member Author

Before and after show the same API output:

(pulp) [vagrant@pulp3-source-fedora33 ~]$ pulp file distribution list
[
  {
    "pulp_href": "/pulp/api/v3/distributions/file/file/49c9e3d5-2df7-4b42-80fb-5d81ccaec7d2/",
    "pulp_created": "2021-03-24T18:22:11.374041Z",
    "base_path": "foo",
    "base_url": "http://pulp3-source-fedora33.localhost.example.com/pulp/content/foo/",
    "content_guard": null,
    "pulp_labels": {
      "foo": "bar"
    },
    "name": "mydistro",
    "publication": null
  }
]

@bmbouter bmbouter changed the title Switch to use new pulpcore.plugin.models.Distribution as MasterModel Switches FileDistribution use Distribution. Mar 24, 2021
@bmbouter bmbouter changed the title Switches FileDistribution use Distribution. Switches FileDistribution use Distribution Mar 24, 2021
pks_to_delete.append(old_file_distribution.pulp_id)


def delete_remaining_old_master_model_entries(apps, schema_editor):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this just be done in a normal way, e.g. BaseDistribution.objects.filter(pk__in=pks_to_delete).delete()

And what is the benefit to using two separate migration steps rather than doing the deletion at the end of the first step?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried several ways, and as far as I can tell it cannot.

There is some kind of a bug in the migration system here. I have to use BaseDistribution directly because if I used FileDistribution I'll also be deleting the detail table. So if you run code like this:

def delete_remaining_old_master_model_entries(apps, schema_editor):
    BaseDistribution = apps.get_model('core', 'BaseDistribution')
    BaseDistribution.objects.filter(pk__in=pks_to_delete).delete()

You get this error:

  Applying file.0009_move_data_to_new_master_distribution_model...Traceback (most recent call last):
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
psycopg2.errors.UndefinedColumn: column file_filedistribution.basedistribution_ptr_id does not exist
LINE 1: DELETE FROM "file_filedistribution" WHERE "file_filedistribu...
                                                  ^


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/pulp/bin/pulpcore-manager", line 33, in <module>
    sys.exit(load_entry_point('pulpcore', 'console_scripts', 'pulpcore-manager')())
  File "/home/vagrant/devel/pulpcore/pulpcore/app/manage.py", line 11, in manage
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/management/__init__.py", line 375, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/management/base.py", line 323, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/management/base.py", line 364, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/management/base.py", line 83, in wrapped
    res = handle_func(*args, **kwargs)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/management/commands/migrate.py", line 232, in handle
    post_migrate_state = executor.migrate(
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/migrations/executor.py", line 117, in migrate
    state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/migrations/executor.py", line 147, in _migrate_all_forwards
    state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/migrations/executor.py", line 245, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/migrations/migration.py", line 124, in apply
    operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/migrations/operations/special.py", line 190, in database_forwards
    self.code(from_state.apps, schema_editor)
  File "/home/vagrant/devel/pulp_file/pulp_file/app/migrations/0009_move_data_to_new_master_distribution_model.py", line 34, in delete_remaining_old_master_model_entries
    BaseDistribution.objects.filter(pk__in=pks_to_delete).delete()
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/query.py", line 711, in delete
    deleted, _rows_count = collector.delete()
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/deletion.py", line 294, in delete
    count = qs._raw_delete(using=self.using)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/query.py", line 725, in _raw_delete
    return sql.DeleteQuery(self.model).delete_qs(self, using)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/subqueries.py", line 75, in delete_qs
    cursor = self.get_compiler(using).execute_sql(CURSOR)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py", line 1142, in execute_sql
    cursor.execute(sql, params)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/utils.py", line 89, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.ProgrammingError: column file_filedistribution.basedistribution_ptr_id does not exist
LINE 1: DELETE FROM "file_filedistribution" WHERE "file_filedistribu...

You can however do this without error, but I believe we can't do this because if this plugin is installed in a much later pulpcore release this won't even be importable.

def delete_remaining_old_master_model_entries(apps, schema_editor):
    from pulpcore.app.models import BaseDistribution
    BaseDistribution.objects.filter(pk__in=pks_to_delete).delete()

I believe this is a problem only during migration because once migrations are finished running I can go into a shell and run:

from django.apps import apps
BaseDistribution = apps.get_model('core', 'BaseDistribution')
BaseDistribution.objects.all().delete()

I have a good setup to test any idea, but I don't have any others. Any suggestions welcome. Oh one other idea is: Don't worry about deleting the BaseDistribution at all and eventually the pulpcore migration that removes BaseDistribution will delete the table.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a good setup to test any idea, but I don't have any others. Any suggestions welcome. Oh one other idea is: Don't worry about deleting the BaseDistribution at all and eventually the pulpcore migration that removes BaseDistribution will delete the table.

Honestly this sounds good to me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been running into problems with stray BaseContent entries while experimenting with uninstalling plugins.
I don't know what exactly the problem was, but it sounds like not such a good idea to me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're deleting the entire table then I don't think we'd have that problem. BaseContent is still around

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just saying i had strange problems with Content. I don't know if they extend to Distributions. But we don't want to delete the whole table right away. It will exist for at least one release cycle after this change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure we'll run in to the problem @mdellweg identified, but I'd rather not find out. So I'm in favor of keeping the raw SQL removal of those fields as it has now. @dralley are you ok with this?

Also FYI this will be the same thing that all plugins will do.

view_name_pattern=r"publications(-.*/.*)?-detail",
queryset=models.Publication.objects.exclude(complete=False),
allow_null=True,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intention that plugins would just copy this code rather than inheriting? It's a reasonable position and I'm fine with it, just checking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I want to switch this to having Pulpcore provide mixins to avoid the copying. I'm going to do that now.

I had a moment of pause with this idea thinking 'oh no am I recreating the very problem we are tyring to solve'? I realize, no it's not recreating the same problem because these cross-package imports wouldn't automatically be defining fields that unavoidable are created at the db level. In fact, they're just serializer fields.

def delete_remaining_old_master_model_entries(apps, schema_editor):
with connection.cursor() as cursor:
for pk in pks_to_delete:
cursor.execute("DELETE from core_basedistribution WHERE pulp_id = %s", [pk])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you run that delete statement in the same transaction that moves the single distribution?
It sounds cleaner to me than collecting those id's in a global variable.
Also it's more resilient to oom in the migration. (Thinking of a lot of distributions here.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this, but it didn't work. I applied the following diff:

diff --git a/pulp_file/app/migrations/0009_move_data_to_new_master_distribution_model.py b/pulp_file/app/migrations/0009_move_data_to_new_master_distribution_model.py
index 9f714d5..24d6485 100644
--- a/pulp_file/app/migrations/0009_move_data_to_new_master_distribution_model.py
+++ b/pulp_file/app/migrations/0009_move_data_to_new_master_distribution_model.py
@@ -26,13 +26,19 @@ def migrate_data_from_old_master_model_to_new_master_model(apps, schema_editor):
             new_master_model_entry.save()
             old_file_distribution.distribution_ptr = new_master_model_entry
             old_file_distribution.save()
-            pks_to_delete.append(old_file_distribution.pulp_id)
+            # pks_to_delete.append(old_file_distribution.pulp_id)
+            with connection.cursor() as cursor:
+                cursor.execute(
+                    "DELETE from core_basedistribution WHERE pulp_id = %s",
+                    [old_file_distribution.pulp_id]
+                )
 
 
 def delete_remaining_old_master_model_entries(apps, schema_editor):
-    with connection.cursor() as cursor:
-        for pk in pks_to_delete:
-            cursor.execute("DELETE from core_basedistribution WHERE pulp_id = %s", [pk])
+    print('qqq')
+    # with connection.cursor() as cursor:
+    #     for pk in pks_to_delete:
+    #         cursor.execute("DELETE from core_basedistribution WHERE pulp_id = %s", [pk])

And at migration time I get this error:

psycopg2.errors.ForeignKeyViolation: update or delete on table "core_basedistribution" violates foreign key constraint "file_filedistributio_basedistribution_ptr_ba2e0c52_fk_core_base" on table "file_filedistribution"
DETAIL:  Key (pulp_id)=(ffd35cf3-19e0-46d3-b45b-9e437cd76913) is still referenced from table "file_filedistribution".


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/pulp/bin/pulpcore-manager", line 33, in <module>
    sys.exit(load_entry_point('pulpcore', 'console_scripts', 'pulpcore-manager')())
  File "/home/vagrant/devel/pulpcore/pulpcore/app/manage.py", line 11, in manage
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/management/__init__.py", line 375, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/management/base.py", line 323, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/management/base.py", line 364, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/management/base.py", line 83, in wrapped
    res = handle_func(*args, **kwargs)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/core/management/commands/migrate.py", line 232, in handle
    post_migrate_state = executor.migrate(
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/migrations/executor.py", line 117, in migrate
    state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/migrations/executor.py", line 147, in _migrate_all_forwards
    state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/migrations/executor.py", line 245, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/migrations/migration.py", line 124, in apply
    operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/migrations/operations/fields.py", line 178, in database_forwards
    schema_editor.remove_field(from_model, from_model._meta.get_field(self.name))
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/backends/base/schema.py", line 481, in remove_field
    self.execute(self._delete_fk_sql(model, fk_name))
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/backends/base/schema.py", line 137, in execute
    cursor.execute(sql, params)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/utils.py", line 89, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.IntegrityError: update or delete on table "core_basedistribution" violates foreign key constraint "file_filedistributio_basedistribution_ptr_ba2e0c52_fk_core_base" on table "file_filedistribution"
DETAIL:  Key (pulp_id)=(ffd35cf3-19e0-46d3-b45b-9e437cd76913) is still referenced from table "file_filedistribution".

to='core.Distribution'),
preserve_default=False,
),
migrations.RunPython(migrate_data_from_old_master_model_to_new_master_model),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, this is asking a lot, but please provide a backwards migration.

pulpcore==3.12 introduces a new MasterModel named `Distribution`
designed to replace the `BaseDistribution` MasterModel. This PR switches
the `FileDistribution` to use `Distribution`. It also ships a migration
which moves the data from the `BaseDistribution` table to the
`Distribution` field.

Required PR: pulp/pulpcore#1198

closes #8387
@bmbouter bmbouter closed this Mar 25, 2021
@bmbouter bmbouter reopened this Mar 25, 2021
@bmbouter
Copy link
Member Author

Well this is unfortunate. github won't let me force push my new ref because it says

 ! [remote rejected] switch-pulp-file-distribution-to-use-new-mastermodel -> switch-pulp-file-distribution-to-use-new-mastermodel (cannot lock ref 'refs/heads/switch-pulp-file-distribution-to-use-new-mastermodel': is at eab39265b0e7f32271c69d12df8cf4b2674eb0e5 but expected 0312ce2e209573f492a306fb42bd695956fa5809)
error: failed to push some refs to 'git@github.com:bmbouter/pulp_file.git'

So now we're going to have to use this PR instead... #482

@bmbouter bmbouter closed this Mar 25, 2021
@bmbouter bmbouter deleted the switch-pulp-file-distribution-to-use-new-mastermodel branch March 25, 2021 15:19
@bmbouter
Copy link
Member Author

Ok now this PR seems to allow me to force push (after deleting my fork and recreating it). #484

@bmbouter
Copy link
Member Author

I'm leaving comments here still because there was such good substantive discussion here. I'll reply to comments in either place though, whatever works for reviewers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants