Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove import/export's dependency on pulp_ids. #745

Merged
merged 1 commit into from Jul 14, 2020

Conversation

ggainey
Copy link
Contributor

@ggainey ggainey commented Jun 6, 2020

Dealing this required slimming down the fields we export/import,
so #6515 got addressed along the way as well.

closes #6807
closes #6515

@pulpbot
Copy link
Member

pulpbot commented Jun 6, 2020

Attached issue: https://pulp.plan.io/issues/6807

Attached issue: https://pulp.plan.io/issues/6515

ggainey added a commit to ggainey/pulp_rpm that referenced this pull request Jun 6, 2020
ggainey added a commit to ggainey/pulp_file that referenced this pull request Jun 6, 2020
ggainey added a commit to ggainey/pulp_rpm that referenced this pull request Jun 6, 2020
ggainey added a commit to ggainey/pulp_rpm that referenced this pull request Jun 6, 2020
ggainey added a commit to ggainey/pulp_file that referenced this pull request Jun 6, 2020
ggainey added a commit to ggainey/pulp_rpm that referenced this pull request Jun 6, 2020
ggainey added a commit to ggainey/pulp_file that referenced this pull request Jun 6, 2020
@daviddavis
Copy link
Contributor

This change looks good to me. So export_id on the Content in the downstream Pulp stores the pulp_id of the upstream Pulp server? Does it handle the case where the pulp_id in the upstream Pulp changes? I'm imagining this could happen if a Content gets removed/orphan cleaned up, and then created again later.

@ggainey
Copy link
Contributor Author

ggainey commented Jun 8, 2020

This change looks good to me. So export_id on the Content in the downstream Pulp stores the pulp_id of the upstream Pulp server? Does it handle the case where the pulp_id in the upstream Pulp changes? I'm imagining this could happen if a Content gets removed/orphan cleaned up, and then created again later.

For Content, export_id is Just Another Field, that will be updated when the matching natural-key-content gets imported/updated. The flow is a little complicated, but should Just Work.

Consider the following scenario:

  1. In upstream, package foo-1.2.3-4 has pulp_id 100, relative_path '/foo.rpm', and is exported and then imported.
  2. There now exists in downstream a core_content entry, pulp_id 2000, export_id 100, and a matching rpm_rpmpackage entry. core_contentartifact has content_id 2000 and relative_path '/foo.rpm'.
  3. Upstream deletes and recreates foo-1.2.3-4, so that it now has export_id 157.
  4. On import, we end up updating core_content 2000 (since natural-key works), export_id 157.
  5. Core_contentartifact import happens. We attempt to create a core_contentartifact by looking for export_id=157. We find pulp_id 2000, find the existing core_contentartifact record, and the update happens correctly.

I think we're good in this case.

@daviddavis
Copy link
Contributor

Code LGTM. I am going to test this out. @bmbouter or @dkliban can we get a second review?

CHANGES/6515.bugfix Outdated Show resolved Hide resolved
@@ -257,6 +257,8 @@ class Content(MasterModel, QueryMixin):

objects = BulkCreateManager()

export_id = models.UUIDField(null=True) # Used by PulpImport/Export processing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a significant change because all plugins will have to make migrations. The docstring above needs it to be explained. What does this do and why here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's not as significant as it looks. It's on the base content table and doesn't need to be populated. It's essentially data that's stored during an import to link up Content to ContentArtifact since the Content table doesn't have a natural key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the plugins need migrations, it's happening in core_content only - pulp_file and pulp_rpm already work with it, only pulpcore needed a migration.

See #745 (comment) for an explanation. The net is, core_contentartifact has no 'natural key' other than pulp_id, and doesn't know what it's 'pointing to' at this part of the process. export_id gives us a way to teach ContentArtifactResource a way to find the right ContentResource to point itself to when being imported.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you all are saying about the detail models not needing a migration. Is the data only stored while an import is running and then when the import completes it's no longer stored? I'm trying to determine if this is a field to be stored in the long-term or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the database, a given exported Detail ends up in two tables - core_content, and the detail-specific table, like say file_filecontent. This field exists only in core_content. It is filled when/if a given detail is imported - which action results in both the core_content and the content table-entries being created and filled in, because that's how django-import-export works. It then exists in the table (and the model, but nobody cares about its existence other than import/export). Import of ContentArtifact relies on export_id existing, to have any hope of recreating the core/artifact mapping being imported.

ggainey added a commit to ggainey/pulp_rpm that referenced this pull request Jun 9, 2020
ggainey added a commit to ggainey/pulp_file that referenced this pull request Jun 9, 2020
ggainey added a commit to ggainey/pulp_rpm that referenced this pull request Jun 9, 2020
ggainey added a commit to ggainey/pulp_file that referenced this pull request Jun 9, 2020
ggainey added a commit to ggainey/pulp_file that referenced this pull request Jun 9, 2020
ggainey added a commit to ggainey/pulp_rpm that referenced this pull request Jun 9, 2020
ggainey added a commit to ggainey/pulp_rpm that referenced this pull request Jul 13, 2020
ggainey added a commit to ggainey/pulp_rpm that referenced this pull request Jul 14, 2020
@@ -0,0 +1 @@
Cured import/export's addiction to pulp_ids.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it clear for a user who would read a changelog what's fixed?
maybe it can be .misc, so implementation details are not exposed to a user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire PR is nothing but low-level implementation change. From the user's POV, there isn't anything to see here - import/export works as before. From a plugin-author's POV, that is a meaningful changelog (or at least meaningful enough to go look at the attached issue). I honestly don't see how moving this to 'misc' changes anything?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving to .misc makes the changelog not published/visible.
I'm not sure if it's a good proposal or not. We have a dedicated plugin-api changelog, though this change is not plugin api per se.

We can leave it as is, it's just quite confusing for users, imo, and the changelog is for them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a side comment - ".misc makes the changelog not published/visible" was news to me. Our docs say nothing about it, and I see that there is one line in the towncrier doc that could be taken as implying that, but it never occurred to me that it means "not emitted".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll rename the file and repush.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to leave a similar comment. I also thought a user reading this would not understand it.

Dealing this required slimming down the fields we export/import,
so #6515 got addressed along the way as well.

Required PR: pulp/pulp_file#406
closes #6807
closes #6515
ggainey added a commit to ggainey/pulp_rpm that referenced this pull request Jul 14, 2020
ggainey added a commit to ggainey/pulp_file that referenced this pull request Jul 14, 2020
Copy link
Member

@bmbouter bmbouter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks great to me. Thanks @ggainey !

daviddavis pushed a commit to pulp/pulp_file that referenced this pull request Jul 14, 2020
@daviddavis daviddavis merged commit 8707c40 into pulp:master Jul 14, 2020
ggainey added a commit to ggainey/pulp_rpm that referenced this pull request Jul 23, 2020
daviddavis pushed a commit to pulp/pulp_rpm that referenced this pull request Jul 24, 2020
@ggainey ggainey deleted the natural_keys_POC branch April 27, 2021 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants