Skip to content
This repository has been archived by the owner on Jan 9, 2023. It is now read-only.

Fixed migration of docker content when corrupted content has been #352

Merged
merged 1 commit into from Apr 23, 2021

Conversation

ipanova
Copy link
Member

@ipanova ipanova commented Apr 14, 2021

identified.

[noissue]

@ipanova ipanova marked this pull request as draft April 14, 2021 17:46
@@ -129,6 +129,7 @@ def pipeline_stages(self):
DockerContentSaver(),
ResolveContentFutures(),
InterrelateContent(),
#RemoveEmtpyTags(),
Copy link
Member Author

@ipanova ipanova Apr 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leaving post_hook only would not solve problem for upgraded installations that already contain problem with tagged_manifest=null. In this case a migration is needed to remove existing tags that point to null manifest andd then leave all the subsequent job for future runs on the post_save hook.
or - I get implemented this stage that would loop trough dc and check type(dc.content) == Tag and remove those that have tagged_manifest=null.

@@ -407,61 +408,62 @@ def get_remote_by_importer_id(importer_id):
downloaded=pulp2content.downloaded
)
if artifact is None:
Copy link
Member Author

@ipanova ipanova Apr 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am pretty sure i have screwed up the indentation since at least the counter sduring the migration got off, but the gist of this change is that we cannot continue when artifact is None due to the content that has futures, those futures need to be resolved otherwise we would get into the problem where in one batch there would be manifests+tags( in the ContentSaverStage), and since tags rely on manifests being already saved at the tags _pre_save_hook() tags won't have needed information at the save() time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also need to check with @quba42 on how this change will impact debian pipeline.
I have checked rpm one - it does have futures -Rpms, that are getting resolved and then in the interrelate stage relation betwen modules and rpms are created. For the case when a modular rpm would miss, there would be created one relation less. tldr - rpm plugin is not impacted.

Copy link
Contributor

@quba42 quba42 Apr 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure.
Here is what I do rely on: Later types in the order given by the ordered dict DebMigrator.content_models, may query for the objects created by earlier types from their create_pulp3_content() method. e.g.:

release = pulp3_models.Release.objects.filter(<some_uniquely_identifying_filter>).first()

Currently, if such queries return None, the pipeline will fail.

I understand I need to add graceful handling for the case that this query will return None because the target content was never created because it's artifact was corrupt, and it was therefore skipped (should be trivial to do).

Edit: See #353 (perhaps not quite as trivial as I had hoped).

I am not sure if there can also be a case were the query returns None because the target content has not yet passed sufficiently far through the pipeline, which would be a bigger problem...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should not be the case because debian package is a future type content and will be resolved/saved/skipped if corrupted/ by the time PackageReleaseComponent content will be saved.

@ipanova ipanova force-pushed the empty-tags-migration branch 4 times, most recently from 16a0850 to 3e57bca Compare April 20, 2021 11:16
@ipanova ipanova marked this pull request as ready for review April 20, 2021 12:46
Comment on lines 411 to 412
skipped_content += 1
else:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, maybe let's not nest it more and try to keep continue in place.
It will mean that the futures needs to be set for resolution at the end, outside of the loop.

if pb:
pb.increment()

if has_future and dc:
futures.append(dc)
resolve_futures = len(futures) >= DEFAULT_BATCH_SIZE or pb.done == pb.total
all_content_done = pb.done == pb.total or (pb.done + skipped_content) == pb.total
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like that skipped content is counted as a part of pb.done anyway, becasue of the line 464.

if pb:
pb.increment()

if has_future and dc:
futures.append(dc)
resolve_futures = len(futures) >= DEFAULT_BATCH_SIZE or pb.done == pb.total
all_content_done = pb.done == pb.total or (pb.done + skipped_content) == pb.total
resolve_futures = len(futures) >= DEFAULT_BATCH_SIZE or all_content_done
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you decide to have set future resolution outside of the loop as well, we can get rid off the second condition completely.


"""
for dc in batch:
if type(dc.content) == Tag and not dc.content.tagged_manifest:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not critical here so it;s up to you, but I'd vote for the isinstance instead of type. Taught myself not to use type in Python at all. Just a as a good habit, so it doesn't bite in certain situations.

@ipanova ipanova force-pushed the empty-tags-migration branch 2 times, most recently from 6a76c2f to c70cd6b Compare April 22, 2021 13:23
@@ -0,0 +1 @@
Handle skipping of corrupted docker content that has futures.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about Handled properly skipping of corrupted docker content.?
Past tense for changelog entries + I think users have no idea what futures are, so I'm trying to make the changelog entry more user oriented.

Copy link
Member

@goosemania goosemania left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!
Please see my only comment about the changelog entry.

@ipanova ipanova merged commit dcaefac into pulp:master Apr 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants