Skip to content
This repository has been archived by the owner on Jan 9, 2023. It is now read-only.

Fixed migration of docker content when corrupted content has been #352

Merged
merged 1 commit into from Apr 23, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGES/8594.bugfix
@@ -0,0 +1 @@
Handled properly skipping of corrupted or missing docker content.
16 changes: 13 additions & 3 deletions pulp_2to3_migration/app/plugin/content.py
Expand Up @@ -407,6 +407,8 @@ def get_remote_by_importer_id(importer_id):
downloaded=pulp2content.downloaded
)
if artifact is None:
Copy link
Member Author

@ipanova ipanova Apr 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am pretty sure i have screwed up the indentation since at least the counter sduring the migration got off, but the gist of this change is that we cannot continue when artifact is None due to the content that has futures, those futures need to be resolved otherwise we would get into the problem where in one batch there would be manifests+tags( in the ContentSaverStage), and since tags rely on manifests being already saved at the tags _pre_save_hook() tags won't have needed information at the save() time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also need to check with @quba42 on how this change will impact debian pipeline.
I have checked rpm one - it does have futures -Rpms, that are getting resolved and then in the interrelate stage relation betwen modules and rpms are created. For the case when a modular rpm would miss, there would be created one relation less. tldr - rpm plugin is not impacted.

Copy link
Contributor

@quba42 quba42 Apr 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure.
Here is what I do rely on: Later types in the order given by the ordered dict DebMigrator.content_models, may query for the objects created by earlier types from their create_pulp3_content() method. e.g.:

release = pulp3_models.Release.objects.filter(<some_uniquely_identifying_filter>).first()

Currently, if such queries return None, the pipeline will fail.

I understand I need to add graceful handling for the case that this query will return None because the target content was never created because it's artifact was corrupt, and it was therefore skipped (should be trivial to do).

Edit: See #353 (perhaps not quite as trivial as I had hoped).

I am not sure if there can also be a case were the query returns None because the target content has not yet passed sufficiently far through the pipeline, which would be a bigger problem...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should not be the case because debian package is a future type content and will be resolved/saved/skipped if corrupted/ by the time PackageReleaseComponent content will be saved.

if pb:
pb.increment()
continue

if is_lazy_type and pulp2lazycatalog:
Expand Down Expand Up @@ -439,8 +441,11 @@ def get_remote_by_importer_id(importer_id):
await self.put(dc)

if not at_least_one_lce_migrated:
_logger.warn(_('On_demand content cannot be migrated without a remote '
'pulp2 unit_id: {}'.format(pulp2content.pulp2_id)))
_logger.warn(
_('On_demand content cannot be migrated without a remote '
'pulp2 unit_id: {}'.format(pulp2content.pulp2_id)
)
)
future_relations.update({'lces': list(pulp2lazycatalog)})
else:
relative_path = (
Expand All @@ -461,12 +466,17 @@ def get_remote_by_importer_id(importer_id):

if has_future and dc:
futures.append(dc)
resolve_futures = len(futures) >= DEFAULT_BATCH_SIZE or pb.done == pb.total
resolve_futures = len(futures) >= DEFAULT_BATCH_SIZE
if resolve_futures:
for dc in futures:
await dc.resolution()
futures.clear()

# resolve futures if there are any left
for dc in futures:
await dc.resolution()
futures.clear()


class UpdateLCEs(Stage):
"""
Expand Down
15 changes: 14 additions & 1 deletion pulp_2to3_migration/app/plugin/docker/migrator.py
Expand Up @@ -260,10 +260,23 @@ async def _pre_save(self, batch):

"""
for dc in batch:
if type(dc.content) == Tag:
if isinstance(dc.content, Tag):
related_man_id = dc.extra_data.get('tag_rel')
# find manifest by id
# We are relying on the order of the processed DC
# Manifests should have passed through ContentSaver stage already
man = Manifest.objects.filter(digest=related_man_id).first()
dc.content.tagged_manifest = man

async def _post_save(self, batch):
"""
Remove tag if it points to a tagged_manifest=null

Args:
batch (list of :class:`~pulpcore.plugin.stages.DeclarativeContent`): The batch of
:class:`~pulpcore.plugin.stages.DeclarativeContent` objects to be saved.

"""
for dc in batch:
if isinstance(dc.content, Tag) and not dc.content.tagged_manifest:
dc.content.delete()