Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify if a manifest already exists locally when syncing a remote repo. #1133

Merged
merged 1 commit into from Nov 29, 2022

Conversation

decko
Copy link
Member

@decko decko commented Nov 10, 2022

Closes #1047

@decko decko force-pushed the check_for_existing_manifests_1047 branch from 2e618b6 to 47476d6 Compare November 10, 2022 20:08
@decko decko requested a review from ipanova November 10, 2022 20:33
@decko decko closed this Nov 10, 2022
@decko decko reopened this Nov 10, 2022
@decko decko force-pushed the check_for_existing_manifests_1047 branch 6 times, most recently from 94e8c78 to e3002da Compare November 18, 2022 21:39
@decko decko marked this pull request as ready for review November 18, 2022 21:42
CHANGES/1047.feature Outdated Show resolved Hide resolved
pulp_container/app/tasks/sync_stages.py Outdated Show resolved Hide resolved
@decko decko closed this Nov 20, 2022
@decko decko reopened this Nov 20, 2022
@decko decko requested a review from ipanova November 20, 2022 22:02
@decko decko force-pushed the check_for_existing_manifests_1047 branch from e3002da to fa44324 Compare November 21, 2022 13:18
pulp_container/app/tasks/sync_stages.py Outdated Show resolved Hide resolved
digest = dl_res.artifact_attributes["sha256"]
validate_manifest(content_data, media_type, digest)
for artifact in asyncio.as_completed(to_download_artifact):
saved_artifact, content_data, raw_data, response = await artifact
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider returning here just raw_data and on the line after do json.load for the content_data

pulp_container/app/tasks/sync_stages.py Show resolved Hide resolved
pulp_container/app/tasks/sync_stages.py Outdated Show resolved Hide resolved
response.artifact_attributes["file"] = response.path

saved_artifact = await sync_to_async(_save_artifact_blocking)(response.artifact_attributes)
content_data = json.loads(raw_data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure, you probably don't need to return content_data here. just raw_data should suffice

raw_data = saved_artifact.file
content_data = json.load(raw_data)
saved_artifact.file.close()
return saved_artifact, content_data, raw_data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about we return just just raw_data and later on where it is needed we can json.load it. I am trying to avoid to carry both content_data and raw_data

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it, but there are some places where I need to use _get_content_data_blocking and _download_and_save_artifact_data. Specially in the create_listed_manifest, where there are statements under the else branch that need content_data. I will end calling json.load more times than if left where they are now. 😭

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also in favour of removing content_data from the return parameters. You will call json.load two times instead of one. Or not? Still, the number of parameters will decrease all around the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be written in the code twice - under if and under else branch but in the end of the day called it will be once.

@decko decko force-pushed the check_for_existing_manifests_1047 branch from fa44324 to e2e8b13 Compare November 22, 2022 20:12
@decko decko force-pushed the check_for_existing_manifests_1047 branch 3 times, most recently from d5b525a to f860076 Compare November 23, 2022 15:16
@decko decko force-pushed the check_for_existing_manifests_1047 branch 3 times, most recently from d21e24c to a029df5 Compare November 24, 2022 16:18
saved_artifact = manifest.contentartifact_set.first().artifact
raw_data = saved_artifact.file.read()
content_data = json.loads(raw_data)
saved_artifact.file.close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use context manager here

Copy link
Member

@ipanova ipanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good thank you! We can look into how optimize on the returned parameters in the follow up PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

As a user during sync of remote repo I have first checked wehther manifests exist locally
5 participants