New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syncing custom/unknown repository metadata #1439
Conversation
closes #5432 https://pulp.plan.io/issues/5432 Required PR: pulp#1438 Required PR: pulp#1439
|
One minor change is missing here. |
pulp_rpm/app/tasks/synchronizing.py
Outdated
| url=urljoin(remote_url, record.location_href), | ||
| relative_path=record.location_href, | ||
| remote=self.remote, | ||
| deferred_download=self.deferred_download |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Historically we had all the repo metadata always downloaded.
I'm not sure if it's a good or bad idea to download it on_demand only.
One example (in favor of always downloading repository metadata):
- sync a repo with on_demand policy
- copy custom metadata to another repo
- in the meantime metadata changes upstream, previous metadata is no longer available to be downloaded.
To be fair, it can happen to any content, however metadata is a bit different, it's always relevant to a specific set of content in a repo, so when you copy it, you do want to preserve the "old" metadata because it will be relevant to the content you have in your copy.
I think my suggestion would be to always set deferred_download to False.
Does it make sense?
Any thoughts? @fabricio-aguiar @daviddavis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it makes sense to me, but I have no idea how often metadata changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daviddavis , @dkliban , @ipanova , what do you think about ALWAYS downloading custom repository metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to always download custom repo metadata
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's download it all the time
| @@ -0,0 +1 @@ | |||
| Syncing custom/unknown repository metadata. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also make it a feature and not just misc.
Users don't care much about models and they might not fully understand what publish is, however they know very well the term "sync" and it's a feature that they can perform now sync (and publish is kinda implied :) for them)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since publish is implied for them, I think it is better to make it a feature on the publish PR, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, works for me
3ccdb99
to
b61c855
Compare
closes #5432 https://pulp.plan.io/issues/5432 Required PR: pulp#1438 Required PR: pulp#1439
closes #5432 https://pulp.plan.io/issues/5432 Required PR: pulp#1438 Required PR: pulp#1439
|
|
||
| package_dupe_criteria = [ | ||
| {'model': Package, 'field_names': ['name', 'epoch', 'version', 'release', 'arch']}, | ||
| {'model': RepoMetadataFile, 'field_names': ['data_type']}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
pulp_rpm/app/tasks/synchronizing.py
Outdated
| package_dupe_criteria = {'model': Package, | ||
| 'field_names': ['name', 'epoch', 'version', 'release', 'arch']} | ||
|
|
||
| package_dupe_criteria = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fabricio-aguiar , since it no longer contains criteria for packages only, I suggest to rename it to something more general, like dupe_criteria.
closes #5431 https://pulp.plan.io/issues/5431 Required PR: pulp#1438
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
closes #5432 https://pulp.plan.io/issues/5432 Required PR: pulp#1438 Required PR: pulp#1439
closes #5432 https://pulp.plan.io/issues/5432 Required PR: pulp#1438 Required PR: pulp#1439
closes #5432 https://pulp.plan.io/issues/5432 Required PR: pulp#1438 Required PR: pulp#1439
closes #5432 https://pulp.plan.io/issues/5432 Required PR: pulp#1438 Required PR: pulp#1439
closes #5431
https://pulp.plan.io/issues/5431
Required PR: #1438