New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a package retention policy feature #1752
Conversation
|
Attached issue: https://pulp.plan.io/issues/5950 |
|
Need to sort out how to get the extension installed on user systems. |
| @@ -219,6 +223,9 @@ class Meta: | |||
| 'name', 'epoch', 'version', 'release', 'arch', 'checksum_type', 'pkgId' | |||
| ) | |||
|
|
|||
| class ReadonlyMeta: | |||
| readonly = ["evr"] | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is necessary because otherwise Django will attempt to insert an empty string into the field by default.
99fdb67
to
67b62a8
Compare
fa57930
to
49c737c
Compare
598f682
to
662f635
Compare
|
@goosemania @mdellweg This is feature-complete, I just need to write a functional test (already tested manually) and decide if we want to go forwards with dumping the evr_t extension into the DB separate from Katello or accomplish that in a different way. Feel free to drop comments. |
| @@ -306,16 +306,25 @@ class RpmRepositorySerializer(RepositorySerializer): | |||
| """ | |||
|
|
|||
| metadata_signing_service = serializers.HyperlinkedRelatedField( | |||
| help_text="A reference to an associated signing service.", | |||
| help_text=_("A reference to an associated signing service."), | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few of these were missing some i11n
pulp_rpm/app/serializers.py
Outdated
| view_name='signing-services-detail', | ||
| queryset=AsciiArmoredDetachedSigningService.objects.all(), | ||
| many=False, | ||
| required=False, | ||
| allow_null=True | ||
| ) | ||
| package_retention_policy = serializers.IntegerField( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely open to renaming this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm +1 to rename but I don't have good suggestions :D
For the sake of brainstorming ideas:
latest_package_count
package_recency
recent_package_count
keep_recent_packages
retain_package_count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to keep_recent_packages or (Ina's suggestion) retain_package_versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ipanova @ggainey @dkliban @goosemania @daviddavis Any strong preferences between those two (or a different one)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about retained_package_versions? keep_ and retain_ are verbs, but what you're really specifying is a noun, "how many package-versions should I keep in this repo". Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could be considered either a noun or a verb from the user perspective. It's a fine suggestion, though. I don't really want to bike shed this forever, maybe we should just set up a poll and whichever one wins is the one we go with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea - maybe a quick 24-hr poll, with my, ipanova's, and your suggestions? Note that I am not strongly tied to my suggestion - any of those three would be fine I think, if you picked one and just went with it I'd be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job! 🐱
| CREATE TRIGGER evr_update_trigger | ||
| BEFORE UPDATE OF epoch, version, release | ||
| ON rpm_package | ||
| FOR EACH ROW | ||
| WHEN ( | ||
| OLD.epoch IS DISTINCT FROM NEW.epoch OR | ||
| OLD.version IS DISTINCT FROM NEW.version OR | ||
| OLD.release IS DISTINCT FROM NEW.release | ||
| ) | ||
| EXECUTE PROCEDURE evr_trigger(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be ever used?
I'm +1 to keep just because it's a safe thing to do. I'm just wondering if there is any workflow or use case for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so.
| from django.db import models | ||
| from django.db.models import Window, F |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit-pick. Would it make sense to do one way or the other and not both?
from django.db import models
from django.db.models import ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving it this way is my personal preference, since we've got 3 levels of imports and not just 2.
| return self.annotate( | ||
| age=Window( | ||
| expression=RowNumber(), | ||
| partition_by=[F('name'), F('arch')], | ||
| order_by=F('evr').desc() | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweeeet 🥮
| if package.age > self.package_retention_policy: | ||
| old_packages.append(package.pk) | ||
|
|
||
| new_version.remove_content(Content.objects.filter(pk__in=old_packages)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant to ask if it's ok to remove the ones which are not a part of any repo version apart from this incomplete one. And it seem like yes, because remove_content does this https://github.com/pulp/pulpcore/blob/master/pulpcore/app/models/repository.py#L602.
Not related to this PR but related to the topic of removal.
I guess this improvement ^^ to the removal was added later than the advisory changes. I added clarifications as you asked but now it seems like the code can be simplified, please check #1757
| help_text=_("The number of verisons of each package to keep in the repository; " | ||
| "older versions will be purged. A value of '0' will disable this" | ||
| "feature and keep all versions of the package."), | ||
| min_value=0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, it can be PositiveIntegerField.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for the serializer there is no such thing as PositiveIntegerField
https://www.django-rest-framework.org/api-guide/fields/#integerfield
pulp_rpm/app/models/package.py
Outdated
| Partitioning by architecture is important because there i686 and x86_64 packages | ||
| with same name and same or close verison numbers are not interchangeable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just crossed my mind that in Pulp3, SRPMs are also stored as Package, the only difference is in arch.
Since you are already taking care of it, maybe it make sense to throw in a comment about SRPMs specifically, it might not be obvious that they are covered.
pulp_rpm/app/serializers.py
Outdated
| view_name='signing-services-detail', | ||
| queryset=AsciiArmoredDetachedSigningService.objects.all(), | ||
| many=False, | ||
| required=False, | ||
| allow_null=True | ||
| ) | ||
| package_retention_policy = serializers.IntegerField( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm +1 to rename but I don't have good suggestions :D
For the sake of brainstorming ideas:
latest_package_count
package_recency
recent_package_count
keep_recent_packages
retain_package_count
closes: #5402 https://pulp.plan.io/issues/5402 [nocoverage]
1607553
to
bf6e891
Compare
| cls.remote_api = RemotesRpmApi(cls.client) | ||
| delete_orphans(cls.cfg) | ||
|
|
||
| def test_sync_with_retention(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@goosemania This functional test only tests the retention policy feature with sync, do you think that is sufficient for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me. I saw you've already put a note in the coverage doc, thanks!
| {'duck': ['0.7', '0.6'], 'kangaroo': ['0.2'], 'walrus': ['0.71']}, | ||
| versions_for_packages | ||
| ) | ||
| # TODO: Test that modular RPMs unaffected? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@goosemania Do you think this needs a test up-front or just document it for later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with having it later.
coverage.md
Outdated
| @@ -41,3 +41,6 @@ Manual Coverage | |||
| | As a user, when a module is removed, its packages are removed as well ( not referenced by other modules) | NO | | | |||
| | **Consumer cases** | | | | |||
| | As a user, I can use dnf to install all the content served by Pulp | PART | only covers rpm installation | | |||
| | **Retention** | | | | |||
| | As a user, I can have a repository option that retains the latest N packages of the same name | PART | No coverage of packages with differing arch in same repo (src, i686, x86_64), no coverage of non-sync repo modifications. | | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
4ff2f30
to
72e3b8f
Compare
| ) | ||
|
|
||
|
|
||
| class RetentionPolicyTestCase(PulpTestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can rename this too but the name still seems kind of fitting when used as a noun.
| @@ -0,0 +1 @@ | |||
| Add a retention policy feature - when specified, the latest N versions of each package will be kept and older versions will be purged. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned elsewhere but, "retention policy" still seems like an apt description of the feature in noun form, so I'm not sure if we want to adjust this wording as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This prompted me to think about how users will learn about new configuration.
What about adding to the docs what is the parameter and how to use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Depsolving for PackageGroups, PackageEnvirnment and PackageCategory. Required PR: pulp#1752 closes: #6316 https://pulp.plan.io/issues/6316
Depsolving for PackageGroups, PackageEnvirnment and PackageCategory. Required PR: pulp#1752 closes: #6316 https://pulp.plan.io/issues/6316 [nocoverage]
Depsolving for PackageGroups, PackageEnvirnment and PackageCategory. Required PR: pulp#1752 closes: #6316 https://pulp.plan.io/issues/6316 [nocoverage]
| self.addCleanup(self.remote_api.delete, remote.pulp_href) | ||
|
|
||
| with self.assertRaises(ApiException): | ||
| self.sync(repository=repo, remote=remote, optimize=False, mirror=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if there's a way to assert the specific code (400).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4cd02fe
to
9e45c34
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Looks good!
pulp_rpm/app/serializers.py
Outdated
| view_name='signing-services-detail', | ||
| queryset=AsciiArmoredDetachedSigningService.objects.all(), | ||
| many=False, | ||
| required=False, | ||
| allow_null=True | ||
| ) | ||
| retain_package_versions = serializers.IntegerField( | ||
| help_text=_("The number of verisons of each package to keep in the repository; " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo in versions
pulp_rpm/app/serializers.py
Outdated
| view_name='signing-services-detail', | ||
| queryset=AsciiArmoredDetachedSigningService.objects.all(), | ||
| many=False, | ||
| required=False, | ||
| allow_null=True | ||
| ) | ||
| retain_package_versions = serializers.IntegerField( | ||
| help_text=_("The number of verisons of each package to keep in the repository; " | ||
| "older versions will be purged. A value of '0' will disable this" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
up to you if it's helpful to note that '0 is a default.
| if repository.retain_package_versions > 0 and mirror: | ||
| raise DRFValidationError("Cannot use 'retain_package_versions' with mirror-mode sync") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
closes: #5367
https://pulp.plan.io/issues/5367