Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a package retention policy feature #1752

Merged
merged 2 commits into from Jul 2, 2020
Merged

Conversation

dralley
Copy link
Contributor

@dralley dralley commented Jun 18, 2020

@pep8speaks
Copy link

pep8speaks commented Jun 18, 2020

Hello @dralley! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-07-02 17:45:55 UTC

@pulpbot
Copy link
Member

pulpbot commented Jun 18, 2020

Attached issue: https://pulp.plan.io/issues/5950

@dralley
Copy link
Contributor Author

dralley commented Jun 18, 2020

Need to sort out how to get the extension installed on user systems.

@@ -219,6 +223,9 @@ class Meta:
'name', 'epoch', 'version', 'release', 'arch', 'checksum_type', 'pkgId'
)

class ReadonlyMeta:
readonly = ["evr"]
Copy link
Contributor Author

@dralley dralley Jun 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is necessary because otherwise Django will attempt to insert an empty string into the field by default.

@dralley dralley force-pushed the rpm-evr branch 10 times, most recently from 99fdb67 to 67b62a8 Compare June 22, 2020 17:52
@dralley dralley requested a review from a team June 24, 2020 18:39
@dralley dralley changed the title Add database support for comparing EVR Add a retention policy feature Jun 24, 2020
@dralley dralley force-pushed the rpm-evr branch 2 times, most recently from fa57930 to 49c737c Compare June 24, 2020 19:13
@dralley dralley changed the title Add a retention policy feature Add a package retention policy feature Jun 24, 2020
@dralley dralley force-pushed the rpm-evr branch 3 times, most recently from 598f682 to 662f635 Compare June 24, 2020 20:54
@dralley
Copy link
Contributor Author

dralley commented Jun 24, 2020

@goosemania @mdellweg This is feature-complete, I just need to write a functional test (already tested manually) and decide if we want to go forwards with dumping the evr_t extension into the DB separate from Katello or accomplish that in a different way.

Feel free to drop comments.

@@ -306,16 +306,25 @@ class RpmRepositorySerializer(RepositorySerializer):
"""

metadata_signing_service = serializers.HyperlinkedRelatedField(
help_text="A reference to an associated signing service.",
help_text=_("A reference to an associated signing service."),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few of these were missing some i11n

view_name='signing-services-detail',
queryset=AsciiArmoredDetachedSigningService.objects.all(),
many=False,
required=False,
allow_null=True
)
package_retention_policy = serializers.IntegerField(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely open to renaming this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm +1 to rename but I don't have good suggestions :D

For the sake of brainstorming ideas:

latest_package_count
package_recency
recent_package_count
keep_recent_packages
retain_package_count

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to keep_recent_packages or (Ina's suggestion) retain_package_versions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ipanova @ggainey @dkliban @goosemania @daviddavis Any strong preferences between those two (or a different one)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about retained_package_versions? keep_ and retain_ are verbs, but what you're really specifying is a noun, "how many package-versions should I keep in this repo". Does that make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could be considered either a noun or a verb from the user perspective. It's a fine suggestion, though. I don't really want to bike shed this forever, maybe we should just set up a poll and whichever one wins is the one we go with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea - maybe a quick 24-hr poll, with my, ipanova's, and your suggestions? Note that I am not strongly tied to my suggestion - any of those three would be fine I think, if you picked one and just went with it I'd be fine.

Copy link
Member

@goosemania goosemania left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! 🐱

Comment on lines 152 to 161
CREATE TRIGGER evr_update_trigger
BEFORE UPDATE OF epoch, version, release
ON rpm_package
FOR EACH ROW
WHEN (
OLD.epoch IS DISTINCT FROM NEW.epoch OR
OLD.version IS DISTINCT FROM NEW.version OR
OLD.release IS DISTINCT FROM NEW.release
)
EXECUTE PROCEDURE evr_trigger();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be ever used?
I'm +1 to keep just because it's a safe thing to do. I'm just wondering if there is any workflow or use case for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so.

Comment on lines 6 to +7
from django.db import models
from django.db.models import Window, F
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit-pick. Would it make sense to do one way or the other and not both?

from django.db import models
from django.db.models import ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving it this way is my personal preference, since we've got 3 levels of imports and not just 2.

Comment on lines +46 to +53
return self.annotate(
age=Window(
expression=RowNumber(),
partition_by=[F('name'), F('arch')],
order_by=F('evr').desc()
)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweeeet 🥮

if package.age > self.package_retention_policy:
old_packages.append(package.pk)

new_version.remove_content(Content.objects.filter(pk__in=old_packages))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to ask if it's ok to remove the ones which are not a part of any repo version apart from this incomplete one. And it seem like yes, because remove_content does this https://github.com/pulp/pulpcore/blob/master/pulpcore/app/models/repository.py#L602.

Not related to this PR but related to the topic of removal.
I guess this improvement ^^ to the removal was added later than the advisory changes. I added clarifications as you asked but now it seems like the code can be simplified, please check #1757

help_text=_("The number of verisons of each package to keep in the repository; "
"older versions will be purged. A value of '0' will disable this"
"feature and keep all versions of the package."),
min_value=0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, it can be PositiveIntegerField.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the serializer there is no such thing as PositiveIntegerField

https://www.django-rest-framework.org/api-guide/fields/#integerfield

Comment on lines 43 to 44
Partitioning by architecture is important because there i686 and x86_64 packages
with same name and same or close verison numbers are not interchangeable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just crossed my mind that in Pulp3, SRPMs are also stored as Package, the only difference is in arch.
Since you are already taking care of it, maybe it make sense to throw in a comment about SRPMs specifically, it might not be obvious that they are covered.

view_name='signing-services-detail',
queryset=AsciiArmoredDetachedSigningService.objects.all(),
many=False,
required=False,
allow_null=True
)
package_retention_policy = serializers.IntegerField(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm +1 to rename but I don't have good suggestions :D

For the sake of brainstorming ideas:

latest_package_count
package_recency
recent_package_count
keep_recent_packages
retain_package_count

@dralley dralley force-pushed the rpm-evr branch 3 times, most recently from 1607553 to bf6e891 Compare July 1, 2020 03:09
cls.remote_api = RemotesRpmApi(cls.client)
delete_orphans(cls.cfg)

def test_sync_with_retention(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@goosemania This functional test only tests the retention policy feature with sync, do you think that is sufficient for now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me. I saw you've already put a note in the coverage doc, thanks!

{'duck': ['0.7', '0.6'], 'kangaroo': ['0.2'], 'walrus': ['0.71']},
versions_for_packages
)
# TODO: Test that modular RPMs unaffected?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@goosemania Do you think this needs a test up-front or just document it for later?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with having it later.

coverage.md Outdated
@@ -41,3 +41,6 @@ Manual Coverage
| As a user, when a module is removed, its packages are removed as well ( not referenced by other modules) | NO | |
| **Consumer cases** | | |
| As a user, I can use dnf to install all the content served by Pulp | PART | only covers rpm installation |
| **Retention** | | |
| As a user, I can have a repository option that retains the latest N packages of the same name | PART | No coverage of packages with differing arch in same repo (src, i686, x86_64), no coverage of non-sync repo modifications. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@dralley dralley force-pushed the rpm-evr branch 3 times, most recently from 4ff2f30 to 72e3b8f Compare July 1, 2020 14:39
)


class RetentionPolicyTestCase(PulpTestCase):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can rename this too but the name still seems kind of fitting when used as a noun.

@@ -0,0 +1 @@
Add a retention policy feature - when specified, the latest N versions of each package will be kept and older versions will be purged.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned elsewhere but, "retention policy" still seems like an apt description of the feature in noun form, so I'm not sure if we want to adjust this wording as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prompted me to think about how users will learn about new configuration.
What about adding to the docs what is the parameter and how to use it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

pavelpicka added a commit to pavelpicka/pulp_rpm that referenced this pull request Jul 1, 2020
Depsolving for PackageGroups, PackageEnvirnment and PackageCategory.

Required PR: pulp#1752

closes: #6316
https://pulp.plan.io/issues/6316
@pavelpicka pavelpicka mentioned this pull request Jul 1, 2020
pavelpicka added a commit to pavelpicka/pulp_rpm that referenced this pull request Jul 1, 2020
Depsolving for PackageGroups, PackageEnvirnment and PackageCategory.

Required PR: pulp#1752

closes: #6316
https://pulp.plan.io/issues/6316

[nocoverage]
pavelpicka added a commit to pavelpicka/pulp_rpm that referenced this pull request Jul 1, 2020
Depsolving for PackageGroups, PackageEnvirnment and PackageCategory.

Required PR: pulp#1752

closes: #6316
https://pulp.plan.io/issues/6316

[nocoverage]
self.addCleanup(self.remote_api.delete, remote.pulp_href)

with self.assertRaises(ApiException):
self.sync(repository=repo, remote=remote, optimize=False, mirror=True)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if there's a way to assert the specific code (400).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dralley dralley force-pushed the rpm-evr branch 2 times, most recently from 4cd02fe to 9e45c34 Compare July 2, 2020 14:16
Copy link
Member

@goosemania goosemania left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks good!

view_name='signing-services-detail',
queryset=AsciiArmoredDetachedSigningService.objects.all(),
many=False,
required=False,
allow_null=True
)
retain_package_versions = serializers.IntegerField(
help_text=_("The number of verisons of each package to keep in the repository; "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo in versions

view_name='signing-services-detail',
queryset=AsciiArmoredDetachedSigningService.objects.all(),
many=False,
required=False,
allow_null=True
)
retain_package_versions = serializers.IntegerField(
help_text=_("The number of verisons of each package to keep in the repository; "
"older versions will be purged. A value of '0' will disable this"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

up to you if it's helpful to note that '0 is a default.

Comment on lines +135 to +136
if repository.retain_package_versions > 0 and mirror:
raise DRFValidationError("Cannot use 'retain_package_versions' with mirror-mode sync")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants