Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List added/removed content according to a given RepositoryVersion #460

Merged
merged 1 commit into from
Jan 7, 2020

Conversation

fao89
Copy link
Member

@fao89 fao89 commented Dec 18, 2019

https://pulp.plan.io/issues/5706
closes #5706

Please be sure you have read our documentation on creating PRs:
https://docs.pulpproject.org/en/3.0/nightly/contributing/pull-request-walkthrough.html

@@ -0,0 +1 @@
List added/removed content according to a given RepositoryVersion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite ambiguous. What about, "Added an optional parameter base_version to RepositoryVersion add() and removed() methods."

Also, this should probably be in CHANGE/plugin_api/.

@daviddavis
Copy link
Contributor

daviddavis commented Dec 18, 2019

For extra bonus points*, what about adding some unit tests?

* bonus points will be redeemed at weekly team meetings during the call for kudos

if not base_version:
return Content.objects.filter(version_memberships__version_added=self)
versions = RepositoryVersion.objects.filter(
number__in=range(base_version.number + 1, self.number + 1), repository=self.repository)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought through this logic a bit and I think there is a problem. Consider the case where you have versions 1, 2, 3, 4 and you call added(1) on version 4. If there's a unit that was added in 2 and removed in 3, it'll be returned which is not correct.

The only way I can think of off the top of my head is to get the content for version 4 and version 1 and subtract.

@bmbouter
Copy link
Member

@gmbnomis what do you think about this?

Copy link
Contributor

@gmbnomis gmbnomis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am correct, the implementation has a problem. I did not check why but I proposed a different implementation instead.

self.assertEqual(version3.removed().count(), 0)

self.assertEqual(version3.added(version0).count(), 3)
self.assertEqual(version3.removed(version0).count(), 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, version 0 should be empty and version 3 contains 0,2,3 right? Then, there must be no removed element at all?

If so, and the test runs, your implementation has a problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I misunderstood the problem, I specifically designed this solution to show that one element was removed during all the versions.

@@ -449,19 +449,59 @@ def artifacts(self):
"""
Artifact.objects.filter(content__pk__in=self.content)

def added(self):
def added(self, base_version=None):
Copy link
Contributor

@gmbnomis gmbnomis Dec 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks complicated and may take a long time to run if there are many versions between the base_version and the current version. I was thinking about:


    def _relationships(self):
        return RepositoryContent.objects.filter(
            repository=self.repository, version_added__number__lte=self.number
        ).exclude(version_removed__number__lte=self.number)

    @property
    def content(self):
        return Content.objects.filter(version_memberships__in=self._relationships())


    def added(self, base_version=None):
        if not base_version:
            return Content.objects.filter(version_memberships__version_added=self)

        return Content.objects.filter(
            version_memberships__in=self._relationships()
        ).exclude(
            version_memberships__in=base_version._relationships()
        )

    def removed(self, base_version=None):
        if not base_version:
            return Content.objects.filter(version_memberships__version_removed=self)

        return Content.objects.filter(
            version_memberships__in=base_version._relationships()
        ).exclude(
            version_memberships__in=self._relationships()
        )

I hadn't the time to look into the cost of the queries in your implementation and in the one from above.

The queries above may be costly (we need a join with the repository version table to get the version numbers to do the range query on version numbers). However, I don't know how this relates to your queries and we do .content queries all the time anyway...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely your queries are better, thank you!

self.assertTrue(self.pks[0] in added_pks)
self.assertTrue(self.pks[1] not in added_pks)
self.assertTrue(self.pks[2] in added_pks)
self.assertTrue(self.pks[3] in added_pks)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to have additional comparisons in the test e.g. version 3 against all other versions.

self.assertTrue(self.pks[2] not in removed_pks)
self.assertTrue(self.pks[3] not in removed_pks)

self.assertTrue(self.pks[0] in added_pks)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposal: It's less explicit, but self.assertCountEqual(added_pks, compress(self.pks, [1, 0, 1, 1])) is more concise (using compress from itertools).

@@ -105,7 +105,7 @@ def new_version(self, base_version=None):
# now add any content that's in the base_version but not in version
version.add_content(base_version.content.exclude(pk__in=version.content))

if Task.current:
if Task.current():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, now I know why I had to work around this in my unit tests. Thank you!

@@ -449,19 +449,48 @@ def artifacts(self):
"""
Artifact.objects.filter(content__pk__in=self.content)

def added(self):
def _relationships(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use _relationships() in the content property as well.

"""
Returns:
QuerySet: The Content objects that were added until this version.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring needs to be updated

@gmbnomis
Copy link
Contributor

gmbnomis commented Jan 1, 2020

@fabricio-aguiar @daviddavis @bmbouter I have written two different fixes for https://pulp.plan.io/issues/5707 based on this fix for 5706. I would love to get feedback on your view on the pros and cons of these implementations, see https://pulp.plan.io/issues/5707#note-8.

@fabricio-aguiar Thank you for the test rig added by this PR!

Copy link
Contributor

@gmbnomis gmbnomis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you! We might want to add documentation to added()/removed() that explains what the methods do if a parameter is/isn't given.

@@ -396,6 +396,17 @@ class Meta:
get_latest_by = 'number'
ordering = ('number',)

def _relationships(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only hangup I have with this PR. _relationships sounds generic. Would there be a better name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about _find_content? @gmbnomis and @daviddavis what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it's not really Content though. It's RepositoryContent. What about _repository_content?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it's not really Content though. It's RepositoryContent. What about _repository_content?

This sounds good to me.

@gmbnomis
Copy link
Contributor

gmbnomis commented Jan 5, 2020

Should this be cherry picked? While it isn't a bugfix, it will be necessary to fix remove_duplicates().

added_content = repository_version.added()

Copy link
Member

@bmbouter bmbouter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! @fabricio-aguiar great job, and thanks @gmbnomis for all the great discussion and effort also!

@bmbouter
Copy link
Member

bmbouter commented Jan 7, 2020

I think we do want to cherry pick it because we're planning to fix bugs on top of it in 3.0.z like https://pulp.plan.io/issues/5707 ? @daviddavis wdyt?

@daviddavis daviddavis merged commit 420dd1f into pulp:master Jan 7, 2020
@pulpbot pulpbot mentioned this pull request Jan 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants