-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
List added/removed content according to a given RepositoryVersion #460
Conversation
CHANGES/5706.feature
Outdated
@@ -0,0 +1 @@ | |||
List added/removed content according to a given RepositoryVersion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite ambiguous. What about, "Added an optional parameter base_version to RepositoryVersion add() and removed() methods."
Also, this should probably be in CHANGE/plugin_api/
.
For extra bonus points*, what about adding some unit tests? * bonus points will be redeemed at weekly team meetings during the call for kudos |
pulpcore/app/models/repository.py
Outdated
if not base_version: | ||
return Content.objects.filter(version_memberships__version_added=self) | ||
versions = RepositoryVersion.objects.filter( | ||
number__in=range(base_version.number + 1, self.number + 1), repository=self.repository) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought through this logic a bit and I think there is a problem. Consider the case where you have versions 1, 2, 3, 4 and you call added(1) on version 4. If there's a unit that was added in 2 and removed in 3, it'll be returned which is not correct.
The only way I can think of off the top of my head is to get the content for version 4 and version 1 and subtract.
@gmbnomis what do you think about this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am correct, the implementation has a problem. I did not check why but I proposed a different implementation instead.
self.assertEqual(version3.removed().count(), 0) | ||
|
||
self.assertEqual(version3.added(version0).count(), 3) | ||
self.assertEqual(version3.removed(version0).count(), 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, version 0 should be empty and version 3 contains 0,2,3 right? Then, there must be no removed element at all?
If so, and the test runs, your implementation has a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I misunderstood the problem, I specifically designed this solution to show that one element was removed during all the versions.
@@ -449,19 +449,59 @@ def artifacts(self): | |||
""" | |||
Artifact.objects.filter(content__pk__in=self.content) | |||
|
|||
def added(self): | |||
def added(self, base_version=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks complicated and may take a long time to run if there are many versions between the base_version
and the current version. I was thinking about:
def _relationships(self):
return RepositoryContent.objects.filter(
repository=self.repository, version_added__number__lte=self.number
).exclude(version_removed__number__lte=self.number)
@property
def content(self):
return Content.objects.filter(version_memberships__in=self._relationships())
def added(self, base_version=None):
if not base_version:
return Content.objects.filter(version_memberships__version_added=self)
return Content.objects.filter(
version_memberships__in=self._relationships()
).exclude(
version_memberships__in=base_version._relationships()
)
def removed(self, base_version=None):
if not base_version:
return Content.objects.filter(version_memberships__version_removed=self)
return Content.objects.filter(
version_memberships__in=base_version._relationships()
).exclude(
version_memberships__in=self._relationships()
)
I hadn't the time to look into the cost of the queries in your implementation and in the one from above.
The queries above may be costly (we need a join with the repository version table to get the version numbers to do the range query on version numbers). However, I don't know how this relates to your queries and we do .content queries all the time anyway...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely your queries are better, thank you!
self.assertTrue(self.pks[0] in added_pks) | ||
self.assertTrue(self.pks[1] not in added_pks) | ||
self.assertTrue(self.pks[2] in added_pks) | ||
self.assertTrue(self.pks[3] in added_pks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to have additional comparisons in the test e.g. version 3 against all other versions.
self.assertTrue(self.pks[2] not in removed_pks) | ||
self.assertTrue(self.pks[3] not in removed_pks) | ||
|
||
self.assertTrue(self.pks[0] in added_pks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposal: It's less explicit, but self.assertCountEqual(added_pks, compress(self.pks, [1, 0, 1, 1]))
is more concise (using compress from itertools).
@@ -105,7 +105,7 @@ def new_version(self, base_version=None): | |||
# now add any content that's in the base_version but not in version | |||
version.add_content(base_version.content.exclude(pk__in=version.content)) | |||
|
|||
if Task.current: | |||
if Task.current(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, now I know why I had to work around this in my unit tests. Thank you!
pulpcore/app/models/repository.py
Outdated
@@ -449,19 +449,48 @@ def artifacts(self): | |||
""" | |||
Artifact.objects.filter(content__pk__in=self.content) | |||
|
|||
def added(self): | |||
def _relationships(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use _relationships()
in the content
property as well.
pulpcore/app/models/repository.py
Outdated
""" | ||
Returns: | ||
QuerySet: The Content objects that were added until this version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring needs to be updated
@fabricio-aguiar @daviddavis @bmbouter I have written two different fixes for https://pulp.plan.io/issues/5707 based on this fix for 5706. I would love to get feedback on your view on the pros and cons of these implementations, see https://pulp.plan.io/issues/5707#note-8. @fabricio-aguiar Thank you for the test rig added by this PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you! We might want to add documentation to added()/removed()
that explains what the methods do if a parameter is/isn't given.
@@ -396,6 +396,17 @@ class Meta: | |||
get_latest_by = 'number' | |||
ordering = ('number',) | |||
|
|||
def _relationships(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only hangup I have with this PR. _relationships
sounds generic. Would there be a better name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about _find_content
? @gmbnomis and @daviddavis what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it's not really Content though. It's RepositoryContent. What about _repository_content
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it's not really Content though. It's RepositoryContent. What about
_repository_content
?
This sounds good to me.
Should this be cherry picked? While it isn't a bugfix, it will be necessary to fix
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me! @fabricio-aguiar great job, and thanks @gmbnomis for all the great discussion and effort also!
I think we do want to cherry pick it because we're planning to fix bugs on top of it in 3.0.z like https://pulp.plan.io/issues/5707 ? @daviddavis wdyt? |
https://pulp.plan.io/issues/5706
closes #5706
Please be sure you have read our documentation on creating PRs:
https://docs.pulpproject.org/en/3.0/nightly/contributing/pull-request-walkthrough.html