Conversation
|
Hello @bmbouter! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on August 03, 2018 at 20:21 Hours UTC |
bb1c765
to
07739d5
Compare
Codecov Report
@@ Coverage Diff @@
## master #3559 +/- ##
=========================================
- Coverage 58.2% 58.01% -0.2%
=========================================
Files 59 60 +1
Lines 2467 2477 +10
=========================================
+ Hits 1436 1437 +1
- Misses 1031 1040 +9
Continue to review full report at Codecov.
|
00b2ea7
to
54bc039
Compare
1a3b078
to
b656b74
Compare
| @@ -138,6 +138,15 @@ def natural_key(self): | |||
| """ | |||
| return tuple(getattr(self, f) for f in self.natural_key_fields()) | |||
|
|
|||
| def natural_key_dict(self): | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be simplified to:
return {f: getattr(self, f) for f in self.natural_key_fields()}
This is simple enough for the caller to just do. I don't think this (1-line) method adds sufficient value to be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use case I think about with this is plugin writers wanting to query for a unit they have in memory using filter() or Q(). There are two places I needed to do this for in DeclarativeVersion. I wanted to state it's purpose and see what you thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is the use case I'd imaged as well so my opinion hasn't changed.
4b9d312
to
95c9bda
Compare
| the `artifact` attributes may be incomplete because not all digest information can be computed | ||
| until the Artifact is downloaded. | ||
|
|
||
| Attributes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fixing this in the next push by using Sphinx's :no_members:
|
|
||
| >>> artifact_downloader(max_concurrent_downloads=42).stage # This is the real stage | ||
|
|
||
| in_q data type: A `~pulpcore.plugin.stages.DeclarativeContent` with potentially files missing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This problem is all over the docs in this PR. I'm going through and fixing all of them with the next push.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
just about to refactor https://pulp.plan.io/issues/3844 re pulp#3844
| if not content: | ||
| raise ValueError(_("DeclarativeContent must have a 'content'")) | ||
| if d_artifacts: | ||
| self.d_artifacts = d_artifacts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest:
self.d_artifacts = d_artifacts or []
would be simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool will include in next push.
| and defaults to 100. | ||
|
|
||
| Returns: | ||
| A single coroutine that can be used to run, wait, or cancel the entire pipeline with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably should document it returns an asyncio.Future. (I think)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point it's only returning a coroutine. I looked into sphinx's support for indicating something is a coroutine, but they aren't there yet unfortunately. sphinx-doc/sphinx#4777
|
@bmbouter, Review finished. A few of the comments ended up being repetitive, sorry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the comments are not merge blockers but would improve code quality. However, a few comments regarding API and correctness need to be addressed.
Mostly fixes class names for pep8 compliance https://pulp.plan.io/issues/3844 re pulp#3844
This is a big diff, but mainly due only to indention https://pulp.plan.io/issues/3844 re pulp#3844
|
@jortel This PR still needs 3 things I'll do tomorrow morning:
I'll ping you when it's ready for final review, Friday AM. |
plugin/pulpcore/plugin/stages/api.py
Outdated
| await asyncio.gather(*futures) | ||
|
|
||
|
|
||
| class EndStage(BaseStage): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big 👍 for making the stages objects! This will unlock a lot of potential.
At the risk of being pedantic :(
Why not just Stage?
The rare cases I see the Base prefix, it strikes me as odd. The class name is a classification of objects and should read naturally using the IsA lexicon.
For example, the natural way to describe an QueryExistingArtifacts is to say:
"A QueryExistingArtifacts` IsA Stage".
not
"A QueryExistingArtifacts` IsA BaseStage".
I don't think designating this as a base class though naming adds value. This isn't a blocker but, IMHO, it would be more correct in the classic object modeling sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think s/BaseStage/Stage/ is a great change. I'll make it on my next push along w/ the last remaining fixes.
| from .content_unit_stages import ContentUnitSaver, QueryExistingContentUnits | ||
|
|
||
|
|
||
| class FirstStage: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is FirstStage still needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not, but I was too tired to remove it last night. It'll be removed in next push.
| run. Default is 100. | ||
| """ | ||
|
|
||
| def __init__(self, max_concurrent_downloads=100, *args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the *args, **kwargs for?
If needed, please docstring what can be passed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not formally spelled out in Python, but *args and **kwargs should be accepted on all init() calls and used when calling super().init(*args, **kwargs). Otherwise the parent object can never receive its parameters through the subclass' init().
For places where *args and **kwargs are used, I'm going to add a docstring that it takes *args and **kwargs and it will say unused params passed on to Stage to stay DRY w/ the docstrings.
|
|
||
| def __init__(self, max_concurrent_downloads=100, *args, **kwargs): | ||
| self.max_concurrent_downloads = max_concurrent_downloads | ||
| super().__init__(*args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better practice to initialize objects in heirarchical order. Please call super().__init__() (first) before setting extended attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok will do in next push
| for unit in self.new_version.content.all(): | ||
| unit = unit.cast() | ||
| self.unit_keys_by_type[type(unit)].add(unit.natural_key()) | ||
| super().__init__(*args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup will fix in next push.
|
|
||
| def __init__(self, new_version, *args, **kwargs): | ||
| self.new_version = new_version | ||
| super().__init__(*args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup will fix in next push.
| 'remote': declarative_artifact.remote, | ||
| } | ||
| rel_path = content_artifact.relative_path | ||
| remote_artifact_map[rel_path] = remote_artifact_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still correlated by rel_path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup this and the storage backend both are being fixed in next push.
* switch storage backend to DefaultStorage() * super() is now called first in anytime it's used * FirstStage is removed * BaseStage -> Stage * the relative path now handles duplicate rel_paths correctly * documenting args and kwargs I did some final hand testing, and I also ensure the docs build and look good. https://pulp.plan.io/issues/3844 re pulp#3844
| ) | ||
| declarative_artifact.artifact = new_artifact | ||
| to_download_count = to_download_count + 1 | ||
| pb.done = pb.done + to_download_count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI: using += would be more concise.
| 'remote': declarative_artifact.remote, | ||
| } | ||
| content_pk = content_artifact.content.pk | ||
| remote_artifact_map[content_pk] = remote_artifact_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After switching to integer PK, the content_artifact.content.pk will be (None) because the model has not yet been saved. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is rebased onto the integer PK already so it uses them currently. The pk is set from L#151.
I verified I cannot use the models themselves because they are unsaved and Django raises an unhashable error. We could use sha512 instead. The only reason I didn't is to save memory. Should I switch it to that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I'm going to switch it.
ab54599
to
579c700
Compare
Without this commit, an exception will occur when one content unit has two or more Artifacts are associated with it. https://pulp.plan.io/issues/3844 re pulp#3844
|
Thank you @jortel , @dkliban, @dralley , and @gmbnomis for all the collaboration and help with this! 💯 I really think the plugin writers are going to get a big benefit from it. I'm merging it and https://github.com/pulp/pulp_file/pull/102/files |
Required PR: pulp/pulp#3559 https://pulp.plan.io/issues/3890 re #3890
Required PR: pulp/pulp#3559 https://pulp.plan.io/issues/3890 re #3890
Required PR: pulp/pulp#3559 https://pulp.plan.io/issues/3890 re #3890
Required PR: pulp/pulp#3559 https://pulp.plan.io/issues/3890 re #3890


This PR pairs with pulp_file changes here: pulp/pulp_file#102
I use the test script below. It resets my Pulp3 installation, sync pulp_file, upload another file, associate with the repo, and then resync again with mirror='true'.
It's expected to make 3 versions, and Pulp should have 4 content units in it.
v1 - 3 content units matching the remote pulp_file repo
v2 - 4 content units, included the newly uploaded one
v3 - 3 content units, after the sync with mirror='true'. This verifies that mirror is working.