Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a user, I can whitelist packages to sync with standard python syntax #173

Merged
merged 1 commit into from
Jun 1, 2018

Conversation

werwty
Copy link
Contributor

@werwty werwty commented May 19, 2018

Copy link
Contributor

@asmacdo asmacdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fantastic. This will work well, but I was hoping that the specifier syntax could match a subset of the requirements.txt syntax.

@@ -121,6 +120,17 @@ class PythonRemoteViewSet(platform.RemoteViewSet):
serializer_class = python_serializers.PythonRemoteSerializer
filter_class = PythonRemoteFilter

@transaction.atomic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can add the transaction to the serializer code, so we can just use the CreateMixin.create.


projects = validated_data.pop('projects')

PythonRemote = python_models.PythonRemote.objects.create(**validated_data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/PythonRemote/python_remote/

for package in packages:
remote_units.append(parse_metadata(metadata['info'], version, package))

specifier = specifiers.SpecifierSet(project.version)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional name suggestion: s/specifier/project_specifier/

remote_units.append(parse_metadata(metadata['info'], version, package))

specifier = specifiers.SpecifierSet(project.version)
if specifier.contains(version):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little outside of the scope of _find_remote. What do you think about moving this into _find_delta?

def create(self, validated_data):
"""
Creates a PythonRemote
Overriding default create() to write the projects nested field
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • newline

allow_blank=True
)

sha256 = serializers.CharField(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requirements.txt accepts hashes in the form sha256:486ea46224d1bb4fb680f34f7c9ad96a8f24ec88be73ea8e5a6c65260e9cb8a7

Often, more than one hash is specified (one for sdist, one for wheel, for example). Could we call this digests and accept a list of strings in the format that is in requirements.txt? This would allow us to support md5 also.

)

sha256 = serializers.CharField(
help_text=_("hash"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some more here?

specifier = specifiers.SpecifierSet(project.version)
if specifier.contains(version):
for package in packages:
if bool(not project.sha256) ^ bool(package['digests']['sha256'] == project.sha256):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bitwise is cool!

setup.py Outdated
@@ -4,7 +4,8 @@

requirements = [
'pulpcore-plugin',
'pkginfo'
'pkginfo',
'packaging'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: it might be good to add a , so future diffs will only be 1 line

help_text=_("hash"),
required=False,
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add a create function, we could accept strings in the requirements.txt syntax instead a dictionary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a dictionary is the right way to go for this endpoint. I don't want to make requirements.txt the default syntax for creating project specifiers (since we do want Pipfile and Pipfile.lock support.

We need to discuss how to implement these different ways to pass different files; either through one endpoint for each of them, or through the CLI side.
Whichever way we do decide, I think this endpoint should be left generic

@dralley
Copy link
Contributor

dralley commented May 19, 2018

Tags por favor :)

@dralley
Copy link
Contributor

dralley commented May 22, 2018

This looks great @werwty , I'll test it tomorrow

@werwty
Copy link
Contributor Author

werwty commented May 22, 2018

@dralley hold off for a day please, I still need to refactor sync and update it based on asmacdo's other comments.

@pep8speaks
Copy link

pep8speaks commented May 25, 2018

Hello @werwty! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on June 01, 2018 at 18:08 Hours UTC

@asmacdo
Copy link
Contributor

asmacdo commented May 29, 2018

As much as I'd like to merge, I think we really have to wait until https://pulp.plan.io/issues/3705 is fixed.

@werwty werwty force-pushed the projspecifier branch 3 times, most recently from 5462e80 to 506dfc2 Compare May 31, 2018 19:36
help_text=_("A python project name.")
)

version_specifier = serializers.CharField(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add validation to the version specifiers, although it may be non-trivial enough to make a future task.

http://www.django-rest-framework.org/api-guide/serializers/#field-level-validation

version==1.0.0 will sync all distributions matching version
version~=1.0.0 will sync all major version 1 distributions
version==1.0.0 digests=["sha256:0000"] will only sync the distributions matching the hash
name=projectname without specifiers will sync every distribution in the project.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does digests work separately from a version specifier? If yes, could you provide an example for just digests?

Could you provide another example using >=0.9, <1.0?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dralley Ah I thought you meant add an example to the workflow docs. I'll add those to the docstring too

@@ -22,6 +23,50 @@ class Meta:
fields = ('name',)


class DistributionDigestSerializer(serializers.ModelSerializer):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, we should validate that the digest type is supported and that the digest is the correct length for whatever type it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this, we really "support" any digest type. During sync time all we do is check that the type and value in the JSON API's digest field is the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, never mind then.

url='https://repos.fedorapeople.org/repos/pulp/pulp/fixtures/python-pypi/' \
projects='["shelf-reader"]'
url='https://pypi.org/' \
projects:='[{"name": "django", "version_specifier":"~=2.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need backslashes on the other lines as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, it's already enclosed in a ' so bash knows to wait for it to close


class ProjectSpecifier(Model):
"""
A specifier of a python project dependency.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/dependency//


Example:

version==1.0.0 will sync all distributions matching version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use the same data structure that is passed to it, version==1.0.0 isn't actually a valid value.

Fields:

project_name (models.TextField): The name of a python project
version (modles.TextField): A filter for the versions to sync
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version_specifier: Used to filter the versions of a project to sync

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also typo: s/modles/models

"""
A serializer for the Distribution Digest in a Project Specifier
"""
type = serializers.CharField(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought we settled on this being combined with digest like "sha256:3d991651559..."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not combined is nicer, and we don't need to split when comparing hashes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool. how about a choice field then? Re: https://github.com/pulp/pulp_python/pull/173/files#r192219408, I think we probably do want to only support certain digests. For the moment, we only use digests to filter the content that we download, but it would be good form to check that hash after we download it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This digest is user provided and only used to filter. The hash checking logic would not belong on this but rather on the digest of the created artifact


# Remove all project specifier related by foreign key to the remote if it is not a
# partial update or if new projects list has been passed
if not partial or projects:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect PATCH to not delete unmentioned fields at the top level only, so IMO, a PATCH that gives a list of project specifiers should not be additive.

@@ -75,27 +78,42 @@ def _fetch_inventory(version):
return inventory


def _fetch_remote(remote):
def _fetch_specified_remotes(remote, project_specifiers):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about _fetch_remote_metadata

(good plan)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this only fetches some remote metadata, maybe fetch_specified_remote_metadata? (naming is hard)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works for me. or fetch_sepecified_metadata. Its sort of obvious that is going to come from the remote.

on_delete=models.CASCADE)


class ProjectSpecifier(Model):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a "PackageSpecifier"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've been using the project/version/distribution names throughout. Why change this to package now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. DistributionSpecifier could be confused with our Distribution, and PackageSpecifier is clear enough. The point is it doesn't specify projects, even though that is how they are organized. They actually specify a set of Python Distributions.

if specifier.contains(version):
for package in packages:
# add the package if the project specifier does not have an associated digest
if digests.count() == 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesnt if digests work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, since digests is a QuerySet that doesn't evaluate to false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can just use exists()

# otherwise check each digest to see if it matches the specifier
else:
for type, digest in package['digests'].items():
digest_match = digests.filter(type=type, digest=digest).count()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want:
if digests.filter(type=type, digest=digest).exists()

@@ -35,8 +35,10 @@ You can use any Python remote to sync content into any repository::

$ http POST $BASE_ADDR/pulp/api/v3/remotes/python/ \
name='bar' \
url='https://repos.fedorapeople.org/repos/pulp/pulp/fixtures/python-pypi/' \
projects='["shelf-reader"]'
url='https://pypi.org/' \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comma

Copy link
Contributor

@dralley dralley May 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is actually necessary, or would even work. The JSON spec doesn't allow a trailing comma IIRC.

edit: unless this has been fixed already, without hiding the comment since it was on the wrong line. I'm just assuming you meant trailing comma based on what it is currently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no i just saw the other comma and assumed this needed one as well.

@@ -141,10 +186,7 @@ class PythonPackageContentSerializer(core_serializers.ContentSerializer):
def create(self, validated_data):
"""
Creates a PythonPackageContent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newline + args

@dralley
Copy link
Contributor

dralley commented May 31, 2018

Make sure all the smash tests pass -- it may require modifying some of them.

Rebase if you haven't already as I fixed an issue causing some smash tests to fail earlier.

@werwty werwty force-pushed the projspecifier branch 6 times, most recently from c39f08a to d4d8da9 Compare May 31, 2018 21:58
@@ -37,5 +37,5 @@ else
popd
fi

cd pulp_python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be incorrect, after entering the directories above it does a popd, so it should be ending up where it started.

Is that not happening properly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohhh I see this is after we renamed the branch so the branch checkout errors out and it doesn't popd

Copy link
Contributor

@dralley dralley May 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@werwty this is why

https://travis-ci.org/pulp/pulp_python/jobs/386389531#L550

The branch 3.0-dev no longer exists, therefore the git checkout errors, therefore bash's && expression falls through early.

edit: page wasn't refreshed, you got to it first xD

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@werwty it does still need the cd pulp_python line though

Also could you edit the install docs?

@werwty werwty force-pushed the projspecifier branch 2 times, most recently from bcc8eed to 08886b4 Compare May 31, 2018 22:42
@@ -13,7 +13,7 @@ pip install -r test_requirements.txt
cd .. && git clone https://github.com/pulp/pulp.git

if [ -z $PULP_PR_NUMBER ]; then
pushd pulp && git checkout 3.0-dev && popd
pushd pulp && git checkout master && popd
Copy link
Contributor

@dralley dralley May 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably no need to have this branch anymore since it defaults to master. Swap -z with -n (I think, I had to reference the docs and haven't tested)

The source install docs need to be changed to remove the "checkout 3.0-dev" bit

@werwty werwty force-pushed the projspecifier branch 8 times, most recently from 3925700 to 07f7663 Compare June 1, 2018 17:06
Copy link
Contributor

@asmacdo asmacdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits, but no need to re-review. Great work @werwty, this feature is going to be awesome.


Example:

version_specifier==1.0.0 will sync all distributions matching version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think thes examples could be misleading.

Also, s/sync/match/ throughout this docstring

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what way are these examples misleading? (How can I clarify them?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version_specifier==1.0.0 is not a valid version_specifier
I think even just adding punctuation will clarify

"version_specifier: ==1.0.0 will match all distributions ..."


Fields:

project_name (models.TextField): The name of a python project
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/project_name/name/

from rest_framework import serializers
from rest_framework_nested.relations import NestedHyperlinkedRelatedField

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no space here. pulpcore is a 3rd party library from the plugin's perspective.


def validate_version_specifier(self, value):
"""
Check that the Version Specifier valid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/valid/is valid/

Nice!


specifier = specifiers.SpecifierSet(project.version_specifier)

# Note: SpecifierSet("").contains(version) will return true for released versions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add this to the help_text on the serializer?

@werwty werwty force-pushed the projspecifier branch 2 times, most recently from c4460b1 to 8452656 Compare June 1, 2018 18:06
@werwty werwty merged commit 00af7fa into pulp:3.0-dev Jun 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants