Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create new fixtures for python-pypi #82

Merged
merged 1 commit into from
Jun 28, 2018

Conversation

werwty
Copy link
Contributor

@werwty werwty commented Jun 27, 2018

These new fixtures are not hardcoded, instead we read in the new
projects.json file that includes a list of python projects, and the
distributions to keep for those projects.

The metadata is obtained from http://pypi.org/pypi/{project}/json
The packages are obtained from the metadata distribution url.

closes #3608
https://pulp.plan.io/issues/3608

Copy link
Contributor

@Ichimonji10 Ichimonji10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a design perspective, I dislike that the formatter.py script has two entirely separate lines of logic, and never do the two meet. It's awkward in the same way that this function definition is awkward:

def add_or_subtract(action, num1, num2):
  if action == 'add':
    return num1 + num2
  elif action == 'subtract':
    return num1 - num2
  else:
    raise ValueError(f'action should be "add" or "subtract," but is {action}.')

Far better to implement separate add() and subtract() functions. However, this PR represents a significant improvement over the current state of affairs, so I'm happy to merge this and address that in a separate commit. Please address the nitpicks I've pointed out, and we can merge.

if args.action == 'prune_metadata':
print(prune_metadata(json.load(args.metadata_path), args.distributions))
elif args.action == 'format_url':
print(format_url(json.load(args.metadata_path), args.base_url))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

args.metadata_path is a file handle, and it's never being explicitly closed, and json.load(args.metadata_path) is called twice. We can improve robustness and DRYness with the following code:

    args = parser.parse_args()
    try:
        metadata = json.load(args.metadata_path)
    finally:
        args.metadata_path.close()

    if args.action == 'prune_metadata':
        print(prune_metadata(metadata, args.distributions))
    elif args.action == 'format_url':
        print(format_url(metadata, args.base_url))

dest="action",
help=('''
Actions that can be performed by the formatter: prune_metadata will cut unwanted
distributions from the metadta. format_url will replace distribution url links
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a trailing whitespace character here.

# release: e.g. "1.0.1"
# distributions: e.g. [], or [{egg info}, {wheel info}]
new_metadata = copy.deepcopy(old_metadata)
for release, distributions in old_metadata['releases'].items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The release variable is never used. Try this:

for distributions in old_metadata['releases'].values():

# simple
# ├── index.html
# ├── Django
# |  └── index.html
Copy link
Contributor

@Ichimonji10 Ichimonji10 Jun 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pipe (|) character is used here, though a box-drawing character () would be more appropriate.

# pypi
# ├── Django
# |  └── json
# |  └── index.html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A would be more appropriate.

# This should be laid out like:
# pypi
# ├── Django
# |  └── json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A would be more appropriate.

@Ichimonji10
Copy link
Contributor

Here's a diff that implements all the changes I've requested. It can be applied with git-apply(1). I've tested the change.

diff --git a/python-v2/gen-pypi-repo.sh b/python-v2/gen-pypi-repo.sh
index 60790ef..04090bd 100755
--- a/python-v2/gen-pypi-repo.sh
+++ b/python-v2/gen-pypi-repo.sh
@@ -62,7 +62,7 @@ for project in "${projects[@]}"; do
     #   simple
     #   ├── index.html
     #   ├── Django
-    #   |   └── index.html
+    #   │   └── index.html
     #   └── scipy
     #       └── index.html
     mkdir "${working_dir}/simple/${project}"
@@ -75,8 +75,8 @@ for project in "${projects[@]}"; do
     # This should be laid out like:
     #   pypi
     #   ├── Django
-    #   |   └── json
-    #   |       └── index.html
+    #   │   └── json
+    #   │       └── index.html
     #   └── scipy
     #       └── json
     #           └── index.html
@@ -103,7 +103,6 @@ for project in "${projects[@]}"; do
     "${assets_dir}/formatter.py" - format_url "${base_url}" \
         < "${working_dir}/pypi/${project}/json/index.html.tmp" \
         > "${working_dir}/pypi/${project}/json/index.html"
-
     rm "${working_dir}/pypi/${project}/json/index.html.tmp"
 done
 
diff --git a/python-v2/pypi-assets/formatter.py b/python-v2/pypi-assets/formatter.py
index b580983..531098f 100755
--- a/python-v2/pypi-assets/formatter.py
+++ b/python-v2/pypi-assets/formatter.py
@@ -26,7 +26,7 @@ def main():
         dest="action",
         help=('''
         Actions that can be performed by the formatter: prune_metadata will cut unwanted
-        distributions from the metadta. format_url will replace distribution url links 
+        distributions from the metadta. format_url will replace distribution url links
         with a base_url.
         ''')
     )
@@ -46,11 +46,15 @@ def main():
     )
 
     args = parser.parse_args()
+    try:
+        metadata = json.load(args.metadata_path)
+    finally:
+        args.metadata_path.close()
 
     if args.action == 'prune_metadata':
-        print(prune_metadata(json.load(args.metadata_path), args.distributions))
+        print(prune_metadata(metadata, args.distributions))
     elif args.action == 'format_url':
-        print(format_url(json.load(args.metadata_path), args.base_url))
+        print(format_url(metadata, args.base_url))
 
 
 def prune_metadata(old_metadata, available_distributions):
@@ -66,11 +70,12 @@ def prune_metadata(old_metadata, available_distributions):
             del new_metadata['releases'][release]
     return json.dumps(new_metadata, indent=4)
 
+
 def format_url(metadata, base_url):
     """
     Format the distribution urls in metadata to point to the distributions hosted on base_url
     """
-    for release, distributions in metadata['releases'].items():
+    for distributions in metadata['releases'].values():
         for distribution in distributions:
             distribution['url'] = reduce(urljoin, [base_url,
                                                    'fixtures/python-pulp/packages/',

@werwty werwty force-pushed the feature/3608-final branch 2 times, most recently from 497d5cb to a052b0b Compare June 27, 2018 20:42
Copy link
Contributor

@Ichimonji10 Ichimonji10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more robust to close file handles:

diff --git a/python-v2/pypi-assets/pruner.py b/python-v2/pypi-assets/pruner.py
index 6a125bd..ad66b57 100755
--- a/python-v2/pypi-assets/pruner.py
+++ b/python-v2/pypi-assets/pruner.py
@@ -17,16 +17,16 @@ def main():
             stdin.
             '''),
     )
-
     parser.add_argument(
         'distributions',
         type=json.loads,
         help='A JSON list of filenames to search for and keep in the metadata.'
     )
-
     args = parser.parse_args()
-
-    print(prune_metadata(json.load(args.metadata_path), args.distributions))
+    try:
+        print(prune_metadata(json.load(args.metadata_path), args.distributions))
+    finally:
+        args.metadata_path.close()
 
 def prune_metadata(old_metadata, available_distributions):
     """Cut unwanted values out of the JSON metadata at ``metadata_path``."""
diff --git a/python-v2/pypi-assets/urlformatter.py b/python-v2/pypi-assets/urlformatter.py
index b083456..81c2295 100755
--- a/python-v2/pypi-assets/urlformatter.py
+++ b/python-v2/pypi-assets/urlformatter.py
@@ -19,16 +19,16 @@ def main():
             stdin.
             '''),
     )
-
     parser.add_argument(
         'base_url',
         type=str,
         help='base_url at which the fixture packages are located'
     )
-
     args = parser.parse_args()
-
-    print(format_url(json.load(args.metadata_path), args.base_url))
+    try:
+        print(format_url(json.load(args.metadata_path), args.base_url))
+    finally:
+        args.metadata_path.close()
 
 def format_url(metadata, base_url):
     """

@werwty
Copy link
Contributor Author

werwty commented Jun 27, 2018

@Ichimonji10 Just pushed an update.
Thanks for all the reviews!!

Copy link
Contributor

@Ichimonji10 Ichimonji10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I serve the fixtures with python -m http.server and try to consume them from Pulp by triggering Pulp Smash tests, failures will occur:

$ python -m unittest pulp_smash.tests.pulp2.python.api_v2.test_sync_publish                                                                                                                           
/home/bizhang/.venvs/pulp-smash/lib/python3.6/site-packages/urllib3/connectionpool.py:857: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
s..
======================================================================
FAIL: test_01_first_repo (pulp_smash.tests.pulp2.python.api_v2.test_sync_publish.SyncTestCase) (comment='verify content units are present')                                                                                                 
Create, sync content into and publish a Python repository.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/bizhang/code/pulp-smash/pulp_smash/tests/pulp2/python/api_v2/test_sync_publish.py", line 147, in test_01_first_repo                                                                                                           
    self.verify_package_types(self.cfg, repo)
  File "/home/bizhang/code/pulp-smash/pulp_smash/tests/pulp2/python/api_v2/test_sync_publish.py", line 117, in verify_package_types                                                                                                         
    self.assertEqual(unit_types, {'sdist', 'bdist_wheel'})
AssertionError: Items in the second set but not the first:
'sdist'
'bdist_wheel'

======================================================================
FAIL: test_02_second_repo (pulp_smash.tests.pulp2.python.api_v2.test_sync_publish.SyncTestCase) (comment='verify content units are present')                                                                                                
Create a second Python repository, and sync it from the first.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/bizhang/code/pulp-smash/pulp_smash/tests/pulp2/python/api_v2/test_sync_publish.py", line 91, in test_02_second_repo                                                                                                           
    self.verify_package_types(self.cfg, repo)
  File "/home/bizhang/code/pulp-smash/pulp_smash/tests/pulp2/python/api_v2/test_sync_publish.py", line 117, in verify_package_types                                                                                                         
    self.assertEqual(unit_types, {'sdist', 'bdist_wheel'})
AssertionError: Items in the second set but not the first:
'sdist'
'bdist_wheel'

----------------------------------------------------------------------
Ran 4 tests in 24.557s

FAILED (failures=2, skipped=1)

Upon logging into the Pulp VM, I can see a traceback:

$ journalctl -p 0..3 --boot=-0
Jun 28 08:19:35 fedora-27-pulp-2-16-nightly pulp[1294]: nectar.downloaders.base:ERROR: (1294-85504) 'path'
Jun 28 08:19:35 fedora-27-pulp-2-16-nightly pulp[1294]: nectar.downloaders.base:ERROR: (1294-85504) Traceback (most recent call last):
Jun 28 08:19:35 fedora-27-pulp-2-16-nightly pulp[1294]: nectar.downloaders.base:ERROR: (1294-85504)   File "/usr/lib/python2.7/site-packages/nectar/downloaders/base.py", line 145, in _fire_event_to_listener
Jun 28 08:19:35 fedora-27-pulp-2-16-nightly pulp[1294]: nectar.downloaders.base:ERROR: (1294-85504)     event_listener_callback(*args, **kwargs)
Jun 28 08:19:35 fedora-27-pulp-2-16-nightly pulp[1294]: nectar.downloaders.base:ERROR: (1294-85504)   File "/usr/lib/python2.7/site-packages/pulp_python/plugins/importers/sync.py", line 65, in download_succeeded
Jun 28 08:19:35 fedora-27-pulp-2-16-nightly pulp[1294]: nectar.downloaders.base:ERROR: (1294-85504)     units = models.Package.objects.from_metadata(destination.read())
Jun 28 08:19:35 fedora-27-pulp-2-16-nightly pulp[1294]: nectar.downloaders.base:ERROR: (1294-85504)   File "/usr/lib/python2.7/site-packages/pulp_python/plugins/querysets.py", line 55, in from_metadata
Jun 28 08:19:35 fedora-27-pulp-2-16-nightly pulp[1294]: nectar.downloaders.base:ERROR: (1294-85504)     units.append(self._document.from_json(package, version, metadata['info']))
Jun 28 08:19:35 fedora-27-pulp-2-16-nightly pulp[1294]: nectar.downloaders.base:ERROR: (1294-85504)   File "/usr/lib/python2.7/site-packages/pulp_python/plugins/models.py", line 115, in from_json
Jun 28 08:19:35 fedora-27-pulp-2-16-nightly pulp[1294]: nectar.downloaders.base:ERROR: (1294-85504)     package_attrs['path'] = package_data['path']
Jun 28 08:19:35 fedora-27-pulp-2-16-nightly pulp[1294]: nectar.downloaders.base:ERROR: (1294-85504) KeyError: 'path'

Can you investigate this? I've configured a troubleshooting environment on pine.ichimonji10.name. Just log in and attach to the tmux session on your account.

@werwty
Copy link
Contributor Author

werwty commented Jun 28, 2018

@Ichimonji10 That failure matches a problem that was fixed recently in Pulp2: https://github.com/pulp/pulp_python/pull/145/files

Back before PyPI transitioned to warehouse, distribution metadata would always include path, nowadays older metadata would have path but newer metadata fetched from warehouse would only have url.

AFAIK this fix hasn't made it into a release yet...
Are you running a Pulp nightly build? If so I need to investigate why this change it isn't in there

@Ichimonji10
Copy link
Contributor

Ichimonji10 commented Jun 28, 2018

Are you running a Pulp nightly build? If so I need to investigate why this change it isn't in there

Yes, I think so. I just installed it this morning. Here's the relevant RPMs installed on the Pulp VM:

$ rpm -qa | grep pulp | sort
pulp-admin-client-2.16.2-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
pulp-deb-admin-extensions-1.7.0-1.fc27.noarch
pulp-deb-plugins-1.7.0-1.fc27.noarch
pulp-docker-admin-extensions-3.1.4-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
pulp-docker-plugins-3.1.4-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
pulp-ostree-admin-extensions-1.3.0-1.fc27.noarch
pulp-ostree-plugins-1.3.0-1.fc27.noarch
pulp-puppet-admin-extensions-2.16.2-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
pulp-puppet-plugins-2.16.2-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
pulp-puppet-tools-2.16.2-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
pulp-python-admin-extensions-2.0.2-1.fc27.noarch
pulp-python-plugins-2.0.2-1.fc27.noarch
pulp-rpm-admin-extensions-2.16.2-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
pulp-rpm-plugins-2.16.2-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
pulp-selinux-2.16.2-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
pulp-server-2.16.2-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
python-pulp-bindings-2.16.2-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
python-pulp-client-lib-2.16.2-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
python-pulp-common-2.16.2-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
python-pulp-deb-common-1.7.0-1.fc27.noarch
python-pulp-docker-common-3.1.4-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
python-pulp-oid_validation-2.16.2-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
python-pulp-ostree-common-1.3.0-1.fc27.noarch
python-pulp-puppet-common-2.16.2-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
python-pulp-python-common-2.0.2-1.fc27.noarch
python-pulp-repoauth-2.16.2-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
python-pulp-rpm-common-2.16.2-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch
python-pulp-streamer-2.16.2-1.git.69.0b232c2.git.69.0b232c2.fc27.noarch

@werwty
Copy link
Contributor Author

werwty commented Jun 28, 2018

Will you try using the 2.17 rpm for pulp_python
https://repos.fedorapeople.org/repos/pulp/pulp/testing/automation/2.17/stage/fedora-27/x86_64/

@Ichimonji10
Copy link
Contributor

Ichimonji10 commented Jun 28, 2018

Thanks for the suggestion. I executed the following:

pushd "$(mktemp --directory)"
wget \
  https://repos.fedorapeople.org/repos/pulp/pulp/testing/automation/2.17/stage/fedora-27/x86_64/pulp-python-admin-extensions-2.1.0-0.1.alpha.201806280304gitfabd48c.fc27.noarch.rpm \
  https://repos.fedorapeople.org/repos/pulp/pulp/testing/automation/2.17/stage/fedora-27/x86_64/pulp-python-plugins-2.1.0-0.1.alpha.201806280304gitfabd48c.fc27.noarch.rpm \
  https://repos.fedorapeople.org/repos/pulp/pulp/testing/automation/2.17/stage/fedora-27/x86_64/python-pulp-python-common-2.1.0-0.1.alpha.201806280304gitfabd48c.fc27.noarch.rpm
sudo dnf -y install *
popd
sudo systemctl reboot

...and re-ran the Pulp Smash tests. This fixed the traceback, and revealed an error to be fixed:

diff --git a/python-v2/pypi-assets/simple/distribution.html.template b/python-v2/pypi-assets/simple/distribution.html.template
index adcd303..d2e0657 100644
--- a/python-v2/pypi-assets/simple/distribution.html.template
+++ b/python-v2/pypi-assets/simple/distribution.html.template
@@ -7,7 +7,7 @@
 <body>
 {% for distribution in projects[project] %}
 
-  <a href="{{base_url}}/fixtures/python-pulp/packages/{{distribution}}" rel="internal">{{distribution}}</a><br />
+  <a href="{{base_url}}/fixtures/python-pypi/packages/{{distribution}}" rel="internal">{{distribution}}</a><br />
 {% endfor %}
 </body>
 </html>
diff --git a/python-v2/pypi-assets/urlformatter.py b/python-v2/pypi-assets/urlformatter.py
index 81c2295..f029b6b 100755
--- a/python-v2/pypi-assets/urlformatter.py
+++ b/python-v2/pypi-assets/urlformatter.py
@@ -37,7 +37,7 @@ def format_url(metadata, base_url):
     for distributions in metadata['releases'].values():
         for distribution in distributions:
             distribution['url'] = reduce(urljoin, [base_url,
-                                                   'fixtures/python-pulp/packages/',
+                                                   'fixtures/python-pypi/packages/',
                                                    distribution['filename']])
     return json.dumps(metadata, indent=4)

These new fixtures are not hardcoded, instead we read in the new
projects.json file that includes a list of python projects, and the
distributions to keep for those projects.

The metadata is obtained from http://pypi.org/pypi/{project}/json
The packages are obtained from the metadata distribution url.

closes #3608
https://pulp.plan.io/issues/3608
@werwty
Copy link
Contributor Author

werwty commented Jun 28, 2018

Good catch, Update has been pushed

Copy link
Contributor

@Ichimonji10 Ichimonji10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 Great job! Also see: pulp/pulp-smash#1091

@Ichimonji10 Ichimonji10 merged commit 2e7dc7d into pulp:master Jun 28, 2018
ragabala pushed a commit to ragabala/pulp-ci that referenced this pull request Jul 10, 2018
The rewritten `fixtures/python-pypi` make target that Pulp Fixtures now
has requires the following new executables:

* jinja2
* jq
* python3

See: pulp/pulp-fixtures#82
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants