Prototype supporting the same package hash json API response that pypi provides #440

matteius · 2022-08-23T00:06:00Z

Note: This was fairly slow until I added a @functools.lru_cache(maxsize=1000) to the digest_file, then it became super fast even for the initial load.

Fixes #437

I see that the project has a configurable hashing algorithm, but its really the sha256 hash that is important to pipenv, and the md5 (project default) would not be sufficient.

Sample output from local testing of this prototype.

{
	"info": {
		"version": "2.19.1"
	},
	"releases": {
		"2.19.1": [{
			"url": "http://127.0.0.1:8080/packages/requests/requests-2.19.1-py2.py3-none-any.whl",
			"digests": {
				"sha256": "sha256:63b52e3c866428a224f97cab011de738c36aec0185aa91cfacd418b5d58911d1"
			}
		}, {
			"url": "http://127.0.0.1:8080/packages/requests/requests-2.19.1.tar.gz",
			"digests": {
				"sha256": "sha256:ec22d826a36ed72a7358ff3fe56cbd4ba69dd7a6718ffd450ff0e9df7a47ce6a"
			}
		}],
		"2.18.4": [{
			"url": "http://127.0.0.1:8080/packages/requests/requests-2.18.4-py2.py3-none-any.whl",
			"digests": {
				"sha256": "sha256:6a1b267aa90cac58ac3a765d067950e7dbbf75b1da07e895d1f594193a40a38b"
			}
		}, {
			"url": "http://127.0.0.1:8080/packages/requests/requests-2.18.4.tar.gz",
			"digests": {
				"sha256": "sha256:9c443e7324ba5b85070c4a818ade28bfabedf16ea10206da1132edaa6dda237e"
			}
		}],
		"2.14.0": [{
			"url": "http://127.0.0.1:8080/packages/requests/requests-2.14.0-py2.py3-none-any.whl",
			"digests": {
				"sha256": "sha256:a90555c0be723f5c711de36f256b21a65fc599602274fb3d5c4f83ac23aae3c5"
			}
		}, {
			"url": "http://127.0.0.1:8080/packages/requests/requests-2.14.0.tar.gz",
			"digests": {
				"sha256": "sha256:8c4f778459cb4a6bad7ceff4aa65a75697db28c21a6b41ea9a6c371df2a822c2"
			}
		}],
		"1.0.0": [{
			"url": "http://127.0.0.1:8080/packages/requests/requests-1.0.0.tar.gz",
			"digests": {
				"sha256": "sha256:f10d8fbcc02a58056ab44f79ff9b3f9fe78e410296527885250bbb36d15be8c6"
			}
		}]
	}
}

…i provides to pipenv.

dee-me-tree-or-love · 2022-09-02T12:51:41Z

Hey @matteius! Thanks a lot for this PR, this is really awesome! I will try to look closer into this asap, but first will get the CI issues fixed. I see the test-docker failed steps are caused by some external issue, so I'd like to get this out of the way. Besides, I'll try to give a more thorough review soon, really sorry for all the waiting and thanks for your time :)

dee-me-tree-or-love · 2022-09-06T16:49:55Z

Hey @matteius! I've fixed the CI issue #439 in #444. It should now work well if you update your branch from master :) Thanks for your patience, I will get to the reviewing soon! ✌️

matteius · 2022-09-07T08:29:32Z

Thanks @dee-me-tree-or-love -- I have updated the branch from master. Willing to discuss or amend any parts of this PR as it makes sense. We merged in the changes to pipenv so pypiserver is our test runner's private pypiserver, so you'll probably be hearing more from me. Thanks for your help!

dee-me-tree-or-love · 2022-09-12T08:02:54Z

Hey @matteius, thanks for your news, this is super nice to hear 😊 I'm very happy that pypiserver suits your needs! And I'm certainly looking forward to more collaborations. All ideas and input are very welcome :D 👍

I've been a bit busy last week with some other chores, but I will try to give a proper review to your PR here this week, thank you for being on the line! I'll keep you posted.

matteius · 2022-09-13T15:36:27Z

We may be able to have this behavior driven off the preferred hash that is passed in on the CLI -- I think when I first took a stab at this I had no experience with the code base, so I didn't quite pickup on the fact that is configurable.

dee-me-tree-or-love

Hey @matteius! I'm terribly sorry for taking such a while for the review, but I got my hands on it at last! It's looking great and I agree with your comments here about the configurable hash function. I really like how the current prototype is written, so I just left some tiny comments with some cleanups and suggestions. Most importantly, do you think it feasible to adjust it so that it will use the configured hash and then we can wait for #459 so that no changes on the client side are necessary for all the new server boots? :) In any case, one last tiny request, if you could write a couple of tests for this new feature, I'd be super grateful 😊 🙏 Let me know what you think and sorry for taking ages once again! :D Thanks for the proposal, if I can be of any help further with this, do let me know ✌️

dee-me-tree-or-love · 2023-01-06T00:07:01Z

pypiserver/_app.py

@@ -373,6 +373,8 @@ def server_static(filename):


 @app.route("/:project/json")
+@app.route("/pypi/:project/json")
+@app.route("/simple/:project/json")


@matteius, oh, right! could you elaborate a little on why these routes were added? :) I'm not sure I recognize it fully, in case this is not strictly critical for this feature, do you mind if we add the routes in a separate PR? 🤔

dee-me-tree-or-love · 2023-01-06T00:08:35Z

pypiserver/_app.py

@@ -389,12 +391,29 @@ def json_info(project):
    if not packages:
        raise HTTPError(404, f"package {project} not found")

+    package_links = defaultdict(list)


Oh cool! I really like the use of defaultdict 👍

dee-me-tree-or-love · 2023-01-06T00:09:36Z

pypiserver/_app.py

+    package_links = defaultdict(list)
+    for pkg in packages:
+        package_links[pkg.version].append(pkg.relfn_unix)
+    # links = [pkg.relfn_unix for pkg in packages]


Just as a little cleanup of unused things 😊

Suggested change

# links = [pkg.relfn_unix for pkg in packages]

dee-me-tree-or-love · 2023-01-06T00:10:50Z

pypiserver/_app.py

-    for x in packages:
-        releases[x.version] = [
-            {"url": urljoin(req_url, "../../packages/" + x.relfn)}
+    for package in packages:


Thanks for renaming this to package, imho, I like it also a bit better this way 👍

dee-me-tree-or-love · 2023-01-06T00:12:42Z

pypiserver/_app.py

+        matching_links = []
+        for version, links in package_links.items():
+            if version == package.version:
+                matching_links += links


It's sure a nit, but what do you think about doing it this way? 😀

Suggested change

matching_links = []

for version, links in package_links.items():

if version == package.version:

matching_links += links

matching_links = [links for version, links in package_links.items() if version == package.version]

dee-me-tree-or-love · 2023-01-06T00:24:22Z

pypiserver/_app.py

+                    )
+                },
+            }
+            for link in matching_links


Hey, just a wild thought got onto me here, do you think it would be possible to maintain packages and package_links as a single data structure? 🤔

thinking of something along those lines:

packages = sorted( ... ) # ... packages_with_links = [dict(package=pkg, version=pkg.version, link=pkg.relfn_unix) for pkg in packages] # ... and then working with `packages_with_links` in place of `matching_links` and `packages` nestedly?

But it's just a rough fantasy at this point, it sure needs working out better. What do you think?

dee-me-tree-or-love · 2023-01-06T00:26:10Z

pypiserver/_app.py

+            {
+                "url": urljoin(req_url, "../../packages/" + link),
+                "digests": {
+                    "sha256": config.backend.digest_sha256(


Here indeed, I thought it would be awesome to use the "configured" hash mechanism instead :) I think in relation to #459 upcoming in the next major release, I hope that this will make things work out of the box. What do you think? Would this be possible to experiment with?

dee-me-tree-or-love · 2023-01-06T00:27:16Z

pypiserver/backend.py

+    def digest_sha256(
+        self, pkg: PkgFile, file_name: str = None
+    ) -> t.Optional[str]:
+        pass


Just thinking as you mention too, if we could reuse the configured hash and then rely on the changes with #459 in the future, this would probably be not necessary. Do you think I figure this out correctly? :)

matteius added 3 commits August 22, 2022 20:02

Prototype supporting the same package hash json API response that pyp…

7a8aa13

…i provides to pipenv.

Use the correct specifier for the pypi sha256 API.

7d3ad0a

Fix issue where all packages were included in each version.

377d65c

This was referenced Aug 23, 2022

Convert test runner to use pypiserver package as standalone process pypa/pipenv#5284

Merged

Pip packaging and publishing improvements in pytorch wheels for better integration with poetry pytorch/pytorch#76557

Open

Run black formatting.

e92cf22

Merge branch 'master' into support-pypi-hashes

563b8c8

Merge branch 'master' into support-pypi-hashes

f9444bf

dee-me-tree-or-love self-requested a review October 10, 2022 08:18

dee-me-tree-or-love requested changes Jan 6, 2023

View reviewed changes

dee-me-tree-or-love added the features.endpoints Focusing on the endpoints / server functionality label Apr 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prototype supporting the same package hash json API response that pypi provides #440

Prototype supporting the same package hash json API response that pypi provides #440

matteius commented Aug 23, 2022 •

edited

dee-me-tree-or-love commented Sep 2, 2022

dee-me-tree-or-love commented Sep 6, 2022

matteius commented Sep 7, 2022

dee-me-tree-or-love commented Sep 12, 2022

matteius commented Sep 13, 2022

dee-me-tree-or-love left a comment

dee-me-tree-or-love Jan 6, 2023

dee-me-tree-or-love Jan 6, 2023

dee-me-tree-or-love Jan 6, 2023

dee-me-tree-or-love Jan 6, 2023

dee-me-tree-or-love Jan 6, 2023

dee-me-tree-or-love Jan 6, 2023

dee-me-tree-or-love Jan 6, 2023

dee-me-tree-or-love Jan 6, 2023

Prototype supporting the same package hash json API response that pypi provides #440

Are you sure you want to change the base?

Prototype supporting the same package hash json API response that pypi provides #440

Conversation

matteius commented Aug 23, 2022 • edited

dee-me-tree-or-love commented Sep 2, 2022

dee-me-tree-or-love commented Sep 6, 2022

matteius commented Sep 7, 2022

dee-me-tree-or-love commented Sep 12, 2022

matteius commented Sep 13, 2022

dee-me-tree-or-love left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matteius commented Aug 23, 2022 •

edited