Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mirroring fails for packages with hashes other than sha256 #996

Closed
Agalin opened this issue Sep 14, 2023 · 7 comments
Closed

Mirroring fails for packages with hashes other than sha256 #996

Agalin opened this issue Sep 14, 2023 · 7 comments

Comments

@Agalin
Copy link

Agalin commented Sep 14, 2023

Environment

Devpi version: 6.9.1 (looking at code, 6.9.2 should also be affected)

Packages list
aiohttp              3.8.4
aiosignal            1.3.1
argon2-cffi          21.3.0
argon2-cffi-bindings 21.2.0
async-timeout        4.0.2
attrs                23.1.0
beautifulsoup4       4.12.2
bleach               6.0.0
certifi              2023.5.7
cffi                 1.15.1
Chameleon            4.0.1
charset-normalizer   3.1.0
cmarkgfm             2022.10.27
defusedxml           0.7.1
devpi-common         3.7.2
devpi-ldap           2.1.0
devpi-server         6.9.1
devpi-web            4.2.1
docutils             0.20.1
frozenlist           1.3.3
hupper               1.12
idna                 3.4
itsdangerous         2.1.2
lazy                 1.5
ldap3                2.9.1
multidict            6.0.4
packaging            21.3
passlib              1.7.4
PasteDeploy          3.0.1
pip                  23.1.2
plaster              1.1.2
plaster-pastedeploy  1.0.1
platformdirs         3.8.0
pluggy               1.2.0
py                   1.11.0
pyasn1               0.5.0
pycparser            2.21
Pygments             2.15.1
pyparsing            3.1.0
pyramid              2.0.1
pyramid-chameleon    0.3
python-dateutil      2.8.2
PyYAML               6.0
readme-renderer      40.0
repoze.lru           0.7
requests             2.31.0
ruamel.yaml          0.17.32
ruamel.yaml.clib     0.2.7
setuptools           65.5.1
six                  1.16.0
soupsieve            2.4.1
strictyaml           1.7.3
translationstring    1.4
urllib3              2.0.3
venusian             3.0.0
waitress             2.1.2
webencodings         0.5.1
WebOb                1.8.7
wheel                0.40.0
Whoosh               2.7.4
yarl                 1.9.2
zope.deprecation     5.0
zope.interface       6.0
Python version: 3.11.4

OS: Alpine 3.18.2 (Docker)

Issue

Mirroring sometimes fails for packages without sha256 checksum with the following stacktrace snippet (I currently don't have the full stacktrace):

 File "/usr/local/lib/python3.11/site-packages/devpi_server/mirror.py", line 177, in parse_index_v1_json
    url = url.replace(fragment="=".join(next(hashes.items())))
                                        ^^^^^^^^^^^^^^^^^^^^

If we check the mentioned line, we'll see the following fragment:

        if 'sha256' in hashes:
            url = url.replace(fragment=f"sha256={hashes['sha256']}")
        elif hashes:
            url = url.replace(fragment="=".join(next(hashes.items())))

Hashes are loaded from a parsed JSON object which means it's a pure dict. For packages with other hashes than SHA256 it's using the second block.

But next requires an iterator as an argument. Which means this line is raising. It probably should be:

            url = url.replace(fragment="=".join(next(iter(hashes.items()))))

(although ideally it would use the strongest hash available, not the first one)

But there is a problem:

Sometimes it fails consistently for a specific package we have which is mirrored from a Nexus repo. And at other times (like right now) I'm failing to hit this error - I cannot find what's triggering this particular function, tried:

  • Opening a simple repo.
  • Opening a full repo.
  • Using the refresh option.
  • Downloading the file.

Is it possible only the automatic background refresh (on GET request) hits this line?

@fschulze
Copy link
Contributor

To clarify, you mirror something which isn't pypi.org and it doesn't provide sha256 sums? Could you provide a raw html example (with anonymized package name if needed)?

@Agalin
Copy link
Author

Agalin commented Sep 14, 2023

It's a Nexus mirror which currently provides SHA256 sums. Problem was observed for an old package that was already synced with MD5. Not sure if Nexus is still serving only MD5 for it. I'll see if I can retrieve the Nexus response.

Although line linked should just be fixed anyway:

>>> x = {"foo": "bar"}
>>> next(x.items())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'dict_items' object is not an iterator
>>> next(iter(x.items()))
('foo', 'bar')

@Agalin
Copy link
Author

Agalin commented Sep 14, 2023

Here is the Nexus response - still using md5 for this package.

<html lang="en">
<head><title>Links for avro-python3</title>
  <meta name="api-version" value="2"/>
</head>
<body><h1>Links for avro-python3</h1>
        <a href="../../packages/avro-python3/1.9.0-snapshot-20190725/avro-python3-1.9.0-snapshot-20190725.tar.gz#md5=07f58693cc10857e520084588394337d" rel="internal"       data-requires-python="&gt;=3.4" >avro-python3-1.9.0-snapshot-20190725.tar.gz</a><br/>
    </body>
</html>

fschulze added a commit to fschulze/devpi that referenced this issue Oct 20, 2023
@fschulze
Copy link
Contributor

Ah, I only now realized that this is for an application/vnd.pypi.simple.v1+json response, not an html one. There was no test for that yet. Your fix looks correct.
The reason you don't always see it is most likely because of caching. The Refresh button on the simple page should work.

@Agalin
Copy link
Author

Agalin commented Oct 20, 2023

While I'm pretty sure a bug is still there, for some reason I don't see it any more after migrating the stack to the one without replicas.

@fschulze
Copy link
Contributor

I also have to check whether ETag handling conflicts with the Refresh button.

You didn't mention replicas before.

@Agalin
Copy link
Author

Agalin commented Oct 20, 2023

I didn't mention them before because I had no reasons to believe they're related. But now - after switching to a replica-free deployment I no longer see stacktraces in logs.

fschulze added a commit to fschulze/devpi that referenced this issue Oct 22, 2023
fschulze added a commit to fschulze/devpi that referenced this issue Oct 22, 2023
fschulze added a commit to fschulze/devpi that referenced this issue Oct 22, 2023
fschulze added a commit to fschulze/devpi that referenced this issue Oct 22, 2023
fschulze added a commit to fschulze/devpi that referenced this issue Oct 23, 2023
markmcclain pushed a commit to markmcclain/devpi that referenced this issue Jan 8, 2024
markmcclain pushed a commit to aristanetworks/devpi that referenced this issue Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants