You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This tarball though is not verified against a signature, or a hash. In the event of a modified MuPDF tarball, either maliciously or unintentionally, this will lead to non-reproducible PyMuPDF builds, or downright unsafe ones.
Describe the solution you'd like
It would be a nice improvement to take advantage of the SHA-1 hashes in the MuPDF downloads page. This way, we could ensure proper reproducibility, and security against supply chain attacks.
We can further improve here by using SHA-256 hashes (since SHA-1 is considered unsafe), or using PGP signatures.
Describe alternatives you've considered
Users can:
Download the MuPDF source locally.
Check it against the SHA-1 hash in the website.
Build the PyMuPDF source using the PYMUPDF_SETUP_MUPDF_TGZ envvar.
This approach has several drawbacks though:
Environment flags defeat the purpose of reproducibility. A stale envvar means that PyMuPDF will build against an older MuPDF source, and users will most likely not notice it.
Checking the SHA-1 hash from their browser before building a package is a weak defense mechanism in the case of a compromised site. If the contents of the tarball can change, so can the advertised SHA-1 in the same page.
It interrupts the common poetry lock -> poetry install (or equivalent) flow that is part of modern Python development.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
When building PyMuPDF from source, the default behavior is to download the MuPDF source tarball from the Internet:
PyMuPDF/setup.py
Line 389 in e6e1daa
This tarball though is not verified against a signature, or a hash. In the event of a modified MuPDF tarball, either maliciously or unintentionally, this will lead to non-reproducible PyMuPDF builds, or downright unsafe ones.
Describe the solution you'd like
It would be a nice improvement to take advantage of the SHA-1 hashes in the MuPDF downloads page. This way, we could ensure proper reproducibility, and security against supply chain attacks.
We can further improve here by using SHA-256 hashes (since SHA-1 is considered unsafe), or using PGP signatures.
Describe alternatives you've considered
Users can:
PYMUPDF_SETUP_MUPDF_TGZ
envvar.This approach has several drawbacks though:
poetry lock
->poetry install
(or equivalent) flow that is part of modern Python development.The text was updated successfully, but these errors were encountered: