Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip-21.0.1 is breaking reproducible wheels #218

Closed
kushaldas opened this issue Feb 12, 2021 · 35 comments
Closed

pip-21.0.1 is breaking reproducible wheels #218

kushaldas opened this issue Feb 12, 2021 · 35 comments
Assignees

Comments

@kushaldas
Copy link
Contributor

kushaldas commented Feb 12, 2021

The --build is now deprecated. Tracking it in the upstream issue pypa/pip#9604

Note: we should add a monthly CI job to to test reproducibility using the latest setuptools and pip.

@emkll
Copy link
Contributor

emkll commented Feb 12, 2021

Thanks for opening the issue @kushaldas . Right now, the reprotest-wheels CI job is not failing against latest main , so we wouldn't have caught it with simply adding a monthly CI job as-is: https://app.circleci.com/pipelines/github/freedomofpress/securedrop-debian-packaging/902/workflows/09def272-fb86-4906-a63a-53a9f6de4c60/jobs/6978

@kushaldas
Copy link
Contributor Author

Thanks for opening the issue @kushaldas . Right now, the reprotest-wheels CI job is not failing against latest main , so we wouldn't have caught it with simply adding a monthly CI job as-is: https://app.circleci.com/pipelines/github/freedomofpress/securedrop-debian-packaging/902/workflows/09def272-fb86-4906-a63a-53a9f6de4c60/jobs/6978

Yes, that is why I mentioned to use latest setuptools and pip in that monthly job.

@emkll emkll added this to Near Term - SD Workstation in SecureDrop Team Board Feb 25, 2021
@pradyunsg
Copy link

$ python -m build --wheel ./simplejson-3.17.2 --no-build-isolation
...
$ shasum -a 256 ./simplejson-3.17.2/dist/simplejson-3.17.2-cp38-cp38-macosx_10_9_x86_64.whl
6bbb49faf3a94233d3e1dd5059040b939169f2485426cbbef97dfbfa0ecda1ee  ./simplejson-3.17.2/dist/simplejson-3.17.2-cp38-cp38-macosx_10_9_x86_64.whl
$ python -m build --wheel ./simplejson-3.17.2 --no-build-isolation
...
$ shasum -a 256 ./simplejson-3.17.2/dist/simplejson-3.17.2-cp38-cp38-macosx_10_9_x86_64.whl
6bbb49faf3a94233d3e1dd5059040b939169f2485426cbbef97dfbfa0ecda1ee  ./simplejson-3.17.2/dist/simplejson-3.17.2-cp38-cp38-macosx_10_9_x86_64.whl
$ python -m build --wheel ./simplejson-3.17.2
...
$ shasum -a 256 ./simplejson-3.17.2/dist/simplejson-3.17.2-cp38-cp38-macosx_10_9_x86_64.whl
6bbb49faf3a94233d3e1dd5059040b939169f2485426cbbef97dfbfa0ecda1ee  ./simplejson-3.17.2/dist/simplejson-3.17.2-cp38-cp38-macosx_10_9_x86_64.whl

@kushaldas and I were chatting about this today, and... figured out how to get reproducibility with build working. 🎉

@pradyunsg
Copy link

Well, I only needed to export SOURCE_DATE_EPOCH=1596163658. :)

@kushaldas
Copy link
Contributor Author

kushaldas commented Mar 3, 2021

This was strange

✦ ❯ diffoscope simplejson-3.17.2/dist/simplejson-3.17.2-cp37-cp37m-linux_x86_64.whl simplejson-3.17.2-cp37-cp37m-linux_x86_64.whl 
--- simplejson-3.17.2/dist/simplejson-3.17.2-cp37-cp37m-linux_x86_64.whl
+++ simplejson-3.17.2-cp37-cp37m-linux_x86_64.whl
├── zipinfo /dev/stdin
│ @@ -36,13 +36,13 @@
│  -rw-r--r--  2.0 unx      942 b- defN 20-Jul-31 02:47 simplejson/tests/test_separators.py
│  -rw-r--r--  2.0 unx     4144 b- defN 20-Jul-31 02:47 simplejson/tests/test_speedups.py
│  -rw-r--r--  2.0 unx      740 b- defN 20-Jul-31 02:47 simplejson/tests/test_str_subclass.py
│  -rw-r--r--  2.0 unx     1124 b- defN 20-Jul-31 02:47 simplejson/tests/test_subclass.py
│  -rw-r--r--  2.0 unx     3304 b- defN 20-Jul-31 02:47 simplejson/tests/test_tool.py
│  -rw-r--r--  2.0 unx     1831 b- defN 20-Jul-31 02:47 simplejson/tests/test_tuple.py
│  -rw-r--r--  2.0 unx     7056 b- defN 20-Jul-31 02:47 simplejson/tests/test_unicode.py
│ --rw-r--r--  2.0 unx    10375 b- defN 20-Jul-31 02:47 simplejson-3.17.2.dist-info/LICENSE.txt
│ +-rw-rw-r--  2.0 unx    10375 b- defN 20-Jul-31 02:47 simplejson-3.17.2.dist-info/LICENSE.txt
│  -rw-r--r--  2.0 unx     3130 b- defN 20-Jul-31 02:47 simplejson-3.17.2.dist-info/METADATA
│  -rw-r--r--  2.0 unx      104 b- defN 20-Jul-31 02:47 simplejson-3.17.2.dist-info/WHEEL
│ --rw-r--r--  2.0 unx       11 b- defN 20-Jul-31 02:47 simplejson-3.17.2.dist-info/top_level.txt
│ +-rw-rw-r--  2.0 unx       11 b- defN 20-Jul-31 02:47 simplejson-3.17.2.dist-info/top_level.txt
│  ?rw-rw-r--  2.0 unx     4037 b- defN 20-Jul-31 02:47 simplejson-3.17.2.dist-info/RECORD
│  46 files, 372778 bytes uncompressed, 116595 bytes compressed:  68.7%

I think due to the user who unpacked the tarball.

@kushaldas
Copy link
Contributor Author

kushaldas commented Mar 3, 2021

I am finally getting same wheel, but had to remember to use the similar kind of non-root user while building.

Used the following script to test. Remember to have latest setuptools in the environment.

export SOURCE_DATE_EPOCH=1596163658
python3 -m pip download --no-binary :all: simplejson
tar -xvf simplejson-3.17.2.tar.gz
python -m build --wheel ./simplejson-3.17.2 --no-isolation
sha256sum simplejson-3.17.2/dist/simplejson-3.17.2-cp37-cp37m-linux_x86_64.whl

@kushaldas
Copy link
Contributor Author

kushaldas commented Mar 3, 2021

One remaining question is about build dependencies of any package, but I will know more about it in future.

For example: if package A depends on package B and C as build dependency, how does build tool gets it? or is it on us to get all the build dependency in?

@kushaldas kushaldas self-assigned this Mar 3, 2021
@kushaldas
Copy link
Contributor Author

kushaldas commented Mar 4, 2021

Found one build dependency related failure:

Building python-dateutil-2.7.5
ERROR Missing dependencies:
        setuptools_scm
Traceback (most recent call last):
  File "./scripts/build-sync-wheels", line 127, in <module>
    main()
  File "./scripts/build-sync-wheels", line 103, in main
    subprocess.check_call(cmd)
  File "/usr/lib/python3.7/subprocess.py", line 347, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['python3', '-m', 'build', '--wheel', '/tmp/pip-wheel-build/python-dateutil-2.7.5', '--no-isolation', '-o', '/tmp/tmpwuz2lu92']' returned non-zero exit status 1.

setup_requires=['setuptools_scm'], means only one build dependency for the package.

@kushaldas
Copy link
Contributor Author

kushaldas commented Mar 4, 2021

❔ ❔ I think this is better for the whole isolation, as till now these dependencies were down via pip and we had no control over those. I think we should add them too into our wheels and reuse them in the build virtual environment.
I can see two projects have in total 2 build time dependency.

python-dateutil-2.7.5/setup.py
66:      setup_requires=['setuptools_scm'],

chardet-3.0.4/setup.py
53:      setup_requires=pytest_runner,

@kushaldas
Copy link
Contributor Author

kushaldas commented Mar 4, 2021

Final verdict from the reprotest in the first run: 2 failed, 1 passed in 121.35s (0:02:01)

reprotest is showing the classic file permission error as I noticed above:

--- /tmp/tmp6zddqn9p/control
+++ /tmp/tmp6zddqn9p/experiment-1
___ source-root
_ ___ localwheels
_ _ ___ idna-2.7-py2.py3-none-any.whl
_ _ _ ___ zipinfo /dev/stdin
_ _ _ _ @@ -3,13 +3,13 @@
_ _ _ _  -rw-r--r--  2.0 unx     3299 b- defN 11-Jun-29 20:23 idna/codec.py
_ _ _ _  -rw-r--r--  2.0 unx      232 b- defN 11-Jun-29 20:23 idna/compat.py
_ _ _ _  -rw-r--r--  2.0 unx    11858 b- defN 11-Jun-29 20:23 idna/core.py
_ _ _ _  -rw-r--r--  2.0 unx    39285 b- defN 11-Jun-29 20:23 idna/idnadata.py
_ _ _ _  -rw-r--r--  2.0 unx     1749 b- defN 11-Jun-29 20:23 idna/intranges.py
_ _ _ _  -rw-r--r--  2.0 unx       21 b- defN 11-Jun-29 20:23 idna/package_data.py
_ _ _ _  -rw-r--r--  2.0 unx   197803 b- defN 11-Jun-29 20:23 idna/uts46data.py
_ _ _ _ --rwx------  2.0 unx     3947 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/LICENSE.rst
_ _ _ _ +-rw-r--r--  2.0 unx     3947 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/LICENSE.rst
_ _ _ _  -rw-r--r--  2.0 unx     8866 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/METADATA
_ _ _ _  -rw-r--r--  2.0 unx      110 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/WHEEL
_ _ _ _ --rwx------  2.0 unx        5 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/top_level.txt
_ _ _ _ +-rw-r--r--  2.0 unx        5 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/top_level.txt
_ _ _ _  ?rw-rw-r--  2.0 unx      945 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/RECORD
_ _ _ _  13 files, 268178 bytes uncompressed, 56674 bytes compressed:  78.9%

second error

--- /tmp/tmpfsnhm73k/control                                                                                                                                                                       
+++ /tmp/tmpfsnhm73k/experiment-1                                                                                                                                                                  
___ source-root                                                                                                                                                                                    
_ ___ localwheels                                                                                                                                                                                  
_ _ ___ Mako-1.0.7-py3-none-any.whl                                                                                                                                                                
_ _ _ ___ zipinfo /dev/stdin                                                                                                                                                                       
_ _ _ _ @@ -21,15 +21,15 @@                                                                                                                                                                        
_ _ _ _  -rw-r--r--  2.0 unx     2079 b- defN 11-Jun-29 20:23 mako/ext/babelplugin.py                                                                                                              
_ _ _ _  -rw-r--r--  2.0 unx     2365 b- defN 11-Jun-29 20:23 mako/ext/beaker_cache.py                                                                                                             
_ _ _ _  -rw-r--r--  2.0 unx     4261 b- defN 11-Jun-29 20:23 mako/ext/extract.py                                                                                                                  
_ _ _ _  -rw-r--r--  2.0 unx     1663 b- defN 11-Jun-29 20:23 mako/ext/linguaplugin.py                                                                                                             
_ _ _ _  -rw-r--r--  2.0 unx      580 b- defN 11-Jun-29 20:23 mako/ext/preprocessors.py                                                                                                            
_ _ _ _  -rw-r--r--  2.0 unx     4530 b- defN 11-Jun-29 20:23 mako/ext/pygmentplugin.py                                                                                                            
_ _ _ _  -rw-r--r--  2.0 unx     2132 b- defN 11-Jun-29 20:23 mako/ext/turbogears.py                                                                                                               
_ _ _ _ --rw-rw-r--  2.0 unx      282 b- defN 11-Jun-29 20:23 Mako-1.0.7.dist-info/AUTHORS                                                                                                         
_ _ _ _ --rw-rw-r--  2.0 unx     1217 b- defN 11-Jun-29 20:23 Mako-1.0.7.dist-info/LICENSE                                                                                                         
_ _ _ _ +-rw-r--r--  2.0 unx      282 b- defN 11-Jun-29 20:23 Mako-1.0.7.dist-info/AUTHORS                                                                                                         
_ _ _ _ +-rw-r--r--  2.0 unx     1217 b- defN 11-Jun-29 20:23 Mako-1.0.7.dist-info/LICENSE                                                                                                         
_ _ _ _  -rw-r--r--  2.0 unx     2233 b- defN 11-Jun-29 20:23 Mako-1.0.7.dist-info/METADATA                                                                                                        
_ _ _ _  -rw-r--r--  2.0 unx       92 b- defN 11-Jun-29 20:23 Mako-1.0.7.dist-info/WHEEL                                                                                                           
_ _ _ _ --rw-rw-r--  2.0 unx      586 b- defN 11-Jun-29 20:23 Mako-1.0.7.dist-info/entry_points.txt                                                                                                
_ _ _ _ --rw-rw-r--  2.0 unx        5 b- defN 11-Jun-29 20:23 Mako-1.0.7.dist-info/top_level.txt                                                                                                   
_ _ _ _ +-rw-r--r--  2.0 unx      586 b- defN 11-Jun-29 20:23 Mako-1.0.7.dist-info/entry_points.txt                                                                                                
_ _ _ _ +-rw-r--r--  2.0 unx        5 b- defN 11-Jun-29 20:23 Mako-1.0.7.dist-info/top_level.txt                                                                                                   
_ _ _ _  ?rw-rw-r--  2.0 unx     2484 b- defN 11-Jun-29 20:23 Mako-1.0.7.dist-info/RECORD                                                                                                          
_ _ _ _  33 files, 273001 bytes uncompressed, 72733 bytes compressed:  73.4%                                                                                                                       
_ _ ___ SQLAlchemy-1.3.3-cp37-cp37m-linux_x86_64.whl                                                                                                                                               
_ _ _ ___ zipinfo /dev/stdin                                                                                                                                                                       
_ _ _ _ @@ -191,13 +191,13 @@                                                                                                                                                                      
_ _ _ _  -rw-r--r--  2.0 unx     6580 b- defN 11-Jun-29 20:23 sqlalchemy/util/__init__.py                                                                                                          
_ _ _ _  -rw-r--r--  2.0 unx    29153 b- defN 11-Jun-29 20:23 sqlalchemy/util/_collections.py                                                                                                      
_ _ _ _  -rw-r--r--  2.0 unx    11264 b- defN 11-Jun-29 20:23 sqlalchemy/util/compat.py                                                                                                            
_ _ _ _  -rw-r--r--  2.0 unx     7169 b- defN 11-Jun-29 20:23 sqlalchemy/util/deprecations.py                                                                                                      
_ _ _ _  -rw-r--r--  2.0 unx    47645 b- defN 11-Jun-29 20:23 sqlalchemy/util/langhelpers.py                                                                                                       
_ _ _ _  -rw-r--r--  2.0 unx     6827 b- defN 11-Jun-29 20:23 sqlalchemy/util/queue.py                                                                                                             
_ _ _ _  -rw-r--r--  2.0 unx     2767 b- defN 11-Jun-29 20:23 sqlalchemy/util/topological.py                                                                                                       
_ _ _ _ --rw-rw-r--  2.0 unx     1229 b- defN 11-Jun-29 20:23 SQLAlchemy-1.3.3.dist-info/LICENSE                                                                                                   
_ _ _ _ +-rw-r--r--  2.0 unx     1229 b- defN 11-Jun-29 20:23 SQLAlchemy-1.3.3.dist-info/LICENSE                                                                                                   
_ _ _ _  -rw-r--r--  2.0 unx     7122 b- defN 11-Jun-29 20:23 SQLAlchemy-1.3.3.dist-info/METADATA                                                                                                  
_ _ _ _  -rw-r--r--  2.0 unx      104 b- defN 11-Jun-29 20:23 SQLAlchemy-1.3.3.dist-info/WHEEL                                                                                                     
_ _ _ _ --rw-rw-r--  2.0 unx       11 b- defN 11-Jun-29 20:23 SQLAlchemy-1.3.3.dist-info/top_level.txt                                                                                             
_ _ _ _ +-rw-r--r--  2.0 unx       11 b- defN 11-Jun-29 20:23 SQLAlchemy-1.3.3.dist-info/top_level.txt                                                                                             
_ _ _ _  ?rw-rw-r--  2.0 unx    17870 b- defN 11-Jun-29 20:23 SQLAlchemy-1.3.3.dist-info/RECORD                                                                                                    
_ _ _ _  201 files, 4616863 bytes uncompressed, 1159855 bytes compressed:  74.9%                                                                                                                   
_ _ ___ alembic-1.0.2-py2.py3-none-any.whl                                                                                                                                                         
_ _ _ ___ zipinfo /dev/stdin 
_ _ _ _ @@ -25,26 +25,26 @@                                                                                                                                                                        
_ _ _ _  -rw-r--r--  2.0 unx     4870 b- defN 11-Jun-29 20:23 alembic/operations/toimpl.py                                                                                                         
_ _ _ _  -rw-r--r--  2.0 unx        0 b- defN 11-Jun-29 20:23 alembic/runtime/__init__.py                                                                                                          
_ _ _ _  -rw-r--r--  2.0 unx    37944 b- defN 11-Jun-29 20:23 alembic/runtime/environment.py                                                                                                       
_ _ _ _  -rw-r--r--  2.0 unx    34983 b- defN 11-Jun-29 20:23 alembic/runtime/migration.py                                                                                                         
_ _ _ _  -rw-r--r--  2.0 unx       91 b- defN 11-Jun-29 20:23 alembic/script/__init__.py                                                                                                           
_ _ _ _  -rw-r--r--  2.0 unx    30424 b- defN 11-Jun-29 20:23 alembic/script/base.py                                                                                                               
_ _ _ _  -rw-r--r--  2.0 unx    32534 b- defN 11-Jun-29 20:23 alembic/script/revision.py                                                                                                           
_ _ _ _ --rw-rw-r--  2.0 unx       38 b- defN 11-Jun-29 20:23 alembic/templates/generic/README                                                                                                     
_ _ _ _ --rw-rw-r--  2.0 unx     1680 b- defN 11-Jun-29 20:23 alembic/templates/generic/alembic.ini.mako                                                                                           
_ _ _ _ --rw-rw-r--  2.0 unx     1990 b- defN 11-Jun-29 20:23 alembic/templates/generic/env.py                                                                                                     
_ _ _ _ --rw-rw-r--  2.0 unx      494 b- defN 11-Jun-29 20:23 alembic/templates/generic/script.py.mako                                                                                             
_ _ _ _ --rw-rw-r--  2.0 unx       41 b- defN 11-Jun-29 20:23 alembic/templates/multidb/README                                                                                                     
_ _ _ _ --rw-rw-r--  2.0 unx     1775 b- defN 11-Jun-29 20:23 alembic/templates/multidb/alembic.ini.mako                                                                                           
_ _ _ _ --rw-rw-r--  2.0 unx     4147 b- defN 11-Jun-29 20:23 alembic/templates/multidb/env.py                                                                                                     
_ _ _ _ --rw-rw-r--  2.0 unx      923 b- defN 11-Jun-29 20:23 alembic/templates/multidb/script.py.mako                                                                                             
_ _ _ _ --rw-rw-r--  2.0 unx       59 b- defN 11-Jun-29 20:23 alembic/templates/pylons/README                                                                                                      
_ _ _ _ --rw-rw-r--  2.0 unx     1159 b- defN 11-Jun-29 20:23 alembic/templates/pylons/alembic.ini.mako                                                                                            
_ _ _ _ --rw-rw-r--  2.0 unx     2221 b- defN 11-Jun-29 20:23 alembic/templates/pylons/env.py                                                                                                      
_ _ _ _ --rw-rw-r--  2.0 unx      494 b- defN 11-Jun-29 20:23 alembic/templates/pylons/script.py.mako                                                                                              
_ _ _ _ +-rw-r--r--  2.0 unx       38 b- defN 11-Jun-29 20:23 alembic/templates/generic/README                                                                                                     
_ _ _ _ +-rw-r--r--  2.0 unx     1680 b- defN 11-Jun-29 20:23 alembic/templates/generic/alembic.ini.mako                                                                                           
_ _ _ _ +-rw-r--r--  2.0 unx     1990 b- defN 11-Jun-29 20:23 alembic/templates/generic/env.py                                                                                                     
_ _ _ _ +-rw-r--r--  2.0 unx      494 b- defN 11-Jun-29 20:23 alembic/templates/generic/script.py.mako                                                                                             
_ _ _ _ +-rw-r--r--  2.0 unx       41 b- defN 11-Jun-29 20:23 alembic/templates/multidb/README                                                                                                     
_ _ _ _ +-rw-r--r--  2.0 unx     1775 b- defN 11-Jun-29 20:23 alembic/templates/multidb/alembic.ini.mako                                                                                           
_ _ _ _ +-rw-r--r--  2.0 unx     4147 b- defN 11-Jun-29 20:23 alembic/templates/multidb/env.py                                                                                                     
_ _ _ _ +-rw-r--r--  2.0 unx      923 b- defN 11-Jun-29 20:23 alembic/templates/multidb/script.py.mako                                                                                             
_ _ _ _ +-rw-r--r--  2.0 unx       59 b- defN 11-Jun-29 20:23 alembic/templates/pylons/README                                                                                                      
_ _ _ _ +-rw-r--r--  2.0 unx     1159 b- defN 11-Jun-29 20:23 alembic/templates/pylons/alembic.ini.mako                                                                                            
_ _ _ _ +-rw-r--r--  2.0 unx     2221 b- defN 11-Jun-29 20:23 alembic/templates/pylons/env.py                                                                                                      
_ _ _ _ +-rw-r--r--  2.0 unx      494 b- defN 11-Jun-29 20:23 alembic/templates/pylons/script.py.mako                                                                                              
_ _ _ _  -rw-r--r--  2.0 unx      253 b- defN 11-Jun-29 20:23 alembic/testing/__init__.py                                                                                                          
_ _ _ _  -rw-r--r--  2.0 unx     5931 b- defN 11-Jun-29 20:23 alembic/testing/assertions.py                                                                                                        
_ _ _ _  -rw-r--r--  2.0 unx      310 b- defN 11-Jun-29 20:23 alembic/testing/compat.py                                                                                                            
_ _ _ _  -rw-r--r--  2.0 unx     2544 b- defN 11-Jun-29 20:23 alembic/testing/config.py                                                                                                            
_ _ _ _  -rw-r--r--  2.0 unx      766 b- defN 11-Jun-29 20:23 alembic/testing/engines.py                                                                                                           
_ _ _ _  -rw-r--r--  2.0 unx     9485 b- defN 11-Jun-29 20:23 alembic/testing/env.py                                                                                                               
_ _ _ _  -rw-r--r--  2.0 unx    12825 b- defN 11-Jun-29 20:23 alembic/testing/exclusions.py

_ _ _ _ @@ -63,14 +63,14 @@
_ _ _ _  -rw-r--r--  2.0 unx      687 b- defN 11-Jun-29 20:23 alembic/util/__init__.py
_ _ _ _  -rw-r--r--  2.0 unx     6822 b- defN 11-Jun-29 20:23 alembic/util/compat.py
_ _ _ _  -rw-r--r--  2.0 unx       40 b- defN 11-Jun-29 20:23 alembic/util/exc.py
_ _ _ _  -rw-r--r--  2.0 unx     9640 b- defN 11-Jun-29 20:23 alembic/util/langhelpers.py
_ _ _ _  -rw-r--r--  2.0 unx     2442 b- defN 11-Jun-29 20:23 alembic/util/messaging.py
_ _ _ _  -rw-r--r--  2.0 unx     2677 b- defN 11-Jun-29 20:23 alembic/util/pyfiles.py
_ _ _ _  -rw-r--r--  2.0 unx     6718 b- defN 11-Jun-29 20:23 alembic/util/sqla_compat.py
_ _ _ _ --rw-rw-r--  2.0 unx     1184 b- defN 11-Jun-29 20:23 alembic-1.0.2.dist-info/LICENSE
_ _ _ _ +-rw-r--r--  2.0 unx     1184 b- defN 11-Jun-29 20:23 alembic-1.0.2.dist-info/LICENSE
_ _ _ _  -rw-r--r--  2.0 unx     6038 b- defN 11-Jun-29 20:23 alembic-1.0.2.dist-info/METADATA
_ _ _ _  -rw-r--r--  2.0 unx      110 b- defN 11-Jun-29 20:23 alembic-1.0.2.dist-info/WHEEL
_ _ _ _ --rw-rw-r--  2.0 unx       49 b- defN 11-Jun-29 20:23 alembic-1.0.2.dist-info/entry_points.txt
_ _ _ _ --rw-rw-r--  2.0 unx        8 b- defN 11-Jun-29 20:23 alembic-1.0.2.dist-info/top_level.txt
_ _ _ _ +-rw-r--r--  2.0 unx       49 b- defN 11-Jun-29 20:23 alembic-1.0.2.dist-info/entry_points.txt
_ _ _ _ +-rw-r--r--  2.0 unx        8 b- defN 11-Jun-29 20:23 alembic-1.0.2.dist-info/top_level.txt
_ _ _ _  ?rw-rw-r--  2.0 unx     6225 b- defN 11-Jun-29 20:23 alembic-1.0.2.dist-info/RECORD
_ _ _ _  74 files, 571968 bytes uncompressed, 146419 bytes compressed:  74.4%
_ _ ___ idna-2.7-py2.py3-none-any.whl
_ _ _ ___ zipinfo /dev/stdin
_ _ _ _ @@ -3,13 +3,13 @@
_ _ _ _  -rw-r--r--  2.0 unx     3299 b- defN 11-Jun-29 20:23 idna/codec.py
_ _ _ _  -rw-r--r--  2.0 unx      232 b- defN 11-Jun-29 20:23 idna/compat.py
_ _ _ _  -rw-r--r--  2.0 unx    11858 b- defN 11-Jun-29 20:23 idna/core.py
_ _ _ _  -rw-r--r--  2.0 unx    39285 b- defN 11-Jun-29 20:23 idna/idnadata.py
_ _ _ _  -rw-r--r--  2.0 unx     1749 b- defN 11-Jun-29 20:23 idna/intranges.py
_ _ _ _  -rw-r--r--  2.0 unx       21 b- defN 11-Jun-29 20:23 idna/package_data.py
_ _ _ _  -rw-r--r--  2.0 unx   197803 b- defN 11-Jun-29 20:23 idna/uts46data.py
_ _ _ _ --rwx------  2.0 unx     3947 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/LICENSE.rst
_ _ _ _ +-rw-r--r--  2.0 unx     3947 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/LICENSE.rst
_ _ _ _  -rw-r--r--  2.0 unx     8866 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/METADATA
_ _ _ _  -rw-r--r--  2.0 unx      110 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/WHEEL
_ _ _ _ --rwx------  2.0 unx        5 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/top_level.txt
_ _ _ _ +-rw-r--r--  2.0 unx        5 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/top_level.txt
_ _ _ _  ?rw-rw-r--  2.0 unx      945 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/RECORD
_ _ _ _  13 files, 268178 bytes uncompressed, 56674 bytes compressed:  78.9%
FAILED

third error (we are missing system level dependency)

building '_yaml' extension                                                                                                                                                                         
creating build/temp.linux-x86_64-3.7                                                                                                                                                               
creating build/temp.linux-x86_64-3.7/ext                                                                                                                                                           
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/kdas/code/securedrop-debian-
packaging/.venv/include -I/usr/include/python3.7m -c ext/_yaml.c -o build/temp.linux-x86_64-3.7/ext/_yaml.o                                                                                        
In file included from ext/_yaml.c:596:                                                                                                                                                             
ext/_yaml.h:2:10: fatal error: yaml.h: No such file or directory                                                                                                                                   
 #include <yaml.h>                                                                                                                                                                                 
          ^~~~~~~~                                                                                                                                                                                 
compilation terminated.                                                                                                                                                                            
Error compiling module, falling back to pure Python                                                                                                                                                
installing to build/bdist.linux-x86_64/wheel                                                                                                                                                       
running install                                                          

@pradyunsg
Copy link

FWIW, you arguably have more control with build, since you can use the API and have a custom isolated environment, that installs from wheels that you've built with a reproducible process.

@FFY00
Copy link

FFY00 commented Mar 5, 2021

that installs from wheels that you've built with a reproducible process

Uh, nop. It doesn't currently 😕. I really want that, but currently we use pip to install into our isolated environment. I was considering maybe using pip install --download and https://github.com/pradyunsg/installer once the installer is usable.

@FFY00
Copy link

FFY00 commented Mar 5, 2021

Ahhh, nevermind. I just woke up and can't read straight 😅. Yeah, you can use the API to install the wheels you've built, but they won't necessarily be installed in a reproducible way. I guess that unreproduceability could leak into your build process, but I don't think it should be that much of a problem.

@conorsch
Copy link
Contributor

conorsch commented Mar 5, 2021

I think due to the user who unpacked the tarball. [...] but had to remember to use the similar kind of non-root user while building [...] reprotest is showing the classic file permission error as I noticed above

@kushaldas Are you setting umask? The scripts in this repo force umask to avoid variance like what you describe, see https://github.com/freedomofpress/securedrop-debian-packaging/blob/9fb5e7f739ec16b02d1d25ee88137c84702ce388/scripts/build-sync-wheels#L19-L20 If you have specific STR, or better yet, a branch, please share!

you can use the API to install the wheels you've built, but they won't necessarily be installed in a reproducible way. I guess that unreproduceability could leak into your build process, but I don't think it should be that much of a problem.

My understanding is that if the wheels themselves are reproducibly built, that's all we need to get fully reproducible debian packages. At least, the variation in metadata within the wheels was the last bit of non-determinism we had to track down and stamp out, in #211 and #213.

@kushaldas
Copy link
Contributor Author

Somehow the temporary directory is getting cleaned up. Look at the directory content at the top and then after build:

Building urllib3-1.25.10
Contents of /tmp/tmp9u5mtvp9 before the build command
['chardet-3.0.4.tar.gz',
 'arrow-0.12.1.tar.gz',
 'requests-2.22.0.tar.gz',
 'six-1.11.0.tar.gz',
 'urllib3-1.25.10.tar.gz',
 'alembic-1.0.2.tar.gz',
 'pathlib2-2.3.2.tar.gz',
 'certifi-2018.10.15.tar.gz',
 'idna-2.7.tar.gz',
 'securedrop-sdk-0.2.0.tar.gz',
 'MarkupSafe-1.1.1.tar.gz',
 'Mako-1.0.7.tar.gz',
 'python-dateutil-2.7.5.tar.gz',
 'SQLAlchemy-1.3.3.tar.gz',
 'python-editor-1.0.3.tar.gz']
------------------------------

running egg_info
writing src/urllib3.egg-info/PKG-INFO
writing dependency_links to src/urllib3.egg-info/dependency_links.txt
writing requirements to src/urllib3.egg-info/requires.txt
writing top-level names to src/urllib3.egg-info/top_level.txt
reading manifest file 'src/urllib3.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching 'Makefile'
warning: no previously-included files matching '*' found under directory 'docs/_build'
writing manifest file 'src/urllib3.egg-info/SOURCES.txt'
running bdist_wheel
running build
running build_py
creating build
creating build/lib
creating build/lib/urllib3
copying src/urllib3/connection.py -> build/lib/urllib3
copying src/urllib3/_version.py -> build/lib/urllib3
copying src/urllib3/connectionpool.py -> build/lib/urllib3
copying src/urllib3/request.py -> build/lib/urllib3
copying src/urllib3/exceptions.py -> build/lib/urllib3
copying src/urllib3/filepost.py -> build/lib/urllib3
copying src/urllib3/response.py -> build/lib/urllib3
copying src/urllib3/fields.py -> build/lib/urllib3
copying src/urllib3/_collections.py -> build/lib/urllib3
copying src/urllib3/__init__.py -> build/lib/urllib3
copying src/urllib3/poolmanager.py -> build/lib/urllib3
creating build/lib/urllib3/packages
copying src/urllib3/packages/six.py -> build/lib/urllib3/packages
copying src/urllib3/packages/__init__.py -> build/lib/urllib3/packages
creating build/lib/urllib3/packages/ssl_match_hostname
copying src/urllib3/packages/ssl_match_hostname/_implementation.py -> build/lib/urllib3/packages/ssl_match_hostname
copying src/urllib3/packages/ssl_match_hostname/__init__.py -> build/lib/urllib3/packages/ssl_match_hostname
creating build/lib/urllib3/packages/backports
copying src/urllib3/packages/backports/__init__.py -> build/lib/urllib3/packages/backports
copying src/urllib3/packages/backports/makefile.py -> build/lib/urllib3/packages/backports
creating build/lib/urllib3/contrib
copying src/urllib3/contrib/socks.py -> build/lib/urllib3/contrib
copying src/urllib3/contrib/_appengine_environ.py -> build/lib/urllib3/contrib
copying src/urllib3/contrib/appengine.py -> build/lib/urllib3/contrib
copying src/urllib3/contrib/securetransport.py -> build/lib/urllib3/contrib
copying src/urllib3/contrib/pyopenssl.py -> build/lib/urllib3/contrib
copying src/urllib3/contrib/__init__.py -> build/lib/urllib3/contrib
copying src/urllib3/contrib/ntlmpool.py -> build/lib/urllib3/contrib
creating build/lib/urllib3/contrib/_securetransport
copying src/urllib3/contrib/_securetransport/low_level.py -> build/lib/urllib3/contrib/_securetransport
copying src/urllib3/contrib/_securetransport/bindings.py -> build/lib/urllib3/contrib/_securetransport
copying src/urllib3/contrib/_securetransport/__init__.py -> build/lib/urllib3/contrib/_securetransport
creating build/lib/urllib3/util
copying src/urllib3/util/connection.py -> build/lib/urllib3/util
copying src/urllib3/util/timeout.py -> build/lib/urllib3/util
copying src/urllib3/util/queue.py -> build/lib/urllib3/util
copying src/urllib3/util/request.py -> build/lib/urllib3/util
copying src/urllib3/util/ssl_.py -> build/lib/urllib3/util
copying src/urllib3/util/response.py -> build/lib/urllib3/util
copying src/urllib3/util/wait.py -> build/lib/urllib3/util
copying src/urllib3/util/retry.py -> build/lib/urllib3/util
copying src/urllib3/util/__init__.py -> build/lib/urllib3/util
copying src/urllib3/util/url.py -> build/lib/urllib3/util
installing to build/bdist.linux-x86_64/wheel
running install
running install_lib
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/wheel
creating build/bdist.linux-x86_64/wheel/urllib3
copying build/lib/urllib3/connection.py -> build/bdist.linux-x86_64/wheel/urllib3
copying build/lib/urllib3/_version.py -> build/bdist.linux-x86_64/wheel/urllib3
creating build/bdist.linux-x86_64/wheel/urllib3/contrib
copying build/lib/urllib3/contrib/socks.py -> build/bdist.linux-x86_64/wheel/urllib3/contrib
copying build/lib/urllib3/contrib/_appengine_environ.py -> build/bdist.linux-x86_64/wheel/urllib3/contrib
copying build/lib/urllib3/contrib/appengine.py -> build/bdist.linux-x86_64/wheel/urllib3/contrib
copying build/lib/urllib3/contrib/securetransport.py -> build/bdist.linux-x86_64/wheel/urllib3/contrib
creating build/bdist.linux-x86_64/wheel/urllib3/contrib/_securetransport
copying build/lib/urllib3/contrib/_securetransport/low_level.py -> build/bdist.linux-x86_64/wheel/urllib3/contrib/_securetransport
copying build/lib/urllib3/contrib/_securetransport/bindings.py -> build/bdist.linux-x86_64/wheel/urllib3/contrib/_securetransport
copying build/lib/urllib3/contrib/_securetransport/__init__.py -> build/bdist.linux-x86_64/wheel/urllib3/contrib/_securetransport
copying build/lib/urllib3/contrib/pyopenssl.py -> build/bdist.linux-x86_64/wheel/urllib3/contrib
copying build/lib/urllib3/contrib/__init__.py -> build/bdist.linux-x86_64/wheel/urllib3/contrib
copying build/lib/urllib3/contrib/ntlmpool.py -> build/bdist.linux-x86_64/wheel/urllib3/contrib
copying build/lib/urllib3/connectionpool.py -> build/bdist.linux-x86_64/wheel/urllib3
copying build/lib/urllib3/request.py -> build/bdist.linux-x86_64/wheel/urllib3
copying build/lib/urllib3/exceptions.py -> build/bdist.linux-x86_64/wheel/urllib3
copying build/lib/urllib3/filepost.py -> build/bdist.linux-x86_64/wheel/urllib3
copying build/lib/urllib3/response.py -> build/bdist.linux-x86_64/wheel/urllib3
copying build/lib/urllib3/fields.py -> build/bdist.linux-x86_64/wheel/urllib3
creating build/bdist.linux-x86_64/wheel/urllib3/util
copying build/lib/urllib3/util/connection.py -> build/bdist.linux-x86_64/wheel/urllib3/util
copying build/lib/urllib3/util/timeout.py -> build/bdist.linux-x86_64/wheel/urllib3/util
copying build/lib/urllib3/util/queue.py -> build/bdist.linux-x86_64/wheel/urllib3/util
copying build/lib/urllib3/util/request.py -> build/bdist.linux-x86_64/wheel/urllib3/util
copying build/lib/urllib3/util/ssl_.py -> build/bdist.linux-x86_64/wheel/urllib3/util
copying build/lib/urllib3/util/response.py -> build/bdist.linux-x86_64/wheel/urllib3/util
copying build/lib/urllib3/util/wait.py -> build/bdist.linux-x86_64/wheel/urllib3/util
copying build/lib/urllib3/util/retry.py -> build/bdist.linux-x86_64/wheel/urllib3/util
copying build/lib/urllib3/util/__init__.py -> build/bdist.linux-x86_64/wheel/urllib3/util
copying build/lib/urllib3/util/url.py -> build/bdist.linux-x86_64/wheel/urllib3/util
copying build/lib/urllib3/_collections.py -> build/bdist.linux-x86_64/wheel/urllib3
copying build/lib/urllib3/__init__.py -> build/bdist.linux-x86_64/wheel/urllib3
copying build/lib/urllib3/poolmanager.py -> build/bdist.linux-x86_64/wheel/urllib3
creating build/bdist.linux-x86_64/wheel/urllib3/packages
creating build/bdist.linux-x86_64/wheel/urllib3/packages/ssl_match_hostname
copying build/lib/urllib3/packages/ssl_match_hostname/_implementation.py -> build/bdist.linux-x86_64/wheel/urllib3/packages/ssl_match_hostname
copying build/lib/urllib3/packages/ssl_match_hostname/__init__.py -> build/bdist.linux-x86_64/wheel/urllib3/packages/ssl_match_hostname
copying build/lib/urllib3/packages/six.py -> build/bdist.linux-x86_64/wheel/urllib3/packages
creating build/bdist.linux-x86_64/wheel/urllib3/packages/backports
copying build/lib/urllib3/packages/backports/__init__.py -> build/bdist.linux-x86_64/wheel/urllib3/packages/backports
copying build/lib/urllib3/packages/backports/makefile.py -> build/bdist.linux-x86_64/wheel/urllib3/packages/backports
copying build/lib/urllib3/packages/__init__.py -> build/bdist.linux-x86_64/wheel/urllib3/packages
running install_egg_info
running egg_info
writing src/urllib3.egg-info/PKG-INFO
writing dependency_links to src/urllib3.egg-info/dependency_links.txt
writing requirements to src/urllib3.egg-info/requires.txt
writing top-level names to src/urllib3.egg-info/top_level.txt
reading manifest file 'src/urllib3.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching 'Makefile'
warning: no previously-included files matching '*' found under directory 'docs/_build'
writing manifest file 'src/urllib3.egg-info/SOURCES.txt'
Copying src/urllib3.egg-info to build/bdist.linux-x86_64/wheel/urllib3-1.25.10.egg-info
running install_scripts
creating build/bdist.linux-x86_64/wheel/urllib3-1.25.10.dist-info/WHEEL
creating 'dist/urllib3-1.25.10-py2.py3-none-any.whl' and adding 'build/bdist.linux-x86_64/wheel' to it
adding 'urllib3/__init__.py'
adding 'urllib3/_collections.py'
adding 'urllib3/_version.py'
adding 'urllib3/connection.py'
adding 'urllib3/connectionpool.py'
adding 'urllib3/exceptions.py'
adding 'urllib3/fields.py'
adding 'urllib3/filepost.py'
adding 'urllib3/poolmanager.py'
adding 'urllib3/request.py'
adding 'urllib3/response.py'
adding 'urllib3/contrib/__init__.py'
adding 'urllib3/contrib/_appengine_environ.py'
adding 'urllib3/contrib/appengine.py'
adding 'urllib3/contrib/ntlmpool.py'
adding 'urllib3/contrib/pyopenssl.py'
adding 'urllib3/contrib/securetransport.py'
adding 'urllib3/contrib/socks.py'
adding 'urllib3/contrib/_securetransport/__init__.py'
adding 'urllib3/contrib/_securetransport/bindings.py'
adding 'urllib3/contrib/_securetransport/low_level.py'
adding 'urllib3/packages/__init__.py'
adding 'urllib3/packages/six.py'
adding 'urllib3/packages/backports/__init__.py'
adding 'urllib3/packages/backports/makefile.py'
adding 'urllib3/packages/ssl_match_hostname/__init__.py'
adding 'urllib3/packages/ssl_match_hostname/_implementation.py'
adding 'urllib3/util/__init__.py'
adding 'urllib3/util/connection.py'
adding 'urllib3/util/queue.py'
adding 'urllib3/util/request.py'
adding 'urllib3/util/response.py'
adding 'urllib3/util/retry.py'
adding 'urllib3/util/ssl_.py'
adding 'urllib3/util/timeout.py'
adding 'urllib3/util/url.py'
adding 'urllib3/util/wait.py'
adding 'urllib3-1.25.10.dist-info/LICENSE.txt'
adding 'urllib3-1.25.10.dist-info/METADATA'
adding 'urllib3-1.25.10.dist-info/WHEEL'
adding 'urllib3-1.25.10.dist-info/top_level.txt'
adding 'urllib3-1.25.10.dist-info/RECORD'
removing build/bdist.linux-x86_64/wheel
build command used: python3 -m build --wheel /tmp/pip-wheel-build/urllib3-1.25.10 --no-isolation -o /tmp/tmp9u5mtvp9
Contents of /tmp/tmp9u5mtvp9 after the build command
['urllib3-1.25.10-py2.py3-none-any.whl']


@kushaldas
Copy link
Contributor Author

setuptools-40.8.0 was happily deleting the directory, now updating to the latest 54.0.0, I remember seeing another issue with that though.

@kushaldas
Copy link
Contributor Author

Next big question: reprotest changes uid and gid to non-existing one, and tar -xvf source.tar.gz command is not happy about it. Example of error:

chardet-3.0.4/tests/windows-1255-hebrew/pcplus.co.il.xml
tar: chardet-3.0.4/tests/windows-1255-hebrew/pcplus.co.il.xml: Cannot change ownership to uid 501, gid 50: Invalid argument
chardet-3.0.4/tests/windows-1255-hebrew/sharks.co.il.xml
tar: chardet-3.0.4/tests/windows-1255-hebrew/sharks.co.il.xml: Cannot change ownership to uid 501, gid 50: Invalid argument
chardet-3.0.4/tests/windows-1255-hebrew/whatsup.org.il.xml
tar: chardet-3.0.4/tests/windows-1255-hebrew/whatsup.org.il.xml: Cannot change ownership to uid 501, gid 50: Invalid argument
chardet-3.0.4/tests/windows-1256-arabic/
tar: chardet-3.0.4/tests/windows-1255-hebrew: Cannot change ownership to uid 501, gid 50: Invalid argument
chardet-3.0.4/tests/windows-1256-arabic/_chromium_windows-1256_with_no_encoding_specified.html
tar: chardet-3.0.4/tests/windows-1256-arabic/_chromium_windows-1256_with_no_encoding_specified.html: Cannot change ownership to uid 501, gid 50: Invalid argument
tar: chardet-3.0.4/tests/windows-1256-arabic: Cannot change ownership to uid 501, gid 50: Invalid argument
tar: chardet-3.0.4/tests: Cannot change ownership to uid 501, gid 50: Invalid argument
tar: chardet-3.0.4: Cannot change ownership to uid 501, gid 50: Invalid argument
tar: Exiting with failure status due to previous errors

@conorsch any tips on how to extract the files?

@conorsch
Copy link
Contributor

conorsch commented Mar 5, 2021

reprotest has a user_group variation that intentionally modifies the user and group info. You can set it to known-working accounts via --variations=user_group.available+=builduser:builduser, or disable it entirely with --variations="-user_group". Make sure to review which variations we're currently running in CI, they're different for wheels versus debs:

If you're running tar directly, you can also force user/group info via --owner and --group flags during extraction. As requested above, please push a branch! I'd love to take a closer look at the problem.

@kushaldas
Copy link
Contributor Author

kushaldas commented Mar 5, 2021

@conorsch try this branch: https://github.com/freedomofpress/securedrop-debian-packaging/tree/build_with_build

    python3 -m venv .venv
    source .venv/bin/activate
    python3 -m pip install -r build-requirements.txt
    pytest -vvs tests/test_reproducible_wheels.py

Finally this works somewhat.

@eloquence eloquence moved this from Near Term - SD Workstation to Maintenance period (Kanban mode) in SecureDrop Team Board Mar 22, 2021
@kushaldas
Copy link
Contributor Author

kushaldas commented Mar 23, 2021

Notation: build -- for the tool, otherwise the verb.

We will need to use build tool to create the wheels. In this branch I already created the minimal requirements file for the tool itself.

The steps required:

bootstrapping

  • Install latest pip.
  • I think we should add pip to the requirements file.
  • We will have to identify the different build dependencies getting pulled from pypi while installing from source
  • Use the command time python3 -m pip install --no-binary :all: -r requirements.txt to see which all packages have build time dependencies, we have to check the source for each and manually add them to the requirements.in and then update requirements.txt file.
  • First someone has to build the wheels from the requirements.txt file in the reproducible way. And put them
    into a separate directory (say bootstrap).
  • Sign the sha256sums and store them like we do in localwheels directory.
  • Create a new requirements.txt file with only the hashes from our wheels for the build tool itself.
  • Make a list of gpg keys (of the maintainers who can sign and release new wheels )

Actual wheels rebuilding

  • python3 -m venv .venv
  • source .venv/bin/activate
  • Verify the bootstrap directory wheels aginast the GPG signed list of sha256sums.
  • Install them into the virtualenv while checking the hashes.
  • Now we can build/rebuild any wheel we want using our existing scripts.

We should regenerate wheels for all of the Debian pacakges and recreate their build-requirements.txt file. This is import as the wheel built via build tool will not match with the sha256sums from the old one built via pip.

CI side of work

  • Create a job which will run if a new/rebuild of wheel is part of the PR.
  • Rebuild the newly submitted source tarballs, and verify that the sha256sums are matching with the sha256sums in the PR. Also verify that the list of sha256sums is signed by one of the maintainer. This will enable any maintainer to upload new wheels and that can be auto verified via CI by rebuilding and matching the hash of the wheel.

@conorsch
Copy link
Contributor

We should regenerate wheels for all of the Debian pacakges and recreate their build-requirements.txt file. This is import as the wheel built via build tool will not match with the sha256sums from the old one built via pip.

Interesting, I wonder what's different about the structure of the emitted wheels. Thanks for pushing a branch, I'll run through the steps you describe and compare wheels of the same package versions via diffoscope.

@conorsch
Copy link
Contributor

Took a look, and the new build logic is promising! The only variation I noticed in the emitted wheels was for packages that include .so files bundling C code. Roundly that appears to be:

  • markupsafe
  • sqlalchemy
  • idna

All other wheels had the same hash. I appended some commits required to get things working on my machine, such as including the wheel dependency explicitly in the build requirements. make reprotest shows only 1 failing test for me locally, related to securedrop-proxy, complaining about lack of Cython:

Building PyYAML-5.4.1
running egg_info
writing lib3/PyYAML.egg-info/PKG-INFO
writing dependency_links to lib3/PyYAML.egg-info/dependency_links.txt
writing top-level names to lib3/PyYAML.egg-info/top_level.txt
reading manifest file 'lib3/PyYAML.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'lib3/PyYAML.egg-info/SOURCES.txt'
ERROR Missing dependencies:
	Cython
Traceback (most recent call last):
  File "./scripts/build-sync-wheels", line 134, in <module>
    main()
  File "./scripts/build-sync-wheels", line 108, in main
    subprocess.check_call(cmd)
  File "/usr/lib/python3.7/subprocess.py", line 347, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['python3', '-m', 'build', '--wheel', '/tmp/pip-wheel-build/PyYAML-5.4.1', '--no-isolation', '-o', '/tmp/tmpqq39id63']' returned non-zero exit status 1.

Once we resolve that, we're in a good place to rebuild the wheels and use the new logic.

@kushaldas
Copy link
Contributor Author

All other wheels had the same hash. I appended some commits required to get things working on my machine, such as including the wheel dependency explicitly in the build requirements. make reprotest shows only 1 failing test for me locally, related to securedrop-proxy, complaining about lack of Cython:

Ah, this dependency is new I guess. I am working to add it as a dependency. Cython can also be build as reproducible.

@kushaldas
Copy link
Contributor Author

I was testing with Markupsafe, I can see 8 bytes difference in the shared object file.

│ -markupsafe/_speedups.cpython-37m-x86_64-linux-gnu.so,sha256=U7aMolMvt-KGsTX9MHskz2_4Hh1OyX62lctrXYoTwfk,55920
│ +markupsafe/_speedups.cpython-37m-x86_64-linux-gnu.so,sha256=9TspJu-22snUPff7P4CSZRd9D60Y2OLKVh8X7jh3Kck,55912

@kushaldas
Copy link
Contributor Author

After more digging today I can see the difference of 8 bytes happening at the section header. And the difference is related to the actual source directory path.

│ │ -    <15>   DW_AT_comp_dir    : (indirect string, offset: 0x1fbc): /tmp/pip-wheel-build/MarkupSafe-1.1.1
│ │ +    <15>   DW_AT_comp_dir    : (indirect string, offset: 0x57): /tmp/pip-wheel-build/markupsafe

@conorsch my suggestion is to rebuild all the wheels. Let me know what do you think.

@emkll
Copy link
Contributor

emkll commented Mar 29, 2021

the difference is related to the actual source directory path.

Shouldn't the source directory path not be a function of the built artifact? Is there a way to override this? @kushaldas we had a similar issue with updating/matching paths and reproducibility in #231 . This seems to me like a regression, as we were able to build package in a way that is reproducible, regardless of the path where the packaging repo was cloned.

@emkll
Copy link
Contributor

emkll commented Mar 29, 2021

my suggestion is to rebuild all the wheels. Let me know what do you think.

If the diff is as small as the 8bytes in the header, if this is the way that source paths will be handled in the future, then yes, let's rebuild those wheels.

Make a list of gpg keys (of the maintainers who can sign and release new wheels )

Agreed, because the wheels will be reproducible (and we will have CI check to ensure that the wheels are reproducible), it does make sense to have developers sign with their personal keys. In the future, we can consider CI building these wheel on request, and have a developer compare the hash of the wheel with their local build, sign and merge.

bootstrapping

About the bootstrapping, how will we ensure the initial build is correct? This sounds to me like a potential chicken-and-egg problem where we need to build build. How is that done?

Another thing to consider: If the end result of the build process is the same (e.g. same wheel or deb package hash), does it matter if we use the pypi wheel (or instead just use the non-reproducibly-built-from-source) build wheel? I think freezing/tracking requirements makes sense to avoid breakage, but is maintaining a separate set of reproducible and managed wheels here worth it?

@sssoleileraaa
Copy link
Contributor

but is maintaining a separate set of reproducible and managed wheels here worth it?

i don't think it is. i think it might be better to just maintain the wheels that have different hashes so that we can easily find which wheels we had to do diff reviews for in case we need to re-review.

@sssoleileraaa
Copy link
Contributor

Oops I misunderstood your question. I thought you were asking if we should push another set of wheels with the same hashes for our projects just because they were built with build instead of pip. I guess it's obvious in retrospect that no one is suggesting that.

Should we maintain a set of reproducible wheels of the build package and its dependencies? I don't see why. If we specify the source hash of build in apackaging-requirements.txt file, isn't that enough? Why do we care if we can build build reproducibly again? I thought we just cared about building our projects packages reproducibly.

@kushaldas
Copy link
Contributor Author

About the bootstrapping, how will we ensure the initial build is correct? This sounds to me like a potential chicken-and-egg problem where we need to build build. How is that done?

We will count the source tarballs for build tool and other build dependencies as initial truth point, and maintain our own binary wheel for the same ones. If we keep pulling them down everytime from network, then we are breaking our bubble of packages every time.

That is why my initial step is called bootstrapping, where we create the wheels for build tool itself, and also for all the other packages build dependencies. We don't need these tools for debian packages, but we do need them to build our wheels.

@kushaldas
Copy link
Contributor Author

I thought you were asking if we should push another set of wheels with the same hashes for our projects just because they were built with build instead of pip. I guess it's obvious in retrospect that no one is suggesting that.

We will have to rebiuld the wheels as there will be a few wheels with new sha256sums.

Should we maintain a set of reproducible wheels of the build package and its dependencies? I don't see why. If we specify the source hash of build in apackaging-requirements.txt file, isn't that enough? Why do we care if we can build build reproducibly again? I thought we just cared about building our projects packages reproducibly.

The idea is to isolate the whole building environment. We need to make sure that we are using a valid set of packages, and maintaining a copy of all required wheel building tools in a local cache is generally the standard way to create such isolate building environment.

@kushaldas
Copy link
Contributor Author

kushaldas commented Mar 30, 2021

Timing information

Installing from source on a Xeon server:

Successfully installed build-0.3.0 cython-0.29.22 importlib-metadata-3.7.0 packaging-20.9 pep517-0.9.1 pyparsing-2.4.7 pytest-runner-5.3.0 setuptools-54.0.0 setuptools-scm-5.0.2 toml-0.10.2 typing-extensions-3.7.4.3 wheel-0.36.2 zipp-3.4.1

real    5m31.982s
user    5m18.039s
sys     0m7.260s

From wheels

Successfully installed build-0.3.0 cython-0.29.22 importlib-metadata-3.7.0 packaging-20.9 pep517-0.9.1 pyparsing-2.4.7 pytest-runner-5.3.0 setuptools-54.0.0 setuptools-scm-5.0.2 toml-0.10.2 typing-extensions-3.7.4.3 wheel-0.36.2 zipp-3.4.1

real    0m2.193s
user    0m1.849s
sys     0m0.126s

@pradyunsg
Copy link

FWIW, pip should internally be building a wheel and installing from it, when installing from an sdist -- so it's generally better to directly install from the wheel.

Especially since wheel-based installs are deterministic and don't involve executing any code.

@kushaldas
Copy link
Contributor Author

We finished the first step of bootstrapping, next we will have to regenerate all the wheels, and then also create build-requirements.txt file for the following Debian packages and do a release:

  • securedrop-client
  • securedrop-log
  • securedrop-proxy

We should discuss the steps during our standup today. @creviera @conorsch

@conorsch
Copy link
Contributor

conorsch commented Apr 7, 2021

Now that #238 is merged, we're good to close. We have a bit more follow-up, e.g. #241, tracked separately. Thanks, @kushaldas & @creviera, for getting these improvements in. And thanks also to @pradyunsg & @FFY00 for the helpful guidance!

@conorsch conorsch closed this as completed Apr 7, 2021
SecureDrop Team Board automation moved this from 1.8.1 Release Period (Kanban Mode) to Done Apr 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

6 participants