Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Makes wheel builds reproducible, with tests #211

Merged
merged 7 commits into from
Jan 5, 2021
Merged

Conversation

conorsch
Copy link
Contributor

Overview

Refs #196

Makes two changes to our existing wheel build process:

  1. Setting SOURCE_DATE_EPOCH env var
  2. Setting a hardcoded dir path for all wheels builds

With those changes, the ./scripts/build-sync-wheels script builds wheels that are byte-for-byte reproducible. I also included a test-only Makefile target that was handy during development. It's not yet ready for CI, because I'm seeing the platform vary in the reprotest build environment. More input appreciated if anyone can resolve that problem, see initial report in #196 (comment)

Testing

First, it's helpful to observe the wheel checksums changing over multiple builds, to understand the problem we're trying to solve.

  1. Check out a new branch, based on main, rather than this feature branch.
  2. Run git rm localwheels/*.whl and commit.
  3. Run ./build-sync-wheels -p ../securedrop-client, updating the dirpath as necessary. Commit the results.
  4. Run git rm localwheels/*.whl, but don't commit the results.
  5. Run ./build-sync-wheels -p ../securedrop-client again, and commit the results.
  6. Run git show --name-status and confirm that many of the localwheels/*.whl files changed.

That's a convoluted workflow because the build-sync-wheels script will politely avoid overwriting wheel files that already exist. In this PR, a --clobber flag is added to the script to force overwrites, making comparison easier.

OK, now let's verify the new behavior!

  1. Check out a new branch, based on this branch. Note that the localwheels/*.whl files are identical to those on the main branch.
  2. Run ./build-sync-wheels --clobber -p ../securedrop-client, updating the dirpath as necessary. Commit the results.
  3. Run ./build-sync-wheels --clobber -p ../securedrop-client again
  4. Run git status, confirm that there are no local modifications. The wheels were fetched, but identical.

That's about it. Eventually, I'd love to drop the requirement that we store the wheels in version control, separately from the deb packages that contain them. But the work here is a step in that direction: once we have reproducible wheels, we can work on skipping artifact storage as long as the final output is deterministic.

Two changes to wheel build process:

  1. Setting SOURCE_DATE_EPOCH env var
  2. Setting a hardcoded dir path for all wheels builds

That seems to do the trick. The reprotest target in the Makefile, used
for debugging locally, has a platform discrepancy, building for i686
rather than x86_64.
kushaldas and others added 2 commits December 21, 2020 22:32
We need to build only for x86_64 arch or else the wheel names will
not match.
Using pytest with the same options as what's been confirmed locally.
Tested with pip v20.1. With newer versions, specifically >=20.3, the
repro check fails, due to lack of "--build" flag support.
@conorsch
Copy link
Contributor Author

@kushaldas Thanks for adding the platform fix! I cobbled together a pytest file to run reprotest. It works for me locally, although we're bumping up against a deprecation warning—if your pip is v20.3 or newer, the wheels are no longer reproducible. Haven't checked behavior in CI yet, please have a look both locally and in CI, and see if you can get it passing in both places.

@kushaldas
Copy link
Contributor

Now setarch is failing on the CI:

setarch: failed to set personality to x86_64: Operation not permitted

@conorsch
Copy link
Contributor Author

Drat, I did not take a look at the CI failures as I'd hoped to today, @kushaldas. Please continue to experiment and see if you can get it passing!

@kushaldas
Copy link
Contributor

I think it is some restrictions via removed capability in circleci, as I can do this on my Debian Buster system:

❯ docker run --rm -it circleci/python:3.7-buster
Unable to find image 'circleci/python:3.7-buster' locally
3.7-buster: Pulling from circleci/python
6c33745f49b4: Pull complete 
ef072fc32a84: Pull complete 
c0afb8e68e0b: Pull complete 
d599c07d28e6: Pull complete 
f2ecc74db11a: Pull complete 
26856d31ce86: Pull complete 
d4db05d8da44: Pull complete 
a63ae5575b2c: Pull complete 
a8c72204fafe: Pull complete 
bf764dfeb822: Pull complete 
e57a236639b1: Pull complete 
012c9eaaa197: Pull complete 
aa2a93f84145: Pull complete 
57bb67997b2b: Pull complete 
cbabdb8bf86a: Pull complete 
044162ca41e3: Pull complete 
2779b34e014a: Pull complete 
ca5b18703510: Pull complete 
8b7e761ad09c: Pull complete 
Digest: sha256:e80187525f02093157eab20be03978b4a5d6d04f012e27bedf464418081dafa2
Status: Downloaded newer image for circleci/python:3.7-buster
$ bash
circleci@72e368189ae7:/$ cd
circleci@72e368189ae7:~$ setarch x86_64
$   
circleci@72e368189ae7:~$ exit

@kushaldas
Copy link
Contributor

I can manually do setarch in the container:

circleci@5094c19bc362:~$ setarch x86_64
$ 
circleci@5094c19bc362:~$ ls

But, the reprotest command still fails:

(.venv) circleci@5094c19bc362:~/project$ reprotest -c "./scripts/build-sync-wheels -p https://github.com/freedomofpress/securedrop-client --clobber" --vary "+all, -time, -locales, -kernel" . "localwheels/*.whl"
WARNING:reprotest:The control build runs on 1 CPU by default, give --min-cpus to increase this.
setarch: failed to set personality to x86_64: Operation not permitted
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/reprotest/__init__.py", line 831, in run
    return 0 if check_func(*check_args) else 1
  File "/usr/lib/python3/dist-packages/reprotest/__init__.py", line 363, in check
    local_dists = [proc.send(nv) for nv in zip(bnames, build_variations)]
  File "/usr/lib/python3/dist-packages/reprotest/__init__.py", line 363, in <listcomp>
    local_dists = [proc.send(nv) for nv in zip(bnames, build_variations)]
  File "/usr/lib/python3/dist-packages/reprotest/__init__.py", line 329, in corun_builds
    bctx.run_build(testbed, build, os.environ, artifact_pattern, testbed_build_pre, no_clean_on_error)
  File "/usr/lib/python3/dist-packages/reprotest/__init__.py", line 220, in run_build
    kind='build')
  File "/usr/lib/python3/dist-packages/reprotest/__init__.py", line 64, in check_exec2
    adtlog.AutopkgtestError)
  File "/usr/lib/python3/dist-packages/reprotest/__init__.py", line 70, in bomb
    raise _type(m)
reprotest.lib.adtlog.AutopkgtestError: "sh -ec run_build() {
    mkdir -p /tmp/reprotest.vWaoEr/build-control-aux && \
    mv /tmp/reprotest.vWaoEr/build-control/ /tmp/reprotest.vWaoEr/const_build_path && \
    SETARCH_ARCH="linux64 -R" && \
    SETARCH_OPTS="$SETARCH_OPTS -R" && \
    CPU_MAX=$(nproc) && \
    CPU_MIN=$({ echo $CPU_MAX; echo 1; } | sort -n | head -n1) && \
    CPU_NUM=$CPU_MIN && \
    umask 0022 && \
    export REPROTEST_BUILD_PATH=/tmp/reprotest.vWaoEr/const_build_path/ && \
    export REPROTEST_UMASK=$(umask) && \
    faketime @1608717489 \
    taskset -a -c $(echo $(shuf -i0-$((CPU_MAX - 1)) -n$CPU_NUM) | tr ' ' ,) \
    setarch $SETARCH_ARCH $SETARCH_OPTS \
    sh -ec 'cd "$REPROTEST_BUILD_PATH"; unset REPROTEST_BUILD_PATH; umask "$REPROTEST_UMASK"; unset REPROTEST_UMASK; ./scripts/build-sync-wheels -p https://github.com/freedomofpress/securedrop-client --clobber'
}

cleanup() {
    __c=0; \
    mv /tmp/reprotest.vWaoEr/const_build_path /tmp/reprotest.vWaoEr/build-control/ || __c=$?; \
    rm -rf /tmp/reprotest.vWaoEr/build-control-aux || __c=$?; \
    exit $__c
}

trap '( cleanup )' HUP INT QUIT ABRT TERM PIPE # FIXME doesn't quite work reliably yet

if ( run_build ); then ( cleanup ); else
    __x=$?; # save the exit code of run_build
    if ( ! false ); then
        if ( cleanup ); then :; else echo >&2 "cleanup failed with exit code $?"; fi;
    fi
    exit $__x
fi" failed with status 1
(.venv) circleci@5094c19bc362:~/project$ 

@emkll emkll force-pushed the 196-reproducible-wheels branch 3 times, most recently from fcebf80 to 86a5928 Compare December 23, 2020 15:17
setarch is not permittied in the docker container.
Because CircleCI only supports Ubuntu VM targets [1], we must add the Bionic sources to install dh-virtualenv under Focal as no installation candidate exists for that distribution in upstream repos.

[1] : https://circleci.com/docs/2.0/configuration-reference/#available-machine-images
@emkll
Copy link
Contributor

emkll commented Dec 23, 2020

@kushaldas I've appended a commit here to use the CircleCI machine executor (a VM) instead of a docker container. It appears the setarch is restricted inside the container. With these changes, it appears to reprotests are now passing in CI.

@kushaldas
Copy link
Contributor

kushaldas commented Jan 4, 2021

After @conorsch marks this as "Ready for review", we can mark it as "Approved" and merge.

Next step on the packaging front:

  • Build the related packages for a project into wheels in a fixed path.
  • Use that path from the filesystem and disable any external index while building the Debian package. (I already did a PoC on this, working).
  • Then we can retire the web part of our index.

@conorsch conorsch changed the title Attempts to build wheels reproducibly Makes wheel builds reproducible, with tests Jan 4, 2021
@conorsch conorsch marked this pull request as ready for review January 4, 2021 18:03
"securedrop-client",
"securedrop-log",
"securedrop-proxy",
]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there other repositories for which we build wheels, that should be added to this list?

@conorsch
Copy link
Contributor Author

conorsch commented Jan 4, 2021

Thanks, @kushaldas & @emkll. I've pushed one more commit, to enable reproducibility tests for "securedrop-log" and "securedrop-proxy" packages. Marking ready for review!

@conorsch
Copy link
Contributor Author

conorsch commented Jan 4, 2021

Use that path from the filesystem and disable any external index while building the Debian package. (I already did a PoC on this, working).

Please open an issues to track these follow-ups! I see two (2): 1) rebuilding the wheels post-merge of reproducibility added here; and 2) removing the website index, referencing only the local files, as you describe.

Copy link
Contributor

@kushaldas kushaldas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with the steps mentioned in the PR. All worked as expected. Approved.

@sssoleileraaa sssoleileraaa deleted the 196-reproducible-wheels branch September 13, 2021 23:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants