New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dockerfile for release automation #13250
Changes from all commits
897af94
e15057b
20395a5
866d87e
5bf7639
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
# Copyright (C) Daniel Stenberg, <daniel@haxx.se>, et al. | ||
# | ||
# SPDX-License-Identifier: curl | ||
|
||
# Self-contained build environment to match the release environment. | ||
# | ||
# Build and set the timestamp for the date corresponding to the release | ||
# | ||
# docker build --build-arg SOURCE_DATE_EPOCH=1711526400 --build-arg UID=$(id -u) --build-arg GID=$(id -g) -t curl/curl . | ||
# | ||
# Then run commands from within the build environment, for example | ||
# | ||
# docker run --rm -it -u $(id -u):$(id -g) -v $(pwd):/usr/src -w /usr/src curl/curl autoreconf -fi | ||
# docker run --rm -it -u $(id -u):$(id -g) -v $(pwd):/usr/src -w /usr/src curl/curl ./configure --without-ssl --without-libpsl | ||
# docker run --rm -it -u $(id -u):$(id -g) -v $(pwd):/usr/src -w /usr/src curl/curl make | ||
# docker run --rm -it -u $(id -u):$(id -g) -v $(pwd):/usr/src -w /usr/src curl/curl ./maketgz 8.7.1 | ||
# | ||
# or get into a shell in the build environment, for example | ||
# | ||
# docker run --rm -it -u $(id -u):$(id -g) -v (pwd):/usr/src -w /usr/src curl/curl bash | ||
# $ autoreconf -fi | ||
# $ ./configure --without-ssl --without-libpsl | ||
# $ make | ||
# $ ./maketgz 8.7.1 | ||
|
||
# To update, get the latest digest e.g. from https://hub.docker.com/_/debian/tags | ||
FROM debian:bookworm-slim@sha256:993f5593466f84c9200e3e877ab5902dfc0e4a792f291c25c365dbe89833411f | ||
|
||
RUN apt-get update -qq && apt-get install -qq -y --no-install-recommends \ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we want to deterministically install the same package versions I think we need to drop the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I have gone all the way down this road, and realized Debian is just a non-starter for practical/maintainable reproducible builds in a container. I am working on another PR with a stagex-based Containerfile that is deterministic today (and in the future) and full source bootstrapped. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we be more concrete here? It would be fine if the tool versions differ as long as they still produce the same tarball. Which of the steps taken to produce the tarballs might give different results when the tool versions differ? Dropping There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A different version of automake, autoconf, tar, zip, bzip2 etc could all end up causing different results in the future. I just pushed a draft PR with an example that uses a "from scratch" container and an explicit set of fixed/deterministic/multi-signed dependencies: Does not have zip support (until next week as we need to do a release cycle to package it) but otherwise should be good to go, and can be easily upgraded to build any deterministic binaries of curl if desired as well. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about something along the lines of this?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The only package versions that matter at all here are the versions of the autotools packages (autoconf, automake, libtool...). Simply print the debian (apt) versions of this exceedingly small number of packages to a reproducible-manifest.txt inside the tarball before you tar it up. Problem solved. @lrvick As far as rebuilding all packages from scratch goes, it appears you have reinvented the Gentoo Linux distro except NIH. Which, incidentally, also allows you to record an exact git repository hash of the Gentoo tree and guarantee all packages in your container use that and nothing else. But all this is the mootest of moot points, because if you want to guarantee that everyone is using the exact same docker container to reproduce the tarballs, then you simply publish the container you used -- as mentioned above. This is very reproducible as you're uploading @bagder's actual (virtual) laptop used to make the release, and anyone can download @bagder's (virtual) laptop and use it to create the tarball (and check that it's the virtual laptop created in CI). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A decision has already been made here so I am merely responding to some potential misconceptions above for the sake of anyone following this thread.
That will require the user that wishes to verify to track down all those versions, and all their dependencies, and assemble them together again. Version numbers are not good enough, as those keys expire and are rotated often. You would at a minimum need to publish the entire installed apt tree in the container with /hashes/ of every .deb file so they can be retrieved from the Debian archive and directly installed in a reproduction container. This will be very arduous for a user still without any scripted help, but it would at least be a path. I have written a few tools to help with this mentioned elsewhere. It is a terrible reproduction path, but it does work.
Gentoo linux is not reproducible, quorum-signed, full-source-bootstrapped, or OCI native. It is true that we do in fact compile everything from source, but we publish our binaries and expect users to use those, so that makes us closer to traditional linux distros like Debian than Gentoo. The only other deterministic full-source-bootstrapped Linux distro that exists to my knowledge is GNU Guix.They however do not publish reproducible container images, and do not do quorum signing. They are also glibc based and optimized for end user desktop use, which comes with dramatically higher attack surface and complexity than what is required with a goal of simply securely building software. Still, Guix would be the best distro to compare us to. If you are not familiar with full-source-bootstrapping, Guix has a great writeup as they were the first distro to do it. Stagex was the second. https://guix.gnu.org/en/blog/2023/the-full-source-bootstrap-building-from-source-all-the-way-down/
Agreed. That is the reason to use containers, but the context of this are trying to minimize the chance of supply chain attacks. We are trying to solve for trust. If the binary container image is compromised by the person with the publishing API key, then anything downstream from that container image is also compromised. See the XZ attack and Ken Thompson's Reflections on Trusting Trust. You can however pull down the "musl" and "autoconf" binary stagex container images and directly use them together. You can also verify multiple people built them for you and got the same exact hash, via their PGP signatures so you don't have to build it if you trust they are not colluding. You also can -choose- to build them yourself and add your own signature if you don't trust those folks. With Debian OTOH, they build their published docker images in a non-deterministic way and do not sign them, so you have no evidence it is free of tampering. Very much like the XZ situation. Using Docker Debian unfortunately just moves the problem up one layer. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All of this is completely irrelevant, since the curl project is trying to solve reproducible curl release tarballs, not reproducible linux distros as general-purpose computing platforms.
No, again, version numbers of the autotools set of packages alone (and not any dependencies) are sufficient since they allow one to determine the correct version of autotools packages to install from ANY source, modulo debian-specific patches publicly recorded in the Debian VCS for that small handful of packages. No hashes necessary, no dependencies necessary, no access to .debs necessary. It is slightly more awkward to install a local autotools toolchain with debian patches on your choice of personally trusted computing systems than to install a prebuilt .deb, but not much harder. Also "as those keys expire and are rotated often" does not sound like a practical objection to me. If you use old stuff signed by crypto keys which are old, then naturally you backtrace the trust source for those keys, and subsequently backdate the verification routine to respect keys that claim to have expired in the past, but from the perspective of what you're reproducing, in the future.
Sure I'm familiar with it. I don't think that's the concern the curl project has. It will also be more interesting to me personally when it no longer relies on untrusted binaries such as idk, the docker daemon, guile-bootstrap, or any other potentially malicious tools used to host the computing platform that runs the full-source bootstrap.
I've seen both quite well, thanks. Assuming a non-compromised docker daemon, it really is not hard to rebuild the container image using, say, https://snapshot.debian.org/ and test that your binary container image produces the same tarball. Again, the container doesn't have to be byte-for-byte reproducible, it simply has to be some form of instructions that others can follow to yield the same... curl release tarball. No need to trust curl devs to publish a non-compromised spinoff container.
Sounds like a market opportunity for someone to build deterministic debian docker images that are signed. The underlying .deb packages are reproducible: https://tests.reproducible-builds.org/debian/reproducible.html (Feel free to run your own builders to verify that!) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If keys are revoked because they used weak, vulnerable, algos or are already compromised, or they were by a maintainer that is no longer trusted (like the xz maintainer) then those keys could be used to sign a malicious alternative to the package you are actually requesting, with the same version number, and apt would not know the difference. This is why hash locking is an absolute necessity when using old packages where the signatures may no longer be trustworthy.
This actually is surprisingly hard. Debian shapshots are hosted by a single deployment that is most of the time crippled to dialup speeds in my experience, and the experience of virtually every user that has used projects of mine that relied on it. Also, for the above reasons about not being able to trust expired/revoked keys, you have to pull the .deb files manually without installing, hash verify them, then install from that folder as a local mirror. For these two reasons, in the many reproducible build projects I worked with based on Debian, we all ended up forced to use our own git LFS mirrors of .deb packages which was a major pain.
To be fair to this point, for your very specific use case here, you could probably get away with using generic latest debian and then build your own autotools, and then use your own autotools to build the tar. That is of course if any signed/trusted/reproducible debian base images actually existed :/ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Your logical fallacy of the day is: "Moving the goalposts". Now that we have subset the original stated problem from "it is policy to regularly expire and rotate keys" down to "the key turned out to be weak or was actively compromised", it turns out the problem is very manageable and everyone has been managing precisely this for years.
It does indeed sound very unfortunate that the Debian project lacks robust resources and using Debian services is slow. Someone should sponsor infrastructure or something... however, "reproducibility is slow because slow server" is not exactly proof that it is unreproducible.
This sounds very suspicious because it is not how the Debian repository format works. I don't know whether you've studied the Debian policy documentation for this, but a Debian repository consists of a cryptographically signed root manifest that securely (we hope) verifies sha256/512-hashed sub-manifests, which in turn record a sha256/512-hashed set of .debs. As long as you can securely verify the root manifest you have a full chain of trust. If for some reason you determine that the PGP signature isn't reliable for this purpose, you can save the sha512 or blake2 hash of the root manifest, and guarantee that unless an attack is found that permits simultaneously forging the md5, sha1, sha256, and sha512 for the same malicious file, the single file you saved hashes for has covered all .debs for that Debian release. You should probably be able to just sideload your own known-good release file and apt will just use that and recursively handle all the rest for you. And there are various caching proxies you could use to accelerate it with a local mirror.
I'll reiterate my philosophical musing about how it would be great if someone who was passionate about this and interested in working with existing communities would offer to help make this a reality. Again, the packages that go into an image are reproducible, so most of the work is already done and the rest is just "whatever is needed regardless of distro". |
||
build-essential make autoconf automake libtool git perl zip zlib1g-dev gawk && \ | ||
rm -rf /var/lib/apt/lists/* | ||
|
||
ARG UID=1000 GID=1000 | ||
|
||
RUN groupadd --gid $UID dev && \ | ||
useradd --uid $UID --gid dev --shell /bin/bash --create-home dev | ||
|
||
USER dev:dev | ||
|
||
ARG SOURCE_DATE_EPOCH | ||
ENV SOURCE_DATE_EPOCH=${SOURCE_DATE_EPOCH:-1} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgive a docker newbie, but where does the data for this line come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can find it when you pull the image:
But also probably usingdocker manifest
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternately one might be able to use the existing curl-dev-debian image as a base (in curl ghcr)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the practical difference ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got this digest by going to
https://hub.docker.com/_/debian
checking tags
https://hub.docker.com/_/debian/tags
and filtering for bookworm-slim
I didn't know there is already a curl docker image; I read the original mail listing tools and operating system and simply went for debian bookworm. Happy to change this.