Skip to content

Dockerfile for release automation #13250

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

daniel-j-h
Copy link
Contributor

Hey @bagder 👋 I've seen your post on mastodon on how to reproduce the release tarballs.

I wanted to bounce this off of you as an idea

  1. We can check in a Dockerfile build environment based on a specific distribution (debian bookworm here)
  2. We then can have a Github Action to build and create the release tarballs in this specific environment

This is far from reproducible builds but maybe at least it could be a cheap way forward for reproducible release tarballs.

Here is the github action running in my fork

https://github.com/daniel-j-h/curl/actions/runs/8499306747/job/23279970815

What's left to do here is

  • discussing if this could be a way forward
  • have the github action use a new signing key (put in github secrets) and sign the tarballs
  • have the github action upload release tarballs and signatures on every tag push

Refs

@github-actions github-actions bot added the CI Continuous Integration label Mar 31, 2024
@daniel-j-h daniel-j-h force-pushed the adds-release-automation branch from 5e768e5 to 4da5c97 Compare March 31, 2024 16:05
@bagder
Copy link
Member

bagder commented Mar 31, 2024

Having a dockerfile that can do identical tarballs to the one we ship is a sensible idea as it makes it easier for anyone who feels like it to reproduce them. I don't think there is any particular need to do releases "in the cloud" as it does not actually add value or protection to the process. As long as the tarballs can be reproduced, they can be verified to be built from a known git repository state and then it does not matter where they were generated.

@daniel-j-h
Copy link
Contributor Author

Having a dockerfile that can do identical tarballs to the one we ship is a sensible idea as it makes it easier for anyone who feels like it to reproduce them. I don't think there is any particular need to do releases "in the cloud" as it does not actually add value or protection to the process. As long as the tarballs can be reproduced, they can be verified to be built from a known git repository state and then it does not matter where they were generated.

Got it, that makes sense! I can remove the github action integration if you don't want the github action to create and upload those tarballs. The only problem I can see is if the Dockerfile is not part of your workflow or part of some sort of automation that it it's very easy to forget to keep it up to date. Do you have ideas how we could make sure we don't forget about it?


Then I did the following experiment: I went to the release page and downloaded the latest curl-8.7.1.tar.gz and compared it against a locally created curl-8.7.1.tar.gz from the git tag checkout and using the Dockerfile running the commands below.

docker build -t curl/curl .

docker run --rm -v (pwd):/usr/src -w /usr/src curl/curl autoreconf -fi
docker run --rm -v (pwd):/usr/src -w /usr/src curl/curl ./configure --without-ssl --without-libpsl
docker run --rm -v (pwd):/usr/src -w /usr/src curl/curl make -j8
docker run --rm -v (pwd):/usr/src -w /usr/src curl/curl ./maketgz 8.7.1

Here is an overview of what a recursive diff is showing me just following these simple steps:

$ diff -qr local remote

Files local/curl-8.7.1/CHANGES and remote/curl-8.7.1/CHANGES differ
Files local/curl-8.7.1/docs/curl-config.1 and remote/curl-8.7.1/docs/curl-config.1 differ
Files local/curl-8.7.1/include/curl/curlver.h and remote/curl-8.7.1/include/curl/curlver.h differ
Files local/curl-8.7.1/include/curl/Makefile.in and remote/curl-8.7.1/include/curl/Makefile.in differ
Files local/curl-8.7.1/ltmain.sh and remote/curl-8.7.1/ltmain.sh differ
Files local/curl-8.7.1/packages/Makefile.in and remote/curl-8.7.1/packages/Makefile.in differ
Files local/curl-8.7.1/packages/vms/Makefile.in and remote/curl-8.7.1/packages/vms/Makefile.in differ
Files local/curl-8.7.1/projects/Windows/VC14/lib/libcurl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14/lib/libcurl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14/src/curl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14/src/curl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.10/lib/libcurl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.10/lib/libcurl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.10/src/curl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.10/src/curl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.20/lib/libcurl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.20/lib/libcurl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.20/src/curl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.20/src/curl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.30/lib/libcurl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.30/lib/libcurl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.30/src/curl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.30/src/curl.vcxproj differ
Files local/curl-8.7.1/scripts/Makefile.in and remote/curl-8.7.1/scripts/Makefile.in differ
Files local/curl-8.7.1/src/tool_hugehelp.c and remote/curl-8.7.1/src/tool_hugehelp.c differ
Files local/curl-8.7.1/tests/data/Makefile.in and remote/curl-8.7.1/tests/data/Makefile.in differ
Files local/curl-8.7.1/tests/http/Makefile.in and remote/curl-8.7.1/tests/http/Makefile.in differ
Files local/curl-8.7.1/tests/Makefile.in and remote/curl-8.7.1/tests/Makefile.in differ

What I can see in there

  • the CHANGES file differs: the one from the local built is missing individual changes; my hunch is I need git installed for this file to be assembled?
  • There are date stamps in a few files e.g. in the curl-config.1 file and in the curlver.h file (the maketgz script datestamps here. I guess we can ignore this.
  • There are some makefile arguments different in Makefile.in (--gnu vs --foreign)
  • The src/tool_hugehelp.c tool in the github release seems to have an additional second part to it with a compressed part that gets sent through gzip; this part is not in my local tarball

and this just looking though the first five or so differences.

I don't think I can get to the bottom of all of those differences and I for sure can not judge why they're different. I believe some could be from slightly different tools or how we invoke them. I would appreciate your help here in getting to the bottom of this.

@bagder
Copy link
Member

bagder commented Mar 31, 2024

  1. Right, we need a CI job to make sure that the Dockerfile remains functional. Maybe we should make use of it for the distcheck.yml job?
  2. We need git installed to generate CHANGES in the tarball
  3. You should set the env variable SOURCE_DATE_EPOCH to a fixed time (epoch) before you start the tests so that they use the exact same
  4. We need zlib1g-dev installed so that configure detects it and hugehelp.c gets the compressed version bundled
  5. The MS project files should not differ...

@bagder
Copy link
Member

bagder commented Mar 31, 2024

The autotool related differences I assume is because your image does not use the exact same set that I used when made the release.

@vszakats
Copy link
Member

vszakats commented Mar 31, 2024

  1. You should set the env variable SOURCE_DATE_EPOCH to a fixed time (epoch) before you start the tests so that they use the exact same

Possibly also TZ=UTC?

Speaking of 8.7.1, also LC_ALL=C, to generate the date in English. This will not be necessary in upcoming versions after today's afdd112.

@daniel-j-h
Copy link
Contributor Author

Thanks folks! I made some progress here and found two date-related issues

  1. The maketgz script does not respect the SOURCE_DATE_EPOCH env var but always uses today; fixing in 817f968
  2. Our date computation is off somewhere resulting in the previous day before the release day to end up in a manpage

Looking at the github release for 8.7.1 it was on 2024-03-27 (~10am ish)

https://github.com/curl/curl/releases/tag/curl-8_7_1

In the release tarballs I can find a datestamp from 2024-03-26

remote/curl-8.7.1/docs/curl-config.1
2:.TH curl-config 1 "March 26 2024" curl-config

and other datestamps seem to be correctly set to 2024-03-27

remote/curl-8.7.1/include/curl/curlver.h
73:#define LIBCURL_TIMESTAMP "2024-03-27"

My hunch is that

  1. either our datestamping logic for stamping the man page is off somewhere, or
  2. maybe there was no full release re-creation done for the releases from a clean state on 2024-03-27 by chance but some of it got datestamped the day before

This is in no way critical I just wanted to leave it here since it makes automatically comparing the diffs tricky.


Then I looked into the differences in the *.in files, e.g.

djh@rf /t/curlpkgs> diff {local,remote}/curl-8.7.1/include/curl/Makefile.in
462c462
<       echo ' cd $(top_srcdir) && $(AUTOMAKE) --gnu include/curl/Makefile'; \
---
>       echo ' cd $(top_srcdir) && $(AUTOMAKE) --foreign include/curl/Makefile'; \
464c464
<         $(AUTOMAKE) --gnu include/curl/Makefile
---
>         $(AUTOMAKE) --foreign include/curl/Makefile

I checked the debian bookworm docker image; here are the versions installed

  • autoconf 2.71
  • automake 1.16.5
  • libtoolize 2.4.7
  • make 4.3
  • perl 5.36.0
  • git 2.39.2

Comparing to the mail https://curl.se/mail/lib-2024-03/0062.html

For the most recent curl release, my toolset that I believe might affect the results include:

  • autoconf (GNU Autoconf) 2.71
  • automake (GNU automake) 1.16.5
  • libtoolize (GNU libtool) 2.4.7
  • GNU Make 4.3
  • perl v5.38.2
  • git version 2.43.0

The versions seems to match except for the perl and git binaries but those should not make a difference here.

What I can see is diffs like

<        version:        $progname $scriptversion Debian-2.4.7-5
---
>        version:        $progname $scriptversion Debian-2.4.7-7

so I believe even the slight differences in libtool versions - even if they're both version 2.4.7 make a difference:

  • 2.4.7-5 (debian bookworm), vs
  • 2.4.7-7 (release tarball)

I don't know why the Microsoft files are changing and I don't have the experience to look into this.


Summary: there are still some slight differences between the release tarballs and what I can re-create now with this docker build environment; some we can explain (see above) and some is still unexplained (Microsoft builds, mismatch in datestamp). The autotools generated code seems to differ even though the versions have the same major/minor/patch versions, making it tricky to do any automated comparisons.

@vszakats
Copy link
Member

vszakats commented Apr 2, 2024

I don't know why the Microsoft files are changing and I don't have the experience to look into this.

Can you tell which files these are and how are they changing? It is their line ending perhaps?

@daniel-j-h
Copy link
Contributor Author

The Microsoft related files I can see in my diff -qr are

Files local/curl-8.7.1/projects/Windows/VC14/lib/libcurl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14/lib/libcurl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14/src/curl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14/src/curl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.10/lib/libcurl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.10/lib/libcurl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.10/src/curl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.10/src/curl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.20/lib/libcurl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.20/lib/libcurl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.20/src/curl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.20/src/curl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.30/lib/libcurl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.30/lib/libcurl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.30/src/curl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.30/src/curl.vcxproj differ

Here is curl-8.7.1/projects/Windows/VC14/lib/libcurl.vcxproj for my local tarball and release, respectively. Compare

I don't know how these files are getting generated, if the entries there can be of different order, or if the build command differs.

@vszakats
Copy link
Member

vszakats commented Apr 2, 2024

It seems indeed that ItemGroup list items are not alpha-sorted.

There is a generator logic in ./Makefile.in (and also in projects/generate.bat when run on Windows natively).

@bagder
Copy link
Member

bagder commented Apr 3, 2024

maketgz invokes make vc-ide, and in Makefile.am there is a complicated awk script that generates the project files using the variables (lists of file names really) set in the Makefile.

This script sorts the lists. I have not figured out why it would create output not sorted...

@bagder
Copy link
Member

bagder commented Apr 3, 2024

remote/curl-8.7.1/docs/curl-config.1

This file is included in the tarball by mistake. The date in this file is the date when it was generated, the day before the release. I will make a PR to remove it from the dist, it should get generated in the build.

bagder added a commit that referenced this pull request Apr 3, 2024
The markdown file is already there and the .1 file gets generated in the
build.

Ref: #13250
bagder added a commit that referenced this pull request Apr 3, 2024
The markdown file is already there and the .1 file gets generated in the
build.

Ref: #13250
Closes #13268
@bagder
Copy link
Member

bagder commented Apr 4, 2024

@daniel-j-h maybe you can submit the maketgz fix separately, to allow us to gradually and step by step fix the nits you have identified

@daniel-j-h
Copy link
Contributor Author

Got it! Can do: #13280

@bagder
Copy link
Member

bagder commented Apr 4, 2024

Ah, the image needs gawk installed as well! For make vc-ide to work.

daniel-j-h added a commit to daniel-j-h/curl that referenced this pull request Apr 5, 2024
The SOURCE_DATE_EPOCH env var is needed to date-stamp releases
properly with the release date, when re-creating official releases.

Ref: curl#13250
@daniel-j-h daniel-j-h force-pushed the adds-release-automation branch from 41cb10a to 090a150 Compare April 5, 2024 06:48
@daniel-j-h
Copy link
Contributor Author

daniel-j-h commented Apr 5, 2024

Ah, the image needs gawk installed as well! For make vc-ide to work.

Good catch!! I just added it to the docker environment and re-build the release tarballs.

$ gawk --version
GNU Awk 5.2.1, API 3.2, PMA Avon 8-g1, (GNU MPFR 4.2.0, GNU MP 6.2.1)

It looks like there are still unsorted lists in there, though.

For example compare the files in this list

  1. https://gist.github.com/daniel-j-h/936a722c2c2c8b3afa3a546e4fca44ac#file-local-libcurl-vcxproj-L2389 (this branch)
  2. https://gist.github.com/daniel-j-h/936a722c2c2c8b3afa3a546e4fca44ac#file-remote-libcurl-vcxproj-L2389 (release tarball)

These differences still exist even after re-building everything with gawk in the docker environment in the Microsoft files

Files local/curl-8.7.1/projects/Windows/VC14/lib/libcurl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14/lib/libcurl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14/src/curl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14/src/curl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.10/lib/libcurl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.10/lib/libcurl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.10/src/curl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.10/src/curl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.20/lib/libcurl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.20/lib/libcurl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.20/src/curl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.20/src/curl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.30/lib/libcurl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.30/lib/libcurl.vcxproj differ
Files local/curl-8.7.1/projects/Windows/VC14.30/src/curl.vcxproj and remote/curl-8.7.1/projects/Windows/VC14.30/src/curl.vcxproj differ

On a positive note: if we figure out the Microsoft file differences then the only differences are due to the slight mismatch in tools outlined in #13250 (comment) - I'm wondering, are you on debian bookworm and using the packages from apt? Or did you install specific packages locally yourself from source?

There I can see three ways forward: either we track your specific setup in this Dockerfile, or we standardize on tools from a distribution such as debian bookworm, or maybe you are willing to do releases in the self-contained docker environment.

@bagder
Copy link
Member

bagder commented Apr 5, 2024

if we figure out the Microsoft file differences

We should figure that out and make them stable.

are you on debian bookworm and using the packages from apt

I'm on Debian unstable/sid and I use packages from apt. but...

maybe you are willing to do releases in the self-contained docker environment

I am prepared to use this dockerfile when releasing tarballs going forward to ease the process for people who want to reproduce them. I can probably also make the daily snapshot builds use the same thing.

bagder added a commit that referenced this pull request Apr 5, 2024
This target generates the MSVC project files. This change removes the
extra sorting and instead makes the script use the order of the files as
listed in the variables - which are mostly sorted anyway.

This is an attempt to make the project file generation more easily
reproducible.

Ref: #13250
@bagder
Copy link
Member

bagder commented Apr 5, 2024

#13294 is a take to simplify the project file generation in the hope that the sorting was a reason for the diff. It was nonetheless unnecessary.

bagder added a commit that referenced this pull request Apr 5, 2024
This target generates the MSVC project files. This change removes the
extra sorting and instead makes the script use the order of the files as
listed in the variables - which are mostly sorted anyway.

This is an attempt to make the project file generation more easily
reproducible.

Ref: #13250
Closes #13294
@daniel-j-h
Copy link
Contributor Author

Sorry for the slow responses I've been out sick these days, hope to get better by the weekend. Some quick responses below

When I build a release with this image all the files it makes get owned by root. Very inconvenient. Is there an easy fix to make them owned by my regular user?
Ah yes, I need to pass -u $(id -u):$(id -u) to docker run.

There are two sides to this

  1. When you docker run you can pass your host user id and group id to the container

  2. In the Dockerfile we can create a user id and a group id and switch to that for all subsequent commands

     RUN groupadd --gid 1000 dev \
      && useradd --uid 1000 --gid dev --shell /bin/bash --create-home dev
    
     USER dev:dev
    

https://docs.docker.com/reference/dockerfile/#user

The second step is optional and is only needed because some programs expect an actual user and group (and e.g. home dir) to exist when they see the uid and gid that you pass in. And if you want to map your host user and group to the container's user and group it's best to add ARG we can change when we run it end to end, because on your host user id 1000 might not exist.

@nurupo
Copy link

nurupo commented Apr 10, 2024

This assumes that the docker image used is not compromised itself, either via the distro or via the docker image maintainers. If possible, it would be nice to be able to generate the tarball using different distros dockerized by different groups of people, e.g. Debian and Alpine, or perhaps Fedora, and compare them. Not sure if a tarball can be made reproducible like that though.

@bagder
Copy link
Member

bagder commented Apr 11, 2024

The tool I'm most worried about is automake though

autoconf, automake and libtool are most likely the tools which versions are most relevant. The others most probably not at all.

I also don't want a "trust competition". Reproducible builds are primarily interesting and important in the short term: to allow users to easily verify the last few releases (say within the last year or so).

An obvious advantage of using Debian as a base for this is familiarity, trusting the brand, its processes and that they also patch packages for functionality. Going with a separate environment means leaning on someone else's process. Possibly a less familiar one.

@bagder
Copy link
Member

bagder commented Apr 11, 2024

and also: we have removed the need for awk in maketgz (since #13311) so gawk can be removed again 😄

@lrvick
Copy link

lrvick commented Apr 11, 2024

Reproducible builds are primarily interesting and important in the short term: to allow users to easily verify the last few releases (say within the last year or so).

A breaking change resulting from building from latest Debian packages could come at any time. Maybe in 5 years, maybe in a week. Also given Debian does not sign their container images or produce reproducible digests, we have no easy way to know if Debian in the container is actually the Debian we expect it to be.

My arguments here would be far stronger if we were talking about generating deterministic binaries. That said, binaries (and the archives that generate them) can stick around and see some level of use for decades. There are (sadly) many mission critical embedded devices being used today with very old versions of curl on them and we can expect versions installed today will likewise be used many years in the future.

When it comes to security auditing and forensics it becomes relevant to recreate the exact supply chains of binaries created years ago, that resulted in recent harm, to rule out or identify supply chain attacks or obscure vulnerabilities that only show up depending on the tools used to build them.

My efforts here and elsewhere are about trying to get widely used projects to trend towards releasing using methods that are always predictably deterministic and long-term-auditable. That of course does not have to be with Stagex, though with my bias hat firmly in place, I can say it does make it much simpler to audit and maintain IMO.

If using Debian is a hard requirement, I would suggest using scripts like https://github.com/reproducible-containers/repro-sources-list.sh or the setup I wrote here in https://git.distrust.co/public/toolchain to produce a set of locked hashes of a given snapshot of debian dependencies, then download and install these exact snapshots as I describe above. This path is complex, slow, difficult to maintain, very disk heavy, and definitely a hack to force Debian to do something it was not meant do do, but it does -work- and is my go-to when projects mandate Debian out of familiarity or other reasons.

We did have this debian-locking approach to determinism audited by Cure53 as we used it in AirgapOS, though we are now moving that project to stagex to reduce maintenance burden: https://git.distrust.co/public/airgap/src/branch/main/audits/cure53-2020.pdf

@bagder
Copy link
Member

bagder commented Apr 11, 2024

My arguments here would be far stronger if we were talking about generating deterministic binaries

Those who ship binaries might appreciate reproducible source tarballs, but they need to take care of the binary producing part themselves.

always predictably deterministic and long-term-auditable

Sure, that's ideal. But I realize we may need to make priorities and take decisions. Making it dead easy to verify source packages twenty years later is not something I think it's worth spending many brain cells on. Diminishing returns and all that.

The people who are stuck on ancient versions are also probably the ones least likely to actually want to verify these things as they are clearly not very concerned...

If using Debian is a hard requirement

I don't believe it is. I think we are still in a process where we asses our options and their pros and cons.

@vszakats
Copy link
Member

Considering the (continuous) efforts necessary for keeping the pre-built autotools bits deterministic (and then verifying them), I'd risk saying that a more efficient alternative is to offer a source tarball that doesn't contain pre-generated autotools files. Those would require the builder to run autoreconf -fi (is there any downside of that?), or use CMake, which works out of the box.

A trivial way to do this is to GPG sign the tarballs created automatically by GitHub:
https://github.com/curl/curl/archive/refs/tags/curl-8_7_1.tar.gz
https://github.com/curl/curl/archive/refs/tags/curl-8_7_1.tar.zip

This has the downside of missing pre-generated manual files, and building those locally requires Perl.

Fixing this needs a little bit more involved solution, but much less so than the dance with reproducible OS environments, IMO.

@bagder
Copy link
Member

bagder commented Apr 12, 2024

I agree that we could remove the need for a lot of this by not generating anything at all for releases. I just don't think it would benefit our users.

  1. We have users that build curl from configure, especially on older systems, because they then don't need to install additional build software (like cmake or autotools). Forcing users to install the autotools to use configure will push workloads to users and there will be follow-up squeaks and problems.

  2. People have many times over the years expressed their desire to avoid having to install perl to build curl and we have worked on preserving this ability. Only very recently even.

  3. The GitHub tarballs are not reproducible are they?

Sacrificing build convenience for easier tarball verification would go against what we know plenty want in favor of something virtually nobody has asked for.

@vszakats
Copy link
Member

vszakats commented Apr 12, 2024

  1. The GitHub tarballs are not reproducible are they?

They are (for now). Last year this broke for a few days due an update they did. The fallout was large enough to have it reverted. They then guaranteed it for 1 year, meaning: Not guaranteed, but they still are reproducible.

https://github.blog/2023-02-21-update-on-the-future-stability-of-source-code-archives-and-hashes/

The next best thing is cloning a specific Git hash. Which needs Git of course, and using SHA1 hashes known for collisions. Also easy to mess-up by using the unencrypted git:// protocol. Is it possible to double-check a Git clone, against a known state?

Sacrificing build convenience for easier tarball verification would go against what we know plenty want in favor of something virtually nobody has asked for.

I understand.

Adding that it's probably expected that cmake, or autotools might be installed already on many systems, just like (GNU) make is. Or as easy to install as the latter. But yeah, might take time till some of these becomes as ubiquitous. (Maybe worthy a survey question?)

@eli-schwartz
Copy link
Contributor

Adding that it's probably expected that cmake, or autotools might be installed already on many systems, just like (GNU) make is. Or as easy to install as the latter. But yeah, might take time till some of these becomes as ubiquitous. (Maybe worthy a survey question?)

The people who are appreciative of using configure on older systems because it doesn't require installing anything, may be people on operating systems that don't have a C++ compiler, so no cmake. They may have a non-GNU make, or a GNU make from two decades ago. They may not have autotools at all, or have autotools from two decades ago that can't be used to regenerate the curl configure.ac -- a major stated advantage of autotools is that you can make a dist tarball on bleeding edge distros, but still have that work on very old systems without modern autotools, as long as it has a posix shell make and general utilities.

Continuing to offer autotools but requiring users of autotools to generate the configure script themselves seems like a very mixed message.

@vszakats
Copy link
Member

They may not have autotools at all, or have autotools from two decades ago that can't be used to regenerate the curl configure.ac

Out of curiousity, is there any other autotools compatibility requirement for configure.ac, besides being v2.59 (from 2003-11-06) or newer?:

AC_PREREQ(2.59)

It seems a little arbitrary to take for granted both a compatible make tool and C compiler, but not a compatible autotools. Though I admit to have no data or even anecdotal info about this. I'd have expected autotools to have lived together with make and C compilers way back. CMake is much newer of course, and the minimum required 3.7 (2018-12-03), can be a limitation indeed. (even without this limitation, CMake really only seems to be going back a decade from now as 'ubiquitous'.)

Either way, I understand that pre-packaging autotools stuff is something expected, period.

Also those worried about the state of pre-packaged stuff (and have a compatible autotools), can always autoreconf.

@eli-schwartz
Copy link
Contributor

It seems a little arbitrary to take for granted both a compatible make tool and C compiler, but not a compatible autotools. Though I admit to have no data or even anecdotal info about this. I'd have expected autotools to have lived together with make and C compilers way back.

Both make and C are much older than autotools -- and are required by POSIX. However, I'll admit I didn't check what the minimum autotools requirement for curl is before commenting. I wonder if it's still tested against that. 🤔

xc-am-iface.m4 does offer support for "versions of automake older than 1.14" though it's not clear what the actual minimum is (nor, indeed, whether that is tested either).

@dfandrich
Copy link
Contributor

dfandrich commented Apr 14, 2024 via email

@vszakats
Copy link
Member

Thanks for the minimums info.

Dates for these:

  • GNU Libtool 1.4.2 (2001-09-11)
  • GNU Autoconf 2.59 (2003-11-06)
  • GNU Automake 1.7 (2002-09-25)
  • GNU M4 1.4 (2007-09-21)

I wonder what might be the oldest GNU C (or other brand) compiler able to compile curl. 3.0.0 to 3.3.0 were released in the above timeframe. 2.9.5 is from 1999-07-29.

It'd be an interesting experiment to test for the autotools minimums (linux-old CI already revealed two CMake build issues). If there exists any re-usable online infrastructure for that.

@bagder
Copy link
Member

bagder commented Apr 15, 2024

I wonder what might be the oldest GNU C (or other brand) compiler able to compile curl.

A C89 compliant compiler should still be able to build curl even if the compiler is from the 90s. We stick to C89 partly because of that.

And if it does not due to some mistake somewhere, it can't be very important since nobody has reported it...

@bagder
Copy link
Member

bagder commented Apr 15, 2024

It seems a little arbitrary to take for granted both a compatible make tool and C compiler, but not a compatible autotools.

It is not arbitrary. It is a design choice and how autotools always worked. Users everywhere have always been able to install software widely without having autotools themselves. You install autotools only when you want to develop software, while if you want to just build and install software autotools may not be installed (or up-to-date).

Thus, suddenly asking that people should install and use autotools is wrong and will cause a lot of friction.

Update: configure itself is probably also more portable and functional than the autotools themselves. So you can install curl using configure on systems where you might not be able to easily use autotools.

@daniel-j-h daniel-j-h force-pushed the adds-release-automation branch from 090a150 to 5bf7639 Compare April 15, 2024 19:20
@daniel-j-h
Copy link
Contributor Author

Hey @bagder I just made some changes here

  1. rebased on master to get the latest maketgz changes
  2. added a dedicated user so that the container doesn't run on root and we can have the same user permissions on the host and in the container
  3. added build arguments for the timestamp and also user and group id

To build

docker build \
  --build-arg SOURCE_DATE_EPOCH=1711526400 \
  --build-arg UID=$(id -u) \
  --build-arg GID=$(id -g) \
  -t curl/curl .

to run

docker run --rm -it -u $(id -u):$(id -g) \
  -v (pwd):/home/dev/curl -w /home/dev/curl \
  curl/curl bash

This gets you into the container based on the image you built in the previous step. Because we created a dedicated user we work in its home directory and map the host's user and group ids to the container's user.

Like I said in https://github.com/curl/curl/pull/13250/files#discussion_r1559389357 it's not perfect but I'd still love your feedback here and if it's worth getting this in. Wanna give it a final look?

@bagder
Copy link
Member

bagder commented Apr 16, 2024

I'm using the following script to build a test release using this Dockerfile and it works fine:

version="${1:-}"

if [ -z "$version" ]; then
  echo "Specify a version number!"
  exit
fi

user="$(id -u):$(id -u)"

make distclean
docker build \
       --build-arg SOURCE_DATE_EPOCH=$(date -u +%s) \
       --build-arg UID=$(id -u) \
       --build-arg GID=$(id -g) \
       -t curl/curl .

run="run --rm -it -u $(id -u):$(id -g) -v $(pwd):/usr/src -w /usr/src curl/curl"

docker $run autoreconf -fi
docker $run ./configure --without-ssl --without-libpsl
docker $run make -sj8
docker $run ./maketgz $version

@bagder bagder closed this in 41c03b4 Apr 16, 2024
@bagder
Copy link
Member

bagder commented Apr 16, 2024

Thanks!

@bagder
Copy link
Member

bagder commented Apr 16, 2024

The tarball generator script using this Dockerfile is in #13388

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration
Development

Successfully merging this pull request may close these issues.