New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reproducible builds proposal: make gem define SOURCE_DATE_EPOCH itself #2290
Comments
I just want to know if this is something that would be wanted/accepted by rubygems, then I can propose a pull request for this as well. Awaiting feedback |
Just to underline that it should only set |
@lamby thanks to make sure this happens that way, but thats literally what i wrote in the 3th block 🐱 |
@anthraxx I saw exactly that, hence my "underline" :) |
Rubygems already does that, see https://github.com/rubygems/rubygems/blob/master/lib/rubygems/package.rb#L163 |
Unfortunately it doesn't so please reopen. There are multiple places that do this check, gem doesn't expose that env vár but all places set current time one by one, this could vary if it takesonger then a second which makes it unreproducible
|
The point is that it does not export that env var to a uniformal current now but just assignes members in various places each independently from each other, it's only guaranteed to all be the same if all are assigned from env var which only works for gem itself if gem very early exports that env var itself
CC @hsbt @duckinator @segiddins
|
Sorry for the late response, @anthraxx. I only just now saw this issue. My understanding of the issue is that, every place RubyGems checks for
... which means if you don't explicitly define it, then the times won't be consistent throughout the entire process. This means that you can only get reproducible builds via RubyGems by setting Whereas if we check it in that way once, and set the environment variable ourselves if it is not already set, then we're defaulting to reproducible builds. Is that correct, @anthraxx? |
Yes, the first set of commits i did just ensured everything works fine if we define This is about making every gem package produced through To achieve this, as you summarized correctly, gem needs to define |
@anthraxx 👍 okay. I'm definitely in favor of adding that. |
@anthraxx I did some tests and it will be probably enough to pass |
Going to proceed creating a patch, would still prefer setting the var as there may be different things influencing it. Tar is just the easier and most obvious, gemspec also has dates and may invoke processes that respect S_D_E to generate docs or anything else.
|
Maybe if |
Attacks like this could be discovered more easily if this issue was resolved. |
If I'm understanding everything correctly (about both reproducible builds and that CVE) then, if we can get both
then you could re-build the gem locally using the same And, if that checksum were to be different, it'd signify the code has been modified between the expected source and released version. Is that correct? Assuming that is correct: if the SOURCE_DATE_EPOCH is provided in a machine-accessible way (e.g. via the rubygems.org API or in the .gem file or something), I think we could possibly even partially-automate this process. E.g., have a tool that takes a gem name ("bootstrap-sass"), gem version ("3.2.0.3"), makes the required network requests to get other information, and see if it all matches up as expected. |
Yes. Some testing might be needed to figure out if more env variables should be included. It might also be an idea to make this information available in a buildinfo file, but an API for this is sufficient.
I know rust has been working towards something like this, and on the distribution side we also have been working on tools to recreate packages. |
I'm looking into this again tonight. I'm hoping to have at least a rough version of a PR for this done in the next few hours. 🙂 @Foxboron reading your link about buildinfo files, it looks like ArchLinux includes the |
Fixes rubygems#2290. 1. `Gem::Specification.date` returns SOURCE_DATE_EPOCH when defined, 2. this commit makes RubyGems set it _persistently_ when not provided. This combination means that you can build a gem, check the build time, and use that value to generate a new build -- and then verify they're the same.
I think I've got it working, including tests! #2882 |
Fixes rubygems#2290. 1. `Gem::Specification.date` returns SOURCE_DATE_EPOCH when defined, 2. this commit makes RubyGems set it _persistently_ when not provided. This combination means that you can build a gem, check the build time, and use that value to generate a new build -- and then verify they're the same.
Fixes rubygems#2290. 1. `Gem::Specification.date` returns SOURCE_DATE_EPOCH when defined, 2. this commit makes RubyGems set it _persistently_ when not provided. This combination means that you can build a gem, check the build time, and use that value to generate a new build -- and then verify they're the same.
Fixes rubygems#2290. 1. `Gem::Specification.date` returns SOURCE_DATE_EPOCH when defined, 2. this commit makes RubyGems set it _persistently_ when not provided. This combination means that you can build a gem, check the build time, and use that value to generate a new build -- and then verify they're the same.
Embedding the buildinfo into our package means that in order to reproduce the package you must treat the full list of build VM packages as "build inputs". You cannot reproduce a package completely without making sure the versions of all dependencies and other system software are identical. On the other hand, versions of packages are things that might be expected to influence output anyway (different versions of gcc will surely emit different ELF binaries!) The benefit of explicitly including it is that we can guarantee the buildinfo is always available -- it is attached by the same tool that is required to create a valid package, and you do not need to keep track of two different files (one being the release artifact/package, the other being the buildinfo file). There are pros and cons to both sides. Debian has chosen to store the buildinfo for .deb packages separately, with the rationale that this makes it easier to, say, change the version of sed or gawk and still get the same .deb file. OTOH just yesterday it turns out the rubygem "gpgme" no longer builds from source when gawk 5 is installed due to issues with its bundled libgpg-error (see https://bugs.archlinux.org/task/63654 for details) so even the simple, non-obvious tools can have surprising ramifications... Arch Linux is okay with requiring all known environment modifiers that have been declared to be significant, to be part of the input in the context of distribution packages in order to reproduce things. :) |
2882: Set SOURCE_DATE_EPOCH env var if not provided. r=djberg96 a=duckinator # Description: Set SOURCE_DATE_EPOCH env var if not provided. Fixes #2290. 1. `Gem::Specification.date` returns SOURCE_DATE_EPOCH when defined, 2. this commit makes RubyGems set it _persistently_ when not provided. This combination means that you can build a gem, check the build time, and use that value to generate a new build -- and then verify they're the same. # Tasks: - [x] Describe the problem / feature - [x] Write tests - [x] Write code to solve the problem - [ ] Get code review from coworkers / friends I will abide by the [code of conduct](https://github.com/rubygems/rubygems/blob/master/CODE_OF_CONDUCT.md). Co-authored-by: Ellen Marie Dash <the@smallest.dog>
I would like to suggest to make gem itself a potential SOURCE_DATE_EPOCH declarer instead of "only" making reproducible artifacts whenever the outside world defines the SOURCE_DATE_EPOCH environment variable.
While above works perfectly for all distros, as they and teir build tools and pipelines itself define SOURCE_DATE_EPOCH it would be awesome if the gem tool/script could define SOURCE_DATE_EPOCH itself.
This proposal would allow every gem aquired from the rubygems repository purely build with
gem
instead of any distro or other packaging related tool defining SOURCE_DATE_EPOCH to be independently reproduced.This would need to check SOURCE_DATE_EPOCH in the gem command line tool, and if it is not yet define, it should define it to the current utc timestamp.
Example like it is done in Arch Linux's
makepkg
:https://git.archlinux.org/pacman.git/tree/scripts/makepkg.sh.in#n93
This issue is related to:
gem
Related to pull requests:
#2289 #2278
Spec:
https://reproducible-builds.org/specs/source-date-epoch/
Buy-in:
https://reproducible-builds.org/docs/buy-in/
I will abide by the code of conduct.
The text was updated successfully, but these errors were encountered: