New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SHA256 checksum changed for v0.26.0 #4343
Comments
I don't know. We don't create these archives ourselves, they're automatically generated by GitHub. Presumably the results of these are cached, but I don't think that there's a guarantee there. That means that they're not stable. The tar archive itself should be reproducible, but gzip does not have any guarantees of determinism. It uses a timestamp in the header, for example, and of course compression levels will cause wildly different outputs. The tag for 0.26.0 is correct, it is (and has always been) If you want to continue downloading the archive, I would encourage you to take the shasum of the But as to why this changed suddenly, I don't know. I suspect it was an otherwise harmless cache invalidation. But I don't like that this happened. (It would be ideal if GitHub's codeload would allow for repeatable builds here, and strip the timestamp out.) As for how to make sure this doesn't happen again, I'm not sure yet. I think that we can upload our own artifacts when we do a release. But I'm not sure that disables the automatic /cc @carlosmn in case he has more insight into codeload and friends |
Tarballs generated by git have never been guaranteed to have the same checksum as any other tarball. git sometimes brings in fixes for path handling e.g. with unicode or to make it compatible with versions of If your system depends on tarballs which are generated by git on the fly, it has always been prone to arbitrary checksum mismatches. This bites each project which does this sooner or later. It's bit the Linux kernel and Homebrew in the past. The only way to get consistent checksums is to generate the archive once and upload it somewhere. This is where e.g. GitHub's releases come in, which let you upload your own artifacts. I believe the Homebrew project upload their source archives to a third party. Distributions such as Debian also upload tarballs to their own servers rather than rely on the project itself hosting them. As to why this has happened now, GitHub recently completed an OS upgrade on its fileservers, but the most likely cause here is a bugfix to git so the generated tarballs are compatible with more versions of But even if you store this later checksum, it might change depending on GitHub OS or package upgrade schedule, which means that the checksum might change depending on where in the world you are and which machine happens to serve your request. |
I encountered a new hash for the v0.24.1 release tarball. I compared the contents of a newly-downloaded tarball with the contents of a version I had cached, and found that the release appears to have been updated to include the contents of Diff: https://gist.github.com/jboning/cef81b704895e6a224845e0075deabb7 |
Retracted. My cache must not have included the |
I validated that the contents are the same using the tarball in my bazel cache. Apparently github changed the way they generate release tarballs: libgit2/libgit2#4343 (comment)
I take it that although libgit2 emits release tags, the team does not upload their own tar.xz, but instead relies on the ones automatically created by git? Maybe something can be done there? Thank you for the swift response. |
Yes, but again, I'm not sure that disables the automatic .tar.gz and .zip creation. Worse than having one that has unreliable signatures is two with different signatures (one unreliable). |
There was a change on GitHub side: libgit2/libgit2#4343 (comment) Let us update the SHA256 for now, and keep working. Fixes mozilla/DeepSpeech#827
There was a change on GitHub side: libgit2/libgit2#4343 (comment) Let us update the SHA256 for now, and keep working. Fixes mozilla/DeepSpeech#827
* libgit2/libgit2#4343 Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
@carlosmn No that is not the case. This is a very breaking change for us (Homebrew): Homebrew/homebrew-core#18044 and we already have a bunch of support issues coming in as a result. |
Right. It's completely critical that we be able to serve up checksums that are reliable from the time we release software and archive it until, well, forever. As an end user, I simply don't care what Git does here. Ultimately this is a breaking change for GitHub. You're leaking the abstractions now. I guess we're going to have to create our own archives now when we release to avoid this. But:
|
I'm not sure that's entirely true. We've gotten the opposite-direction bug reports, too: why does
Is there a reason that uploading a tarball to the releases page isn't a good solution for the project? That is an extra step you have to take, but it's free. I absolutely agree that the extra step is a pain (for your project, and especially for projects like Homebrew which have to rely on the upstream projects to take the step). In an ideal world GitHub would automate that step away by caching the first tarball generated for any tag and keeping it forever, but in the meantime I think it's still possible to get what you want. |
I still haven't gotten an answer: if I upload a |
No it will not. |
Those are completely different things. The on-the-fly generation isn't about tags, it's the HTTP, better-cacheable version of |
Not trying to be an asshole: I literally don't know what anything you just said means. My point is: if I make a release, upload a IOW, why would I bother uploading anything to GH for a release? |
They end up with different paths. Github's auto-generated release artifacts are under |
Here are some examples from hubble: Auto-generated from Github: https://github.com/hubblestack/hubble/archive/v2.2.1.tar.gz Our manually-generated package: https://github.com/hubblestack/hubble/releases/download/v2.2.1/hubblestack-2.2.1-1.el7.x86_64.rpm Note the path differences. They're from the same release You definitely don't need to host yourself, but if Github won't guarantee that their release artifacts will checksum the same forever, you'll need to just download their artifact once, and then upload it as your own artifact. |
I didn't mean whether it's identifiable by the URL. I meant whether it's identifiable to the end user looking at the page. It's not clear to me why I would choose one |
If by "end users" you mean people like me who package your software, then I'm fine with this proposed workaround. I see no issue with that, especially if you expressly specify in the release notes to pick the pre-packed .tar.gz instead of the generic "Source code (tar.gz)" to get the consistent checksums, You could even spell out the SHA256 yourself as part of the release notes. |
If GitHub will provide hash sums for one |
How about having a project option to disable the auto-generated tarballs, to avoid confusing/misleading end-users? The last time I checked there wasn't any way to do that. I know this has previously caused support problems for some projects, e.g. Tahoe-LAFS. |
The checksum also changed for 0.25.1. How about moving away from the proprietary github platform? |
No. |
The proximate cause of the checksum changes is that Git was upgraded on the GitHub backend. Whether GitHub itself is "proprietary" is not exactly … relevant. |
@ilovezfs the proximate cause of the checksum changes is that Github regenerates archives on the fly instead of pegging them to the tag/commit. Why they would do this I don't understand (it's wasting resources and causing issues like this one). I didn't find the GitHub project where to report that issue; oh, wait, they don't have one! |
Indeed. I'm going to leave this open, until GitHub will supports one of:
Obviously, though, no, we are not going to move off GitHub, one of the two biggest supporters of the project. That someone would be so rude as to suggest such a thing truly boggles my mind. |
Hopefully this. |
As a downstream packager of libgit2 and many other programs, I can say that people in my position can tell the difference between the snaphots that are automatically generated by GitHub per-tag, and the release archives (tar or zip) that each project's maintainer uploads and distributes via the release page. When you first visit GitHub as a beginner, it can be confusing to know which one to download, that's true. But you quickly learn that the generically named "Source code" download is something that the project's maintainers can't disable and don't recommend to be used. In many cases, the "real" release archive includes things beyond what's checked in to revision control, such as generated build scripts (autotools etc), extra documentation, test data, and so on. So, my project has a policy to always prefer the "real" release over the auto-generated snapshot, for the reasons listed above, as well as the bit-reproducibility problem that motivated this bug report. |
I'm afraid of sounding like a smart-arse, nevertheless:
This already is possible using Travis-CI whenever a new tag is pushed, as documented here, scroll down to "Uploading Multiple Files". So the If you prefer the archives to be uploaded by a developer as part of the the release-process, a tool like ghrelease can API the Github release API. This you even allow to sign the archives.
And IMHO there is not need for removing the automatically generated ones (and AFAIK this is not possible). For projects it is important to have any reproducible source, it does not matter if there are other non-deterministic sources. You way want to have a look at how PyInstaller handles this: We simply attach teh archives and the pgp-signatures to the release (example).
IMHO there is no need for this as long as deterministic archives are available. |
Off-topic:
Well, Gitlab has an integrated CI/CD, which has some advantages over the separate GitHub–Travis-CI solution. |
So when I said "damn that's rude" you decided to double down on it? |
Like I said, I am keeping this open since it is an actual issue that hasn't yet been resolved, but obviously there's no need for further discussion on it, so it's now been locked. |
Reproduction steps
I am packaging libgit2 as part of a build system.
I am downloading the v0.26.0 release tarball from here: https://github.com/libgit2/libgit2/archive/v0.26.0.tar.gz.
Expected behavior
Until today, the sha256sum for the tarball was 4ac70a2bbdf7a304ad2a9fb2c53ad3c8694be0dbec4f1fce0f3cd0cda14fb3b9.
I would have expected it not to change.
Actual behavior
Since today, the sha256sum for the tarball is 6a62393e0ceb37d02fe0d5707713f504e7acac9006ef33da1e88960bd78b6eac.
What is happenning?
Version of libgit2 (release number or SHA1)
https://github.com/libgit2/libgit2/archive/v0.26.0.tar.gz
The text was updated successfully, but these errors were encountered: