Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checksum failed for nightly channel #524

Closed
dswd opened this issue Jun 10, 2016 · 38 comments
Closed

Checksum failed for nightly channel #524

dswd opened this issue Jun 10, 2016 · 38 comments

Comments

@dswd
Copy link

dswd commented Jun 10, 2016

I just moved from multirust to rustup and experience the following problem when installing nightly:

$> rustup update nightly
info: syncing channel updates for 'nightly-x86_64-unknown-linux-gnu'
error: checksum failed, expected: 'afb2094f716fb7167accccdbc9e213ad5e707e7b56b0175aa664108cf994eaa0', calculated: 'b60f814a446940366499a350769dab93bce25e91bb30825eebb3f74820fd8b74'

Stable and beta install just fine. Also I removed all remains of multirust from my system(removed ~/.multirust, cleared $PATH, fresh console) and tried again, no help.

To clarify, I do not have any version of nightly currently installed (it seems to work for people with nightly already installed):

$> rustup show
installed toolchains
--------------------

stable-x86_64-unknown-linux-gnu (default)
beta-x86_64-unknown-linux-gnu

active toolchain
----------------

stable-x86_64-unknown-linux-gnu (default)
rustc 1.9.0 (e4e8b6668 2016-05-18)

I am using the current version of rustup:

$> rustup -V
rustup 0.1.12 (c6e430a 2016-05-12)
@alexcrichton
Copy link
Member

Someone was also reporting this yesterday where it was a checksum failure in the manifests. Definitely seems fishy, as especially the toolchains should never have bad sha256 sums, right @brson?

As this was happening (the failure), the nightly-dist-packaging-mac, nightly-dist-packaging-win-gnu-32, and nightly-dist-packaging-win-gnu-64 builders were all running. I don't think that should affect this, though, because the new TOML manifest wouldn't have been uploaded so none of the new artifacts should have been used.

I also checked and we had no cloudfront invalidations in flight when this happened.

@dswd
Copy link
Author

dswd commented Jun 10, 2016

Just a quick update: The expected checksum changed but it still does not match:

$> rustup install nightly
info: syncing channel updates for 'nightly-x86_64-unknown-linux-gnu'
error: checksum failed, expected: '82d81d2254bcb2c088a0adcc06964b530fe34813930ec634119e43c19c43db64', calculated: 'b60f814a446940366499a350769dab93bce25e91bb30825eebb3f74820fd8b74'

@dswd
Copy link
Author

dswd commented Jun 10, 2016

Ok I finally found the problem! I found a folder called cargo-install.we9Ct4dun7pK in /tmp. Once I removed it, nightly installed just fine.
I will leave this issue open. Please close it if you don't consider this behavior a bug.

@alexcrichton
Copy link
Member

Whoa! Sounds like we may not be invalidating a cache somewhere perhaps, definitely seems like a bug though.

@kamalmarhubi
Copy link
Contributor

kamalmarhubi commented Jun 10, 2016

Someone was also reporting this yesterday where it was a checksum failure in the manifests.

Someone was me, and the checksums in the report today are the same checksums I was seeing yesterday, which seems bad.

@kamalmarhubi
Copy link
Contributor

Oh cool, I still have the files in /tmp:

/tmp$ cat channel-rust-nightly.toml.sha256 
afb2094f716fb7167accccdbc9e213ad5e707e7b56b0175aa664108cf994eaa0  channel-rust-nightly.toml
/tmp$ sha256sum channel-rust-nightly.toml
b60f814a446940366499a350769dab93bce25e91bb30825eebb3f74820fd8b74  channel-rust-nightly.toml
/tmp$ ls -l channel-rust-nightly.toml*
-rw-r--r-- 1 kamal kamal 78430 Jun  9 15:13 channel-rust-nightly.toml
-rw-r--r-- 1 kamal kamal    92 Jun  8 14:57 channel-rust-nightly.toml.sha256

@alexcrichton
Copy link
Member

Right! If you remove the old tmp files does an update work for you @kamalmarhubi?

@brson
Copy link
Contributor

brson commented Jun 11, 2016

Definitely seems fishy, as especially the toolchains should never have bad sha256 sums, right @brson?

@alexcrichton I believe right now it's possible to get checksum drift on the manifests. The self-update will just warn, but if the manifests don't agree with their .sha256 file it is still an error.

@brson
Copy link
Contributor

brson commented Jun 11, 2016

I'm surprised @kamalmarhubi has manifests sitting directly in their /tmp folder. rustup should be using TempDir to create subfolders.

I wouldn't expect the contents of /tmp would impact these checksums.

@kamalmarhubi
Copy link
Contributor

@brson should have mentioned: those were files I downloaded directly from the dist site while investigating the issue. Basically, I was verifying that the mismatch I saw on Thursday was not from weird local caching.

Things work now, independently of those files being in /tmp. :-)

@kamalmarhubi
Copy link
Contributor

kamalmarhubi commented Jun 11, 2016

@brson

I believe right now it's possible to get checksum drift on the manifests.

Is this something that can be fixed? The drift I saw lasted for at least tens of minutes, and up to many hours depending on when OP saw those same checksums. This seems a bit long.

I'm not sure what the serving infrastructure is behind cloudfront. Is it immediately backed by S3? Trying to figure out what's making this hard, and what could be done to change it.

@alexcrichton
Copy link
Member

Yeah I'm curious how checksum drift is possible in the manifests. I thought it was only in the small window where an invalidation is in flight (or we're in the middle of an upload), but when this happened I confirmed and neither of those was happening.

@brson
Copy link
Contributor

brson commented Jun 15, 2016

Let's switch both the self-update and manifest checksum checks to use the HTTP e-tags. The download_and_check function will need to change - instead of accepting a hash and downloading the paired .sha256 file it will accept an etag and check it against the HTTP etag header.

To facilitate the upgrade we'll need to maintain both code paths. We'll change the metadata format to store etags, like we currently do with update hashes. If there's an update hash (but no stored etag) dfor a particular artifact then use the old code path, otherwise use the etag path.

The way we store etags will need to be slightly different than update hashes. Right now we don't store update hashes for the self-updates - we just calculate them from the running bin. So we'll need to store them in a format that supports etags for self updates or channels. Maybe it makes sense to make it a hash table of URLs to etags. I don't think this will require a metadata version bump.

@gyscos
Copy link

gyscos commented Jun 24, 2016

It happens here, but only on some computers:

$ rustup update nightly
info: syncing channel updates for 'nightly-x86_64-unknown-linux-gnu'
error: checksum failed, expected: '7270d7499edc955910c9bcc8198584fc8cb6f96914097d30f2138a2fd6957cc1', calculated: 'b22674bc7474c33c30497ac040c6adbceac13a79b55f1fd147323c3a80bdd75f'

While others are fine.
Is there some cache somewhere that I'd need to clean? Or should I just try to re-install rustup entirely?
Note: ~/.multirust/tmp is empty. ~/.multirust/update-hashes only contains stable-x86_64-unknown-linux-gnu.

EDIT: I guess it was a temporary server-side issue? Anyway, I didn't change much, but it works again.

@brson
Copy link
Contributor

brson commented Jun 28, 2016

@gyscos It's a temporary issue with our servers.

@Limeth
Copy link

Limeth commented Jul 3, 2016

Also having this issue:
error: checksum failed, expected: 'fd50af837dff039e64c251cb078cc9c7fa311c5e64ed3a373708fee382f0d498', calculated: 'ab23368d08a1fb3c51fc458087076c81ffbaf454a5f8907477c3f8aea0a1d08b'

@Limeth
Copy link

Limeth commented Jul 3, 2016

Removing /tmp/cargo-install.* fixes the problem. Thanks, @dswd!

@dcuddeback
Copy link

Just ran into this on FreeBSD as well:

$ rustup toolchain install nightly
info: syncing channel updates for 'nightly-x86_64-unknown-freebsd'
error: checksum failed, expected: 'ab23368d08a1fb3c51fc458087076c81ffbaf454a5f8907477c3f8aea0a1d08b', calculated: '8392b780a68c5e3a95d17e6c64120087b084de47f7fb0e2c4caf3a700959fabd'

@typedrat
Copy link

typedrat commented Jul 3, 2016

I've got it on OS X as well. There's nothing cargo-y in /tmp.

@AtheMathmo
Copy link
Contributor

Another +1 -

info: syncing channel updates for 'nightly-x86_64-unknown-linux-gnu'
error: checksum failed, expected: 'ab23368d08a1fb3c51fc458087076c81ffbaf454a5f8907477c3f8aea0a1d08b', calculated: '8392b780a68c5e3a95d17e6c64120087b084de47f7fb0e2c4caf3a700959fabd'

@typedrat
Copy link

typedrat commented Jul 4, 2016

Works fine now, at least.

@brson
Copy link
Contributor

brson commented Jul 5, 2016

Another temporary remediation we could make here: like with self-updates, when we see that the manifest checksum doesn't match, we can print a warning that the update is not yet available, try again later. It sucks, but at least it would be less alarming, and people would know what to do about it.

@AtheMathmo
Copy link
Contributor

@brson - if this is no more involved then modifying this line I'm happy to make the change.

@brson
Copy link
Contributor

brson commented Jul 5, 2016

@AtheMathmo it's only slightly more complicated than that. That error is emitted in several places and only one is causing this issue. [In dl_v2_manifest](https://github.com/rust-lang-nursery/rustup.rs/blob/66af8c148566fb0abc15506558aa209c1d82a13f/src/rustup-dist/src/dist.rs#L449) you could check that the error is ChecksumFailed, and if so emit an info diagnostic like [this one](https://github.com/rust-lang-nursery/rustup.rs/blob/66af8c148566fb0abc15506558aa209c1d82a13f/src/rustup-cli/self_update.rs#L1126) (which interestingly seems to mention the wrong bug number!), then return Ok(None)to indicate there is no available update. Emit the diagnostic usingdownload.notify_handler`.

@AtheMathmo
Copy link
Contributor

I'm happy to make this change in a few hours. Thanks for the info!

@AtheMathmo
Copy link
Contributor

@brson - could you give a little more information about emitting the info diagnostic? Should I add a new value to the Notification enum - ChecksumFailed?

@brson
Copy link
Contributor

brson commented Jul 6, 2016

@AtheMathmo yes, that enum will need a new variant. I'd name it something more indicative of it's hacky nature, like ManifestChecksumFailedHackOmgThisSucks.

@AtheMathmo
Copy link
Contributor

@brson - looks good, thanks! :D

@brson brson closed this as completed in #562 Jul 8, 2016
@brson brson reopened this Jul 8, 2016
@brson
Copy link
Contributor

brson commented Jul 8, 2016

@AtheMathmo made the quick fix. Thanks! Leaving this open for the full solution.

@alexcrichton
Copy link
Member

@brson something fishy seems to be happening right now? On AppVeyor I'm getting this error currently, but on a build scheduled half an hour earlier the checksum error didn't happen.

There's no way we can just chalk this all up to cloudfront, right?

@brson
Copy link
Contributor

brson commented Jul 31, 2016

@alexcrichton yes, I agree that observations don't support the hypothesis that this is just due to drift in cloudfront invalidations.

@brson brson removed the help wanted label Jul 31, 2016
@tshepang
Copy link
Member

tshepang commented Aug 4, 2016

Am getting this:

info: syncing channel updates for 'nightly-x86_64-unknown-linux-gnu'
warning: update not yet available, sorry! try again later
error: checksum failed, expected: 75b220d4bdf9c4d670d4787e98de8444a7641a14cc82c898db2a36138248bb4', calculated: '8f10396e1feee2e8f69f6d1406ce5750cc0b3291924b0e11b0cac75fb71bbc70'
rustup: command failed: /tmp/tmp.q9yatfTVnq/rustup-init

@tshepang
Copy link
Member

tshepang commented Aug 4, 2016

Fixed itself

@brson
Copy link
Contributor

brson commented Sep 14, 2016

In the discussion of signature validation I mentioned in passing my current preferred solution.

Basically, we add another layer of indirection: a "current.toml" file that acts like a symlink. It contains two or three keys: one is the name of the current archive directory, the other is a checksum of the previous field, the possible third is a signature of that same field. We do this for both rustup and the rust manifests. I do plan to get around to this soon.

I prefer this to attempting to wrangle cloudfront/s3 into doing something approximating symlinks because it can be done without depending on specifics of the rust distribution infrastructure.

@Diggsey
Copy link
Contributor

Diggsey commented Sep 14, 2016

@brson Will that actually fix the problem though? What if "current.toml" is updated, but rustup can't access the files it points to because it doesn't see them yet? Maybe that can't happen, but unless we know exactly what's causing the current failures it seems a bit premature to switch to this method with no guarantee it will help.

@brson
Copy link
Contributor

brson commented Oct 11, 2016

Well, I think it is likely to fix the problem. Even though some of the windows of time over which the mismatches persist seem to be longer than we'd expect given what we know about the CDN, it does seem that the problem is due to there being two files that are overwritten that must be in agreement.

brson added a commit to brson/rustup.rs that referenced this issue Oct 14, 2016
Temporary hack to avoid issues with consistent checksumming on the
CDN.

Issue rust-lang#524
brson added a commit to brson/rustup.rs that referenced this issue Oct 14, 2016
This changes the update process in the following ways: the current
version is read from the server at /rustup/stable-release.toml; if the
version is different from the running version then rustup downloads
the new release from the archives at /rustup/archive/$version/.

Fixes rust-lang#524
brson added a commit to brson/rustup.rs that referenced this issue Oct 21, 2016
This changes the update process in the following ways: the current
version is read from the server at /rustup/stable-release.toml; if the
version is different from the running version then rustup downloads
the new release from the archives at /rustup/archive/$version/.

Fixes rust-lang#524
@Seeker14491
Copy link
Contributor

I've been unable to update past the 2016-11-17 nightly due to this bug. The problem doesn't appear to be transient; I tried updating a few days ago and got the same error. I did update my Xubuntu install from 14.xx -> 16.10 at around the same time this issue started occurring. Perhaps that has something to do with this issue.

[rust]$ rustup --version
rustup 0.6.5 (88ef618 2016-11-04)
[rust]$ rustup show
Default host: x86_64-unknown-linux-gnu

installed targets for active toolchain
--------------------------------------

x86_64-pc-windows-gnu
x86_64-unknown-linux-gnu

active toolchain
----------------

nightly-x86_64-unknown-linux-gnu (default)
rustc 1.15.0-nightly (ba872f270 2016-11-17)

[rust]$ rustup update
info: syncing channel updates for 'nightly-x86_64-unknown-linux-gnu'
info: update not yet available, sorry! try again later
error: checksum failed, expected: 'afac5aec3145de8a551e930980045d4e7867c83aa5909246087e451bf0038aed', calculated: 'd411ebefe2ceb819fc78423e0cc6425b05d84b5c370ad9aeb19bffea9949dd53'
info: checking for self-updates

  nightly-x86_64-unknown-linux-gnu update failed - rustc 1.15.0-nightly (ba872f270 2016-11-17)

nodakai pushed a commit to nodakai/rustup.rs that referenced this issue Apr 23, 2017
Temporary hack to avoid issues with consistent checksumming on the
CDN.

Issue rust-lang#524
nodakai pushed a commit to nodakai/rustup.rs that referenced this issue Apr 23, 2017
This changes the update process in the following ways: the current
version is read from the server at /rustup/stable-release.toml; if the
version is different from the running version then rustup downloads
the new release from the archives at /rustup/archive/$version/.

Fixes rust-lang#524
@may1393
Copy link

may1393 commented Jan 6, 2021

@

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests