Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: The digest of the reviewed and freshly downloaded crate were different #224

Closed
icefoxen opened this issue Aug 20, 2019 · 13 comments

Comments

@icefoxen
Copy link

commented Aug 20, 2019

On some crates I keep getting the following sort of error when I try to run cargo crev review on them:

Error: The digest of the reviewed and freshly downloaded crate were different; i_bS1vb-271a-02WkBOf7D-yGi-t3fsYJ3kco8FKrNY != v_iKtjR6uB7QFaZ0_sOyYIAcDHidjucAjRGvjXOXqq0; /home/icefox/.cargo/registry/src/github.com-1ecc6299db9ec823/randomize-3.0.0-rc.3 != /home/icefox/.cargo/registry/src/github.com-1ecc6299db9ec823/randomize-3.0.0-rc.3.crev.reviewed

This occurs reliably on randomize and axiom, which are totally unrelated. Not sure what is going on here, since I can review other crates successfully. I've tried nuking the registry folder but the error still occurs.

Backtrace:

stack backtrace:
   0: failure::backtrace::internal::InternalBacktrace::new::hcb3625a47088b6f7 (0x564b51d18dd8)
   1: <failure::backtrace::Backtrace as core::default::Default>::default::hcff7db86cea2a064 (0x564b51d18f90)
   2: cargo_crev::shared::check_package_clean_state::h0942369bdb5012dd (0x564b513edc38)
   3: cargo_crev::review::create_review_proof::h64adffb3bc722a09 (0x564b51363b56)
   4: cargo_crev::shared::handle_goto_mode_command::hc59aa21ade0c759e (0x564b513f4f9f)
   5: cargo_crev::run_command::hfe66a2e803387847 (0x564b51347a6a)
   6: cargo_crev::main::h318057d9c4548283 (0x564b5134cf53)
   7: std::rt::lang_start::{{closure}}::h98c65638f7c344f3 (0x564b513cabf3)
   8: std::rt::lang_start_internal::{{closure}}::hf367a5a6ae52b0ce (0x564b51d65103)
             at src/libstd/rt.rs:49
      std::panicking::try::do_call::h9d779bc5e27e3284
             at src/libstd/panicking.rs:296
   9: __rust_maybe_catch_panic (0x564b51d6effa)
             at src/libpanic_unwind/lib.rs:80
  10: std::panicking::try::h4af22d9cb0afa351 (0x564b51d65ccd)
             at src/libstd/panicking.rs:275
      std::panic::catch_unwind::hbb68a9ada87bfb97
             at src/libstd/panic.rs:394
      std::rt::lang_start_internal::hba9e09d74440ce4d
             at src/libstd/rt.rs:48
  11: main (0x564b5134d062)
  12: __libc_start_main (0x7f8033e4609b)
  13: _start (0x564b51337a6a)
  14: <unknown> (0x0)
  • Which version are you using: 0.8.3
  • How did you install crev: cargo-install cargo-crev
  • What OS/platform are you running on: Linux, Debian 10 Stable
@dpc

This comment has been minimized.

Copy link
Collaborator

commented Aug 20, 2019

My initial guess: this crates might self-modifing (so they add source files in ./src via build.rs) or just our logic of which files to ignore is somehow wrong.

@icefoxen

This comment has been minimized.

Copy link
Author

commented Aug 21, 2019

It's definitely happening with crates that don't self-modify. Example: autocfg.

$ cargo crev goto autocfg
$ cargo crev review
Error: The digest of the reviewed and freshly downloaded crate were different; 
4lc9vVY0Lry7N__5WNBeHUCCfrz3OXbeaGpfr3xXnFk != 
okLeq3B1BjroNk0Ijjg8NnODmj_vJg0hW2K_T8OrM2M; /home/icefox/.cargo/registry/src/github.com-
1ecc6299db9ec823/autocfg-0.1.6 != /home/icefox/.cargo/registry/src/github.com-
1ecc6299db9ec823/autocfg-0.1.6.crev.reviewed

$ find . -type f -exec sha512sum "{}" + > ~/sum1
$ cd ../autocfg-0.1.6.crev.reviewed
$ find . -type f -exec sha512sum "{}" + > ~/sum2
$ diff ~/sum1 ~/sum2
# No output
$ sha512sum -c ~/sum1
./.cargo-ok: OK
./.gitignore: OK
./.travis.yml: OK
./Cargo.toml.orig: OK
./Cargo.toml: OK
./LICENSE-APACHE: OK
./LICENSE-MIT: OK
./README.md: OK
./examples/integers.rs: OK
./examples/paths.rs: OK
./examples/traits.rs: OK
./examples/versions.rs: OK
./src/error.rs: OK
./src/lib.rs: OK
./src/tests.rs: OK
./src/version.rs: OK
./.cargo_vcs_info.json: OK
./Cargo.lock: OK

I've looked at the crev code where it does this check and I don't see what's happening either, though. Is it possible for walkdir to traverse dirs with the same contents in different orders, maybe? 🤔

@dpc

This comment has been minimized.

Copy link
Collaborator

commented Aug 21, 2019

Is it possible for walkdir to traverse dirs with the same contents in different orders, maybe? 

Sound plausible. If you're on it, try adding dbg!() somewhere and compare...

@icefoxen

This comment has been minimized.

Copy link
Author

commented Aug 22, 2019

The .cargo/registry/src/github.com-1ecc6299db9ec823/autocfg-0.1.6.crev.reviewed version includes Cargo.lock in the hash and the non-reviewed version doesn't. Looks like the crate includes its Cargo.lock in the .crate file, whether deliberately or not. Adding Cargo.lock to cargo_crev::shared::cargo_min_ignore_list() makes this error go away, but I'm not sure whether that is the best solution since sometimes a crate's Cargo.lock is important.

@CAD97

This comment has been minimized.

Copy link

commented Aug 22, 2019

Cargo.lock is ignored for libraries, so ignoring the lockfile is probably fine.

@dpc

This comment has been minimized.

Copy link
Collaborator

commented Aug 22, 2019

Oh. I see what is happening. There are two ignore-lists - one (full) for crate source that you reviewed, that often generates additional files, and one (min) for a clean copy of crate where cargo should only have added .cargo-ok marker.

https://github.com/dpc/crev/blob/562f5acbb53fcdbc64ccc3b7e3a33a3b9b422cad/cargo-crev/src/unsorted_mess.rs#L106

For some reason I though that ignoring Cargo.lock in full list was a good idea, because it might have got generated during the review. I've missed the fact that some libraries might actually provide it.

I guess the solution is to remove Cargo.lock from the full list, or... possibbly better... attempt calculating checksum with or without it and be happy if any of the two matches. This way if the user actually did cargo build as a part of the review, the newly generated or modified Cargo.toml will not upset digest checker.

The root of the problem is how easy it is to introduce modifications into source directory, just by reviewing it.

@dpc

This comment has been minimized.

Copy link
Collaborator

commented Aug 22, 2019

Also, maybe we should not fail anything and just display a warning, since this check is nor very reliable. Maybe we shouldn't even bother with that check at all...

@icefoxen

This comment has been minimized.

Copy link
Author

commented Aug 22, 2019

Seems like the goal is to ensure that the hash we provide with the proof is the same as what the crate actually contains. So could we just hash the .crate file instead, or extract a temporary copy into a known location and hash everything that contains?

@BurntSushi

This comment has been minimized.

Copy link

commented Aug 22, 2019

Is it possible for walkdir to traverse dirs with the same contents in different orders, maybe?

Indeed. Under default settings, walkdir yields directory entries in the order provided by the underlying system APIs. If you need consistent ordering, you can use sort_by to sort by file name. It will need to buffer only entries in each individual directory, and not the whole tree.

@dpc

This comment has been minimized.

Copy link
Collaborator

commented Aug 22, 2019

@BurntSushi Right not the paths are sorted manually using BTreeMap https://github.com/dpc/crev/blob/master/recursive-digest/src/lib.rs#L109 . Not the best thing for memory consumption, but easy and works.

@icefoxen

Seems like the goal is to ensure that the hash we provide with the proof is the same as what the crate actually contains.

Indeed. It's the last check to confirm that what has been reviewed is actually the same as a fresh copy.

So could we just hash the .crate file instead,

I don't know what .crate is 🤣 . But according to the spec, crev is supposed to make a recursive digest of content of the crate. That's the only way to ensure the integrity of the actual code, and allow verification, even by generic tools that don't trust and/or understand cargo metadata etc.

or extract a temporary copy into a known location and hash everything that contains?

That's what is being done actually. crev moves the reviewed source code $x to $x.reviewed. Then it makes cargo download a fresh copy of $x. The reason is - some editors leave some *.tmp files and other garbage, and we don't want to hash it.

And then, as a final precaution, we calculate and compare the digest of $x with $x.reviewed. And this is when stuff goes wrong, because in $x.reviewed we ingore Cargo.lock under assumption that it has been manually crated by the reviewer doing cargo build. We also ignore target/ and .cargo-ok which is some weird marker that cargo creates. Unforuntely if the crate already had Cargo.lock this logic is wrong.

I think this whole check is a bit pointless and it just creates problems. When creating the review, we could just hash the fresh copy, and not bother with double checking, under the assumption that cargo will not really lie to us and provide a different copy then the one that reviewer just checked. Right now it's just being paranoid for no good reason. Also, if the text editor left any garbage, then the user is left with this "unclean" error which doesn't help anyone.

@dpc

This comment has been minimized.

Copy link
Collaborator

commented Aug 22, 2019

Alternatively, it could be made into warning for the user that wants to double check what was the difference. In the future we could display a list of changes between two dirs or something.

@icefoxen

This comment has been minimized.

Copy link
Author

commented Aug 23, 2019

I don't know what .crate is

I just meant the autocfg-0.1.6.crate tarball file that cargo downloads from crates.io. Pretty sure it stashes those somewhere, but I don't recall where.

I'll think about this. It would be nice to keep some kind of check so that there's some kind of promise that what you review is the same as what is actually in crates.io.

@dpc dpc closed this in ec5e082 Aug 23, 2019
@dpc

This comment has been minimized.

Copy link
Collaborator

commented Aug 23, 2019

I've implement a fix, and turned it into a warning.

dpc added a commit that referenced this issue Oct 12, 2019
This gives better results in practice, while still not introducing the
problem from #224
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.