Extend scope #4

davidak · 2019-07-20T15:21:59Z

This project is amazing! It shows that we are very far in the goal to make nixos reproducible.

At least for nixos-unstable's iso_minimal job for x86_64-linux.

It would be nice to have this metric for the stable releases. We might get some headlines for nixos and reproducibility.

It would also be nice to extend the scope of this "experiment" and integrate it into nixos. I'm not sure how this works because there is no documentation (#5), but you probably compare hashes? So, hydra should calculate this hashes and save them in the metadata of all builds. Then we need some distributed computing infrastructure where the community can run builds on their machines and submit their hashes. Then a central instance can calculate how reproducible each package is that has this data available. We can use that data on our website and package search.

I know BOINC for distributed computing tasks, but we might find a less complex solution.

I would like to just execute one command to build packages from a job set or channel i like and submit hashes.

grahamc · 2019-07-20T16:01:13Z

The current architecture is very simple:

Given an input expression:

r13y.com/src/bin/check.rs

Lines 32 to 46 in f144539

    
           let instruction = BuildRequest::V1(BuildRequestV1 { 
        
               nixpkgs_revision: env::args().nth(1).unwrap(), 
        
               nixpkgs_sha256sum: env::args().nth(2).unwrap(), 
        
               result_url: "bogus".into(), 
        
               subsets: vec![( 
        
                   Subset::NixOSReleaseCombined, 
        
                   Some(vec![vec![ 
        
                       "nixos".into(), 
        
                       "iso_minimal".into(), 
        
                       "x86_64-linux".into(), 
        
                   ]]), 
        
               )] 
        
               .into_iter() 
        
               .collect(), 
        
           });

We evaluate the derivation:

r13y.com/src/bin/check.rs

Lines 87 to 109 in f144539

    
           let eval = Command::new("nix-instantiate") 
        
               // .arg("--pure-eval") // See evaluate.nix for why this isn't passed yet 
        
               .arg("-E") 
        
               .arg(include_str!("./evaluate.nix")) 
        
               .arg("--add-root") 
        
               .arg(&drv) 
        
               .arg("--indirect") 
        
               .args(&[ 
        
                   "--argstr", 
        
                   "revision", 
        
                   &job.nixpkgs_revision, 
        
                   "--argstr", 
        
                   "sha256", 
        
                   &job.nixpkgs_sha256sum, 
        
                   "--argstr", 
        
                   "subfile", 
        
                   &format!("{}", path.display()), 
        
                   "--argstr", 
        
                   "attrsJSON", 
        
                   &serde_json::to_string(&attrs).unwrap(), 
        
               ]) 
        
               .output() 
        
               .expect("failed to execute process");

and then query all of the dependencies of that derivation:

r13y.com/src/bin/check.rs

Lines 118 to 126 in f144539

    
           let query_requisites = Command::new("nix-store") 
        
               .arg("--query") 
        
               .arg("--requisites") 
        
               .arg(&drv) 
        
               .output() 
        
               .expect("failed to execute process"); 
        
           for line in query_requisites.stderr.lines() { 
        
               info!("stderr: {:?}", line); 
        
           }

Each dependency gets added to a queue for verification:

r13y.com/src/bin/check.rs

Lines 127 to 132 in f144539

    
           for line in query_requisites.stdout.lines() { 
        
               debug!("stdout: {:?}", &line); 
        
               if let Ok(line) = line { 
        
                   if line.ends_with(".drv") { 
        
                       if !skip_list.contains(&line) { 
        
                           to_build.insert(line.into());

Then, for each dependency we build it twice. First with a fairly standard set of build options, where it will almost certainly be fetched from the binary cache:

r13y.com/src/bin/check.rs

Lines 192 to 199 in f144539

    
           let first_build = Command::new("nix-store") 
        
               .arg("--add-root") 
        
               .arg(&gc_root_a) 
        
               .arg("--indirect") 
        
               .arg("--realise") 
        
               .arg(&drv) 
        
               .arg("--cores") 
        
               .arg(format!("{}", maximum_cores_per_job))

Then we build it a second time with --check, forcing the local machine to build it again:

r13y.com/src/bin/check.rs

Lines 223 to 233 in f144539

    
           debug!( 
        
               "(thread-{}) Performing --check build: {:#?}", 
        
               thread_id, drv 
        
           ); 
        
           let second_build = Command::new("nix-store") 
        
               .arg("--realise") 
        
               .arg(&drv) 
        
               .arg("--cores") 
        
               .arg(format!("{}", maximum_cores_per_job)) 
        
               .arg("--check") 
        
               .arg("--keep-failed")

Nix will then exit 0 if the output matched bit-for-bit the original build:

r13y.com/src/bin/check.rs

Line 247 in f144539

if second_build.success() {

Otherwise the build is not reproducible, and we do some fancier things:

r13y.com/src/bin/check.rs

Lines 260 to 276 in f144539

    
           // For each output, look for a .check directory. 
        
           // If we find one, we want to: 
        
           // 
        
           // 1. add it to the store right away -- .check directories 
        
           //    aren't actually store paths and cannot be saved from 
        
           //    being garbage collected 
        
           // 
        
           // 2. create a GC root for what we just added to the store 
        
           //    see: https://github.com/NixOS/nix/issues/2676 
        
           // 
        
           // 3. create a NAR for the .check store path 
        
           // 
        
           // 4. create a NAR for the output store path 
        
           // 
        
           // 5. hash the two NARs 
        
           // 
        
           // 6. return a build result with the two hashes

Notably we take the original build output, and the not-matching build output and export them as a NAR, and then copy them to a content-addressed store (CAS):

r13y.com/src/bin/check.rs

Lines 295 to 305 in f144539

    
           let (path_stream, mut path_wait) = 
        
               store.export_nar(&path).unwrap(); 
        
           let (checked_stream, mut checked_wait) = 
        
               store.export_nar(&checked).unwrap(); 
        
           hashes.insert( 
        
               output.to_string(), 
        
               ( 
        
                   cas.from_read(path_stream).unwrap().into(), 
        
                   cas.from_read(checked_stream).unwrap().into(), 
        
               ),

The result of this process is a JSON document containing a list of builds: an object saying the path was reproducible, or an object which includes the derivation and the hash of the two NARs.

A separate process, report.rs starts a similar way. Given an expression:

r13y.com/src/bin/report.rs

Lines 28 to 42 in f144539

    
           let instruction = BuildRequest::V1(BuildRequestV1 { 
        
               nixpkgs_revision: env::args().nth(1).unwrap(), 
        
               nixpkgs_sha256sum: env::args().nth(2).unwrap(), 
        
               result_url: "bogus".into(), 
        
               subsets: vec![( 
        
                   Subset::NixOSReleaseCombined, 
        
                   Some(vec![vec![ 
        
                       "nixos".into(), 
        
                       "iso_minimal".into(), 
        
                       "x86_64-linux".into(), 
        
                   ]]), 
        
               )] 
        
               .into_iter() 
        
               .collect(), 
        
           });

Evaluate it and find a list of derivations and dependencies (todo: refactor :) ):

r13y.com/src/bin/report.rs

Lines 64 to 113 in f144539

    
               info!("Evaluating {:?} {:#?}", &subset, &attrs); 
        
               let eval = Command::new("nix-instantiate") 
        
                   // .arg("--pure-eval") // See evaluate.nix for why this isn't passed yet 
        
                   .arg("-E") 
        
                   .arg(include_str!("./evaluate.nix")) 
        
                   .arg("--add-root") 
        
                   .arg(&drv) 
        
                   .arg("--indirect") 
        
                   .args(&[ 
        
                       "--argstr", 
        
                       "revision", 
        
                       &job.nixpkgs_revision, 
        
                       "--argstr", 
        
                       "sha256", 
        
                       &job.nixpkgs_sha256sum, 
        
                       "--argstr", 
        
                       "subfile", 
        
                       &format!("{}", path.display()), 
        
                       "--argstr", 
        
                       "attrsJSON", 
        
                       &serde_json::to_string(&attrs).unwrap(), 
        
                   ]) 
        
                   .output() 
        
                   .expect("failed to execute process"); 
        
               for line in eval.stderr.lines() { 
        
                   info!("stderr: {:?}", line) 
        
               } 
        
               for line in eval.stdout.lines() { 
        
                   debug!("stdout: {:?}", line) 
        
               } 
        
               let query_requisites = Command::new("nix-store") 
        
                   .arg("--query") 
        
                   .arg("--requisites") 
        
                   .arg(&drv) 
        
                   .output() 
        
                   .expect("failed to execute process"); 
        
               for line in query_requisites.stderr.lines() { 
        
                   info!("stderr: {:?}", line); 
        
               } 
        
               for line in query_requisites.stdout.lines() { 
        
                   debug!("stdout: {:?}", &line); 
        
                   if let Ok(line) = line { 
        
                       if line.ends_with(".drv") { 
        
                           to_build.insert(line.into()); 
        
                       } 
        
                   } 
        
               } 
        
           }

and create a report of all the paths:

r13y.com/src/bin/report.rs

Lines 136 to 144 in f144539

    
           BuildStatus::Reproducible => { 
        
               reproducible += 1; 
        
           } 
        
           BuildStatus::FirstFailed => { 
        
               first_failed.push(response.drv); 
        
           } 
        
           BuildStatus::SecondFailed => { 
        
               unchecked += 1; 
        
           }

Given an unreproducible path, it takes the two NARs from the CAS, extracts them to the Nix store and runs Diffoscope on them:

r13y.com/src/bin/report.rs

Lines 147 to 169 in f144539

    
           unreproducible_list.push(format!("<li><code>{}</code><ul>", response.drv)); 
        
           for (output, (hash_a, hash_b)) in hashes.iter() { 
        
               if let Some(output_path) = parsed_drv.outputs().get(output) { 
        
                   let dest_name = format!("{}-{}.html", hash_a, hash_b); 
        
                   let dest = diff_dir.join(&dest_name); 
        
                   if dest.exists() { 
        
                       // ok 
        
                   } else { 
        
                       println!( 
        
                           "Diffing {}'s {}: {} vs {}", 
        
                           response.drv, output, hash_a, hash_b 
        
                       ); 
        
                       let cas_a = read_cas.str_to_id(hash_a).unwrap(); 
        
                       let cas_b = read_cas.str_to_id(hash_b).unwrap(); 
        
                       let savedto = diffoscope 
        
                           .nars( 
        
                               &output_path.file_name().unwrap().to_string_lossy(), 
        
                               &cas_a.as_path_buf(), 
        
                               &cas_b.as_path_buf(), 
        
                           )

From there, a simple HTML report is generated:

r13y.com/src/bin/report.rs

Lines 194 to 204 in f144539

    
           html.write_all( 
        
               format!( 
        
                   include_str!("./template.html"), 
        
                   reproduced = reproducible, 
        
                   unchecked = unchecked, 
        
                   total = total, 
        
                   percent = format!("{:.*}%", 2, 100.0 * (reproducible as f64 / total as f64)), 
        
                   revision = job.nixpkgs_revision, 
        
                   now = Utc::now().to_string(), 
        
                   unreproduced_list = unreproducible_list.join("\n") 
        
               )

The ideas behind using this instruction, the JSON document, and the CAS store is explicitly designed around being able to distribute the work to many builders, to make it easy to grow this project later. I would love help with this. You can see the data types I thought about here, which are designed with this in mind: https://github.com/grahamc/r13y.com/blob/f144539ae80108bd8d7bf243d67011ca63198dce/src/messages.rs

One thing these data types assume is that every builder would randomize the list of derivations and try to build all of them. The idea being have many builders try the same thing makes us more sure about the reproducibility. Now I wonder if we would want something a bit different, to allow greater coverage. My thinking now is the central server would publish the same instruction, but also publish statistics about how many times each derivation has been built. In this way builders can prioritize a low-count derivations first.

Does this help?

grahamc · 2019-07-20T16:18:42Z

One challenge which might come up by expanding scope is knowing how to visualize the list of brokenness. We shouldn't try to solve that until after it is a problem, though. Just thinking about it as it was a tough problem for

this project: http://github.com/grahamc/nix-install-matrix/
example build: https://buildkite.com/grahamc/nix-install-matrix/builds/32
example report: https://buildkite.com/organizations/grahamc/pipelines/nix-install-matrix/builds/32/jobs/1d2deb87-dd94-427e-b761-4cffa01462b4/artifacts/60d8117e-d878-450f-9af8-685bef65a229

grahamc · 2019-07-20T16:31:08Z

Some more of what I was thinking. I have no strong opinions on how to implement this, and would love help anyone wants to provide.

CAS

I use a CAS store for the NARs thinking builders would upload the NARs to something like S3, and it would be good for them to not all re-upload the one they fetched from the cache. Even better, avoiding uploading duplicate NARs if the unreproducibility comes from the current date is nice.

You can see some existing (currently useless) code around this. For example, the "report_url".

Diffoscope

Diffoscope can take many gigabytes of RAM, especially when comparing ISOs and mksquashfs outputs. The final architecture ideally could run the diffoscope process on one system, and upload it to the cache and then the website can link to that cache.

People running builds should not be expected to actually run the diffoscope step.

davidak · 2019-07-20T17:16:26Z

Great ideas.

I don't understand all details of the rust code, but got a good picture of the project. A readme should not go in such detail. Such information are probably better as comments in the code itself.

Most visitors or users are probably not interested in implementation details.

Nix will then exit 0 if the output matched bit-for-bit the original build:

so the comparison is done by Nix. for a visitor not familiar with Nix or Nixos, it would be good to note here how it's done and maybe link to Nix manual

i think sha256 hash of the build result path?

we build it twice

it would be good to just get a hash from the hydra build and compare against that. when the hash is not identical, we can still fetch the path and compare with diffoscope

One challenge which might come up by expanding scope is knowing how to visualize the list of brokenness.

that alone is a great task for someone who is a specialist in the field of data visualisation

but that's also a task any distro active in the reproducibility challenge faces, so we can cooperate there

grahamc · 2019-07-20T17:21:47Z

so the comparison is done by Nix. for a visitor not familiar with Nix or Nixos, it would be good to note here how it's done and maybe link to Nix manual

Nix does this by creating a NAR for the build, and comparing the hashes of the NARs. Essentially the same as hashing the result path.

We just "nix-build" it twice, we don't actually perform the first build, as Hydra has (presumably) done the first. The first build is substituting the build from the cache, as --check requires the build have been done before.

davidak · 2019-07-20T17:32:10Z

Nix does this by creating a NAR for the build, and comparing the hashes of the NARs.

We just "nix-build" it twice, we don't actually perform the first build, as Hydra has (presumably) done the first.

So i think Hydra should create hashes for every package. Then we just have to get the hash and don't need to download the whole package when it's already reproducible.

Do you think that's a good idea?

We might need to change the Hydra build jobs to create the hash and save them somewhere
and change Nix to use that hash for reproducibility check. If that check fails, get the package...

grahamc · 2019-07-20T17:35:21Z

We already can get the hash of the NAR:

$ curl https://cache.nixos.org/$(readlink $(which bash) | cut -d/ -f4 | cut -d'-' -f1).narinfo
StorePath: /nix/store/93h01q6yg13xdrabvqbddzbk11w6a928-bash-interactive-4.4-p23
URL: nar/037ypxfkl3ggfjlvfwxhxsynk31y7wibyd35d94qqzja7mpkk1w6.nar.xz
Compression: xz
FileHash: sha256:037ypxfkl3ggfjlvfwxhxsynk31y7wibyd35d94qqzja7mpkk1w6
FileSize: 927440
NarHash: sha256:0cpr1xwqslpmjdgpg8n9fvy2icsdzr4bp0hg2f9r47fyzsm36qqp
NarSize: 5650960
References: 681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27 93h01q6yg13xdrabvqbddzbk11w6a928-bash-interactive-4.4-p23 adc71v5apk4dzcxg7cjqgszjg1a6pd0z-ncurses-6.1-20190112 cinw572b38aln37glr0zb8lxwrgaffl4-bash-4.4-p23 q626bqzjsnzsqpxwd79l1501did3qy4k-readline-7.0p5
Deriver: 74r7m998kk1b5b9618yr1wy1rvrdvbga-bash-interactive-4.4-p23.drv
Sig: cache.nixos.org-1:CyY1jYISWaLV6BJML++MXP6FNUOkMSBCIFr7qZBMPWf28C74cbJGPnb1dFdye9cdb6S40I0SzHGJb3z8WpH1CA==

but I think it is not too much to ask to download the pre-built nar anyway, as it would make this much more complex I think.

davidak · 2019-07-20T17:42:55Z

but I think it is not too much to ask to download the pre-built nar anyway, as it would make this much more complex I think.

it would be more sustainable as we don't waste resources. also get faster results

So that's a topic for a feature-request for Nix. I'll create one...

davidak · 2019-07-20T18:31:05Z

@grahamc do you plan to extend the project in the near future, so others can contribute with builds or does that has a low priority?

grahamc · 2019-07-20T23:47:19Z

I'm not sure. I haven't done substantial work on this project in a while now. If someone else were to contribute some code, that would surely help :)

davidak mentioned this issue Jul 20, 2019

Use hash from Hydra to compare second build to with --check NixOS/nix#3001

Open

davidak mentioned this issue Oct 18, 2019

Hash collection infrastructure #13

Open

mschwaig mentioned this issue Nov 17, 2020

Automatic verification of reproducibility nix-community/robotnix#18

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend scope #4

Extend scope #4

davidak commented Jul 20, 2019 •

edited

Loading

grahamc commented Jul 20, 2019

grahamc commented Jul 20, 2019

grahamc commented Jul 20, 2019 •

edited

Loading

davidak commented Jul 20, 2019

grahamc commented Jul 20, 2019

davidak commented Jul 20, 2019

grahamc commented Jul 20, 2019

davidak commented Jul 20, 2019

davidak commented Jul 20, 2019

grahamc commented Jul 20, 2019

Extend scope #4

Extend scope #4

Comments

davidak commented Jul 20, 2019 • edited Loading

grahamc commented Jul 20, 2019

grahamc commented Jul 20, 2019

grahamc commented Jul 20, 2019 • edited Loading

CAS

Diffoscope

davidak commented Jul 20, 2019

grahamc commented Jul 20, 2019

davidak commented Jul 20, 2019

grahamc commented Jul 20, 2019

davidak commented Jul 20, 2019

davidak commented Jul 20, 2019

grahamc commented Jul 20, 2019

davidak commented Jul 20, 2019 •

edited

Loading

grahamc commented Jul 20, 2019 •

edited

Loading