Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend scope #4

Open
davidak opened this issue Jul 20, 2019 · 10 comments
Open

Extend scope #4

davidak opened this issue Jul 20, 2019 · 10 comments

Comments

@davidak
Copy link
Collaborator

davidak commented Jul 20, 2019

This project is amazing! It shows that we are very far in the goal to make nixos reproducible.

At least for nixos-unstable's iso_minimal job for x86_64-linux.

It would be nice to have this metric for the stable releases. We might get some headlines for nixos and reproducibility.

It would also be nice to extend the scope of this "experiment" and integrate it into nixos. I'm not sure how this works because there is no documentation (#5), but you probably compare hashes? So, hydra should calculate this hashes and save them in the metadata of all builds. Then we need some distributed computing infrastructure where the community can run builds on their machines and submit their hashes. Then a central instance can calculate how reproducible each package is that has this data available. We can use that data on our website and package search.

I know BOINC for distributed computing tasks, but we might find a less complex solution.

I would like to just execute one command to build packages from a job set or channel i like and submit hashes.

@grahamc
Copy link
Owner

grahamc commented Jul 20, 2019

The current architecture is very simple:

Given an input expression:

let instruction = BuildRequest::V1(BuildRequestV1 {
nixpkgs_revision: env::args().nth(1).unwrap(),
nixpkgs_sha256sum: env::args().nth(2).unwrap(),
result_url: "bogus".into(),
subsets: vec![(
Subset::NixOSReleaseCombined,
Some(vec![vec![
"nixos".into(),
"iso_minimal".into(),
"x86_64-linux".into(),
]]),
)]
.into_iter()
.collect(),
});

We evaluate the derivation:

r13y.com/src/bin/check.rs

Lines 87 to 109 in f144539

let eval = Command::new("nix-instantiate")
// .arg("--pure-eval") // See evaluate.nix for why this isn't passed yet
.arg("-E")
.arg(include_str!("./evaluate.nix"))
.arg("--add-root")
.arg(&drv)
.arg("--indirect")
.args(&[
"--argstr",
"revision",
&job.nixpkgs_revision,
"--argstr",
"sha256",
&job.nixpkgs_sha256sum,
"--argstr",
"subfile",
&format!("{}", path.display()),
"--argstr",
"attrsJSON",
&serde_json::to_string(&attrs).unwrap(),
])
.output()
.expect("failed to execute process");

and then query all of the dependencies of that derivation:

r13y.com/src/bin/check.rs

Lines 118 to 126 in f144539

let query_requisites = Command::new("nix-store")
.arg("--query")
.arg("--requisites")
.arg(&drv)
.output()
.expect("failed to execute process");
for line in query_requisites.stderr.lines() {
info!("stderr: {:?}", line);
}

Each dependency gets added to a queue for verification:

r13y.com/src/bin/check.rs

Lines 127 to 132 in f144539

for line in query_requisites.stdout.lines() {
debug!("stdout: {:?}", &line);
if let Ok(line) = line {
if line.ends_with(".drv") {
if !skip_list.contains(&line) {
to_build.insert(line.into());

Then, for each dependency we build it twice. First with a fairly standard set of build options, where it will almost certainly be fetched from the binary cache:

r13y.com/src/bin/check.rs

Lines 192 to 199 in f144539

let first_build = Command::new("nix-store")
.arg("--add-root")
.arg(&gc_root_a)
.arg("--indirect")
.arg("--realise")
.arg(&drv)
.arg("--cores")
.arg(format!("{}", maximum_cores_per_job))

Then we build it a second time with --check, forcing the local machine to build it again:

r13y.com/src/bin/check.rs

Lines 223 to 233 in f144539

debug!(
"(thread-{}) Performing --check build: {:#?}",
thread_id, drv
);
let second_build = Command::new("nix-store")
.arg("--realise")
.arg(&drv)
.arg("--cores")
.arg(format!("{}", maximum_cores_per_job))
.arg("--check")
.arg("--keep-failed")

Nix will then exit 0 if the output matched bit-for-bit the original build:

if second_build.success() {

Otherwise the build is not reproducible, and we do some fancier things:

r13y.com/src/bin/check.rs

Lines 260 to 276 in f144539

// For each output, look for a .check directory.
// If we find one, we want to:
//
// 1. add it to the store right away -- .check directories
// aren't actually store paths and cannot be saved from
// being garbage collected
//
// 2. create a GC root for what we just added to the store
// see: https://github.com/NixOS/nix/issues/2676
//
// 3. create a NAR for the .check store path
//
// 4. create a NAR for the output store path
//
// 5. hash the two NARs
//
// 6. return a build result with the two hashes

Notably we take the original build output, and the not-matching build output and export them as a NAR, and then copy them to a content-addressed store (CAS):

r13y.com/src/bin/check.rs

Lines 295 to 305 in f144539

let (path_stream, mut path_wait) =
store.export_nar(&path).unwrap();
let (checked_stream, mut checked_wait) =
store.export_nar(&checked).unwrap();
hashes.insert(
output.to_string(),
(
cas.from_read(path_stream).unwrap().into(),
cas.from_read(checked_stream).unwrap().into(),
),

The result of this process is a JSON document containing a list of builds: an object saying the path was reproducible, or an object which includes the derivation and the hash of the two NARs.


A separate process, report.rs starts a similar way. Given an expression:

let instruction = BuildRequest::V1(BuildRequestV1 {
nixpkgs_revision: env::args().nth(1).unwrap(),
nixpkgs_sha256sum: env::args().nth(2).unwrap(),
result_url: "bogus".into(),
subsets: vec![(
Subset::NixOSReleaseCombined,
Some(vec![vec![
"nixos".into(),
"iso_minimal".into(),
"x86_64-linux".into(),
]]),
)]
.into_iter()
.collect(),
});

Evaluate it and find a list of derivations and dependencies (todo: refactor :) ):

r13y.com/src/bin/report.rs

Lines 64 to 113 in f144539

info!("Evaluating {:?} {:#?}", &subset, &attrs);
let eval = Command::new("nix-instantiate")
// .arg("--pure-eval") // See evaluate.nix for why this isn't passed yet
.arg("-E")
.arg(include_str!("./evaluate.nix"))
.arg("--add-root")
.arg(&drv)
.arg("--indirect")
.args(&[
"--argstr",
"revision",
&job.nixpkgs_revision,
"--argstr",
"sha256",
&job.nixpkgs_sha256sum,
"--argstr",
"subfile",
&format!("{}", path.display()),
"--argstr",
"attrsJSON",
&serde_json::to_string(&attrs).unwrap(),
])
.output()
.expect("failed to execute process");
for line in eval.stderr.lines() {
info!("stderr: {:?}", line)
}
for line in eval.stdout.lines() {
debug!("stdout: {:?}", line)
}
let query_requisites = Command::new("nix-store")
.arg("--query")
.arg("--requisites")
.arg(&drv)
.output()
.expect("failed to execute process");
for line in query_requisites.stderr.lines() {
info!("stderr: {:?}", line);
}
for line in query_requisites.stdout.lines() {
debug!("stdout: {:?}", &line);
if let Ok(line) = line {
if line.ends_with(".drv") {
to_build.insert(line.into());
}
}
}
}

and create a report of all the paths:

r13y.com/src/bin/report.rs

Lines 136 to 144 in f144539

BuildStatus::Reproducible => {
reproducible += 1;
}
BuildStatus::FirstFailed => {
first_failed.push(response.drv);
}
BuildStatus::SecondFailed => {
unchecked += 1;
}

Given an unreproducible path, it takes the two NARs from the CAS, extracts them to the Nix store and runs Diffoscope on them:

r13y.com/src/bin/report.rs

Lines 147 to 169 in f144539

unreproducible_list.push(format!("<li><code>{}</code><ul>", response.drv));
for (output, (hash_a, hash_b)) in hashes.iter() {
if let Some(output_path) = parsed_drv.outputs().get(output) {
let dest_name = format!("{}-{}.html", hash_a, hash_b);
let dest = diff_dir.join(&dest_name);
if dest.exists() {
// ok
} else {
println!(
"Diffing {}'s {}: {} vs {}",
response.drv, output, hash_a, hash_b
);
let cas_a = read_cas.str_to_id(hash_a).unwrap();
let cas_b = read_cas.str_to_id(hash_b).unwrap();
let savedto = diffoscope
.nars(
&output_path.file_name().unwrap().to_string_lossy(),
&cas_a.as_path_buf(),
&cas_b.as_path_buf(),
)

From there, a simple HTML report is generated:

r13y.com/src/bin/report.rs

Lines 194 to 204 in f144539

html.write_all(
format!(
include_str!("./template.html"),
reproduced = reproducible,
unchecked = unchecked,
total = total,
percent = format!("{:.*}%", 2, 100.0 * (reproducible as f64 / total as f64)),
revision = job.nixpkgs_revision,
now = Utc::now().to_string(),
unreproduced_list = unreproducible_list.join("\n")
)


The ideas behind using this instruction, the JSON document, and the CAS store is explicitly designed around being able to distribute the work to many builders, to make it easy to grow this project later. I would love help with this. You can see the data types I thought about here, which are designed with this in mind: https://github.com/grahamc/r13y.com/blob/f144539ae80108bd8d7bf243d67011ca63198dce/src/messages.rs

One thing these data types assume is that every builder would randomize the list of derivations and try to build all of them. The idea being have many builders try the same thing makes us more sure about the reproducibility. Now I wonder if we would want something a bit different, to allow greater coverage. My thinking now is the central server would publish the same instruction, but also publish statistics about how many times each derivation has been built. In this way builders can prioritize a low-count derivations first.

Does this help?

@grahamc
Copy link
Owner

grahamc commented Jul 20, 2019

One challenge which might come up by expanding scope is knowing how to visualize the list of brokenness. We shouldn't try to solve that until after it is a problem, though. Just thinking about it as it was a tough problem for

@grahamc
Copy link
Owner

grahamc commented Jul 20, 2019

Some more of what I was thinking. I have no strong opinions on how to implement this, and would love help anyone wants to provide.

CAS

I use a CAS store for the NARs thinking builders would upload the NARs to something like S3, and it would be good for them to not all re-upload the one they fetched from the cache. Even better, avoiding uploading duplicate NARs if the unreproducibility comes from the current date is nice.

You can see some existing (currently useless) code around this. For example, the "report_url".

Diffoscope

Diffoscope can take many gigabytes of RAM, especially when comparing ISOs and mksquashfs outputs. The final architecture ideally could run the diffoscope process on one system, and upload it to the cache and then the website can link to that cache.

People running builds should not be expected to actually run the diffoscope step.

@davidak
Copy link
Collaborator Author

davidak commented Jul 20, 2019

Great ideas.

I don't understand all details of the rust code, but got a good picture of the project. A readme should not go in such detail. Such information are probably better as comments in the code itself.

Most visitors or users are probably not interested in implementation details.

Nix will then exit 0 if the output matched bit-for-bit the original build:

so the comparison is done by Nix. for a visitor not familiar with Nix or Nixos, it would be good to note here how it's done and maybe link to Nix manual

i think sha256 hash of the build result path?

we build it twice

it would be good to just get a hash from the hydra build and compare against that. when the hash is not identical, we can still fetch the path and compare with diffoscope

One challenge which might come up by expanding scope is knowing how to visualize the list of brokenness.

that alone is a great task for someone who is a specialist in the field of data visualisation

but that's also a task any distro active in the reproducibility challenge faces, so we can cooperate there

@grahamc
Copy link
Owner

grahamc commented Jul 20, 2019

so the comparison is done by Nix. for a visitor not familiar with Nix or Nixos, it would be good to note here how it's done and maybe link to Nix manual

Nix does this by creating a NAR for the build, and comparing the hashes of the NARs. Essentially the same as hashing the result path.

We just "nix-build" it twice, we don't actually perform the first build, as Hydra has (presumably) done the first. The first build is substituting the build from the cache, as --check requires the build have been done before.

@davidak
Copy link
Collaborator Author

davidak commented Jul 20, 2019

Nix does this by creating a NAR for the build, and comparing the hashes of the NARs.

We just "nix-build" it twice, we don't actually perform the first build, as Hydra has (presumably) done the first.

So i think Hydra should create hashes for every package. Then we just have to get the hash and don't need to download the whole package when it's already reproducible.

Do you think that's a good idea?

We might need to change the Hydra build jobs to create the hash and save them somewhere
and change Nix to use that hash for reproducibility check. If that check fails, get the package...

@grahamc
Copy link
Owner

grahamc commented Jul 20, 2019

We already can get the hash of the NAR:

$ curl https://cache.nixos.org/$(readlink $(which bash) | cut -d/ -f4 | cut -d'-' -f1).narinfo
StorePath: /nix/store/93h01q6yg13xdrabvqbddzbk11w6a928-bash-interactive-4.4-p23
URL: nar/037ypxfkl3ggfjlvfwxhxsynk31y7wibyd35d94qqzja7mpkk1w6.nar.xz
Compression: xz
FileHash: sha256:037ypxfkl3ggfjlvfwxhxsynk31y7wibyd35d94qqzja7mpkk1w6
FileSize: 927440
NarHash: sha256:0cpr1xwqslpmjdgpg8n9fvy2icsdzr4bp0hg2f9r47fyzsm36qqp
NarSize: 5650960
References: 681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27 93h01q6yg13xdrabvqbddzbk11w6a928-bash-interactive-4.4-p23 adc71v5apk4dzcxg7cjqgszjg1a6pd0z-ncurses-6.1-20190112 cinw572b38aln37glr0zb8lxwrgaffl4-bash-4.4-p23 q626bqzjsnzsqpxwd79l1501did3qy4k-readline-7.0p5
Deriver: 74r7m998kk1b5b9618yr1wy1rvrdvbga-bash-interactive-4.4-p23.drv
Sig: cache.nixos.org-1:CyY1jYISWaLV6BJML++MXP6FNUOkMSBCIFr7qZBMPWf28C74cbJGPnb1dFdye9cdb6S40I0SzHGJb3z8WpH1CA==

but I think it is not too much to ask to download the pre-built nar anyway, as it would make this much more complex I think.

@davidak
Copy link
Collaborator Author

davidak commented Jul 20, 2019

but I think it is not too much to ask to download the pre-built nar anyway, as it would make this much more complex I think.

it would be more sustainable as we don't waste resources. also get faster results

So that's a topic for a feature-request for Nix. I'll create one...

@davidak
Copy link
Collaborator Author

davidak commented Jul 20, 2019

@grahamc do you plan to extend the project in the near future, so others can contribute with builds or does that has a low priority?

@grahamc
Copy link
Owner

grahamc commented Jul 20, 2019

I'm not sure. I haven't done substantial work on this project in a while now. If someone else were to contribute some code, that would surely help :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants