Skip to content

ideas/2024: propose an analytics project (time budgeted builds)#16

Merged
Janik-Haag merged 2 commits intoNixOS:mainfrom
SomeoneSerge:proposal/time-budget
Feb 22, 2024
Merged

ideas/2024: propose an analytics project (time budgeted builds)#16
Janik-Haag merged 2 commits intoNixOS:mainfrom
SomeoneSerge:proposal/time-budget

Conversation

@SomeoneSerge
Copy link
Copy Markdown
Contributor

@SomeoneSerge SomeoneSerge commented Feb 21, 2024

I'd like to CC @Mic92, @RaitoBezarius, @GuillaumeDesforges, @GaetanLepage, @ConnorBaker, and @samuela for comments and as potential "potential mentors" (e.g. I've never looked into the implementations of nix-index, nix-eval-jobs, nix-fast-builds, etc so I may lack some of the expertise required for the project to succeed in time)

I didn't write this up but I think one of the prerequisites of a clean solution is the problem of identifying derivations from different nixpkgs instantiations (different revisions, different config arguments, etc), which by design "lack identity". What we can easily match is e.g. nixpkgs' attribute paths. However, derivations overridden/defined in e.g. let-in expressions will have non-trivial contributions to the total cost and we need to be able to identify these

@SomeoneSerge SomeoneSerge force-pushed the proposal/time-budget branch 4 times, most recently from 629b77a to 6a54c5c Compare February 21, 2024 06:58
Comment thread ideas/2024.md Outdated
Comment thread ideas/2024.md Outdated
Comment on lines +192 to +217
### Nixpkgs analytics: [`nixpkgs-review`](https://github.com/Mic92/nixpkgs-review) with a time-budget

This project is a small step in analyzing and understanding the data generated
by the Nixpkgs' fifteen years of modeling (bootstrapping, building and testing)
"much of the world of open source software". This data includes:

- [Hydra](https://hydra.nixos.org/job/nixpkgs/trunk/python311Packages.torch.aarch64-linux)'s evaluation errors, output paths, [build times](https://hydra.nixos.org/build/250198091#tabs-details), and build logs.
- The `.narinfo` files stored by https://cache.nixos.org, which together describe runtime dependency graphs between packages built by Hydra.
- Logs for the builds and `passthru` tests run by [Ofborg](https://logs.ofborg.org/).
- The [Nixpkgs](https://github.com/NixOS/nixpkgs) Git repository, where each
revision includes Nix code that can be evaluated or built, as well as
human-readable comments left by maintainers possibly providing insights into
the evolution of coding patterns and conventions used by Nixpkgs, as well as
into the details of upstream projects.
- The NixOS/Nixpkgs GitHub repository, which features conversations in GitHub issues and code reviews.
- IRC, Discourse, and Matrix logs.

Some of this data can be explored and visualized in Grafana hosted at https://monitoring.nixos.org/grafana.
This data allows tools like [nix-index](https://github.com/nix-community/nix-index/) (`nix-locate`, `comma`, [envfs](https://github.com/Mic92/envfs), etc.) to exist.
Nonetheless, we currently lack tools to use this data to (conveniently) answer
sometimes very simple queries like "how long has a given package been taking to
build on average". To contrast, our ambition could have been to reason about questions such as "how likely is the next build to succeed?", "how long is it likely to take until termination?", "given a Nixpkgs revision and a PR, what attributes is it likely to break?", "given Nixpkgs git history and a failing attribute, which commit is likely to have introduced the breakage?", "why was a given change introduced?". Some of these questions have brute-force solutions: termination times can be obtained by executing the builds, the offending commit may be found by bisection. The accumulated data offers us an opportunity to consult a model prior to performing the expensive computation.

In this project we'll attempt a modest task: write a version of [`nixpkgs-review`](https://github.com/Mic92/nixpkgs-review) or [`nix-fast-build`](https://github.com/Mic92/nix-fast-build) that can be given a time-budget to follow. The program would use historical data to estimate each package's contribution to the time complexity of the full review. The program would discard packages that "do not fit into the budget" and report their build status as "uncertain". When the observed build times deviate from the estimates, the program would dynamically adjust and schedule fewer or more builds as appropriate. The program would require a calibration procedure to attune to specific builders. Some sort of a simple linear model should suffice for the initial implementation.

Skills: the project might take understanding the codebase of [`nix-eval-jobs`](https://github.com/nix-community/nix-eval-jobs) and [`nix-fast-build`](https://github.com/Mic92/nix-fast-build) in order to learn how to control and schedule nix evaluation and builds; experience of working with data and the basic numerical sympathy would be helpful.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit too long I guess. The guide suggests 2-5 sentences per idea.

@samuela
Copy link
Copy Markdown
Member

samuela commented Feb 21, 2024

I was once hired as an intern to do a project like this at one of the faanGs. My takeaway at the time was that we ought to have just built better static analysis/build infrastructure instead of trying to ML it.

That's not to say that this project is not worth pursuing... Experiments are worthwhile. And even if we got just a dashboard showing plots of build times (in CPU hours) for each package that would be a worthwhile success IMHO.

@RaitoBezarius
Copy link
Copy Markdown
Member

RaitoBezarius commented Feb 21, 2024

Hmm, where is the machine learning component in the proposal here? Or do we consider basic statistical analysis to be an machine learning algorithm :P ?

@RaitoBezarius
Copy link
Copy Markdown
Member

Either way, I think it's important to separate the:

  • analysis
  • prediction
  • scheduling

parts of the project. Even building something that can collect the data and ship it somewhere else is already great and can be reused by other people to do other parts of this idea.

Comment thread ideas/2024.md
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I didn't follow up on the reviews yesterday.

  • Threw out most of the generic blathering (there's some left in the opening sentence). @Janik-Haag is this short enough now? I could delete more
  • Updated the complexity rating to "hard" (350h)
  • I suppose we shouldn't try to outline every detail in the proposal, but I included @RaitoBezarius's decomposition in the description, because it actually structures the proposal well and maybe makes the "Skills" redundant.

Comment thread ideas/2024.md
Comment thread ideas/2024.md
The problem decomposes into at least three tasks:
- Obtaining the data from Hydra and preparing it for redistribution and downstream usage.
- Working out a statistical model for the build times, including online update rules.
- Writing the evaluation and build scheduler that can consult such a model.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should have kept the bit about "marking status as uncertain" and "scheduling fewer or more builds" 🤔

Copy link
Copy Markdown
Member

@Janik-Haag Janik-Haag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, and think we can merge it.

@Janik-Haag Janik-Haag merged commit a1170b6 into NixOS:main Feb 22, 2024
@nixos-discourse
Copy link
Copy Markdown

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-should-we-handle-software-created-with-llms/76061/62

Comment thread ideas/2024.md
Nixpkgs and its infrastructure feature fifteen years of history of the open source software: in the form of build and test logs, dependency graphs, and conversations. Compared to the opportunities offered by this data, we'll attempt a modest task: write a version of [`nixpkgs-review`](https://github.com/Mic92/nixpkgs-review) or [`nix-fast-build`](https://github.com/Mic92/nix-fast-build) that can follow a fixed time-budget.

The problem decomposes into at least three tasks:
- Obtaining the data from Hydra and preparing it for redistribution and downstream usage.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that would be cool IMHO: add a field to search.nixos.org or similar that shows the expected CPU-minutes to build each derivation. It could even link to a dashboard with plots showing statistics from Hydra over time, etc.

Comment thread ideas/2024.md

### Nixpkgs analytics: [`nixpkgs-review`](https://github.com/Mic92/nixpkgs-review) with a time-budget

Nixpkgs and its infrastructure feature fifteen years of history of the open source software: in the form of build and test logs, dependency graphs, and conversations. Compared to the opportunities offered by this data, we'll attempt a modest task: write a version of [`nixpkgs-review`](https://github.com/Mic92/nixpkgs-review) or [`nix-fast-build`](https://github.com/Mic92/nix-fast-build) that can follow a fixed time-budget.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that there is interesting data here, but I'm not sure that I understand the motivation: What is the value proposition for having time-bounded nixpkgs-review? Eg supposing that this project is completed successfully, what problem in the nix community would be solved after all is said and done?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants