ideas/2024: propose an analytics project (time budgeted builds)#16
ideas/2024: propose an analytics project (time budgeted builds)#16Janik-Haag merged 2 commits intoNixOS:mainfrom
Conversation
629b77a to
6a54c5c
Compare
| ### Nixpkgs analytics: [`nixpkgs-review`](https://github.com/Mic92/nixpkgs-review) with a time-budget | ||
|
|
||
| This project is a small step in analyzing and understanding the data generated | ||
| by the Nixpkgs' fifteen years of modeling (bootstrapping, building and testing) | ||
| "much of the world of open source software". This data includes: | ||
|
|
||
| - [Hydra](https://hydra.nixos.org/job/nixpkgs/trunk/python311Packages.torch.aarch64-linux)'s evaluation errors, output paths, [build times](https://hydra.nixos.org/build/250198091#tabs-details), and build logs. | ||
| - The `.narinfo` files stored by https://cache.nixos.org, which together describe runtime dependency graphs between packages built by Hydra. | ||
| - Logs for the builds and `passthru` tests run by [Ofborg](https://logs.ofborg.org/). | ||
| - The [Nixpkgs](https://github.com/NixOS/nixpkgs) Git repository, where each | ||
| revision includes Nix code that can be evaluated or built, as well as | ||
| human-readable comments left by maintainers possibly providing insights into | ||
| the evolution of coding patterns and conventions used by Nixpkgs, as well as | ||
| into the details of upstream projects. | ||
| - The NixOS/Nixpkgs GitHub repository, which features conversations in GitHub issues and code reviews. | ||
| - IRC, Discourse, and Matrix logs. | ||
|
|
||
| Some of this data can be explored and visualized in Grafana hosted at https://monitoring.nixos.org/grafana. | ||
| This data allows tools like [nix-index](https://github.com/nix-community/nix-index/) (`nix-locate`, `comma`, [envfs](https://github.com/Mic92/envfs), etc.) to exist. | ||
| Nonetheless, we currently lack tools to use this data to (conveniently) answer | ||
| sometimes very simple queries like "how long has a given package been taking to | ||
| build on average". To contrast, our ambition could have been to reason about questions such as "how likely is the next build to succeed?", "how long is it likely to take until termination?", "given a Nixpkgs revision and a PR, what attributes is it likely to break?", "given Nixpkgs git history and a failing attribute, which commit is likely to have introduced the breakage?", "why was a given change introduced?". Some of these questions have brute-force solutions: termination times can be obtained by executing the builds, the offending commit may be found by bisection. The accumulated data offers us an opportunity to consult a model prior to performing the expensive computation. | ||
|
|
||
| In this project we'll attempt a modest task: write a version of [`nixpkgs-review`](https://github.com/Mic92/nixpkgs-review) or [`nix-fast-build`](https://github.com/Mic92/nix-fast-build) that can be given a time-budget to follow. The program would use historical data to estimate each package's contribution to the time complexity of the full review. The program would discard packages that "do not fit into the budget" and report their build status as "uncertain". When the observed build times deviate from the estimates, the program would dynamically adjust and schedule fewer or more builds as appropriate. The program would require a calibration procedure to attune to specific builders. Some sort of a simple linear model should suffice for the initial implementation. | ||
|
|
||
| Skills: the project might take understanding the codebase of [`nix-eval-jobs`](https://github.com/nix-community/nix-eval-jobs) and [`nix-fast-build`](https://github.com/Mic92/nix-fast-build) in order to learn how to control and schedule nix evaluation and builds; experience of working with data and the basic numerical sympathy would be helpful. |
There was a problem hiding this comment.
This is a bit too long I guess. The guide suggests 2-5 sentences per idea.
|
I was once hired as an intern to do a project like this at one of the faanGs. My takeaway at the time was that we ought to have just built better static analysis/build infrastructure instead of trying to ML it. That's not to say that this project is not worth pursuing... Experiments are worthwhile. And even if we got just a dashboard showing plots of build times (in CPU hours) for each package that would be a worthwhile success IMHO. |
|
Hmm, where is the machine learning component in the proposal here? Or do we consider basic statistical analysis to be an machine learning algorithm :P ? |
|
Either way, I think it's important to separate the:
parts of the project. Even building something that can collect the data and ship it somewhere else is already great and can be reused by other people to do other parts of this idea. |
6a54c5c to
4d62e31
Compare
There was a problem hiding this comment.
Sorry I didn't follow up on the reviews yesterday.
- Threw out most of the generic blathering (there's some left in the opening sentence). @Janik-Haag is this short enough now? I could delete more
- Updated the complexity rating to "hard" (350h)
- I suppose we shouldn't try to outline every detail in the proposal, but I included @RaitoBezarius's decomposition in the description, because it actually structures the proposal well and maybe makes the "Skills" redundant.
| The problem decomposes into at least three tasks: | ||
| - Obtaining the data from Hydra and preparing it for redistribution and downstream usage. | ||
| - Working out a statistical model for the build times, including online update rules. | ||
| - Writing the evaluation and build scheduler that can consult such a model. |
There was a problem hiding this comment.
Maybe I should have kept the bit about "marking status as uncertain" and "scheduling fewer or more builds" 🤔
Janik-Haag
left a comment
There was a problem hiding this comment.
I like it, and think we can merge it.
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/how-should-we-handle-software-created-with-llms/76061/62 |
| Nixpkgs and its infrastructure feature fifteen years of history of the open source software: in the form of build and test logs, dependency graphs, and conversations. Compared to the opportunities offered by this data, we'll attempt a modest task: write a version of [`nixpkgs-review`](https://github.com/Mic92/nixpkgs-review) or [`nix-fast-build`](https://github.com/Mic92/nix-fast-build) that can follow a fixed time-budget. | ||
|
|
||
| The problem decomposes into at least three tasks: | ||
| - Obtaining the data from Hydra and preparing it for redistribution and downstream usage. |
There was a problem hiding this comment.
One thing that would be cool IMHO: add a field to search.nixos.org or similar that shows the expected CPU-minutes to build each derivation. It could even link to a dashboard with plots showing statistics from Hydra over time, etc.
|
|
||
| ### Nixpkgs analytics: [`nixpkgs-review`](https://github.com/Mic92/nixpkgs-review) with a time-budget | ||
|
|
||
| Nixpkgs and its infrastructure feature fifteen years of history of the open source software: in the form of build and test logs, dependency graphs, and conversations. Compared to the opportunities offered by this data, we'll attempt a modest task: write a version of [`nixpkgs-review`](https://github.com/Mic92/nixpkgs-review) or [`nix-fast-build`](https://github.com/Mic92/nix-fast-build) that can follow a fixed time-budget. |
There was a problem hiding this comment.
I agree that there is interesting data here, but I'm not sure that I understand the motivation: What is the value proposition for having time-bounded nixpkgs-review? Eg supposing that this project is completed successfully, what problem in the nix community would be solved after all is said and done?
I'd like to CC @Mic92, @RaitoBezarius, @GuillaumeDesforges, @GaetanLepage, @ConnorBaker, and @samuela for comments and as potential "potential mentors" (e.g. I've never looked into the implementations of nix-index, nix-eval-jobs, nix-fast-builds, etc so I may lack some of the expertise required for the project to succeed in time)
I didn't write this up but I think one of the prerequisites of a clean solution is the problem of identifying derivations from different nixpkgs instantiations (different revisions, different
configarguments, etc), which by design "lack identity". What we can easily match is e.g. nixpkgs' attribute paths. However, derivations overridden/defined in e.g. let-in expressions will have non-trivial contributions to the total cost and we need to be able to identify these