New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sound incrementality when changing the build plan #668
Comments
What I'm proposing should sound a lot like Nix, there's recently been a bunch of work on https://github.com/haskell-nix/hnix-store, perhaps that might assist implementing this in Shake in eventually. CC @shlevy |
I tried to find the list of operations the nix store provides, but couldn't easily. What operations are you suggesting? I think the plan is to move to CI using a cached build for Shake, even in it's current state. That said, I do agree that a model where things build only based on command lines might be appealing. I'm currently researching that actively. |
The unit of caching and sandboxing for Nix is a derivation. Shake can just fire these off instead of shelling out directly. derivation {
name = "asdf";
builder = "/bin/sh";
args = [ "-c" "echo foo >> $out"];
system = "x86_64-linux";
} is a very simple derivation that doesn't use nixpkgs. The ".. ${path/to/file} .." will copy a file to the store at a name depending on its hash, and substitute that path into the string in place of the anti-quotation. This is enough to get data into the store for Nix derivations to do meaningful work on Shake's behalf. [Nix understands derivations that depend on derivations, but it would be non-trivial to compile shake into that at this time.] You can start
I'm confused what you mean? GHC already uses Hadrian for certain CI jobs. Once the old build system is gone it would just use Hadrian.
Glad to hear that! Besides sandboxing, the thing that Nix does that makes this a bit conceptually easier is that derivations don't really get to pick the path they produce. It's instead predetermined based on content hashing. Downstream derivations need to move the outputs "into position" if something needs to have a certain name/location. This also makes "monadic" build rules easier in that it's impossible (up to hash colisions) to produce the same path two different ways. |
So, for now hnix-store is only intended to be an interface to the C++ nix store. So it's not going to be a path to things like disabling busybox /bin/sh in builds. That being said, I think in principle it could be an interesting backend to shake, if we're OK making the GHC and installed libraries we use Nix store paths. As for the interface, basically the unit of work (and of caching, incrementality, etc.) in the Nix store is a single execve call producing (at least) one output path (which can be a directory or just a file). |
I do have thoughts (nearly but not quite plans) about a next generation of Nix that enables domain-specific notions of the unit of cacheable/incremental computation (among other things), but that's not something we'll have soon. |
I am sceptical that GHC would want to go as far as making GHC and installed libraries used Nix paths. Sounds like nix-store might be one to watch, but not necessarily something relevant right now? |
Yeah, that seems reasonable to me. Shake is already on my list of tools to adapt once I have something cooler working 😉 |
I disagree. I think the final built GHC would be the same whether it was built with shake or a "shake + nix mash-up", and the shake + nix mashup should be an optional goodie for anything using shake, in principle. Certain rules may break due to sandboxing, but that is to be expected. There certainly are features that, realistically speaking, are blocked on @shlevy's cooler stuff, but I don't think a basic "please stick this shell out in a derivation" should be among them. And since I'm currently fighting GHC's CI times, I'm very motivated to grab even fruits just barely within reach :). |
Hmm... I don't see it, but I'm happy to help if you think you can make it work 😉 |
Currently incremental builds are dangerous when the build plan changes; the solution is some way to opt into sandboxing spawned process steps so we can get a grapple on all inputs and no nothing is leaking through. That way, no matter what the overall build plan is, at least those process-spawns can be cached. Rules where shake directly creates the outputted file cannot be so cached, but at least most of those are probably cheaper anyways.
I'm interested in this mainly because I'd like the Hadrian-based GHC CI to build stage 1 much much faster, opening the door to testing stage 1 and not just stage 2. CI is currently always a fresh build because shake instrumentality isn't robust enough for arbitrary PRs.
The text was updated successfully, but these errors were encountered: