RFC: Nix backend #9089

fgaz · 2023-07-06T08:24:19Z

Motivation

The current Nix-style store implementation works well but has a number of limitations, such as

No garbage collection ([nix-local-build] Garbage collecting the store #3333)
Sharing/deploying build products can be difficult (Support relocatable packages #462, RFC: Remote cache support #5582, ...)
Only the version (not the specific instance) of system dependencies is considered in the hash (ironically, this can be a problem when using cabal-install in NixOS)
Path length problems (too many to list)

Nix can solve many of those problems, but Nix as is requires total buy-in to be useful, which means using the Nix expression language and command line, and a cabal user may not be interested in those.

cabal-install 3 is uniquely positioned to transparently take advantage of the store part of Nix without having to sacrifice its own solver, cli, or ability to perform incremental builds on local packages.

Proposal

I propose to:

Modify cabal-install so that it can build remote (=saved in the store) packages through multiple backends
Add a Nix backend that produces Nix derivations directly (without passing through the Nix language) and delegates the build to Nix

If Nix RFC 134 is implemented, this will be based on a stable interface independent from Nix-the-language.

Challenges

The main challenge is designing the interface of cabal-install backends. While the similarity of Nix and the Nix-style store helps, there are a few obstacles to consider, for example:

The current implementation expects the store to have a single mutable package db.
The current implementation can interleave remote and local builds, Nix will only do remote (non-incremental) builds.

The interface has to be flexible enough to accommodate for these differences.

Future work

If this turns out to work well and Nix RFC92 is implemented, cabal-install can be used to generate build plans directly in nixpkgs without recursive Nix, closing the loop and demonstrating how system package managers and language-specific package managers can integrate even when dependency trees become complex.

Prior work

#3882
haskell.nix

michaelpj · 2023-07-06T09:21:45Z

The current implementation can interleave remote and local builds

I'm not sure I quite understood this. Do you mean temporally interleave such builds, or something like "build A->B->C where B is local and A and C are remote"?

A major issue to me seems to be: how would we specify the system dependencies? We're going to need GHC, various C libraries and so on. Probably we want to get these from nixpkgs somehow, but the user will likely have an opinion on which nixpkgs, and it's not trivial to map from the system library names we use to nixpkgs derivations (see e.g. https://github.com/input-output-hk/haskell.nix/blob/master/lib/system-nixpkgs-map.nix in haskell.nix).

So maybe the user is going to need to tell us some of this stuff. The easiest way for them to do that would probably be for them to write some nix code that we can import to get things... but then we can't have a store-only implementation.

An alternative model would be to do something more like haskell.nix does and generate nix source files from the plan. This is pretty non-trivial, and I think doing it properly would amount to pulling a lot of haskell.nix into Cabal, which seems undesirable.

Here's a sketch of an alternative plan. We can think of the process of building a local package a bit like this:

The Solver produces a Plan
The Plan is used to build the dependencies, ultimately producing a Package DB with all the packages installed in it
The Package DB along with other stuff (?) is used to produce the Environment that the local package is built in

Instead of making it possible to replace the "build a remote package" part, we could make it possible to replace the "turn a Plan into a Package DB" part. Then we can out-source that to, say, haskell.nix, which partly struggles today because there is not a good fully-detailed representation of the Plan for it to consume.

The advantage of this is that it sidesteps the problems above. The Plan contains Cabal's view on what to build. It doesn't have to say how you get a particular GHC or how you get a system library with a particular name, that's the job of the tool that consumes the Plan.

So the UI for this might look a bit like providing an environment-builder field in cabal.project that points to a nix file (?) that can be used to build the environment that we want for a local build.

michaelpj · 2023-07-06T09:35:26Z

cc @angerman

fgaz · 2023-07-06T10:07:24Z

Do you mean temporally interleave such builds, or something like "build A->B->C where B is local and A and C are remote"?

Yes temporally. If a package in the dependency tree is local, then all its reverse dependencies (A in your example) will be local as well.

A major issue to me seems to be: how would we specify the system dependencies?

There is a simple solution to this problem: require the system dependencies to be already present in the store. This way cabal only controls Haskell dependencies, while system dependencies can be controlled externally. For example, when checking for the compiler, cabal can error out if GHC isn't in the store already, while if it is cabal can add the store path of its deriver to inputDrvs.

Of course the only way of providing those dependencies right now is through the Nix language, but it isn't a requirement, and more importantly it can be done externally. Nix the language can provide the environment, cabal can build inside that environment using the nix store.

This is the opposite of what haskell.nix does. It's more similar to an ordinary mkDerivation that takes its inputs explicitly, but we do that for system dependencies only.

If system dependencies are stable there is also the sandbox-paths escape hatch of course...

I'll add this to Challenges

angerman · 2023-07-06T10:33:18Z

This is pretty non-trivial, and I think doing it properly would amount to pulling a lot of haskell.nix into Cabal, which seems undesirable.

I'm not sure. Anything that allow us to throw away (large?) parts of haskell.nix is welcome.

Nix the language can provide the environment, cabal can build inside that environment using the nix store.

That's what we currently do with our devx repo. E.g.

nix develop github:input-output-hk/devx#ghc8107-minimal

and then using cabal-install as usual.

Works very well for interactive development.

It does of course only work with the pre-set dependencies, and has no way of discovering dependencies needed by projects in any dynamic form. Having something that would do the same, but be able to declaratively figure out which dependencies the build plan needs would be nice. In any case a mapping from commonly referred to libraries to the names that nix calls them would be needed, but that can equally well live in nixpkgs itself.

There is a bit of a chicken and egg issue if you try to use cabal's planning for this. To compute the plan, you may need some pkgconfig dependencies (and some package might just auto reconfigure if the pkg-config pkg isn't available; whether that's the intention of the actor or not). Or the build plan might fail because some build dependencies can't be found (I think something with postgresql is quite infamous for doing this 😬).

Of course if we require people to have all of these in store (and environment) prior to running any cabal-install command, that will certainly work well!

haskell.nix doesn't really try to solve any of those issues. It fundamentally only tries to solve this questions:

I have a Haskel project that builds on my machine. How do I get a nix expression for it that I can pin down for reproducibility, and leverage nixpkgs cross compilation capabilities to cross compile to other platforms?

andreabedini · 2023-07-06T11:03:55Z

I have played with these ideas at lot @fgaz so if you want to start hacking some PoC togheter I am in.

FWIW nix already support importing derivation graphs (nix derivation add added recently). Once planning is done there is no interleaving and one can turn each component into its own derivation. This is what haskell.nix does (although not as well as I wish).

Others have already commented on the issue of the surrounding build environment.

Re: multiple backends. This is the part I wish we could start with.
Not much because of I have any backends in mind (Ninja?) but because it would force us to refactor the existing codebase.

As you must be aware:

The solver does not understand components, and has a excessively simplified view of the installed packages (it ignores flags because they are not saved anywhere)
Per-component build is hacked on top of per-package planning (the "elaboration" part)
You can print out the plan in all its details but it mixes environment specific information (like the build directories) with the plan itself (which would be a pure function of the available packages and few solver parameters).
This "plan as a pure function" is currently not trivial because we do IO everywhere (thanks to the rebuild monad). This also has to be refactored.

I don't want to sound discouraging; the opposite! I encourage you, and anybody interested, to give it a go.

In the end cabal-install is a library now so you can start your own cabal2nix solution. I assure you it will be warmly welcomed by all the other ones 😂

💜

fgaz · 2023-07-06T14:44:16Z

Thanks for the feedback!

FWIW nix already support importing derivation graphs (nix derivation add added recently).

I know :)

Once planning is done there is no interleaving and one can turn each component into its own derivation. This is what haskell.nix does (although not as well as I wish).

What I meant to say is this:
Not everything can be build by Nix. Local packages will have to be built outside of Nix if we want to keep incremental compilation. The current builder freely interleaves local and nonlocal builds since it controls the whole build plan, but if we give the nonlocal part to nix we potentially have to wait for the entire nonlocal build to finish before building any local package. This isn't that much of a problem because often that's already what happens in typical builds, but it's still something to take into account while designing the interface, as we want to permit interleaved builds for the current backend.

Re: multiple backends. This is the part I wish we could start with.
Not much because of I have any backends in mind (Ninja?) but because it would force us to refactor the existing codebase.

Yes! The separation between planning an building is there, but it gets complex fast.

The solver does not understand components, and has a excessively simplified view of the installed packages (it ignores flags because they are not saved anywhere)

Per-component build is hacked on top of per-package planning (the "elaboration" part)

This should be fine, nix derivation can be "package"-level, like the current cabal store paths, and everything can be source-based since we let nix do the caching.

You can print out the plan in all its details but it mixes environment specific information (like the build directories) with the plan itself (which would be a pure function of the available packages and few solver parameters).

This "plan as a pure function" is currently not trivial because we do IO everywhere (thanks to the rebuild monad). This also has to be refactored.

Environment-specific information can be a problem and we'll have to use it carefully.

IO is fine as long as the build plan doesn't depend on the store, and as far as I can tell it doesn't until the plan improvement phase. Ideally we'd skip plan improvement as it will be done by Nix.

Not much because of I have any backends in mind (Ninja?)
[...]
In the end cabal-install is a library now so you can start your own cabal2nix solution.

If this is somehow made pluggable (backpack maybe?) other backends may actually be possible...

Ericson2314 · 2023-07-06T16:57:25Z

There is a bit of a chicken and egg issue if you try to use cabal's planning for this. To compute the plan, you may need some pkgconfig dependencies (and some package might just auto reconfigure if the pkg-config pkg isn't available; whether that's the intention of the actor or not). Or the build plan might fail because some build dependencies can't be found (I think something with postgresql is quite infamous for doing this grimacing).

I think a useful sub-goal is to make sure Cabal/cabal-install can always "be told" rather than "autodetect".

An interesting thing to compare is @mpickering's multi-repl work, where setup had to be told to just trust that these packages will eventually exist. That is exactly the same principle at play --- de-interleaving planning and building requires taking some information "on faith" because the build that would "make it true" hasn't yet happened yet.

As a bonus, the environment detection logic can be repurposed as testing after-the-fact logic. E.g., we can just "asssume" we'll have some C library eventually and cabal setup accordingly, but once it is is built, we should run run that check to ensure the build did what we want.

As a final note, all this means that the right way to get stuff from the outside world (e.g. Nixpkgs) is not the build results but the derivations themselves. This allows maximal eager planning and lazy building.

Ericson2314 · 2023-07-06T17:01:08Z

Also, I hope this code can co-exist with haskell.nix in a very healthy matter. The core work of separating planning and building within Cabal/cabal-install is good for both. The details of whether want to control everything and output our own derivations (this, an internal project), or know more about Nixpkgs / integration and output nix language expressions (haskell.nix, an external project) should be not too hard to abstract over for all the code that doesn't care.

michaelpj · 2023-07-08T13:12:53Z

There is a simple solution to this problem: require the system dependencies to be already present in the store. This way cabal only controls Haskell dependencies, while system dependencies can be controlled externally. For example, when checking for the compiler, cabal can error out if GHC isn't in the store already, while if it is cabal can add the store path of its deriver to inputDrvs.

I'm not sure how this will work. What does it mean to say that "a system dependency is already present in the store"? To find something in the store you need to know the derivation that built it, which is the same thing as being able to build it! You have to say specifically what GHC derivation you want. There isn't a way to just say "get me a GHC from the store"!

That's why I was saying we might need the user to tell us which specific GHC derivation they want. Otherwise there really isn't a way to find it, I think.

Also, I hope this code can co-exist with haskell.nix in a very healthy matter. The core work of separating planning and building within Cabal/cabal-install is good for both.

Yeah, I was really mostly bringing up haskell.nix because I think it does grapple with many of these problems today, and is an interesting and useful point of contrast for a lot of them. I'm pretty sure any increase in modularity here would be good for haskell.nix! Everyone loves deleting code :D

andreabedini · 2023-07-10T07:37:10Z

Another relevant ticket is #6885

nomeata · 2023-07-11T21:06:52Z

I am watching a github action build dozends of depenencies that I am sure have been built thousand times before by someone. So I'll fill the time waiting by saying that a nix backend would be great… :-)

andreabedini · 2023-07-11T23:44:03Z

@nomeata there are a few caching solution for haskell and cabal on GitHub actions, are you using any of those? I am usually ok with just caching the store.

That said, I had a look at implementing a "remote cache" for cabal in the style of bazel. This would be also very similar to how GHA cache works. It requires some rework of the store configuration (it's not just a path anymore) but it's doable if there are any takers.
🤔 I should move this to a different ticket ... (Edit: #9137)

geekosaur · 2023-07-11T23:45:23Z

I was considering mentioning that that sounded less like a Nix backend than a Cachix backend.

TravisWhitaker · 2023-08-02T00:22:32Z

@angerman @michaelpj @Ericson2314 If the upshot here is that implementing this would allow big chunks of Haskell.nix functionality to move into cabal itself, I'd be very interested in lending a hand (I may be able to have Anduril fund this).

fgaz added cabal-install: nix integration type: RFC Requests for Comment labels Jul 6, 2023

fgaz self-assigned this Jul 6, 2023

Ericson2314 mentioned this issue Jul 17, 2023

Why is haskell.nix so complex (OR, how can we reduce it)? input-output-hk/haskell.nix#1855

Closed

fgaz mentioned this issue Sep 29, 2023

feature request: Nix Integration support for flake.nix #9046

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Nix backend #9089

RFC: Nix backend #9089

fgaz commented Jul 6, 2023 •

edited

michaelpj commented Jul 6, 2023

michaelpj commented Jul 6, 2023

fgaz commented Jul 6, 2023 •

edited

angerman commented Jul 6, 2023

andreabedini commented Jul 6, 2023

fgaz commented Jul 6, 2023

Ericson2314 commented Jul 6, 2023

Ericson2314 commented Jul 6, 2023

michaelpj commented Jul 8, 2023

andreabedini commented Jul 10, 2023

nomeata commented Jul 11, 2023

andreabedini commented Jul 11, 2023 •

edited

geekosaur commented Jul 11, 2023

TravisWhitaker commented Aug 2, 2023

RFC: Nix backend #9089

RFC: Nix backend #9089

Comments

fgaz commented Jul 6, 2023 • edited

Motivation

Proposal

Challenges

Future work

Prior work

michaelpj commented Jul 6, 2023

michaelpj commented Jul 6, 2023

fgaz commented Jul 6, 2023 • edited

angerman commented Jul 6, 2023

andreabedini commented Jul 6, 2023

fgaz commented Jul 6, 2023

Ericson2314 commented Jul 6, 2023

Ericson2314 commented Jul 6, 2023

michaelpj commented Jul 8, 2023

andreabedini commented Jul 10, 2023

nomeata commented Jul 11, 2023

andreabedini commented Jul 11, 2023 • edited

geekosaur commented Jul 11, 2023

TravisWhitaker commented Aug 2, 2023

fgaz commented Jul 6, 2023 •

edited

fgaz commented Jul 6, 2023 •

edited

andreabedini commented Jul 11, 2023 •

edited