Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate Hackage/LTS nix expressions on the fly rather than have them in the repo #16130

Closed
obadz opened this issue Jun 10, 2016 · 24 comments
Closed

Comments

@obadz
Copy link
Contributor

obadz commented Jun 10, 2016

@peti brought up an idea earlier on the IRC channel: if hackage-packages.nix is automatically generated, why does it need to be in the nix repo? Could we instead generate it as a nix derivation which would contain all versions of packages on hackage.

I hacked something together that's very crude, incomplete, and that needs a lot of work before it becomes even remotely serviceable but it illustrates the concept: https://gist.github.com/obadz/347fcf3ef0a86a9dbccbb7b04a80793b

You can inspect the generated nix expressions with:

$ nix-build ./autoHackage.nix -A autoHackageSrc

and you can see that at least one package builds by trying:

$ nix-build ./autoHackage.nix -A test-abstract-deque

Maybe we can move hackage-package.nix out of nixpkgs. We can probably do something similar for LTS releases (also @peti's idea).

Notes:

  1. There's a guard in there so that only packages with names starting in ab are generated. That's just to get rapid feedback while iterating.
  2. If you generate the full hackage package set, it's about 60MB of nix expressions uncompressed
  3. I used cabal2nix but I suspect there are more appropriate tools that I'm just not familiar with to do this
  4. If something like this became haskellPackages, we'd probably need to have enough packages in "static nix expressions" to build cabal2nix or any required tooling written in haskell

cc // @acowley @ttuegel @ocharles

@copumpkin
Copy link
Member

The main thing I'd want to test is whether your autoHackageSrc gets a "binary" cache download. @shlevy did some work on that back in NixOS/nix#52 but it's since rotted.

In principle I'm super duper in favor, and this is the thing that I've been wanting to see for over a year.

A few thoughts:

  1. Does nix-env (the name version, not attribute path version) force this to be generated/downloaded at evaluation time?
  2. To minimize the size of evaluation-time downloads, is cabal2nix tuned to generate as little code as possible for each package? It seems like ultimately there should be a fairly simple "data model" for Haskell packages that would fit into a giant attrset generated by something like cabal2nix, that then gets consumed by support code that isn't dynamically generated. Basically I'd aim to minimize the amount of repetitive code that is autogenerated in a system like this, by factoring out all repetition to library code.
  3. Nix might want to get more explicit support for staging to make the UX more pleasant here: make it clear that before we can proceed with evaluation, we need to download X, then make it clear that we're now resuming evaluation after X has been downloaded, and then display the usual list of drvs and outputs.

cc @edolstra since this is the (kind of) thing I've been bugging him about for a while now, especially around NixOS/nix#52. I'm pretty sure that something resembling this will have to be in a future nixpkgs if we want to keep growing past a certain point.

@peti
Copy link
Member

peti commented Jun 12, 2016

Implemented in 2862d27.

@peti peti closed this as completed Jun 12, 2016
@ttuegel
Copy link
Member

ttuegel commented Jun 17, 2016

In my own experiments adapting this method to our Qt and KDE packages, I have discovered a possible fatal flaw. Hydra evaluates Nixpkgs in restricted mode, which prevents fetching during evaluation and prevents evaluating Nix expressions from the store. In other words, Hydra can never evaluate a package with an expression generated this way.

@peti
Copy link
Member

peti commented Jun 17, 2016

Duh. 😞

@obadz
Copy link
Contributor Author

obadz commented Jun 17, 2016

Oh no! That is a fatal flaw indeed :-(

@FRidh
Copy link
Member

FRidh commented Jun 22, 2016

So that means Hydra cannot build packages generated with #16005 either.

@copumpkin
Copy link
Member

It seems like the notion of restricted mode is overly strict. I do think this is the right way to do things, but we'll need some improvements in Nix itself before it's fully usable. Ping @edolstra and @shlevy who both know a lot about it.

@garbas
Copy link
Member

garbas commented Jun 30, 2016

would that mean changing this line would "fix" this issue? i guess it is there for a reason, but maybe it is time to reconsider it (@edolstra)

@shlevy
Copy link
Member

shlevy commented Jun 30, 2016

I don't think this needs improvements in nix exactly. The question is, do we want hydra to be downloading from arbitrary places on the web or not? If so, we should allow downloads during restricted mode. If not, not.

@shlevy
Copy link
Member

shlevy commented Jun 30, 2016

@garbas restricted mode also restricts arbitrary filesystem access, which we definitely want on hydra (don't want someone to be able to upload a nixexpr that dumps /etc/passwd on some hydra box for example)

@shlevy
Copy link
Member

shlevy commented Jun 30, 2016

Alternatively some mechanism to specify a blessed list of downloads, possibly transformed into hydra build inputs. But then that list would need to be protected more than just general nixpkgs access, or else we might as well just enable downloads in restricted mode again.

@FRidh
Copy link
Member

FRidh commented Jun 30, 2016

The Nix manual states

Nix has a new option restrict-eval that allows limiting what paths the Nix evaluator has access to. By passing --option restrict-eval true to Nix, the evaluator will throw an exception if an attempt is made to access any file outside of the Nix search path. This is primarily intended for Hydra to ensure that a Hydra jobset only refers to its declared inputs (and is therefore reproducible).

If I am correct making an exception for just the Nix store would be sufficient.

@shlevy
Copy link
Member

shlevy commented Jun 30, 2016

No, it's not about accessing store paths in this case, it's about downloads (which aren't documented as part of restricted mode it seems)

@shlevy
Copy link
Member

shlevy commented Jun 30, 2016

@obadz
Copy link
Contributor Author

obadz commented Jun 30, 2016

Aren't builds sandboxed from accessing local paths anyway?

@shlevy
Copy link
Member

shlevy commented Jun 30, 2016

Sure, but without restricted mode evals aren't, and evals copy paths to the store. So I could do echo ${/etc/passwd} > $out/look-here-is-hydras-etc-passwd in my build command just fine without restricted mode.

@obadz
Copy link
Contributor Author

obadz commented Jun 30, 2016

Sounds like we need a "network-only" mode

@shlevy
Copy link
Member

shlevy commented Jun 30, 2016

Well, restricted mode exists basically only for hydra, so if we want to allow evals to do arbitrary networking we should just remove networking from being gated by restricted mode.

@copumpkin
Copy link
Member

I guess it really depends on what the original motivation for restricting networking was. I obviously would prefer it to be allowed to enable exactly this sort of use case, but who knows what I'm missing!

@edolstra
Copy link
Member

edolstra commented Jul 1, 2016

Import-from-derivation is not forbidden on Hydra. E.g. the Debian and RPM functions use it. However, it's a bad idea because it can cause a significant amount of building at evaluation time. (E.g. if there is a stdenv change and Hydra evaluates a call to rpmClosureGenerator, then hydra-evaluator may spend a few hours building stdenv locally.)

Import-from-derivation is however forbidden in read-only mode, so it would break nix-env -qa.

There is an orthogonal issue of whether a call to the builtin fetchTarball function should be allowed in restricted mode. Currently it isn't, but it was my intent to allow it when fetchTarball has a hash argument (i.e. fetchTarball { url = http://bla; sha256 = "..." } would always be allowed, but fetchTarball http://bla wouldn't be).

@copumpkin
Copy link
Member

Here's a transcript from a follow-up convo I had with @edolstra on IRC:

[08:57:43]  <copumpkin> niksnut: thanks. It doesn't seem fundamental that builds during evaluation happen locally though, right?
[08:59:38]  <niksnut>   copumpkin: no, actually they may be distributed, but either way is bad
[09:00:20]  <copumpkin> niksnut: to me the ideal in that situation is that the evaluator plans this sort of thing ahead of time (at least with my limited understanding of the matter)
[09:01:01]  <copumpkin> niksnut: i.e., it doesn't just blindly go evaluate rpmClosureGenerator, but evaluates a bunch of things and understands that rpmClosureGenerator will be evaluatable after stdenv is built, so it pauses and does other stuff until stdenv is built
[09:01:22]  <copumpkin> or does that sound ridiculous?
[09:01:48]  <copumpkin> if you just nix-build -A rpmClosureGenerator you get a stdenv build during evaluation, of course
[09:01:55]  <copumpkin> but in isolation that doesn't seem like a bad outcome
[09:02:02]  <niksnut>   I don't see an easy way to accomplish that
[09:02:18]  <niksnut>   except by getting rid of the separate evaluator altogether
[09:02:26]  <copumpkin> in what sense?
[09:02:36]  <niksnut>   moving hydra-evaluator into hydra-queue-runner
[09:02:43]  <copumpkin> oh
[09:02:50]  <copumpkin> is that bad for other reasons?
[09:02:56]  <copumpkin> (I'm clueless if you couldn't tell)
[09:03:24]  <niksnut>   it's a lot of work, and would require a major change to the hydra schema
[09:03:37]  <copumpkin> ah
[09:03:46]  <niksnut>   for example, it would require having build steps that are not part of a build
[09:04:45]  <copumpkin> niksnut: do you at least buy the motivation for this change, if not the practicalities? the way I see it, this would allow us to start partitioning nixpkgs, stop growing the repo uncontrollably, and actually doing autogenerated package ecosystems properly (i.e., they'd still be locked down for a given commit, but we wouldn't have to have all the data ahead of time)
[09:05:35]  <niksnut>   copumpkin: yes
[09:05:58]  <niksnut>   note that implementing hash checking in fetchTarball would be trivial to do
[09:06:01]  <copumpkin> niksnut: my fear with fetchtarball is that it's a small bandaid. We might still need arbitrary tooling to autogenerate the nix expressions for the next stage of evaluation, if not to download the sources
[09:06:26]  <copumpkin> niksnut: this is sort of speaking to my earlier rambling about "stages of evaluation/building"
[09:06:48]  <copumpkin> basically stratification of the evaluation process into distinct steps, possibly with builds interleaved between them
[09:09:51]  <copumpkin> niksnut: or do you think adding the hash check to fetchTarball (and the exception to restricted mode) is enough to kickstart this sort of thing?
[09:14:28]  <niksnut>   copumpkin: either way, there is the downside that it makes nixpkgs no longer self-contained
[09:14:56]  <niksnut>   for example, it might become impossible to evaluate a nixpkgs version in the future if some of the referenced external files disappear
[09:15:13]  <copumpkin> niksnut: as long as the external references are hash-locked, that doesn't seem that bad? it's already impossible to do anything with an evaluated nixpkgs in future if the referenced files disappear
[09:15:45]  <copumpkin> I see some appeal to keeping it self-contained, but to me at least the cons are starting to outweigh the pros
[09:16:51]  <gchristensen>  it also makes it much much more difficult to reason about the repository
[09:17:18]  <gchristensen>  because now your diff is -SHA +SHA which could be the difference of a couple lines, or a mass rebuilding of the entire python infrastructure
[09:17:47]  <copumpkin> sure
[09:17:53]  <copumpkin> but do you think the current thing is sustainable?
[09:18:00]  <copumpkin> like I'd love to add a real java ecosystem
[09:18:10]  <copumpkin> but there's no chance in hell that git would survive a haskellPackages-like approach to that
[09:18:25]  <copumpkin> in fact, we basically can't afford another haskellPackages
[09:18:34]  <gchristensen>  yes I agree with you, copumpkin
[09:18:44]  <copumpkin> even though it's sort of the ideal scenario for an individual package ecosystem
[09:19:22]  <gchristensen>  copumpkin: actually, why can't it survive? there are many stories of monster monster git repositories
[09:20:07]  <gchristensen>  copumpkin: ps: I'd love to hear more about your java ecosystem ... I've been trying to package a gradle program and it has been horrible frustrating
[09:20:08]  <copumpkin> gchristensen: autogenerated code in repositories has lots of downsides, and nobody likes monster git repositories. We also increase the channel size and hinder federated ecosystem builds
[09:20:24]  <copumpkin> gchristensen: i.e., peti has to chew people out periodically for touching his autogenerated code
[09:20:33]  <copumpkin> it makes diffs unusable, et.c
[09:20:39]  <copumpkin> all the usual reasons people hate autogenerated code in VCS
[09:20:48]  <gchristensen>  copumpkin: fair enough
[09:20:57]  <FRidh> niksnut, copumpkin: I think it's therefore important that we discuss now exactly what we do and do not keep in external repo's, and that we agree we keep the external repo's in the NixOS org. That should prevent us from 'losing' any.
[09:21:04]  <copumpkin> Nix also just has such a nice way to manage build artifacts, except when it wants to evaluate its own build artifacts :P
[09:21:12]  <copumpkin> FRidh: sure
[09:21:28]  <gchristensen>  copumpkin: ok I agree with you again

@edolstra
Copy link
Member

edolstra commented Jul 1, 2016

Regarding import-from-derivation, we could allow substitution (but not building) in read-only mode. So if nix-env -qa encounters import autoHackageSrc, and the output of autoHackageSrc exists in the binary cache, it would be downloaded. Otherwise it would throw an exception (which nix-env should handle in some graceful way). That should cover most users.

@peti
Copy link
Member

peti commented Jul 12, 2016

@edolstra, I like this solution:

Regarding import-from-derivation, we could allow substitution in read-only mode. So if nix-env -qa encounters import autoHackageSrc, and the output of autoHackageSrc exists in the binary cache, it would be downloaded.

hydra.nixos.org would build "auto-haskell.nix" just like any other derivation, and once it has built that derivation, the rest of Nixpkgs can import it (on hydra.nixors.org), too.

The only complication is that we don't have one big fat "auto-haskell.nix" file, but instead we have several hundred small "cabal2nix-foo-x.y.nix" files, where "foo" is some Haskell package and "x.y" is some version number. To get all those files realized, we would have to assign a proper attribute to each of those expressions so that packagePlatforms discovers and builds it. This is not nearly as nice as the case where users can instantiate Haskell build expressions on the fly by writing callHackage "foo" "1.0" {} whereever they please. Unfortunatey, Hydra would not find those derivations and consequently would never build them.

@peti
Copy link
Member

peti commented Jul 19, 2016

I'm closing this issue since we can generate Hackage expressions on the fly, which is the subject of this ticket. Further discussion of the "how to build on Hydra" issue should take place at NixOS/nix#954, IMHO.

@peti peti closed this as completed Jul 19, 2016
peti added a commit to peti/nixpkgs that referenced this issue Jul 21, 2016
This is the first step towards dropping Stackage support. We keep LTS 6.x
around because I don't want to downgrade our default compiler to GHC 7.x,
but once LTS 7.x comes out we'll switch our main package set to that and
drop Nightly.

More details are at:

  http://permalink.gmane.org/gmane.linux.distributions.nixos/20505

Closes NixOS#14897.

Also relevant:

 - NixOS#16130
 - commercialhaskell/stack#2259
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants