New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stdenv/generic/setup.sh: enable parallel installs for parallel builds #217568
Conversation
If there is a consensus it's a useful change I think it would be worthwile to have a hydra run before merge to While building
More failures from hydra run:
|
a930360
to
2bacb32
Compare
2bacb32
to
a001dfb
Compare
If a package build is doing anything other than installing files in the installPhase, that smells to me that it's building some previously unbuilt target that should have been built in the buildPhase but wasn't. Your example might be building the docs as a dependency of the install. The correct fix in that case would be to explicitly build all targets that are to be installed rather than accelerating builds in the installPhase. Now of course there might still be a benefit to installing in parallel (I would expect there to be one) but that should be evaluated independently of the coincidental fix to the potential issue above. |
I will not stop you from fixing it. But I do not think such fixes belong to
|
It really wouldn't be much of a fix; you'd merely declare the default + additional targets in
I generally agree but I fear doing so would mask the actual issue that should be worked around or at least reported. Therefore, I'd advocate for this feature being handled the same as Another unknown is whether this affects r13y (certain parallel builds have done so in the past) which is also why I'd prefer this being off by default.
That's great. My point is that this PR's merit should be measured by cases like that rather than (potentially, I haven't checked) broken build scripts. |
If we enable parallel installs by default you would not need to do even that. Which sounds like a nice decrease in maintenance burden.
I don't think we have any signal today that we will lose by enabling install parallelism.
Why is already used
Why is it relevant here? If parallel builds break reproducibility then we fix the reproducibility issue one way or another. Why mask it? Note that it is not as drastic as as breaking package build. I would expect packages to fail the build be a more common issue that introduce output non-determinism.
I don't see a big difference in those. Both can do more or do less in |
The actual issue is already masked. There is no To unmask the issue would require a PR that causes
This is a totally valid concern. I think the proposal to do a few Hydra runs with this PR before merging it is a very good idea. |
e1cc88d
to
e91fbb7
Compare
That'd mean building things in the install phase and that just seems wrong to me.
The signal is time and parallelism. When refactoring ffmpeg, I mistakenly stopped building the general all target and only noticed because it was taking forever building ffmpeg on only one core despite I wasn't building anything in the buildPhase there. I wouldn't have noticed that if the installPhase built the package in parallel.
Ah, didn't notice it was guarded behind
Not a big difference in what? Time? I'm mostly concerned about weighing this PR's merit in the time improvements in regular builds rather than broken ones as the broken ones should be fixed rather than worked around.
That sounds great. |
Sounds a bit abstract. What query should I write against local logs or hydra to catch If I were to build interesting the problematic cases in An example of a more actionable signal:
That at least allows you to construct a query to find suspicious cases, like: "a lot (>10s) of time in install", "install time is a lot longer than build time" and similar. I don't think making an install phase parallel by default masks something like that infeasible. And I don't think we have an equivalent of that today as a signal.
How did you detect that? Did you look at the timing reported in logs? Or happened to notice it interactively? I agree this specific use case will look differently. I disagree it overweights the benefit of faster installs for everyone. I would say that it's a feature to get faster install in this case: if the end result is the same then it's a cosmetic issue. If the result is different - you can find about it faster as the full build is still fast.
Not a big difference in the type of the problem: both do something substantial in
I disagree that |
Yes, if we don't always parallel-build due to race conditions making output less deterministic than it could be, and the main case where parallel install helps is building during the install phase, same considerations probably apply.
Maybe it should just default to Notably, |
04771bf
to
82c5a2b
Compare
Rebased against today's merge base of |
Without the change parallel install fails as: $ install flags: -j16 ... ... collect2: error: ld returned 1 exit status libtool: error: error: relink 'libsvn_ra_serf-1.la' with the above command before installing it make: *** [build-outputs.mk:1316: install-serf-lib] Error 1 make: *** Waiting for unfinished jobs.... /nix/store/1qasgqvab0xh2jcy00x9b1zh39dw7m8f-bin
Without the change parallel install fails as: $ install flags: -j16 ... ... install: target '...-ocaml-4.14.0/lib/ocaml/threads': No such file or directory make[1]: *** [Makefile:140: installopt] Error 1
Without the change parallel installs fail as: install flags: -j2 ... ln: failed to create symbolic link '...-eresi-0.83-a3-phoenix//bin/elfsh': No such file or directory make: *** [Makefile:108: install64] Error 1
Without the change parallel installs fail as: install flags: -j16 install -d -m 0755 ...-s9fes-20181205/share/s9fes sed -e "s|^#! /usr/local|#! ...-s9fes-20181205|" <prog/s9help.scm >...-s9fes-20181205/bin/s9help ...-bash-5.2-p15/bin/bash: line 1: ...-s9fes-20181205/bin/s9help: No such file or directory make: *** [Makefile:157: install-util] Error 1 make: *** Waiting for unfinished jobs....
Without the change parallel installs fail as: install flags: -j1 ... install -m644 src/doc/*.md ...-vpnc-unstable-2021-11-04/share/doc/vpnc install: target '...-vpnc-unstable-2021-11-04/share/doc/vpnc': No such file or directory
Without the change parallel installs fail as: ...-coreutils-9.1/bin/install: cannot stat 'asy-keywords.el': No such file or directory make: *** [Makefile:272: install-asy] Error 1
Without the change parallel installs fail as: cp: cannot stat '...-gretl-2022c/share/gretl/data/plotbars': Not a directory make[1]: *** [Makefile:73: install_datafiles] Error 1
Without the change parallel installs fail as: lrelease error: Parse error at src/translations/qsynth_ru.ts:1503:33: Premature end of document. make: *** [Makefile:107: src/translations/qsynth_ru.qm] Error 1
Without the change parallel installs fail as: ...-binutils-2.40/bin/ld: cannot find ./.libs/libircd.so: No such file or directory collect2: error: ld returned 1 exit status make[4]: *** [Makefile:634: solanum] Error 1
Hydra run is complete \o/: https://hydra.nixos.org/eval/1791763?compare=trunk&full=1#tabs-now-fail It uncovered only 12 packages that fail to install (all have a workaround now). Does not sound too bad. |
A bit of fallout in |
From
|
I fixed more failures that seemed caused by this. I suppose the best reference generally is git grep 'enableParallelInstalling = false;' or git log -S enableParallelInstalling |
The primary motivating example is openssl:
Before the change full package build took 1m54s minutes. After the change full package build takes 59s.
About a 2x speedup.
The difference is visible because openssl builds hundreds of manpages spawning a perl process per manual in
install
phase. Such a workload is very easy to parallelize.Another example would be
autotools
+libtool
based build system where install step requires relinking. The more binaries there are to relink the more gain it will be to do it in parallel.The change enables parallel installs by default only for buiilds that already have parallel builds enabled. There is a high chance those build systems already handle parallelism well but some packages will fail.
Description of changes
Things done
sandbox = true
set innix.conf
? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)