NixOS closure size reduction #7117

Closed
edolstra opened this Issue Apr 1, 2015 · 32 comments

Projects

None yet

8 participants

@edolstra
Member
edolstra commented Apr 1, 2015

NixOS currently takes up a lot of disk space, which is especially bad for container and cloud deployments. For instance, a basic LAPP (Linux+Apache+PostgreSQL+PHP) configuration takes > 1500 MB.

Reducing this mostly requires finishing the multiple outputs work to ensure that build-time dependencies (like GCC) don't appear in runtime closures: https://github.com/vcunat/nixpkgs/compare/v/modular

Progress can be tracked in the closure size charts on Hydra:

http://hydra.nixos.org/job/nixos/trunk-combined/nixos.closures.tinyContainer.x86_64-linux#tabs-charts
http://hydra.nixos.org/job/nixos/trunk-combined/nixos.closures.smallContainer.x86_64-linux#tabs-charts
http://hydra.nixos.org/job/nixos/trunk-combined/nixos.closures.ec2.x86_64-linux#tabs-charts
http://hydra.nixos.org/job/nixos/trunk-combined/nixos.closures.lapp.x86_64-linux#tabs-charts
http://hydra.nixos.org/job/nixos/trunk-combined/nixos.closures.xfce.x86_64-linux#tabs-charts

@edolstra edolstra added this to the 15.05 milestone Apr 1, 2015
@edolstra
Member
edolstra commented Apr 1, 2015

@vcunat Any objection to moving the v/modular branch to the main repo (maybe renamed to closure-size or something like that)?

@vcunat
Member
vcunat commented Apr 4, 2015

Yes, good name, no objection. Link to some description of the work: https://gist.github.com/vcunat/6139ee17ae1dec684fd3. I believe I should get to significantly moving it again in April, but it's difficult to predict how my free time turns out.

Some commits in the branch aren't very nice – they just served as a snapshot of current state while experimenting (some even have very non-descriptive names like WIP). IIRC I didn't originally mean to merge to master without cleaning the history, but if it doesn't matter... Looking now, most commits don't seem so bad, only there are often repeated changes of the same things, which makes the history a bit more difficult to read, but perhaps it's better to read diffs against merge-point to master instead.

@domenkozar
Member

Glibc depends to linux-headers at runtime because of some hidden files: https://gist.github.com/ce540a72775ac56802d3

@domenkozar
Member

Also, there are two openssl packages in the closure. I think once comes from curl where openssl.crossDrv is used

@vcunat
Member
vcunat commented Apr 11, 2015

The linux headers dependency was solved years ago in Eelco's branch IIRC (although the problem could've re-appeared since then).

@vcunat
Member
vcunat commented Apr 18, 2015

Pushed that branch as closure-size, and deleted its ancestor multiple-outputs.

@edolstra edolstra added a commit that referenced this issue Apr 19, 2015
@edolstra edolstra Don't include ntfs-3g by default
Issue #7117.
57b0576
@edolstra edolstra added a commit that referenced this issue Apr 19, 2015
@edolstra edolstra Include cifs-utils only when needed
Issue #7117.
2b6d011
@vcunat
Member
vcunat commented Apr 20, 2015

Branch status, TL;DR: I can build fairly complex things, such as firefox or qt4. For now, I'm avoiding committing/pushing new splits of more packages except for those that need fixing.

I would like to focus on stabilizing it soon and getting it to master fast, as the benefits seem quite significant already, and we can reiterate afterwards. Soon I want to fixup any remains needed to rebuild my full system; afterwards it might be good to create a Hydra jobset to find more build breakages. I expect most problems to manifest during build time, notable exceptions being ${pkg}/path strings to be used during runtime (often paths to executables or in wrappers).

@edolstra edolstra added a commit that referenced this issue Apr 20, 2015
@edolstra edolstra minimal.nix: Get rid of most Glibc locales
This cuts ~100 MB from the system closure.

Issue #7117.
650492c
@edolstra edolstra added a commit that referenced this issue Apr 20, 2015
@edolstra edolstra Remove sysvtools from the system path
All programs in sysvtools (except killall5) are also provided by
util-linux or procps.

Issue #7117.
d69b205
@domenkozar
Member

@vcunat awesome, really looking forward to this. I think we should first merge it to staging and stabilize it there.

@vcunat
Member
vcunat commented Apr 23, 2015

Ugh, mariadb on 14.12 has output size 464 MB. Was it always so? Maybe it would be worth to backport some commits in there, as on staging I can "only" see 12566 MB size (EDIT: for the $lib output, without changes from my branch).

@domenkozar
Member

because @wkennington split /lib

@vcunat
Member
vcunat commented Apr 23, 2015

But 126+66 << 460.

@edolstra
Member

See #7114.

@vcunat
Member
vcunat commented Apr 23, 2015

Maybe removing 264 MB of mysql-test would be enough for 14.12. Shall I commit that? /cc @wkennington.

@wkennington
Contributor

You should be able to mostly copy the # Remove superfluous files section
On Apr 23, 2015 10:18 AM, "Vladimír Čunát" notifications@github.com wrote:

Maybe removing 264 MB of mysql-test would be enough for 14.12. Shall I
commit that? /cc @wkennington https://github.com/wkennington.


Reply to this email directly or view it on GitHub
#7117 (comment).

@vcunat vcunat added a commit that referenced this issue Apr 23, 2015
@vcunat vcunat mariadb: remove ~250MB of superfluous files
Picked lines from master, discussion:
#7117 (comment)

The output is still ~190 MB, but it's much better.
On master there's a splitting solution anyway.
cf46c88
@vcunat
Member
vcunat commented Apr 23, 2015

Pushed cf46c88. (I'm sorry I didn't notice that issue which is more on-topic.)

@ip1981 ip1981 added a commit to zalora/microgram that referenced this issue Apr 24, 2015
@ip1981 ip1981 Deflate MariaDB 0977cb9
@wmertens
Contributor

How about pruning out the linux modules, with a wrapper build? The kernel is 130MB, with 87MB being drivers. That would help especially well with the ec2 build where video, gpu, sound, scsi, media, net/wireless and (probably) staging drivers are never used.

The firmwares are also 90+MB and completely unnecessary on ec2.

Note that recently the ec2 closure size jumped from 800MB to 1.2GB.

@wmertens
Contributor

@edolstra ^

Also, any command lines to quickly get a tree of closures ordered by size would be great, or anything else helping with debugging closure size.

@edolstra
Member

Where do you see that 1.2 GB? I don't see it on http://hydra.nixos.org/job/nixos/trunk-combined/nixos.closures.ec2.x86_64-linux#tabs-charts.

Regarding kernel modules, we could either build a kernel with a much smaller configuration, or use makeModulesClosure to include only required modules in the system closure.

@edolstra
Member

@wmertens I use

du -scl $(nix-store -qR ./result) | sort -n
@wmertens
Contributor

@edolstra oops I was playing with virtualbox which gave me a 1.2GB install and I saw that the closure size jumped on the graph and somehow I interpreted that as 1.2GB instead of 850MB. Need more coffee 😅

@wmertens
Contributor

1% savings observation: both bash and bashInteractive are installed on my system, each 6MB. What is the advantage of scripts using the non-interactive version?

@edolstra
Member

It's probably not easy to get rid of all references to the non-interactive version because that's what stdenv uses. So any build that stores ${stdenv.shell} somewhere will get the non-interactive version.

BTW, it might load slightly faster due to the absence of readline/ncurses.

@wmertens
Contributor

We could simply make the interactive one the default... I doubt anyone will notice those extra ld linking steps, plus two bashes mean two times the memory pressure...

@wmertens
Contributor

3% observation: .a static libraries are taking up 26MB (biggest is spidermonkey via polkit). Would it be feasible to automatically move .a files into a dev output?

@wmertens
Contributor

Handy command: du -scl $(nix-store -qR /nix/var/nix/profiles/system) | sort -n | tail -20 | head -19 | while read size path; do echo $size $path; nix-store -q --referrers $path | sed 's:/nix/store/: :'; done

This was referenced May 30, 2015
@vcunat vcunat modified the milestone: 16.03, 15.09 Sep 29, 2015
@domenkozar
Member

Closure graphs look ok, bumping milestone for closure-size branch

@domenkozar domenkozar modified the milestone: 16.09, 16.03 Feb 29, 2016
@domenkozar
Member

I'm really eager to see numbers once staging is merged. pythonFull went from 312M to 141M \o/

@danbst
Contributor
danbst commented Jun 3, 2016

lapp graph isn't much better than 2 years ago. closure-size got no effect here. Anybody knows why?

@dezgeg
Contributor
dezgeg commented Jun 3, 2016

I think when I last checked, php references quite a bit of dev headers still. Maybe it could be split.

@vcunat
Member
vcunat commented Jun 4, 2016

Also note that graphs of multiple-output jobs probably don't show what people expect AFAIK.

@zimbatm
Contributor
zimbatm commented Jun 8, 2016

Now that the closure-size branch has been merged I think we can close this issue. Or are there any other specific actionable left?

@vcunat
Member
vcunat commented Jun 8, 2016 edited

Let me close it. The lapp and xfce closures have decreased by about a third, though not much the containers. The infrastructure is there, so if anyone finds some more unnecessary files or dependencies, it should help her to split them away.

BTW, the next general splitting improvement would likely be share/locale. It often takes a significant part of core packages and most people will only want a tiny fraction of it (one or two languages), but it would need us to introduce a way for gettext to find the files (probably an env var with path list). I'm unlikely to do such large projects completely for free, during this year at least.

@vcunat vcunat closed this Jun 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment