New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NixOS closure size reduction #7117

Closed
edolstra opened this Issue Apr 1, 2015 · 32 comments

Comments

Projects
None yet
8 participants
@edolstra
Member

edolstra commented Apr 1, 2015

NixOS currently takes up a lot of disk space, which is especially bad for container and cloud deployments. For instance, a basic LAPP (Linux+Apache+PostgreSQL+PHP) configuration takes > 1500 MB.

Reducing this mostly requires finishing the multiple outputs work to ensure that build-time dependencies (like GCC) don't appear in runtime closures: https://github.com/vcunat/nixpkgs/compare/v/modular

Progress can be tracked in the closure size charts on Hydra:

http://hydra.nixos.org/job/nixos/trunk-combined/nixos.closures.tinyContainer.x86_64-linux#tabs-charts
http://hydra.nixos.org/job/nixos/trunk-combined/nixos.closures.smallContainer.x86_64-linux#tabs-charts
http://hydra.nixos.org/job/nixos/trunk-combined/nixos.closures.ec2.x86_64-linux#tabs-charts
http://hydra.nixos.org/job/nixos/trunk-combined/nixos.closures.lapp.x86_64-linux#tabs-charts
http://hydra.nixos.org/job/nixos/trunk-combined/nixos.closures.xfce.x86_64-linux#tabs-charts

@edolstra edolstra added this to the 15.05 milestone Apr 1, 2015

@edolstra

This comment has been minimized.

Show comment
Hide comment
@edolstra

edolstra Apr 1, 2015

Member

@vcunat Any objection to moving the v/modular branch to the main repo (maybe renamed to closure-size or something like that)?

Member

edolstra commented Apr 1, 2015

@vcunat Any objection to moving the v/modular branch to the main repo (maybe renamed to closure-size or something like that)?

@vcunat

This comment has been minimized.

Show comment
Hide comment
@vcunat

vcunat Apr 4, 2015

Member

Yes, good name, no objection. Link to some description of the work: https://gist.github.com/vcunat/6139ee17ae1dec684fd3. I believe I should get to significantly moving it again in April, but it's difficult to predict how my free time turns out.

Some commits in the branch aren't very nice – they just served as a snapshot of current state while experimenting (some even have very non-descriptive names like WIP). IIRC I didn't originally mean to merge to master without cleaning the history, but if it doesn't matter... Looking now, most commits don't seem so bad, only there are often repeated changes of the same things, which makes the history a bit more difficult to read, but perhaps it's better to read diffs against merge-point to master instead.

Member

vcunat commented Apr 4, 2015

Yes, good name, no objection. Link to some description of the work: https://gist.github.com/vcunat/6139ee17ae1dec684fd3. I believe I should get to significantly moving it again in April, but it's difficult to predict how my free time turns out.

Some commits in the branch aren't very nice – they just served as a snapshot of current state while experimenting (some even have very non-descriptive names like WIP). IIRC I didn't originally mean to merge to master without cleaning the history, but if it doesn't matter... Looking now, most commits don't seem so bad, only there are often repeated changes of the same things, which makes the history a bit more difficult to read, but perhaps it's better to read diffs against merge-point to master instead.

@domenkozar

This comment has been minimized.

Show comment
Hide comment
@domenkozar

domenkozar Apr 10, 2015

Member

Glibc depends to linux-headers at runtime because of some hidden files: https://gist.github.com/ce540a72775ac56802d3

Member

domenkozar commented Apr 10, 2015

Glibc depends to linux-headers at runtime because of some hidden files: https://gist.github.com/ce540a72775ac56802d3

@domenkozar

This comment has been minimized.

Show comment
Hide comment
@domenkozar

domenkozar Apr 10, 2015

Member

Also, there are two openssl packages in the closure. I think once comes from curl where openssl.crossDrv is used

Member

domenkozar commented Apr 10, 2015

Also, there are two openssl packages in the closure. I think once comes from curl where openssl.crossDrv is used

@vcunat

This comment has been minimized.

Show comment
Hide comment
@vcunat

vcunat Apr 11, 2015

Member

The linux headers dependency was solved years ago in Eelco's branch IIRC (although the problem could've re-appeared since then).

Member

vcunat commented Apr 11, 2015

The linux headers dependency was solved years ago in Eelco's branch IIRC (although the problem could've re-appeared since then).

@vcunat

This comment has been minimized.

Show comment
Hide comment
@vcunat

vcunat Apr 18, 2015

Member

Pushed that branch as closure-size, and deleted its ancestor multiple-outputs.

Member

vcunat commented Apr 18, 2015

Pushed that branch as closure-size, and deleted its ancestor multiple-outputs.

edolstra added a commit that referenced this issue Apr 19, 2015

edolstra added a commit that referenced this issue Apr 19, 2015

@vcunat

This comment has been minimized.

Show comment
Hide comment
@vcunat

vcunat Apr 20, 2015

Member

Branch status, TL;DR: I can build fairly complex things, such as firefox or qt4. For now, I'm avoiding committing/pushing new splits of more packages except for those that need fixing.

I would like to focus on stabilizing it soon and getting it to master fast, as the benefits seem quite significant already, and we can reiterate afterwards. Soon I want to fixup any remains needed to rebuild my full system; afterwards it might be good to create a Hydra jobset to find more build breakages. I expect most problems to manifest during build time, notable exceptions being ${pkg}/path strings to be used during runtime (often paths to executables or in wrappers).

Member

vcunat commented Apr 20, 2015

Branch status, TL;DR: I can build fairly complex things, such as firefox or qt4. For now, I'm avoiding committing/pushing new splits of more packages except for those that need fixing.

I would like to focus on stabilizing it soon and getting it to master fast, as the benefits seem quite significant already, and we can reiterate afterwards. Soon I want to fixup any remains needed to rebuild my full system; afterwards it might be good to create a Hydra jobset to find more build breakages. I expect most problems to manifest during build time, notable exceptions being ${pkg}/path strings to be used during runtime (often paths to executables or in wrappers).

edolstra added a commit that referenced this issue Apr 20, 2015

minimal.nix: Get rid of most Glibc locales
This cuts ~100 MB from the system closure.

Issue #7117.

edolstra added a commit that referenced this issue Apr 20, 2015

Remove sysvtools from the system path
All programs in sysvtools (except killall5) are also provided by
util-linux or procps.

Issue #7117.
@domenkozar

This comment has been minimized.

Show comment
Hide comment
@domenkozar

domenkozar Apr 20, 2015

Member

@vcunat awesome, really looking forward to this. I think we should first merge it to staging and stabilize it there.

Member

domenkozar commented Apr 20, 2015

@vcunat awesome, really looking forward to this. I think we should first merge it to staging and stabilize it there.

@vcunat

This comment has been minimized.

Show comment
Hide comment
@vcunat

vcunat Apr 23, 2015

Member

Ugh, mariadb on 14.12 has output size 464 MB. Was it always so? Maybe it would be worth to backport some commits in there, as on staging I can "only" see 12566 MB size (EDIT: for the $lib output, without changes from my branch).

Member

vcunat commented Apr 23, 2015

Ugh, mariadb on 14.12 has output size 464 MB. Was it always so? Maybe it would be worth to backport some commits in there, as on staging I can "only" see 12566 MB size (EDIT: for the $lib output, without changes from my branch).

@domenkozar

This comment has been minimized.

Show comment
Hide comment
@domenkozar

domenkozar Apr 23, 2015

Member

because @wkennington split /lib

Member

domenkozar commented Apr 23, 2015

because @wkennington split /lib

@vcunat

This comment has been minimized.

Show comment
Hide comment
@vcunat

vcunat Apr 23, 2015

Member

But 126+66 << 460.

Member

vcunat commented Apr 23, 2015

But 126+66 << 460.

@edolstra

This comment has been minimized.

Show comment
Hide comment
@edolstra
Member

edolstra commented Apr 23, 2015

See #7114.

@vcunat

This comment has been minimized.

Show comment
Hide comment
@vcunat

vcunat Apr 23, 2015

Member

Maybe removing 264 MB of mysql-test would be enough for 14.12. Shall I commit that? /cc @wkennington.

Member

vcunat commented Apr 23, 2015

Maybe removing 264 MB of mysql-test would be enough for 14.12. Shall I commit that? /cc @wkennington.

@wkennington

This comment has been minimized.

Show comment
Hide comment
@wkennington

wkennington Apr 23, 2015

Contributor

You should be able to mostly copy the # Remove superfluous files section
On Apr 23, 2015 10:18 AM, "Vladimír Čunát" notifications@github.com wrote:

Maybe removing 264 MB of mysql-test would be enough for 14.12. Shall I
commit that? /cc @wkennington https://github.com/wkennington.


Reply to this email directly or view it on GitHub
#7117 (comment).

Contributor

wkennington commented Apr 23, 2015

You should be able to mostly copy the # Remove superfluous files section
On Apr 23, 2015 10:18 AM, "Vladimír Čunát" notifications@github.com wrote:

Maybe removing 264 MB of mysql-test would be enough for 14.12. Shall I
commit that? /cc @wkennington https://github.com/wkennington.


Reply to this email directly or view it on GitHub
#7117 (comment).

vcunat added a commit that referenced this issue Apr 23, 2015

mariadb: remove ~250MB of superfluous files
Picked lines from master, discussion:
#7117 (comment)

The output is still ~190 MB, but it's much better.
On master there's a splitting solution anyway.
@vcunat

This comment has been minimized.

Show comment
Hide comment
@vcunat

vcunat Apr 23, 2015

Member

Pushed cf46c88. (I'm sorry I didn't notice that issue which is more on-topic.)

Member

vcunat commented Apr 23, 2015

Pushed cf46c88. (I'm sorry I didn't notice that issue which is more on-topic.)

ip1981 added a commit to zalora/microgram that referenced this issue Apr 24, 2015

@wmertens

This comment has been minimized.

Show comment
Hide comment
@wmertens

wmertens May 13, 2015

Contributor

How about pruning out the linux modules, with a wrapper build? The kernel is 130MB, with 87MB being drivers. That would help especially well with the ec2 build where video, gpu, sound, scsi, media, net/wireless and (probably) staging drivers are never used.

The firmwares are also 90+MB and completely unnecessary on ec2.

Note that recently the ec2 closure size jumped from 800MB to 1.2GB.

Contributor

wmertens commented May 13, 2015

How about pruning out the linux modules, with a wrapper build? The kernel is 130MB, with 87MB being drivers. That would help especially well with the ec2 build where video, gpu, sound, scsi, media, net/wireless and (probably) staging drivers are never used.

The firmwares are also 90+MB and completely unnecessary on ec2.

Note that recently the ec2 closure size jumped from 800MB to 1.2GB.

@wmertens

This comment has been minimized.

Show comment
Hide comment
@wmertens

wmertens May 13, 2015

Contributor

@edolstra ^

Also, any command lines to quickly get a tree of closures ordered by size would be great, or anything else helping with debugging closure size.

Contributor

wmertens commented May 13, 2015

@edolstra ^

Also, any command lines to quickly get a tree of closures ordered by size would be great, or anything else helping with debugging closure size.

@edolstra

This comment has been minimized.

Show comment
Hide comment
@edolstra

edolstra May 13, 2015

Member

Where do you see that 1.2 GB? I don't see it on http://hydra.nixos.org/job/nixos/trunk-combined/nixos.closures.ec2.x86_64-linux#tabs-charts.

Regarding kernel modules, we could either build a kernel with a much smaller configuration, or use makeModulesClosure to include only required modules in the system closure.

Member

edolstra commented May 13, 2015

Where do you see that 1.2 GB? I don't see it on http://hydra.nixos.org/job/nixos/trunk-combined/nixos.closures.ec2.x86_64-linux#tabs-charts.

Regarding kernel modules, we could either build a kernel with a much smaller configuration, or use makeModulesClosure to include only required modules in the system closure.

@edolstra

This comment has been minimized.

Show comment
Hide comment
@edolstra

edolstra May 13, 2015

Member

@wmertens I use

du -scl $(nix-store -qR ./result) | sort -n
Member

edolstra commented May 13, 2015

@wmertens I use

du -scl $(nix-store -qR ./result) | sort -n
@wmertens

This comment has been minimized.

Show comment
Hide comment
@wmertens

wmertens May 13, 2015

Contributor

@edolstra oops I was playing with virtualbox which gave me a 1.2GB install and I saw that the closure size jumped on the graph and somehow I interpreted that as 1.2GB instead of 850MB. Need more coffee 😅

Contributor

wmertens commented May 13, 2015

@edolstra oops I was playing with virtualbox which gave me a 1.2GB install and I saw that the closure size jumped on the graph and somehow I interpreted that as 1.2GB instead of 850MB. Need more coffee 😅

@wmertens

This comment has been minimized.

Show comment
Hide comment
@wmertens

wmertens May 13, 2015

Contributor

1% savings observation: both bash and bashInteractive are installed on my system, each 6MB. What is the advantage of scripts using the non-interactive version?

Contributor

wmertens commented May 13, 2015

1% savings observation: both bash and bashInteractive are installed on my system, each 6MB. What is the advantage of scripts using the non-interactive version?

@edolstra

This comment has been minimized.

Show comment
Hide comment
@edolstra

edolstra May 13, 2015

Member

It's probably not easy to get rid of all references to the non-interactive version because that's what stdenv uses. So any build that stores ${stdenv.shell} somewhere will get the non-interactive version.

BTW, it might load slightly faster due to the absence of readline/ncurses.

Member

edolstra commented May 13, 2015

It's probably not easy to get rid of all references to the non-interactive version because that's what stdenv uses. So any build that stores ${stdenv.shell} somewhere will get the non-interactive version.

BTW, it might load slightly faster due to the absence of readline/ncurses.

@wmertens

This comment has been minimized.

Show comment
Hide comment
@wmertens

wmertens May 13, 2015

Contributor

We could simply make the interactive one the default... I doubt anyone will notice those extra ld linking steps, plus two bashes mean two times the memory pressure...

Contributor

wmertens commented May 13, 2015

We could simply make the interactive one the default... I doubt anyone will notice those extra ld linking steps, plus two bashes mean two times the memory pressure...

@wmertens

This comment has been minimized.

Show comment
Hide comment
@wmertens

wmertens May 13, 2015

Contributor

3% observation: .a static libraries are taking up 26MB (biggest is spidermonkey via polkit). Would it be feasible to automatically move .a files into a dev output?

Contributor

wmertens commented May 13, 2015

3% observation: .a static libraries are taking up 26MB (biggest is spidermonkey via polkit). Would it be feasible to automatically move .a files into a dev output?

@wmertens

This comment has been minimized.

Show comment
Hide comment
@wmertens

wmertens May 13, 2015

Contributor

Handy command: du -scl $(nix-store -qR /nix/var/nix/profiles/system) | sort -n | tail -20 | head -19 | while read size path; do echo $size $path; nix-store -q --referrers $path | sed 's:/nix/store/: :'; done

Contributor

wmertens commented May 13, 2015

Handy command: du -scl $(nix-store -qR /nix/var/nix/profiles/system) | sort -n | tail -20 | head -19 | while read size path; do echo $size $path; nix-store -q --referrers $path | sed 's:/nix/store/: :'; done

@domenkozar

This comment has been minimized.

Show comment
Hide comment
@domenkozar

domenkozar Feb 29, 2016

Member

Closure graphs look ok, bumping milestone for closure-size branch

Member

domenkozar commented Feb 29, 2016

Closure graphs look ok, bumping milestone for closure-size branch

@domenkozar domenkozar modified the milestones: 16.09, 16.03 Feb 29, 2016

@domenkozar

This comment has been minimized.

Show comment
Hide comment
@domenkozar

domenkozar Apr 11, 2016

Member

I'm really eager to see numbers once staging is merged. pythonFull went from 312M to 141M \o/

Member

domenkozar commented Apr 11, 2016

I'm really eager to see numbers once staging is merged. pythonFull went from 312M to 141M \o/

@danbst

This comment has been minimized.

Show comment
Hide comment
@danbst

danbst Jun 3, 2016

Contributor

lapp graph isn't much better than 2 years ago. closure-size got no effect here. Anybody knows why?

Contributor

danbst commented Jun 3, 2016

lapp graph isn't much better than 2 years ago. closure-size got no effect here. Anybody knows why?

@dezgeg

This comment has been minimized.

Show comment
Hide comment
@dezgeg

dezgeg Jun 3, 2016

Contributor

I think when I last checked, php references quite a bit of dev headers still. Maybe it could be split.

Contributor

dezgeg commented Jun 3, 2016

I think when I last checked, php references quite a bit of dev headers still. Maybe it could be split.

@vcunat

This comment has been minimized.

Show comment
Hide comment
@vcunat

vcunat Jun 4, 2016

Member

Also note that graphs of multiple-output jobs probably don't show what people expect AFAIK.

Member

vcunat commented Jun 4, 2016

Also note that graphs of multiple-output jobs probably don't show what people expect AFAIK.

@zimbatm

This comment has been minimized.

Show comment
Hide comment
@zimbatm

zimbatm Jun 8, 2016

Member

Now that the closure-size branch has been merged I think we can close this issue. Or are there any other specific actionable left?

Member

zimbatm commented Jun 8, 2016

Now that the closure-size branch has been merged I think we can close this issue. Or are there any other specific actionable left?

@vcunat

This comment has been minimized.

Show comment
Hide comment
@vcunat

vcunat Jun 8, 2016

Member

Let me close it. The lapp and xfce closures have decreased by about a third, though not much the containers. The infrastructure is there, so if anyone finds some more unnecessary files or dependencies, it should help her to split them away.

BTW, the next general splitting improvement would likely be share/locale. It often takes a significant part of core packages and most people will only want a tiny fraction of it (one or two languages), but it would need us to introduce a way for gettext to find the files (probably an env var with path list). I'm unlikely to do such large projects completely for free, during this year at least.

Member

vcunat commented Jun 8, 2016

Let me close it. The lapp and xfce closures have decreased by about a third, though not much the containers. The infrastructure is there, so if anyone finds some more unnecessary files or dependencies, it should help her to split them away.

BTW, the next general splitting improvement would likely be share/locale. It often takes a significant part of core packages and most people will only want a tiny fraction of it (one or two languages), but it would need us to introduce a way for gettext to find the files (probably an env var with path list). I'm unlikely to do such large projects completely for free, during this year at least.

@vcunat vcunat closed this Jun 8, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment