New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fpm and directories containing hard links yields huge packages #365

Closed
pforai opened this Issue Feb 20, 2013 · 14 comments

Comments

Projects
None yet
5 participants
@pforai

pforai commented Feb 20, 2013

I've tried this with a recent build of git. The main culprit is that ie git is a multi-call binary and all the individual git-* commands are hard links pointing to the same file. This causes a regular RPM size for something like git to be roughly 240MB in size and extracted size is something like 650MB since all the hard links are de-references and copied into the RPM.

Dunno if the issue here lies within FPM or the underlying rpmbuild command.

@jordansissel

This comment has been minimized.

Owner

jordansissel commented Feb 20, 2013

I had a look at fedora's rpm spec for git. I didnt' see anything special for handling hard links, so I'm assuming the fault is with fpm.

@jordansissel

This comment has been minimized.

Owner

jordansissel commented Feb 20, 2013

% ls -i
29008271 a  29008271 b  29008271 c  29008271 d
% fpm -s dir -t rpm -n fizzle --prefix /tmp/example -C $PWD .
Created rpm {:path=>"fizzle-1.0-1.x86_64.rpm"}
% sudo rpm -i fizzle-1.0-1.x86_64.rpm 
% ls -i /tmp/example
28956562 a  28956563 b  28956564 c  28956565 d

Confirmed with an easy test.

@jordansissel

This comment has been minimized.

Owner

jordansissel commented Feb 20, 2013

Wondering if fpm's dir input stuff should keep track of inode numbers of the original files being copied, and hardlinking when a duplicate inode reference is found?

@pforai

This comment has been minimized.

pforai commented Feb 20, 2013

Not sure if that is the most portable way, but probably this is the most portable way.

@jordansissel

This comment has been minimized.

Owner

jordansissel commented Feb 20, 2013

I think there are two options:

  • at copy time, track inode/dev values and hardlink instead of copying when a duplicate reference is found.
  • just before outputting a package, scan the staging directory and look for files with duplicate contents and replace them with hardlinks
@pforai

This comment has been minimized.

pforai commented Feb 20, 2013

The first thingy would work on Linux and standard Linux file systems, and Mac OS X as well, for hard links; the later would probably catch symlinks easier.

@jordansissel

This comment has been minimized.

Owner

jordansissel commented Feb 20, 2013

nod. I think the first solution is the correct one, too.

@v-yarotsky

This comment has been minimized.

Contributor

v-yarotsky commented Sep 24, 2013

any update on this issue?

@jordansissel

This comment has been minimized.

Owner

jordansissel commented Sep 24, 2013

Nothing yet.

@ketan

This comment has been minimized.

Contributor

ketan commented Oct 15, 2013

We just hit this same issue while packaging git. A lot of the binaries here are hard links to the actual 'git' binary. I ran a quick stat and saw that the stat.nlink could probably be used to generate a cp -al instead of cp in the rpm spec file template.

Make sense, or are there any other gotchas I'm missing?

irb(main):004:0> File.stat('jailed-root/opt/local/git/1.8.4.1/libexec/git-core/git-send-email')
=> #<File::Stat dev=0xfd00, ino=41558, mode=0100755, nlink=1, uid=900, gid=999, rdev=0x0, size=43956, blksize=4096, blocks=88, atime=2013-10-15 07:42:08 +0000, mtime=2013-10-15 07:18:25 +0000, ctime=2013-10-15 07:42:29 +0000>
irb(main):005:0> File.stat('jailed-root/opt/local/git/1.8.4.1/libexec/git-core/git')
=> #<File::Stat dev=0xfd00, ino=41569, mode=0100755, nlink=113, uid=900, gid=999, rdev=0x0, size=6425791, blksize=4096, blocks=12552, atime=2013-10-15 07:42:05 +0000, mtime=2013-10-15 07:18:25 +0000, ctime=2013-10-15 07:42:29 +0000>
irb(main):006:0> File.stat('jailed-root/opt/local/git/1.8.4.1/libexec/git-core/git-reset')
=> #<File::Stat dev=0xfd00, ino=41569, mode=0100755, nlink=113, uid=900, gid=999, rdev=0x0, size=6425791, blksize=4096, blocks=12552, atime=2013-10-15 07:42:05 +0000, mtime=2013-10-15 07:18:25 +0000, ctime=2013-10-15 07:42:29 +0000>
irb(main):007:0> File.stat('jailed-root/opt/local/git/1.8.4.1/libexec/git-core/git-reset')
@holybit

This comment has been minimized.

holybit commented Oct 22, 2013

Pretty much same story. Built git with html and man docs and the RPM that resulted from using FPM is 280M and extracts out to 702M installed. Oink oink!

Glad I found this thread as I was a bit baffled as to what was happening. Any fix planned or in progress for fpm?

@holybit

This comment has been minimized.

holybit commented Oct 22, 2013

Is there any kind of work around I can use in interim to fix this issue?

@ketan

This comment has been minimized.

Contributor

ketan commented Oct 22, 2013

Is there any kind of work around I can use in interim to fix this issue?

If you're looking for something specifically to build rpms for git: https://github.com/snap-ci-packages/git-build. We've temporarily rolled a spec file while this issue is sorted out.

@holybit

This comment has been minimized.

holybit commented Oct 28, 2013

@ketan thank, the github git-build project worked with few mods for our needs. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment