Skip to content

cbeams/lfs-test

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Git LFS sandbox

This repository demonstrates the virtues of using Git LFS in situations where checking in multiple versions of large binary files is unavoidable. If you clone this repository right now, you'll notice that it is very lightweight, about 200K in size, despite the fact that its history contains versions of binary files much larger than that. This is because we're storing these binary objects outside the repository using Git LFS.

Here's how this repository was created per the instructions at https://git-lfs.github.com:

brew install git-lfs
git lfs install
git lfs track '*.pdf'
git add .gitattributes
git add README.md
git add sample.pdf
git commit -m"Add 180K Git LFS-tracked PDF"
git push

Notice the Uploading LFS objects line in the output of git push. The 180K sample.pdf file is being uploaded to GitHub's LFS server here:

$ git push
Uploading LFS objects: 100% (1/1), 184 KB | 0 B/s, done.
[...]

Now, let's replace the 180K sample.pdf file with a larger 14M version:

$ du -shx sample.pdf
 180K   sample.pdf
$ cp /some/large.pdf sample.pdf
$ du -shx sample.pdf
 14M    sample.pdf
$ git add sample.pdf
$ git push
Uploading LFS objects: 100% (1/1), 14 MB | 0 B/s, done.
[..]

Finally, let's restore the old 180K sample.pdf:

$ du -shx sample.pdf
 14M    sample.pdf
$ cp /original/small.pdf sample.pdf
$ du -shx sample.pdf
 180K   sample.pdf
$ git add sample.pdf
$ git push
Uploading LFS objects: 100% (1/1), 184K | 0 B/s, done.
[..]

Now let's create a fresh clone of the repository in a different directory:

$ time git clone cbeams/lfs-test
Cloning into 'lfs-test'...
remote: Enumerating objects: 20, done.
remote: Counting objects: 100% (20/20), done.
remote: Compressing objects: 100% (15/15), done.
remote: Total 20 (delta 4), reused 19 (delta 3), pack-reused 0
Receiving objects: 100% (20/20), 6.52 KiB | 3.26 MiB/s, done.
Resolving deltas: 100% (4/4), done.

real    0m6.437s

Notice that the size of the repository is very small and the time to clone it is very fast because it includes only the latest, smaller version of the PDF. If we had not used Git LFS to track this file, the repository would contain the complete history of the PDF, balooning the size to more than 14 megabytes, making cloning the repository unnecessarily cumbersome and time-consuming:

$ du -shx lfs-test
504K    lfs-test

$ du -shx lfs-test/.git
316K    lfs-test/.git

$ du -shx lfs-test/.git/lfs/
196K    lfs-test/.git/lfs/

Of course, if the user wants to get back to the larger version of the PDF, they can do so, and they will incur the cost of downloading only at that time (note that the update takes 18 seconds, because it has to actually download the file from LFS):

$ time git checkout f768e45
[..]
HEAD is now at f768e45 Replace 180K sample.pdf with 14M version

real    0m18.084s

At this point, our Git database includes both versions of the file and is much larger because of it:

$ du -shx .git
14M    .git

If we check out master once again, we'll see that the size of the Git database remains the same.

$ git checkout master
$ du -shx .git
14M    .git

In this way, Git LFS allows us to track arbitrarily large binary files without blowing up the size of the repository to unweildy levels. Devs can opt-in to downloading older versions as necessary, but initial clone times are kept as fast as possible, and local Git history operations, e.g. git log -S can remain fast because they do not have to process all the old massive history of those binary objects.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published