-
-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate large binary files to git lfs? #29
Comments
We always used git when developing Tilt Brush; there are a few large files in there, but it doesn't seem to me that there's enough to warrant LFS. The large files in there are unlikely to change much (they're mainly some .psd files for the backgrounds and some .dlls etc). I feel like lfs would add some extra complexity for no gain. |
I thought of it when I saw the ffmpeg.exe commit. It does add a bit of complexity, especially the first time you use it, but it's pretty straightforward once you get used to it. Though arguably, that's the exact definition of Stockholm Syndrome! |
My understanding is that its main benefit is that you don't end up storing multiple copies of large files in the .git repository. However, if the large files are unlikely to change, it doesn't seem to me that you really gain anything. (I have never used git-lfs) |
The clones are faster as well. Between .exe, .psd, .dll, and .png, it looks like we have 353MB worth ( It's not a huge deal, but if we do want to switch, it'll need to rewrite the history. |
(if you want to see what'd look like after a conversion, I pushed a version to https://github.com/mikeage/open-brush-lfs for comparison. To create it, I did the following: # Track large/binary files
git lfs track '*.psd'
git lfs track '*.exe'
git lfs track '*.dll'
git lfs track '*.png'
git lfs track '*.prefab'
git add .gitattributes
# Readd the ones we have
find . \( -iname '*dll' -or -iname '*exe' -or -iname '*psd' -or -iname '*png' -or -iname '*.prefab' \) -print0 | xargs -0 git add
git commit -m "Switch to git-lfs for png, exe, dll, psd, and .prefab"
# Rewrite history so that they've always been in git-lfs
git lfs migrate import --everything --include="*.psd,*.exe,*.dll,*.png,*.prefab"
git push origin HEAD -f ) |
If we are to do this, I don't think we should be tracking |
Gotcha. I did it because on the first push without them, git complained about a large file (Assets/Prefabs/Intro/powered_by_tiltbrush_full.prefab) and i didn't even think that 66MB could be a yaml! Let me try updating the repo without them, and I'll compare. (it looks like the download speed for LFS blobs is faster, but I don't really know why). P.S. One other argument against doing this is that the quota for LFS files is apparently shared by forks (!). I did not know that (I use LFS at work with a hosted Github Enterprise version, so I haven't really thought about costs and limitations) |
Updated. Having reviewed the docs, perhaps it's not really as critical if they're never modified; even the large files aren't really getting that close to the 100MB limit for github. There are only 3 files over 30MB, and only 1 over 50 (and it's the yaml above). So while I think it's a good discussion, maybe it's not worth it. Force pushing is always a challenge for a popular project. |
Does GitHub do LFS hosting? (I haven't tried it) Those files are compressed when stored, and if they don't change often then probably not worth complication of moving elsewhere? There are many projects with 400MB+ repos, but you can use Another option is to move large/specific files into a "tools/resources" (or PC?) sub project which you only pull in when you need it. |
Github does. See https://docs.github.com/en/github/managing-large-files . Above 100MB, you must use it, and above 50MB, you get a warning when you push. |
Seems it does, but I believe pulling the files out and moving them to LFS would mean a restarting the repo history, making everyone's clones invalid.
@mikeage Thanks. I didn't know that. Haven't encountered it yet, but good to know. |
Correct. It'd ruin the history. A rebase would be straightforward, but annoying, and PRs would be a mess if not done. That's why if it's done, it should be done either during a long dry spell, or ASAP. The longer this thread goes, and the more I think about it, the more I feel like the cost is too high, which is a bit ironic since it's only been about a week. But still. |
I suggest closing this as it sounds like we've decided to do nothing at the moment. Less open issues is always a good look. |
Agreed, closing but good to have this noted in case we decide to revisit in the future! |
I’m a bit conflicted about this one, but in general, git is not the best place to store large binary files. Git-lfs is the best way to integrate them (and is supported by GitHub), but an effective migration would involve rewriting history which would affect forks. On the other hand, it greatly shrinks the size of the repo and makes many operations faster. I see three options here:
Any thoughts from others who’ve worked on large git reports?
The text was updated successfully, but these errors were encountered: