Skip to content
This repository has been archived by the owner on Oct 2, 2019. It is now read-only.

Consider cleaning up the .git folder to reduce the large repo size #439

Closed
monfresh opened this issue Jul 2, 2015 · 17 comments
Closed

Consider cleaning up the .git folder to reduce the large repo size #439

monfresh opened this issue Jul 2, 2015 · 17 comments

Comments

@monfresh
Copy link

monfresh commented Jul 2, 2015

Hi. I just cloned this repo and it ended up being 83MB. The biggest file is a 77MB pack in .git/objects/pack.

To see the 10 biggest files, run this from the root directory:

git verify-pack -v .git/objects/pack/pack-7b03cc896f31b2441f3a791ef760bd28495697e6.idx \
| sort -k 3 -n \
| tail -10

To see what each file is, run this:

git rev-list --objects --all | grep [first few chars of the sha1 from previous output]

Most of the files are .png, and the last one in the list is a .mov, which I would guess takes up most of the space. There are also .csv and .pdf files. The next step would be to clean up your git by removing all of those unnecessary files.

One option is to use the bfg-repo-cleaner tool, which worked great for me on other repos I've tried it on.

Alternatively, you could do it manually following this git article, as outlined below:

git filter-branch --index-filter 'git rm --cached --ignore-unmatch *.mov' -- --all
rm -Rf .git/refs/original
rm -Rf .git/logs/
git gc --aggressive --prune=now

Then repeat with other types of files.

Then verify:

git count-objects -v

Your size-pack should be a lot smaller now.

@afeld
Copy link
Contributor

afeld commented Jul 2, 2015

Nice, thanks for the tips! My inclination is that it's not bad enough to warrant rebasing all of the history, but will wait on a 👍 👎 from someone else on the team.

@jessieay
Copy link
Contributor

Agree w @afeld -- cool to have the info but I am not too concerned about this at the moment. Closing.

@SupriyaKalghatgi
Copy link

@monfresh It dd'nt help me

@NiklasOM
Copy link

@monfresh : Thanks, It worked perfectly for me.

@racekiller
Copy link

Hello I have huge folder size under object folder but for some folder with numbers from 1 to 90. the pack fodler is about 50 MB but the rest are about 600 MB. Can I just delete those folders? Note that this is a fork repository. Thanks

@AraHaan
Copy link

AraHaan commented Jan 14, 2018

what about cleaning up the stuff from deleted files to look like they never existed of those files was the root cause of the large size and then later deleted to try to reduce clone time but dont want to loose all history?

I got 1 such repo, try to see how long it takes for you to clone this one: https://github.com/DecoraterBot-devs/DecoraterBot

@azzamsa
Copy link

azzamsa commented Aug 5, 2018

@monfresh thanks lot, it works great for me.

But, won't this will rewrite entire history and mess other fork ?

@bjakubiak
Copy link

And this one worked for me:
git filter-branch --index-filter "git rm --cached --ignore-unmatch *.mov" --tag-name-filter cat -- --all

@ahuigo
Copy link

ahuigo commented Oct 8, 2018

If you want to clean all previous commit and thin up your repo.
Warning: this operation will make you lost all previous commit

## This script is used to clean all git commit
if [[ "$1" = 'all' ]];then
    echo "Clean all git commit"
    git checkout --orphan latest_branch
    git add -A
    git commit -am "Delete all previous commit"
    git branch -D master
    git branch -m master
fi

echo "Cleanup refs and logs"
rm -Rf .git/refs/original
rm -Rf .git/logs/

echo "Cleanup unnecessary files"
git gc --aggressive --prune=now

echo "Prune all unreachable objects"
git prune --expire now

#git push -f origin master

https://github.com/ahuigo/a/blob/master/tool/gitclean.sh

@souryadey
Copy link

I did the steps as @monfresh suggested and recovered a lot of space on my local machine. But how can I integrate the changes with the remote on github.com? When I try to push new commits, it says the remote contains that I don't have locally. So I did git pull, but now my local repository is back to its original size before cleaning up.

@AraHaan
Copy link

AraHaan commented Oct 13, 2018

in that case simply doing a git push --force could solve that @souryadey.

@souryadey
Copy link

Ya I did git push origin --force and it worked!

@FL-ASGS
Copy link

FL-ASGS commented Feb 1, 2019

I used the alternative way outlined by @monfresh to get rid of 30G of my packs. Thanks @monfresh
The first method didn't work out, most probably because I haven't implemented properly.

@scottschreckengaust
Copy link

A "one-liner" bash command for the top 10 files:

git rev-list --objects --all | grep -f <(git verify-pack -v .git/objects/pack/*.idx| sort -k 3 -n | cut -f 1 -d " " | tail -10)

@KenyMylankca
Copy link

how to clean the .git folder that has commits older than a specific date? (for example older than 1 month)

@acrane1
Copy link

acrane1 commented Aug 2, 2019

I tried this and I was able to reduce my size-pack from 243456 to 1937. I then do aa git push --force it uploads but then when I do a git clone on a different machine I still have the old size-pack. I can't get it to reflect on github

@afeld
Copy link
Contributor

afeld commented Aug 2, 2019

This issue is for the C2 application specifically - I suggest taking the broader conversation elsewhere, such as this StackOverflow question. Thanks!

@18F 18F locked as resolved and limited conversation to collaborators Aug 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests