Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove old fat files from the repo #6807

Closed
yorikvanhavre opened this issue Apr 28, 2022 · 10 comments
Closed

Remove old fat files from the repo #6807

yorikvanhavre opened this issue Apr 28, 2022 · 10 comments
Assignees
Labels
Core Issue or PR touches core sections (App, Gui, Base) of FreeCAD :octocat: Packaging/building Related to building, compiling or packaging FreeCAD

Comments

@yorikvanhavre
Copy link
Member

yorikvanhavre commented Apr 28, 2022

Remove the leftover fat Qt Help files from the repo history. They account for almost 50% of the repo size. Only problem, rewriting the history will force everybody who has a local copy of the git repo to force pull the repo manually.

Idea: Do that while switching the main branch from "master" to "main". (Which is becoming a norm a bit everywhere anyway) so it forces people to update

Forums discussion: https://forum.freecadweb.org/viewtopic.php?f=4&t=68303

@yorikvanhavre yorikvanhavre self-assigned this Apr 28, 2022
@luzpaz luzpaz added Core Issue or PR touches core sections (App, Gui, Base) of FreeCAD Packaging/building Related to building, compiling or packaging FreeCAD :octocat: labels Apr 28, 2022
@donovaly donovaly removed the :octocat: label Jun 24, 2022
@chennes chennes removed the For 0.21 label Jun 24, 2022
@luzpaz luzpaz added the :octocat: label Sep 5, 2022
@yorikvanhavre
Copy link
Member Author

Ok it appears the best moment to do this is ... now. This is scary :)
The best way to do it seems to be using git-filter-repo.
From this page and this one it would basically boil down to this:
git filter-repo --invert-paths --path 'src/Doc/*.qch' --path 'src/Doc/*.qch.*' --use-base-name
but there are scripts mentioned on the forum too

@yorikvanhavre
Copy link
Member Author

What I am not sure, are the branches. Will they be handled as well...

@chennes
Copy link
Member

chennes commented Sep 26, 2023

This operates locally until you push, right? So you can just do it and see.

@yorikvanhavre
Copy link
Member Author

Ok what I did:

git clone git@github.com:FreeCAD/FreeCAD.git FreeCAD-shrink
# 1.88G downloaded
du -sh FreeCAD-shrink
# 2.3G
cd FreeCAD-shrink
git filter-repo --invert-paths --path 'src/Doc/*.qch' --path 'src/Doc/*.qch.*' --use-base-name
du -sh .
# 2.3G
git filter-repo --invert-paths --path '*.qch' --path '*.qch.*' --use-base-name
du -sh .
# 2.3G

So this command appears ineffective, or I didn't use it correctly... Trying the scripts mentioned in the forum post now

@yorikvanhavre
Copy link
Member Author

yorikvanhavre commented Sep 27, 2023

Ok cleaning script display s a warning saying "DO NOT USE THIS! USE git-filter-repo!" 😆

Got it. git-filter-repo does not support wildcards. Need to run it with each file...

git filter-repo --invert-paths --path 'src/Doc/freecad.qch'
git filter-repo --invert-paths --path 'src/Doc/freecad.qch.part00'
git filter-repo --invert-paths --path 'src/Doc/freecad.qch.part01'
git filter-repo --invert-paths --path 'src/Doc/freecad.qch.part02'
git filter-repo --invert-paths --path 'src/Doc/freecad.qch.part03'
git filter-repo --invert-paths --path 'src/Doc/freecad.qch.part04'
du -sh .
# 1.4G

Which probably can become:

git filter-repo --invert-paths --path 'src/Doc/freecad.qch' --path 'src/Doc/freecad.qch.part00' --path 'src/Doc/freecad.qch.part01' --path 'src/Doc/freecad.qch.part02' --path 'src/Doc/freecad.qch.part03' --path 'src/Doc/freecad.qch.part04'

Using the fatfiles script from the forum shows that all these files have vanished.

Remaining steps needed, from github docs:

  • git push origin --force --all
  • git push origin --force --tags

Remaining problems:

  • How to make sure nobody pushes while we're doing this locally
  • How to advertise that anybody who has cloned the repo needs to rebase or re-clone

@chennes
Copy link
Member

chennes commented Sep 27, 2023

Actually, github will probably reject that push, we don't allow pushes direct to master, and we don't allow force-pushes on master. So you will probably have to change those GitHub settings first, tell the Maintainers not to do any merges while you do it, then post everywhere you can think of about what's going on :).

@yorikvanhavre
Copy link
Member Author

Let's discuss this later today 😅

@chennes
Copy link
Member

chennes commented Sep 27, 2023

For posterity: we discussed this today and came to the conclusion that the 1gb saved in a full repo clone was not worth the cost of invalidating all existing hashes, which are used in many places (e.g. GitHub permalinks, etc.).

@chennes chennes closed this as completed Sep 27, 2023
@luzpaz luzpaz closed this as not planned Won't fix, can't repro, duplicate, stale Sep 28, 2023
@kkremitzki
Copy link
Member

It's too bad this could not be reconsidered. It results in an almost 50% reduction in repo size, and the development branch rename would be an ideal time to do it.

Re: hash invalidation, one major point to offset that is that the new hashes seem to be deterministic, so it's not like one person would be publishing an entirely unverifiable new history--everybody with an existing copy could run the same commands and verify that the new release tags were produced from the original commit history. There shouldn't be any loss in fidelity, and I think most of those would be largely unused, anyway.

@yorikvanhavre
Copy link
Member Author

In any case, the master -> main switch was very transparent and inocuous... So in case we revisit this one, there is not much reason to tie it to it.

The main problem was not so much all the hash links on the net (I guess there aren't so many and that's not of paramount importance if they don't work anymore, but also the 80-something pull requests in the queue that would all need to be redone, included some very large ones...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Core Issue or PR touches core sections (App, Gui, Base) of FreeCAD :octocat: Packaging/building Related to building, compiling or packaging FreeCAD
Projects
None yet
Development

No branches or pull requests

5 participants