New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discussion] Prune old packages from the repos #2174
Comments
Yes, useful. I can probably make a shell script for that. I like the idea of having complete history for reproducibility's sake, but we definitely shouldn't put that weight on all the mirrors. |
(I'd happily look into it, but only with Python.. the stdlib should be enough) |
Some stats for the msys (not mingw) repos: Current size 46.2 GB Savings in % when removing everything not used and older than:
|
mingw repo stats are similar (size 475.6 GB):
So for two years that would be 265GB left. |
somewhat related, I've looked at the download stats on sourceforge since everything goes there now and we have about 750GB traffic per day. Note that this is only one day and due to pacman timeouts less downloads might happen then normally. |
Probably they come mostly from CI, maybe after the cleaning we should ask CI vendors to cache/mirror the repo like the do for Linux packages? |
So due to server complications, the repos have been reduced to around 130 gigs. I plan to sync that with SF.net soon. |
Perhaps it's worth checking with the Arch Linux developers to see if they have insights on this, and if they have tools that can be used. IIRC they're very quick in taking down outdated packages from the primary mirror, but they also have an archive repo for those old packages. |
Size ∝ 💰 |
Sorry I came here a little late.
Dumb questions from a hobbyist (who used to build stuff on XP, even after Cygwin/MSYS2 deprecated it):
|
I've asked on IRC and got these suggestions:
they don't really allow keeping old packages and don't deal with source packages |
Not officially. I assume some mirrors are gonna keep the older packages, but that's their discretion.
I don't think so. The package databases included in the installers(*) aren't particularly vetted, tested or anything like that. It makes sense to keep a frozen (or nearly-frozen) set of packages that still keep some compatibility promise we broke since (such as running on XP or even earlier Windows versions), but it'd take extra work and resources which I don't think we can spare. *) Honestly, I don't even see a reason to ship the databases with the installers. People are supposed to sync them right away anyway.
That would help make the rolling-ness of our releases more obvious, but as opposed to packages, I know old installers are used regularly, e.g. in vcpkg. |
In case you didn't know... github repositories have size limits and therefore should avoid binary files, but github releases only limit the individual file size so they are a very good solution for binary packages (probably intended):
I propose creating releases in a new repository and putting binary packages there. Now for a more personal request... please create an archive repo and put your old packages there (in releases), including whatever you decide to prune from now on. There is no cost and the process can be automated (after you iron out the details). Not supporting XP didn't affect me much since 32-bit stuff was still there. Sometimes I look at old stuff and consider contributing to the project if it is still alive or adapting the project for something else. You have https://github.com/msys2/MINGW-packages-dev which can be used as a base for recipes of unsupported packages, so someone could adapt Portage or Homebrew or similar (does Arch linux have something?) and allow users to build what they need (which you would not support, only provide the infrastructure for the recipes). After that the next logical step is to allow users to contribute their own recipes. Personally, I only need the old packages available and a way to know which "old version" of msys2 they belonged to. |
The closest I can think of is AUR, which some people use tools like yaourt and yay to help with using AUR |
We now prune to 1.75 years, using this script: https://github.com/msys2/msys2-devtools/blob/main/msys2-repo-prune https://www.msys2.org/docs/faq/#how-long-are-old-packages-kept-on-repomsys2org |
The repo size is growing and many of the old packages just take up space and are likely never used. We could remove all packages and source packages that are older than say 2 or 3 years and are not actively in the pacman repo.
Technically this could be a Python script that parses the repos, generates a list of files that are actively used, gets all files that have a too old mtime, and suggests removing those with a too old mtime but not in the active list.
@elieux Does this sound useful to you? Any other ideas how to achieve something similar?
The text was updated successfully, but these errors were encountered: