-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
No template fit, forgive my use of the empty issue template.
In the typescript-go project, I and others on the team keep accidentally running ourselves out of disk space (or, running CI runners out of disk space) due to how large GOCACHE seems to be for our codebase.
If you just do:
$ export GOCACHE="$(pwd)/gocache"
$ git clone --depth=1 https://github.com/microsoft/typescript-go.git
$ cd typescript-go
$ go test ./... > /dev/nullThen the cache is 3.2GB:
$ du -sh $GOCACHE
3.2G /home/jabaile/work/gocache-example/gocacheThe top 10 largest files in the cache are:
185M 27/27480acf832d161dc0faad35f12db3d7bac9e21dfcc22a8b891c393fc3c39684-d
61M 14/143367227694be40742d296d489b5de454f68b9ddeccd99c1351e93d22da7877-d
58M dc/dcbc45767bc9ae462401888cae7f79f4e5e62aceee572a3da131fa4574cde732-d
50M e6/e6d125f1833dca10b62878d07b24e8ad1e9a57c86299ac892642f4cad3d2414d-d
50M 96/962cdf6c83e6a527324d1bb74a8a9c0bb814fc61527be153dd47277221e3ed53-d
47M ec/ecbd61c1400299ccfaa9dc901bfd4b491727b04352ce38dc74ab78b2c6f802a3-d
47M eb/eb7eeb23cb39432385ead152ad5a74d46f8ba42e2b39fd7c37318d8e6a44943d-d
47M e8/e8eac35ad09dc74b9e5ef05ac9c5fe1984059ff4fabe19188a0d6078ea211e09-d
46M 60/608d4671f8cdaec8703838eb21369a2a191ded0611a4c080f06ec62b1f6dc4d6-d
45M 3b/3b628f4a9c11f4884f33bd416585ab50ba7e213427e5d609e3d584bac3629d00-d
Out of curiosity, I ran these through zstd to see how compressible they are, and indeed:
185M 15M 27/27480acf832d161dc0faad35f12db3d7bac9e21dfcc22a8b891c393fc3c39684-d
61M 7.8M 14/143367227694be40742d296d489b5de454f68b9ddeccd99c1351e93d22da7877-d
58M 5.4M dc/dcbc45767bc9ae462401888cae7f79f4e5e62aceee572a3da131fa4574cde732-d
50M 5.2M e6/e6d125f1833dca10b62878d07b24e8ad1e9a57c86299ac892642f4cad3d2414d-d
50M 5.2M 96/962cdf6c83e6a527324d1bb74a8a9c0bb814fc61527be153dd47277221e3ed53-d
47M 4.3M ec/ecbd61c1400299ccfaa9dc901bfd4b491727b04352ce38dc74ab78b2c6f802a3-d
47M 4.3M eb/eb7eeb23cb39432385ead152ad5a74d46f8ba42e2b39fd7c37318d8e6a44943d-d
47M 4.3M e8/e8eac35ad09dc74b9e5ef05ac9c5fe1984059ff4fabe19188a0d6078ea211e09-d
46M 4.2M 60/608d4671f8cdaec8703838eb21369a2a191ded0611a4c080f06ec62b1f6dc4d6-d
45M 3.8M 3b/3b628f4a9c11f4884f33bd416585ab50ba7e213427e5d609e3d584bac3629d00-d
These files become often 10x smaller. And in fact, if I .tar.zst the entire GOCACHE, the 3.2GB is just 276MB when compressed! (Looking at my GHA logs, we have an 11GB GOCACHE that is compressed to about 1GB, so that savings seems to stick.)
Although this doesn't itself fix the problem of "GOCACHE big and not cleaned often", it does seem like making it 10x smaller could help make it happen in 10x more time.
I tried looking into making a GOCACHEPROG implementation to do this, but I could not figure a way that this could work given the GOCACHEPROG protocol expects Get to have placed the original file on disk. So, my expectation is that the Go toolchain has to be the one to compress the data itself.
There's no support for compressing with zstd quite yet (#62513); I tried using gz and it was roughly 20x slower to compress than zstd, which doesn't bode well for fast compilation (but, I did not change the compress options).
Related:
- cmd/go: clean GOCACHE based on disk usage #29561 (cmd/go: clean GOCACHE based on disk usage #29561 (comment))
- cmd/compile: long symbol names for instantiated generics => large object files (though not executables) #50438
- cmd/compile: very long names for simple generic usage #71535
- cmd/compile: objects emit generic instantiations for imported packages #56718