Skip to content

cmd/go: compress GOCACHE #76337

@jakebailey

Description

@jakebailey

No template fit, forgive my use of the empty issue template.

In the typescript-go project, I and others on the team keep accidentally running ourselves out of disk space (or, running CI runners out of disk space) due to how large GOCACHE seems to be for our codebase.

If you just do:

$ export GOCACHE="$(pwd)/gocache"
$ git clone --depth=1 https://github.com/microsoft/typescript-go.git
$ cd typescript-go
$ go test ./... > /dev/null

Then the cache is 3.2GB:

$ du -sh $GOCACHE
3.2G    /home/jabaile/work/gocache-example/gocache

The top 10 largest files in the cache are:

185M    27/27480acf832d161dc0faad35f12db3d7bac9e21dfcc22a8b891c393fc3c39684-d
61M     14/143367227694be40742d296d489b5de454f68b9ddeccd99c1351e93d22da7877-d
58M     dc/dcbc45767bc9ae462401888cae7f79f4e5e62aceee572a3da131fa4574cde732-d
50M     e6/e6d125f1833dca10b62878d07b24e8ad1e9a57c86299ac892642f4cad3d2414d-d
50M     96/962cdf6c83e6a527324d1bb74a8a9c0bb814fc61527be153dd47277221e3ed53-d
47M     ec/ecbd61c1400299ccfaa9dc901bfd4b491727b04352ce38dc74ab78b2c6f802a3-d
47M     eb/eb7eeb23cb39432385ead152ad5a74d46f8ba42e2b39fd7c37318d8e6a44943d-d
47M     e8/e8eac35ad09dc74b9e5ef05ac9c5fe1984059ff4fabe19188a0d6078ea211e09-d
46M     60/608d4671f8cdaec8703838eb21369a2a191ded0611a4c080f06ec62b1f6dc4d6-d
45M     3b/3b628f4a9c11f4884f33bd416585ab50ba7e213427e5d609e3d584bac3629d00-d

Out of curiosity, I ran these through zstd to see how compressible they are, and indeed:

185M    15M   27/27480acf832d161dc0faad35f12db3d7bac9e21dfcc22a8b891c393fc3c39684-d
61M     7.8M  14/143367227694be40742d296d489b5de454f68b9ddeccd99c1351e93d22da7877-d
58M     5.4M  dc/dcbc45767bc9ae462401888cae7f79f4e5e62aceee572a3da131fa4574cde732-d
50M     5.2M  e6/e6d125f1833dca10b62878d07b24e8ad1e9a57c86299ac892642f4cad3d2414d-d
50M     5.2M  96/962cdf6c83e6a527324d1bb74a8a9c0bb814fc61527be153dd47277221e3ed53-d
47M     4.3M  ec/ecbd61c1400299ccfaa9dc901bfd4b491727b04352ce38dc74ab78b2c6f802a3-d
47M     4.3M  eb/eb7eeb23cb39432385ead152ad5a74d46f8ba42e2b39fd7c37318d8e6a44943d-d
47M     4.3M  e8/e8eac35ad09dc74b9e5ef05ac9c5fe1984059ff4fabe19188a0d6078ea211e09-d
46M     4.2M  60/608d4671f8cdaec8703838eb21369a2a191ded0611a4c080f06ec62b1f6dc4d6-d
45M     3.8M  3b/3b628f4a9c11f4884f33bd416585ab50ba7e213427e5d609e3d584bac3629d00-d

These files become often 10x smaller. And in fact, if I .tar.zst the entire GOCACHE, the 3.2GB is just 276MB when compressed! (Looking at my GHA logs, we have an 11GB GOCACHE that is compressed to about 1GB, so that savings seems to stick.)

Although this doesn't itself fix the problem of "GOCACHE big and not cleaned often", it does seem like making it 10x smaller could help make it happen in 10x more time.

I tried looking into making a GOCACHEPROG implementation to do this, but I could not figure a way that this could work given the GOCACHEPROG protocol expects Get to have placed the original file on disk. So, my expectation is that the Go toolchain has to be the one to compress the data itself.

There's no support for compressing with zstd quite yet (#62513); I tried using gz and it was roughly 20x slower to compress than zstd, which doesn't bode well for fast compilation (but, I did not change the compress options).

Related:

Metadata

Metadata

Assignees

No one assigned

    Labels

    FeatureRequestIssues asking for a new feature that does not need a proposal.GoCommandcmd/goNeedsDecisionFeedback is required from experts, contributors, and/or the community before a change can be made.ToolProposalIssues describing a requested change to a Go tool or command-line program.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions