Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: clean GOCACHE based on disk usage #29561

Open
4a6f656c opened this issue Jan 4, 2019 · 13 comments
Open

cmd/go: clean GOCACHE based on disk usage #29561

4a6f656c opened this issue Jan 4, 2019 · 13 comments

Comments

@4a6f656c
Copy link
Contributor

@4a6f656c 4a6f656c commented Jan 4, 2019

The GOCACHE appears to lack a disk size limit - this is a problem in a space constrained environment and/or when running go on a disk that is nearing capacity. For example, on the openbsd/arm builder (which runs on a USB stick), the ~/.cache/go-build directory runs past several GB in a very short time, which then leads to various failures (git clone or go builds). The only option that I currently appear to have is to run go cache -clean regularly, in order to keep the cache at a respectable size. It seems that having a configurable upper bound would be preferable and/or free disk space based checks that prevent writes (e.g. evict then write) to the cache from failing when the disk is full due, partly due to a large GOCACHE:

[gopher@cubox ~ 102]$ du -csh ~/.cache/go-build                                                                                                                                                                                 
2.2G    /home/gopher/.cache/go-build
2.2G    total
[gopher@cubox ~ 103]$ df -h /home
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/sd1m      9.0G    8.5G   -6.0K   100%    /home
[gopher@cubox ~ 104]$ go build -o /tmp/main /tmp/main.go

/home: write failed, file system is full

/home: write failed, file system is full

/home: write failed, file system is full

/home: write failed, file system is full

/home: write failed, file system is full

/home: write failed, file system is full

/home: write failed, file system is full

/home: write failed, file system is full

/home: write failed, file system is full
[gopher@cubox ~ 105]$ du -csh ~/.cache/go-build          
2.2G    /home/gopher/.cache/go-build
2.2G    total
[gopher@cubox ~ 106]$ df -h /home                        
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/sd1m      9.0G    8.5G   -6.0K   100%    /home
@jayconrod

This comment has been minimized.

Copy link
Contributor

@jayconrod jayconrod commented Jan 4, 2019

I think this would be a good thing to have, especially since the cache will be required starting in Go 1.12. I don't think there's a fixed maximum cache size that would work for everyone, but maybe we could at least make eviction more aggressive when the disk is low on space.

@mvdan

This comment has been minimized.

Copy link
Member

@mvdan mvdan commented Jan 4, 2019

We're no longer allowing GOCACHE=off starting in 1.12, so it might be a bit tricky to allow an arbitrary size limit. One could simply swap GOCACHE=off with GOCACHELIMIT=0, for example.

I seem to remember that @rsc designed the cache to automatically evict based on modification time. Seems sound to me to also automatically evict if the disk is full enough that it might realistically hit errors if it builds a few large packages (say, if there's less than 500MB left).

@bitfield

This comment has been minimized.

Copy link

@bitfield bitfield commented Jan 5, 2019

This sounds like a bit of a can of worms. How do you check for remaining disk space in a cross-platform way? How much free space should trigger a cache eviction? How do you guarantee that, having evicted some cache, the build process won't still run out of disk anyway?

If you're building on a small disk, then wouldn't running go cache -clean before each build be a better solution than trying to build disk space management into the build tool itself?

@mvdan

This comment has been minimized.

Copy link
Member

@mvdan mvdan commented Jan 5, 2019

I personally don't know how easy it would be to control "does the disk have enough space left" in a sane and portable way. I was simply pointing out that since the current eviction algorithm takes into account timestamps, perhaps it should also use some data from the filesystem or disk. If we can make that work, we could avoid adding more knobs to the go tool.

If you're building on a small disk, then wouldn't running go cache -clean before each build be a better solution than trying to build disk space management into the build tool itself?

That might not be a great solution - what if you're building a very large project like Kubernetes? It might produce many hundreds of megabytes of build cache, so it's not unreasonable to think that it could on its own be enough to fill up some filesystems.

@bitfield

This comment has been minimized.

Copy link

@bitfield bitfield commented Jan 5, 2019

That might not be a great solution - what if you're building a very large project like Kubernetes? It might produce many hundreds of megabytes of build cache, so it's not unreasonable to think that it could on its own be enough to fill up some filesystems.

Indeed. So this proposal would, at best, kick the problem slightly further down the road.

@bcmills bcmills modified the milestones: Go1.13, Unplanned Jan 15, 2019
@bcmills bcmills changed the title cmd/go/internal/cache: GOCACHE appears to lack disk size limit cmd/go: clean GOCACHE based on disk usage Jan 15, 2019
@josharian josharian modified the milestones: Unplanned, Go1.14 May 9, 2019
@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@josharian

This comment has been minimized.

Copy link
Contributor

@josharian josharian commented Jan 29, 2020

I just filled up my hard drive again, and am waiting for go clean -cache to delete an untold number of tiny files, which history suggests will take hours.

Since GOCACHE=off is off, how about a cmd/go flag asking it not to cache the results of this particular build/test/compilation? I think that most people who are hitting this are doing something unusual and know it and just need some way to prevent the damage.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jan 29, 2020

I'm not going to claim that this is the best possible solution, but I just run go clean -cache regularly. In other words 1) I know that I am doing something unusual; 2) I have a way to prevent the damage.

@mvdan

This comment has been minimized.

Copy link
Member

@mvdan mvdan commented Jan 29, 2020

Those of you who do "unusual things", what sizes do you encounter that are a problem? I can't remember the last time I cleaned GOCACHE, and it's currently sitting at 1.4GiB. Given that my SSD is 140GiB and I only use ~40% of it, that seems fine.

@ALTree

This comment has been minimized.

Copy link
Member

@ALTree ALTree commented Jan 29, 2020

I got a 200GB gocache folder once, after 12hrs of compiler fuzzing.

@josharian

This comment has been minimized.

Copy link
Contributor

@josharian josharian commented Jan 29, 2020

@mvdan this morning it clocked in at a little over 250GB. In the past I’ve hit 400GB. I’d probably have hit that last night except my script died because I ran out of disk space.

@ianlancetaylor I see you did there. :) To expand on why that isn’t a suitable solution for me:

In normal use, everything is fine. This only happens to me when I start an overnight computation, like compiling 40 different toolchain commits for every platform.

I’d have to run go clean -cache in the middle of the night. I could start a script to do that, but that is an easy thing to forget. I could set up a cron to always clear it every hour, but that would slow down my non-Go-toolchain work, which is a higher priority. Or I could have the script doing the work clean the cache, except that this is a script I’m making publicly available (compilecmp), which means I’ll be clearing other people’s’ caches, which seems unfriendly.

I guess I could have my script create a temp dir, set GOCACHE to it, and clear it regularly. I’ll try that.

@josharian

This comment has been minimized.

Copy link
Contributor

@josharian josharian commented Jan 29, 2020

Another reason to want to be able to disable the cache in these circumstances is to avoid the wear and tear on my SSD of writing and then immediately deleting 100s of GB. I could set up a RAM disk, except that that is fiddly and platform-specific, and I’m trying to maintain a tool to be used by not-just-me.

@mvdan

This comment has been minimized.

Copy link
Member

@mvdan mvdan commented Jan 29, 2020

Your workflows are definitely heavier than mine :) is there a way to expose this "no GOCACHE" mode only to advanced users, so that we don't encourage the broader community to turn off the cache in general? Perhaps hide it behind an undocumented flag?

@josharian

This comment has been minimized.

Copy link
Contributor

@josharian josharian commented Jan 29, 2020

One idea: -a already asks cmd/go to ignore the cache; maybe it could also not write the new entries?

I'm not sure I fully understand why there's so much concern about people disabling the cache. We've forced it on everyone for long enough that we should be past the FUD. And if people really want to waste resources, that's their business. And for the folks who have a genuine need to disable the cache, they can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
9 participants
You can’t perform that action at this time.