Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: lockedfile.File leaked in case of write error #50858

Closed
bcmills opened this issue Jan 27, 2022 · 6 comments
Closed

cmd/go: lockedfile.File leaked in case of write error #50858

bcmills opened this issue Jan 27, 2022 · 6 comments
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@bcmills
Copy link
Member

bcmills commented Jan 27, 2022

From https://build.golang.org/log/3cbe74efe432908017a65a831f586044aaa2fa9e:

-> unzip /boot/workdir/gopath/pkg/mod/cache/download/google.golang.org/protobuf/@v/v1.27.1.zip: write /boot/workdir/gopath/pkg/mod/google.golang.org/protobuf@v1.27.1/internal/testprotos/benchmarks/datasets/google_message3/benchmark_message3_6.pb.go: disk is full
panic: lockedfile.File /boot/workdir/gopath/pkg/mod/cache/download/google.golang.org/genproto/@v/v0.0.0-20211021150943-2b146023228c.ziphash became unreachable without a call to Close

goroutine 5 [running]:
cmd/go/internal/lockedfile.OpenFile.func1(0x11898250)
	/boot/workdir/go/src/cmd/go/internal/lockedfile/lockedfile.go:54 +0xa4

The frame printed here is supposed to be the caller of OpenFile, not OpenFile itself.

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jan 27, 2022
@bcmills bcmills added this to the Backlog milestone Jan 27, 2022
@aschade92
Copy link
Contributor

aschade92 commented Feb 3, 2022

Hello! I'd like to work on this issue.

To investigate this bug, I started in the modfetch package. Eventually I traced the problem to a method called hashZip which is called when go modules fetches dependencies, as is implied from the stack trace.

My understanding of the hashZip function is that it checks the hash of a downloaded file versus the expected hash from go.mod. If that passes validation, then the zip hash file gets saved using lockedfile.Create(). However, hf.Close() is not executed in a defer statement and it instead happens as the last operation before the function returns:

func hashZip(mod module.Version, zipfile, ziphashfile string) error {
	hash, err := dirhash.HashZip(zipfile, dirhash.DefaultHash)
	if err != nil {
		return err
	}
	if err := checkModSum(mod, hash); err != nil {
		return err
	}
	hf, err := lockedfile.Create(ziphashfile)
	if err != nil {
		return err
	}
	if err := hf.Truncate(int64(len(hash))); err != nil {
		return err
	}
	if _, err := hf.WriteAt([]byte(hash), 0); err != nil {
		return err
	}
	if err := hf.Close(); err != nil {
		return err
	}

	return nil
}

if calls to hf.Truncate or hf.WriteAt return an error, possibly due to there being no disk space, then hf.Close() would not get invoked. As a result the finalizer in lockedfile.OpenFile gets executed when the file is garbage collected, which is where panic is invoked. This potentially would explain the misleading stack trace that you're seeing, instead of seeing the error bubbled up properly.

My suggested fix here is to use defer so that the file is closed regardless of whether an error occurs. Am I on the right path here?

@bcmills
Copy link
Member Author

bcmills commented Feb 3, 2022

@aschade92, thanks for the analysis, and I think you've got the right fix in mind for hashZip.

(For this specific issue I had thought that we were explicitly recording the call site in lockedfile.OpenFile, but it appears that we are doing no such thing. As such, I'll retitle the issue for the specific fix.)

@bcmills bcmills changed the title cmd/go/internal/lockedfile: Open records wrong stack frame cmd/go: lockedfile.File leaked in case of write error Feb 3, 2022
@aschade92
Copy link
Contributor

@bcmills do you have any suggestions about how to test the fix here? the fix itself is trivial but it is difficult to replicate in a test.

What I would like to do is somehow get lockedfile.Create to return a non nil error while having either WriteAt or Truncate return an error. Then we could assert that the file was properly closed. I've tried creating temporary files with different permissions using the os package, but any permission errors will get returned when trying to create a locked file and not when writing to it. This leads to the file leak not happening in the test while using this strategy.

In lieu of other ideas, what I would do is define a method variable that tests can override so we can mock the result of lockedfile.Create. Then it would be easy to assert that Close() was called on the file after Truncate or WriteAt returns an error. However, I consider it to be an anti pattern to write functional code whose sole purpose is to facilitate a test, so let me know if you are aware of a better approach here.

@bcmills
Copy link
Member Author

bcmills commented Feb 7, 2022

Unfortunately I'm not sure how we could recreate the error in a test. We would need the test to successfully create and lock a file, but then fail to write to it — and that means either a full disk or a sudden disk failure.

It would indeed be possible to reproduce the failure using a mock, but given that it is straight-line code I think in this case that sounds like more trouble that it is worth. (We can code-review the error paths carefully and call it a day.)

aschade92 added a commit to aschade92/go that referenced this issue Feb 7, 2022
The go modules download command has a method called hashZip which
checks the hash of zip file versus an expected value, and then writes
it out to a file. If the write fails though, we do not close the file
leading to it being leaked. This could happen if the user runs out of
disk space, causing the underlying OS write command to fail. Ultimately
this led to a panic in lockfile.OpenFile which was invoked from a finalizer
garbage collecting the leaked file. The result was a stack trace that didn't
show the call stack from where the write failed.

Fixes issue golang#50858
aschade92 added a commit to aschade92/go that referenced this issue Feb 7, 2022
The go modules download command has a method called hashZip which checks the
hash of a zipped directory versus an expected value, and then writes it out
to a file. In the event that the write operation is not successful, we do
not close the file, leading to it being leaked. This could happen if the
user runs out of disk space, causing the underlying OS write command to
return an error. Ultimately, this led to a panic in lockfile.OpenFile which
was invoked from a finalizer garbage collecting the leaked file. The result
was a stack trace that didn't show the call stack from where the write
operation actually failed.

Fixes issue golang#50858
@gopherbot
Copy link

Change https://go.dev/cl/383915 mentions this issue: cmd/go: lockedfile.File leaked in case of write error

aschade92 added a commit to aschade92/go that referenced this issue Feb 10, 2022
The go modules download command has a method called hashZip which checks the
hash of a zipped directory versus an expected value, and then writes it out
to a file. In the event that the write operation is not successful, we do
not close the file, leading to it being leaked. This could happen if the
user runs out of disk space, causing the underlying OS write command to
return an error. Ultimately, this led to a panic in lockfile.OpenFile which
was invoked from a finalizer garbage collecting the leaked file. The result
was a stack trace that didn't show the call stack from where the write
operation actually failed.

Fixes issue golang#50858
aschade92 added a commit to aschade92/go that referenced this issue Feb 10, 2022
…write errors

The go modules download command has a method called hashZip which checks the
hash of a zipped directory versus an expected value, and then writes it out
to a file. In the event that the write operation is not successful, we do
not close the file, leading to it being leaked. This could happen if the
user runs out of disk space, causing the underlying OS write command to
return an error. Ultimately, this led to a panic in lockfile.OpenFile which
was invoked from a finalizer garbage collecting the leaked file. The result
was a stack trace that didn't show the call stack from where the write
operation actually failed.

Fixes golang#50858
aschade92 added a commit to aschade92/go that referenced this issue Feb 11, 2022
…write errors

The go modules download command has a method called hashZip which checks the
hash of a zipped directory versus an expected value, and then writes it out
to a file. In the event that the write operation is not successful, we do
not close the file, leading to it being leaked. This could happen if the
user runs out of disk space, causing the underlying OS write command to
return an error. Ultimately, this led to a panic in lockfile.OpenFile which
was invoked from a finalizer garbage collecting the leaked file. The result
was a stack trace that didn't show the call stack from where the write
operation actually failed.

Fixes issue golang#50858
aschade92 added a commit to aschade92/go that referenced this issue Feb 11, 2022
…write errors

The go modules download command has a method called hashZip which checks the
hash of a zipped directory versus an expected value, and then writes it out
to a file. In the event that the write operation is not successful, we do
not close the file, leading to it being leaked. This could happen if the
user runs out of disk space, causing the underlying OS write command to
return an error. Ultimately, this led to a panic in lockfile.OpenFile which
was invoked from a finalizer garbage collecting the leaked file. The result
was a stack trace that didn't show the call stack from where the write
operation actually failed.

Fixes golang#50858
@tamanna2702
Copy link

Hi, I am on version 1.17.3 and I'm getting the same error on go mod download. Is there a way to bypass this error ?

`go/pkg/mod/cache/download/rsc.io/quote/v3/@v/v3.1.0.ziphash became unreachable without a call to Close

goroutine 5 [running]:
cmd/go/internal/lockedfile.OpenFile.func1(0x178)
/usr/local/go/src/cmd/go/internal/lockedfile/lockedfile.go:54 +0x74
The command '/bin/sh -c go mod download -x' returned a non-zero code: 2`

@golang golang locked and limited conversation to collaborators Feb 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants