-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os: Remove can result in unreclaimed space due to interaction between syscall.Unlink and filesystem #16452
Comments
Isn't it a tmpfs bug? POSIX does allow unlinking a directory if the system
allows, but not claiming the storage still seems a bug.
Making os.Remove always try Rmdir first will make it issue one unnecessary
syscall for nearly all its uses.
We might be able to fix RemoveAll to use Rmdir directly on directories
because it already know the type.
|
Yeah, sounds like a Solaris tmpfs kernel bug. Please provide a sample program to demonstrate it. |
This is a tmpfs bug. But it seems like checking whether something is a dir and rmdir(ing) it instead of unlink is a good practice. There are a lot of filesystems out there, and POSIX spec clearly says that it is not necessarily supported everywhere. There is already work my team is doing to address this in Illumos, but it will still likely be a problem in Solaris, and in earlier versions which are still in active use, but are not really maintained, like OpenSolaris, and OpenIndiana. As for sample program, I am assuming you would like that to be a playground link? Thanks for considering this! |
Glad we helped find a bug! :) From memory, this seems like about the 9th kernel bug Go has accidentally brought to light in various operating systems.
Or a gist. Whatever you'd like. Something I can click and read online, even if it doesn't run. |
Bugs exposed in indirect ways like these are always awesome, and certainly appreciated. :) Here's that link: https://play.golang.org/p/u8fVnADfsN Program is naive, but should communicate intent. Thanks again! |
Solaris does not support using unlink() on directories on ZFS or tmpfs, Solaris 10 and newer (at least) will fail with EPERM if you attempt this either using a ZFS or tmpfs system, so the tmpfs bug appears to be specific to Illumos. Using the sample program, we can see this if we truss it on any version of Solaris 10+:
On Solaris, go should always use rmdir() instead to not only avoid the unnecessary overhead of trying and failing repeatedly, but the other undesirable semantics if it is allowed:
Note the "no cleanup" bit above. In short, Go should not generally be using unlink() to remove directories, at least on Solaris, since it's unlikely to ever work as expected. OpenSolaris-based derivatives may behave differently of course. |
The source explains why Go does what it does: func Remove(name string) error {
// System call interface forces us to know
// whether name is a file or directory.
// Try both: it is cheaper on average than
// doing a Stat plus the right one.
e := syscall.Unlink(name)
if e == nil {
return nil
}
e1 := syscall.Rmdir(name)
if e1 == nil {
return nil
}
// Both failed: figure out which error to return.
// OS X and Linux differ on whether unlink(dir)
// returns EISDIR, so can't use that. However,
// both agree that rmdir(file) returns ENOTDIR,
// so we can use that to decide which error is real.
// Rmdir might also return ENOTDIR if given a bad
// file path, like /etc/passwd/foo, but in that case,
// both errors will be ENOTDIR, so it's okay to
// use the error from unlink.
if e1 != syscall.ENOTDIR {
e = e1
}
return &PathError{"remove", name, e} As long as Solaris returns an error on Unlink of a directory and doesn't like catch on fire or something, I think we're good here. |
Please answer these questions before submitting your issue. Thanks!
go version
)? 1.6.3go env
)? Solaris Variant (Illumos-based BrickstorOS)This is issue has been reproduced through a number of tests, all done over a tmpfs filesystem. At least as far we know, tmpfs as implemented on Solaris and variants anyway, does not result in space associated with children of a "directory" being reclaimed through a
unlink
syscall, without callingrmdir
. What happens, based on the code in os.Remove is we first unlink and check for errors. Ifsyscall.Unlink
does not return an error, we assume we are done and return. Otherwise we try next case, which issyscall.Rmdir
. In our case it appears that because Unlink does not result in an error, we never get to the Rmdir call, and so directory is apparently removed, but the capacity that was associated with its children is not reclaimed until after filesystem is unmounted.If possible, provide a recipe for reproducing the error.
Recipe is quite straight-forward. This is reproducible each time with the os.RemoveAll function, which uses os.Remove, where the problem seems to be.
Expected for directory to be removed, and space associated with all files under that directory to be freed.
Space is not freed as a result, only directory appears to be removed, as well as its contents.
The text was updated successfully, but these errors were encountered: