-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os: MkdirAll returns "file exists" error when actual error is "intput/output error" #8283
Labels
Milestone
Comments
I can't follow the diagnosis. Can you get a trace of the failing call under strace -f? It looks like MkdirAll checked, could not confirm that /mtpt/subdir was an existing directory (perhaps due to error, but that's okay), called itself recursively to work on /mtpt, and then that call returned an error. That error seems to be 'file exists'. So it seems to have nothing to do with /mtpt/subdir and only with /mtpt. Is /mtpt a directory? Does stat of /mtpt work? It seems like Stat("/mtpt") is returning an error but Mkdir("/mtpt") is returning EEXIST. Can you explain how that happens? |
I can't get the strace output at this moment, but perhaps I can better explain. src/pkg/os/path.go starting at line 21 is making the assumption that if Stat() returns an error, the error indicates the file already exists. It never checks to see what the error is, only that the error is not null. So if Stat() is returning some other error, like "intput/output error" in my case, instead of returning the error to the caller it proceeds with business as usual. On line 29 we have the commented assumption "// Doesn't already exist; make sure parent does." This assumption is wrong. The code never checked that the directory doesn't exist, only whether or not Stat() on it generates an error. |
I still believe the code is correct. func MkdirAll(path string, perm FileMode) error { // If path exists, stop with success or error. dir, err := Stat(path) if err == nil { if dir.IsDir() { return nil } return &PathError{"mkdir", path, syscall.ENOTDIR} } // Doesn't already exist; make sure parent does. i := len(path) for i > 0 && IsPathSeparator(path[i-1]) { // Skip trailing path separator. i-- } j := i for j > 0 && !IsPathSeparator(path[j-1]) { // Scan backward over element. j-- } if j > 1 { // Create parent err = MkdirAll(path[0:j-1], perm) if err != nil { return err } } // Now parent exists, try to create. err = Mkdir(path, perm) if err != nil { // Handle arguments like "foo/." by // double-checking that directory doesn't exist. dir, err1 := Lstat(path) if err1 == nil && dir.IsDir() { return nil } return err } return nil } The first bit checks to see if Stat can *succeed*. If so, I think you must agree that path 'exists'. In that case, if it's a directory we're done and if not we return ENOTDIR. This is a fast path. Otherwise, we might as well try creating it. The comment says 'doesn't already exist' and you are correct that that's not strictly true, but it's the right way to think about what follows. It doesn't matter here that Stat might have failed with an I/O error. We figure out what the parent of path must be and call Mkdir recursively on that. If that fails, we stop. If it succeeds, now we know the parent directory definitely does exist, and we call Mkdir on the original path. Whatever Mkdir returns (ignoring the Lstat fixup), we return. In particular, the only way you get EEXIST from this function is if the inner MkdirAll (inside if j > 1) returns it or if Mkdir itself returns it. Since there has to be a base case in the recursion, the EEXIST *must* be coming from the call to the Mkdir system call. You are right that if Stat returns I/O error then we proceed with 'business as usual' but that is *exactly* what we should do. We want to see the error that Mkdir gives us, not the error that Stat gives us. I will recomment the function but I really think everything here is fine. |
CL https://golang.org/cl/151460043 mentions this issue. |
This issue was closed by revision 1eea5ca. Status changed to Fixed. |
Changing the comment does not fix the issue. Your statement "It doesn't matter here that Stat might have failed with an I/O error." is wrong. The user does care what the error is. The caller deserves to know the actually reason the call failed. We have these two errors, occurring in this order: stat /failed-file-system-mount-point/subdirectory-to-create: input/output error mkdir /failed-file-system-mount-point: file exists The important one, the first one, the one the user wants to know about, the one I want to know about, the reason the call is actually failing--is being discarded. Yes, an error is being returned, but it is the wrong error to return. It would be more accurate to mark this issue "not going to fix" than to changing comments and claim it's fixed. As a general design principal, a routine should return the error that caused the routine to fail. Do you agree? Yes, technically, "file exists" is the error that caused the logic to fail. But the IO Error is what caused the routine to fail and claiming otherwise is disingenuous. Further, recursion is beautiful computer science and all, but is it really appropriate here? Looks to me like it's recursion for the sake of recursion. We're not descending a tree here. A filesystem may be a tree, but a file path is a sequence. |
What caused this routine to fail is that Mkdir failed. The Stat is irrelevant. The only way I can make sense of what you are saying is if Stat("/failed-file-system-mount-point") also fails, presumably with EIO. It is only after that that the code would try Mkdir("/failed-file-system-mount-point") which I assume is failing with EEXIST. At that point the code will call Lstat("/failed-file-system-mount-point"), which I guess must *also* be failing with EIO. You seem to be suggesting that it should instead try Mkdir("/failed-file-system-mount-point/subdirectory-to-create") which I guess would fail with the EIO you want to see. It's hard for me to get worked about a case where Stat(filename) fails with EIO but Mkdir(filename) fails with EEXIST. Why doesn't the Mkdir fail with EIO also? |
You are completely misinterpreting what I am suggesting. "What caused this routine to fail is that Mkdir failed. The Stat is irrelevant." <- This is the bug. The stat isn't irrelevant! The code is not distinguishing between the error it assumes stat will return (file doesn't exist) and all the other possible errors stat could return (but it wouldn't know, it never checks!) Mark it as "not going to fix" or reassign to somebody who cares about returning the correct error because it matters. I ran into the bug while writing disk/filesystem stress testing code. Getting the correct error matters to me as that is the point of the software and is a normal occurrence since I'm testing over a petabyte a week. In any case, I long ago stopped using this function in my code. |
I am saying is that a function called MkdirAll should return an error from Mkdir. You are saying that it should return an error from Stat. I don't agree. We can discuss which specific Mkdir call should produce an error that should be returned, but it should definitely be an error from Mkdir. |
That is neither obvious nor correct. MkdirAll should return whatever error caused it to fail. There should be no "earlier error." As soon as MkdirAll encounters an error it's not expecting, that is the reason it failed and that is the error it should return. It should have never even made it to the Mkdir call. It is not "special pleading" to expect a function that fails to return the reason it failed. |
I don't know how I could demonstrate that I understand what you are saying. I am open to suggestions. You keep saying that the function should return the reason that it failed, and you also say that it should return the error from Stat, but the error from Stat is not the reason that it failed. It failed because Mkdir failed. The error from Stat was only used to determine which Mkdir call to make. Changing the status to Unfortunate. Status changed to Unfortunate. |
wheatman
pushed a commit
to wheatman/go-akaros
that referenced
this issue
Jun 25, 2018
The internal comments are not completely precise about what is going on, and they are causing confusion. Fixes golang#8283. LGTM=r R=r CC=golang-codereviews https://golang.org/cl/151460043
wheatman
pushed a commit
to wheatman/go-akaros
that referenced
this issue
Jun 26, 2018
The internal comments are not completely precise about what is going on, and they are causing confusion. Fixes golang#8283. LGTM=r R=r CC=golang-codereviews https://golang.org/cl/151460043
wheatman
pushed a commit
to wheatman/go-akaros
that referenced
this issue
Jul 9, 2018
The internal comments are not completely precise about what is going on, and they are causing confusion. Fixes golang#8283. LGTM=r R=r CC=golang-codereviews https://golang.org/cl/151460043
wheatman
pushed a commit
to wheatman/go-akaros
that referenced
this issue
Jul 30, 2018
The internal comments are not completely precise about what is going on, and they are causing confusion. Fixes golang#8283. LGTM=r R=r CC=golang-codereviews https://golang.org/cl/151460043
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
by Zachary.Drew:
The text was updated successfully, but these errors were encountered: