-
Notifications
You must be signed in to change notification settings - Fork 18k
os: Readdir swallows partial results if it fails to Lstat any file in the listing #27416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Just spent 2+ hours debugging a problem around this bug. It should not happen! In our project we needed to reimplement ReadDir() ourselves to work around this bug But there was an odd place where ioutils.ReadDir() was used instead of our fixed version. We should not need to reinvent parts of the standard library - please consider fixing this for good. It is extremely frustrating to spend time chasing bugs in standard libraries - resorting to very careful analysis of strace output to notice that getdents64 returns 82 entries but we only have 52 newfstatat() calls! This is a high severity bug it should get more attention. |
I don't know what the right thing to do is here. What should happen if there is more than one error calling |
Just hit the same bug again with macos. Turns out there is a standard file on MacOS that can not normally be stat'ed meaning the entire directory can not be listed:
I suspect it is therefore impossible to list /var/db/ on macos. |
Workaround upstream go library bug needs to be applied to darwin too and probably freebsd as well. Ref: golang/go#27416
Workaround upstream go library bug needs to be applied to darwin too and probably freebsd as well. Ref: golang/go#27416
Now that errors.Join is a thing, maybe it can be used to handle multiple errors? |
I don't think it's going to make sense for a function like That said, as of Go 1.16 we now have Looking at the code that does call |
Please answer these questions before submitting your issue. Thanks!
What version of Go are you using (
go version
)?go version go1.10.1 linux/amd64
Does this issue reproduce with the latest release?
yes
What operating system and processor architecture are you using (
go env
)?GOARCH="amd64"
GOHOSTOS="linux"
What did you do?
Test program here
https://play.golang.org/p/Xgvoi1GQGUv
It is hard to reproduce but when using fuse sometimes the mount point becomes disconnected. In this case when you try to do a lstat() of it you get "transport endpoint is not connected" (Google for that to see that it is a common thing). Anyway the issue is really that when a directory contains a disconnected mount point like this, then Golang's readdir variants (os.Open + Readdir and ioutils.ReadDir()) break out of the loop as soon as they can not Lstat one of the files in the directory and this either returns no results in the case of ioutils or randomly less results in the case of os.Open + Readdir.
This is very surprising and leads to weird program failures because an external event (unclean mount point) suddenly causes the entire directory listing to fail in Go programs. The directory is fine for e.g. ls - you can see how /bin/ls shows it:
The issue is in os/dir_unix.go exiting out of the loop in case of an lstat error.
A more robust implementation is possible by copying out the code from dir_unix.go and modifying it (https://play.golang.org/p/UFvrro7cqqu) so maybe this is a valid workaround but IMHO almost every user of ioutil.ReadDir() expects to get some results back - it is unexpected to just have the entire directory listing fail because someone has put a fuse mount inside it.
Semantically to me at least, when I call ReadDir on a directory (say "/"), then any error that I get back should relate to the directory itself. I was surprised that I was unable to ReadDir("/") when a subdir of "/" was actually un-stat'able. The error does not seem to relate to the thing I was calling the function on.
What did you expect to see?
I expected partial results with all the results that could be stat'ed. Maybe add an empty FileInfo for the bad mount point or omit it.
What did you see instead?
ReadDir() returns an error and no results - weird since I can totally do an "ls -l /" and this works fine.
The text was updated successfully, but these errors were encountered: