It is hard to reproduce but when using fuse sometimes the mount point becomes disconnected. In this case when you try to do a lstat() of it you get "transport endpoint is not connected" (Google for that to see that it is a common thing). Anyway the issue is really that when a directory contains a disconnected mount point like this, then Golang's readdir variants (os.Open + Readdir and ioutils.ReadDir()) break out of the loop as soon as they can not Lstat one of the files in the directory and this either returns no results in the case of ioutils or randomly less results in the case of os.Open + Readdir.
This is very surprising and leads to weird program failures because an external event (unclean mount point) suddenly causes the entire directory listing to fail in Go programs. The directory is fine for e.g. ls - you can see how /bin/ls shows it:
$ ls -l /
ls: cannot access '/mnt': Transport endpoint is not connected
drwxr-xr-x 2 root root 12288 Aug 24 07:35 bin
drwxr-xr-x 3 root root 4096 Aug 30 06:38 boot
drwxr-xr-x 2 root root 4096 May 3 15:48 lib64
drwx------ 2 root root 16384 Sep 26 2017 lost+found
drwxr-xr-x 5 root root 4096 Feb 9 2018 media
d????????? ? ? ? ? ? mnt
drwxr-xr-x 4 root root 4096 Jun 28 12:41 opt
dr-xr-xr-x 362 root root 0 Jun 27 23:02 proc
-rw------- 1 root root 2147483648 Dec 17 2017 swapfile
dr-xr-xr-x 13 root root 0 Sep 1 01:28 sys
drwxrwxrwt 89 root root 102400 Sep 1 01:51 tmp
drwxr-xr-x 14 root root 4096 Dec 29 2017 usr
The issue is in os/dir_unix.go exiting out of the loop in case of an lstat error.
A more robust implementation is possible by copying out the code from dir_unix.go and modifying it (https://play.golang.org/p/UFvrro7cqqu) so maybe this is a valid workaround but IMHO almost every user of ioutil.ReadDir() expects to get some results back - it is unexpected to just have the entire directory listing fail because someone has put a fuse mount inside it.
Semantically to me at least, when I call ReadDir on a directory (say "/"), then any error that I get back should relate to the directory itself. I was surprised that I was unable to ReadDir("/") when a subdir of "/" was actually un-stat'able. The error does not seem to relate to the thing I was calling the function on.
What did you expect to see?
I expected partial results with all the results that could be stat'ed. Maybe add an empty FileInfo for the bad mount point or omit it.
What did you see instead?
ReadDir() returns an error and no results - weird since I can totally do an "ls -l /" and this works fine.
The text was updated successfully, but these errors were encountered:
changed the title
os.Readdir swallows partial results if it fails to Lstat any file in the listing.Aug 31, 2018
But there was an odd place where ioutils.ReadDir() was used instead of our fixed version.
We should not need to reinvent parts of the standard library - please consider fixing this for good. It is extremely frustrating to spend time chasing bugs in standard libraries - resorting to very careful analysis of strace output to notice that getdents64 returns 82 entries but we only have 52 newfstatat() calls!
This is a high severity bug it should get more attention.