Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Directory becomes temporarily inaccessible after it has been moved #92

Closed
tkuosmanen opened this issue Feb 5, 2020 · 5 comments · Fixed by #93
Closed

Directory becomes temporarily inaccessible after it has been moved #92

tkuosmanen opened this issue Feb 5, 2020 · 5 comments · Fixed by #93
Assignees

Comments

@tkuosmanen
Copy link

When moving directories inside a sandbox, it takes a few seconds for the directories to become accessible. Directories are listed, but ls cannot stat() the directory entries.

For reproducing the issue, Bash auto completion has to be used when typing the directory names. By copy-pasting the commands as is, the directory that gets moved is immediately accessible.

Terminal 1:

$ mkdir -p ./out ./new_sandbox
$ sandboxfs --version
sandboxfs 0.1.1
$ sandboxfs --mapping=ro:/:/ --mapping=rw:/out:$PWD/out ./new_sandbox
...

Terminal 2:

$ cd new_sandbox/out/
$ mkdir target to_be_moved
$ mv to_be_moved/ target/  # <- Use TAB here for auto completion
$ while sleep 1; do date && ls -l target; done
Wed Feb  5 13:26:07 EET 2020
ls: cannot access 'target/to_be_moved': No such file or directory
total 0
d????????? ? ? ? ?            ? to_be_moved
Wed Feb  5 13:26:08 EET 2020
ls: cannot access 'target/to_be_moved': No such file or directory
total 0
d????????? ? ? ? ?            ? to_be_moved

...

Wed Feb  5 13:26:52 EET 2020
total 0
drwxrwxr-x 2 tkuosmanen tkuosmanen 2 Feb  5 13:25 to_be_moved

In the example above, it took roughly 45 seconds for the directory to become accessible.

This was run on Ubuntu 16.04 workstation with 4.4.0-133-generic kernel and ext4 root file system. Sandboxfs is pre-built release 0.1.1 from sandboxfs-0.1.1-20191024-linux-x86_64.tgz.

The issue has been reproduced also on Ubuntu 18.04 workstation with 4.15.0-72-generic kernel.

@jmmv
Copy link
Contributor

jmmv commented Feb 5, 2020

Huh! Funny that you report this issue just today.

I've spent all day troubleshooting another problem I encountered and found a pretty serious bug in readdir (where we recreate nodes for directories every time). I suspect the fix to that, which I'm testing right now, will also resolve this.

@jmmv jmmv self-assigned this Feb 5, 2020
jmmv added a commit to jmmv/sandboxfs that referenced this issue Feb 6, 2020
The readdir implementation would generate new nodes for every directory
it encountered, even if they had seen before, returning new inodes each
time.  Other than this having a performance impact, this is just
incorrect and I'm not sure how this bug has survived for so long.

In fact, the only reason I noticed this is because I tried to build
GNU tar on top of sandboxfs and its configure script would fail when
run as root.  Essentially, the test was doing a rename of a directory
inside a directory, and that somehow caused the pwd to become invalid
when performing a readdir on the parent directory.

Should fix bazelbuild#92.
@jmmv
Copy link
Contributor

jmmv commented Feb 6, 2020

Alright. The problem doesn't reproduce on macOS as you describe it because of, I believe, differences in OSXFUSE. But I could reproduce it on Linux; thanks for the report.

This is, indeed, the same problem I found earlier today, which manifested in a different manner but also involved a rename along with reading the directory contents (what you triggered with tab completion). The problem was that because we don't cache directory nodes, readdir was incorrectly regenerating new nodes for subdirectories every time (but was doing the correct thing for files).

I've just pushed a fix to Travis for testing and will merge when tests pass.

jmmv added a commit to jmmv/sandboxfs that referenced this issue Feb 6, 2020
The readdir implementation would generate new nodes for every directory
it encountered, even if they had seen before, returning new inodes each
time.  Other than this having a performance impact, this is just
incorrect and I'm not sure how this bug has survived for so long.

In fact, the only reason I noticed this is because I tried to build
GNU tar on top of sandboxfs and its configure script would fail when
run as root.  Essentially, the test was doing a rename of a directory
inside a directory, and that somehow caused the pwd to become invalid
when performing a readdir on the parent directory.  The issue is easier
to reproduce on Linux though.

Fixes bazelbuild#92.
@jmmv jmmv closed this as completed in #93 Feb 6, 2020
jmmv added a commit that referenced this issue Feb 6, 2020
The readdir implementation would generate new nodes for every directory
it encountered, even if they had seen before, returning new inodes each
time.  Other than this having a performance impact, this is just
incorrect and I'm not sure how this bug has survived for so long.

In fact, the only reason I noticed this is because I tried to build
GNU tar on top of sandboxfs and its configure script would fail when
run as root.  Essentially, the test was doing a rename of a directory
inside a directory, and that somehow caused the pwd to become invalid
when performing a readdir on the parent directory.  The issue is easier
to reproduce on Linux though.

Fixes #92.
@tkuosmanen
Copy link
Author

Thanks for your extremely quick response to the issue!

I built the latest top of tree (855834e) and it fixes the issue described in my initial posting.

However, there seems to be a new issue with files located in directories that get moved. Below is an error case that I can reliably reproduce with my own build of 855834e:

Terminal 1:

mkdir -p ./out ./new_sandbox
sandboxfs --mapping=ro:/:/ --mapping=rw:/out:$PWD/out ./new_sandbox
...

Terminal 2:

$ cd new_sandbox/out/
$ mkdir target to_be_moved
$ echo dummy > to_be_moved/dummy.txt
$ mv to_be_moved/ target/
$ echo "contents of target:" && ls -l target && echo "contents of target/to_be_moved:" && ls -l target/to_be_moved
contents of target:
total 0
drwxrwxr-x 2 tkuosmanen tkuosmanen 2 Feb  7 13:11 to_be_moved
contents of target/to_be_moved:
ls: cannot access 'target/to_be_moved/dummy.txt': No such file or directory
total 0
-????????? ? ? ? ?            ? dummy.txt

There's no need to use TAB auto-completion to trigger the issue. In addition, target/to_be_moved/dummy.txt is stuck in a bad state. Waiting for a minute or two does not help in this case.

@jmmv
Copy link
Contributor

jmmv commented Feb 7, 2020

Would you mind filing a separate bug for this problem please? This issue is now closed and referenced in the commit/changelog already so it'd be confusing to talk about its number for a different problem.

I think I know what's happening and it's not the same root cause. (During a directory rename, we update the inode cache for the directory itself... but not for any of its contents.) Wow, I'm surprised these obvious problems haven't shown up until now, as I've been (ab)using sandboxfs quite a bit with Bazel builds or with pkg_comp builds. But I think OSXFUSE and Linux's FUSE behave quite differently in these cases, and I have done very little testing on the latter.

@tkuosmanen
Copy link
Author

Sure thing. Filed #94.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants