Can trigger hangs in loopback FS with find and SIGINT on Sierra #314

Open
akalin-keybase opened this Issue Sep 21, 2016 · 8 comments

Comments

Projects
None yet
3 participants
@akalin-keybase

Repro steps:

  1. Clone https://github.com/osxfuse/filesystems
  2. Go to filesystems/filesystems-c/loopback and build it by running make
  3. mkdir /tmp/loopback
  4. ./loopback /tmp/loopback -omodules=threadid:subdir,subdir=/ -oallow_other,native_xattr,volname=LoopbackFS
  5. Do find /tmp/loopback, then hit Ctrl-C.

On OS X El Capitan, this works fine. However, on Sierra, this causes the find to hang, causes any subsequent find to hang at around the same spot, and causes any attempted unmount (via diskutil unmount /tmp/loopback, with or without force) to hang. It also causes shutdown/restart to hang.

FYI, this is possibly the minimal repro of a problem we've been seeing in KBFS, so hopefully fixing this problem will fix our KBFS problem, but it's worth fixing anyway.

@bfleischer bfleischer self-assigned this Sep 21, 2016

@bfleischer

This comment has been minimized.

Show comment
Hide comment
@bfleischer

bfleischer Sep 21, 2016

Member

Thanks for the repro steps. I'm able to reproduce the hang.

Member

bfleischer commented Sep 21, 2016

Thanks for the repro steps. I'm able to reproduce the hang.

@bfleischer bfleischer added the bug label Sep 21, 2016

@bfleischer

This comment has been minimized.

Show comment
Hide comment
@bfleischer

bfleischer Sep 22, 2016

Member

The hangs are caused by changes Apple made to the Sierra kernel (and FUSE using a kernel-private struct). The bad news is that Apple has not released the kernel sources for Sierra, yet. With the sources this would be an easy fix. I'm currently working on a workaround.

Member

bfleischer commented Sep 22, 2016

The hangs are caused by changes Apple made to the Sierra kernel (and FUSE using a kernel-private struct). The bad news is that Apple has not released the kernel sources for Sierra, yet. With the sources this would be an easy fix. I'm currently working on a workaround.

@gabriel

This comment has been minimized.

Show comment
Hide comment
@gabriel

gabriel Sep 22, 2016

I see a kernel debug kit for 16A304A which I think is Sierra beta 7, but I don't know if that is helpful at all.

gabriel commented Sep 22, 2016

I see a kernel debug kit for 16A304A which I think is Sierra beta 7, but I don't know if that is helpful at all.

@bfleischer

This comment has been minimized.

Show comment
Hide comment
@bfleischer

bfleischer Sep 23, 2016

Member

Are you able to reproduce the issue with other file systems operations than readdir?

Member

bfleischer commented Sep 23, 2016

Are you able to reproduce the issue with other file systems operations than readdir?

@akalin-keybase

This comment has been minimized.

Show comment
Hide comment
@akalin-keybase

akalin-keybase Sep 23, 2016

I haven't tried; the symptoms in kbfs and the hello bazil filesystem is that when the executable (which doesn't autodaemonize) runs, Ctrl-C'ing it causes a hang, similar to the find hang in this repro.

However, the kbfs / hellofs executable itself doesn't seem to hang, and processes requests fine -- it seems like the kernel itself hangs when the executable exits.

I don't know if the above is helpful -- should I try reproing with other file system operations?

I haven't tried; the symptoms in kbfs and the hello bazil filesystem is that when the executable (which doesn't autodaemonize) runs, Ctrl-C'ing it causes a hang, similar to the find hang in this repro.

However, the kbfs / hellofs executable itself doesn't seem to hang, and processes requests fine -- it seems like the kernel itself hangs when the executable exits.

I don't know if the above is helpful -- should I try reproing with other file system operations?

@maxtaco maxtaco referenced this issue in keybase/kbfs Sep 26, 2016

Closed

macOS Support #379

bfleischer added a commit to osxfuse/kext that referenced this issue Sep 26, 2016

Work around macOS 10.12 interrupt handling
This fixes interrupt handling for the macOS 10.12 production kernel.
The issue still persists on debug and development kernels.

This addresses osxfuse/osxfuse#314
@bfleischer

This comment has been minimized.

Show comment
Hide comment
@bfleischer

bfleischer Sep 26, 2016

Member

I've looked into this for a couple of days now but, at this point, all I can offer is a workaround that should work for most users.

Apple seems to have changed the way the kernel handles interrupts. So far I have not found a way using only public API to get interrupt handling to work on Sierra. This is one part of the problem.

The other part is that Apple has made changes to the (private) thread_t data structure in the kernel. The offset of the relevant field FUSE needs to access has changed. Using lldb I was able to fix this. But the problem with this approach is that the offset actually differs between the production, debug and development kernels. The current workaround works for the production kernel, not the other two.

The new 3.5.2 release contains the workaround. Please let me know if it works for you.

Member

bfleischer commented Sep 26, 2016

I've looked into this for a couple of days now but, at this point, all I can offer is a workaround that should work for most users.

Apple seems to have changed the way the kernel handles interrupts. So far I have not found a way using only public API to get interrupt handling to work on Sierra. This is one part of the problem.

The other part is that Apple has made changes to the (private) thread_t data structure in the kernel. The offset of the relevant field FUSE needs to access has changed. Using lldb I was able to fix this. But the problem with this approach is that the offset actually differs between the production, debug and development kernels. The current workaround works for the production kernel, not the other two.

The new 3.5.2 release contains the workaround. Please let me know if it works for you.

@akalin-keybase

This comment has been minimized.

Show comment
Hide comment
@akalin-keybase

akalin-keybase Sep 26, 2016

It looks like 3.5.2 fixes both this minimal repro and our KBFS hangs. Thanks a lot! I'll let you know if I run anything else, but as far as I can tell, KBFS now behaves normally. 🍰

It looks like 3.5.2 fixes both this minimal repro and our KBFS hangs. Thanks a lot! I'll let you know if I run anything else, but as far as I can tell, KBFS now behaves normally. 🍰

@akalin-keybase

This comment has been minimized.

Show comment
Hide comment
@akalin-keybase

akalin-keybase Sep 28, 2016

@bfleischer Can you elaborate a bit more on Apple changing the way the kernel handles interrupts? Are you saying that interrupt handling shouldn't work with OSXFUSE on Sierra? From my testing, Ctrl-C etc. seem to work fine.

@bfleischer Can you elaborate a bit more on Apple changing the way the kernel handles interrupts? Are you saying that interrupt handling shouldn't work with OSXFUSE on Sierra? From my testing, Ctrl-C etc. seem to work fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment