Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IOS Simulator on ZFS homedir deadlocks #306

Closed
brendonhumphrey opened this issue May 8, 2015 · 5 comments
Closed

IOS Simulator on ZFS homedir deadlocks #306

brendonhumphrey opened this issue May 8, 2015 · 5 comments

Comments

@brendonhumphrey
Copy link
Contributor

Running a default IOS application in the IOS Simulator with the project stored on ZFS results in a ZFS deadlock.

Suspect thread is Thread 0x6936, stack dump is consistent across a number of samples.

Spindump follows:

Date/Time: 2015-05-08 14:20:56 +1000
OS Version: 10.10.3 (Build 14D136)
Architecture: x86_64
Report Version: 21

Duration: 9.99s
Steps: 1000 (10ms sampling interval)

Hardware model: MacPro3,1
Active cpus: 8

Fan speed: 499 rpm


Heavy format: stacks are sorted by count

Use -i and -timeline to re-report with chronological sorting

Process: accountsd [455]
Path: /System/Library/Frameworks/Accounts.framework/Versions/A/Support/accountsd
Architecture: x86_64
Parent: launchd [1]
UID: 504
Sudden Term: Clean (allows idle exit)
Task size: 4235 pages

Thread 0xfc0 DispatchQueue 1 1000 samples (1-1000) priority 4
1000 start + 1 (libdyld.dylib + 13769) [0x7fff83c295c9]
1000 ??? (accountsd + 2741) [0x10211cab5]
1000 CFRunLoopRunSpecific + 296 (CoreFoundation + 465880) [0x7fff87e81bd8]
1000 __CFRunLoopRun + 1371 (CoreFoundation + 467835) [0x7fff87e8237b]
1000 __CFRunLoopServiceMachPort + 212 (CoreFoundation + 470708) [0x7fff87e82eb4]
1000 mach_msg_trap + 10 (libsystem_kernel.dylib + 70878) [0x7fff83c144de]
*1000 ipc_mqueue_receive_continue + 0 (kernel + 1148928) [0xffffff8000318800]

Thread 0xfc5 DispatchQueue 2 1000 samples (1-1000) priority 4
1000 _dispatch_mgr_thread + 52 (libdispatch.dylib + 19050) [0x7fff8cadca6a]
1000 kevent64 + 10 (libsystem_kernel.dylib + 94770) [0x7fff83c1a232]
*1000 ??? (kernel + 5989984) [0xffffff80007b6660]

Binary Images:
0x10211c000 - 0x10211cfff accountsd (504.10) <04A110E1-7397-35CF-B77A-093318CEF5C8> /System/Library/Frameworks/Accounts.framework/Versions/A/Support/accountsd
0x7fff83c03000 - 0x7fff83c20fff libsystem_kernel.dylib (2782.20.48) /usr/lib/system/libsystem_kernel.dylib
0x7fff83c26000 - 0x7fff83c29fff libdyld.dylib (353.2.1) <9EACCA38-291D-38CC-811F-7E9D1451E2D3> /usr/lib/system/libdyld.dylib
0x7fff87e10000 - 0x7fff881a8fff com.apple.CoreFoundation 6.9 (1153.18) <5C0892B8-9691-341F-9279-CA3A74D59AA0> /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
0x7fff8cad8000 - 0x7fff8cb02fff libdispatch.dylib (442.1.4) <502CF32B-669B-3709-8862-08188225E4F0> /usr/lib/system/libdispatch.dylib
*0xffffff8000200000 - 0xffffff80009fffff kernel (2782.20.48) <4B3A11F4-77AA-3D27-A22D-81A1BC5B504D> /System/Library/Kernels/kernel

Process: accountsd 700 (zombie)
Architecture: x86_64
Parent: launchd [1]
UID: 504
Sudden Term: Dirty (allows idle exit)
Task size: 1 pages
Note: Suspended for 1000 samples
Note: Terminated (zombie) for 1000 samples

Thread 0x6936 1000 samples (1-1000) priority 46
*1000 hndl_unix_scall64 + 22 (kernel + 2315126) [0xffffff8000435376]
*1000 unix_syscall64 + 662 (kernel + 6601350) [0xffffff800084ba86]
*1000 open_dprotected_np + 421 (kernel + 3469253) [0xffffff800054efc5]
*1000 open1 + 552 (kernel + 3466680) [0xffffff800054e5b8]
*1000 vn_open_auth + 690 (kernel + 3549602) [0xffffff80005629a2]
*1000 vn_create + 492 (kernel + 3423244) [0xffffff8000543c0c]
*1000 VNOP_CREATE + 95 (kernel + 3600479) [0xffffff800056f05f]
*1000 zfs_vnop_create + 104 (zfs_vnops_osx.c:533,10 in zfs + 614600) [0xffffff7f80e710c8]
*1000 zfs_create + 1179 (zfs_vnops.c:1796,3 in zfs + 589371) [0xffffff7f80e6ae3b]
*1000 zfs_log_create + 267 (zfs_log.c:281,8 in zfs + 546827) [0xffffff7f80e6080b]
*1000 zil_itx_create + 41 (zil.c:1194,8 in zfs + 656713) [0xffffff7f80e7b549]
*1000 zfs_kmem_alloc + 266 (spl-kmem.c:2366,15 in spl + 9962) [0xffffff7f80da66ea]
*1000 vmem_alloc + 1197 (spl-vmem.c:1298,11 in spl + 63213) [0xffffff7f80db36ed]
*1000 vmem_xalloc + 787 (spl-vmem.c:1183,3 in spl + 57123) [0xffffff7f80db1f23]
*1000 spl_cv_wait + 52 (spl-condvar.c:68,12 in spl + 5956) [0xffffff7f80da5744]
*1000 msleep + 98 (kernel + 6142226) [0xffffff80007db912]
*1000 ??? (kernel + 6143321) [0xffffff80007dbd59]
*1000 lck_mtx_sleep + 134 (kernel + 1276342) [0xffffff80003379b6]
*1000 thread_block_reason + 175 (kernel + 1319343) [0xffffff80003421af]
*1000 ??? (kernel + 1329236) [0xffffff8000344854]
*1000 machine_switch_context + 367 (kernel + 2176511) 0xffffff80004135ff

@brendonhumphrey
Copy link
Contributor Author

Looks like all thats required to trigger this is to launch the IOS simulator (/Applications/Xcode/Contents/Developer/Applications/iOS.....) while in a ZFS home directory.

@brendonhumphrey
Copy link
Contributor Author

Intrumentation has of the zfs_vnop_create call and descendents reveals that there is a collision between va_dataprotect_class and AT_XVATTR, which causes us to attempt to use xvap->xva_mapsize which is uninitialised. This results in an impossible to meet memory allocation request and we deadlock.

In zfs_vnop_create() ...

VATTR_CLEAR_ACTIVE(vap, va_dataprotect_class);

... then ...

error = zfs_create(ap->a_dvp, cnp->cn_nameptr, vap, excl, mode,

is a proof of concept work-around and allows the IOS Simulator to run.

@lundman
Copy link
Contributor

lundman commented May 11, 2015

@brendonhumphrey
Copy link
Contributor Author

test, works great

@rottegift
Copy link
Contributor

@brendonhumphrey Post this fix I get panics I have never seen without this fix.

They are easy for me to trigger, and easy to get rid of (simply unmerge the above fix).

I don't understand them, though, since zfs is not in the stack:

Anonymous UUID:       BF0E83A0-DDA2-E1D8-2BA8-758F8C501222

Mon May 11 20:29:57 2015
panic(cpu 0 caller 0xffffff80238dcc1d): Kernel trap at 0xffffff80239fd1c4, type 14=page fault, registers:
CR0: 0x0000000080010033, CR2: 0x0000000000000058, CR3: 0x00000001ba8fc0b2, CR4: 0x00000000001606e0
RAX: 0x000000000000000b, RBX: 0xffffff804b1da5a0, RCX: 0x0000000000000000, RDX: 0xffffff804b075930
RSP: 0xffffff82394ab720, RBP: 0xffffff82394ab790, RSI: 0xffffff82394ab930, RDI: 0xffffff82394ab740
R8:  0xffffff804b1da5a0, R9:  0x0000000000000000, R10: 0xffffff8049f24750, R11: 0x0000000000000246
R12: 0xffffff804b075930, R13: 0xffffff804b075930, R14: 0xffffff804b1da5a0, R15: 0xffffff82394ab930
RFL: 0x0000000000010246, RIP: 0xffffff80239fd1c4, CS:  0x0000000000000008, SS:  0x0000000000000010
Fault CR2: 0x0000000000000058, Error code: 0x0000000000000000, Fault CPU: 0x0

Backtrace (CPU 0), Frame : Return Address
0xffffff82394ab3b0 : 0xffffff8023823139 mach_kernel : _panic + 0xc9
0xffffff82394ab430 : 0xffffff80238dcc1d mach_kernel : _kernel_trap + 0x8ed
0xffffff82394ab600 : 0xffffff80238f4486 mach_kernel : _return_from_trap + 0xe6
0xffffff82394ab620 : 0xffffff80239fd1c4 mach_kernel : _vnode_getattr + 0x74
0xffffff82394ab790 : 0xffffff80239e0094 mach_kernel : _vnode_authorize_init + 0x364
0xffffff82394abb10 : 0xffffff8023bb6f6c mach_kernel : _kauth_authorize_action + 0x4c
0xffffff82394abb70 : 0xffffff80239e0e7b mach_kernel : _vnode_authorize + 0x5b
0xffffff82394abbc0 : 0xffffff80239f66c0 mach_kernel : _vn_stat + 0x40
0xffffff82394abc00 : 0xffffff80239f248a mach_kernel : _vfs_purge + 0xa7a
0xffffff82394abd80 : 0xffffff80239ea70c mach_kernel : _lstat64 + 0x9c
0xffffff82394abf50 : 0xffffff8023c41a23 mach_kernel : _unix_syscall64 + 0x1f3
0xffffff82394abfb0 : 0xffffff80238f4c86 mach_kernel : _hndl_unix_scall64 + 0x16

BSD process name corresponding to current thread: talagent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants