Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in OSSpinLockSynchronization #21

Closed
heiberg opened this issue Dec 3, 2015 · 8 comments
Closed

Crash in OSSpinLockSynchronization #21

heiberg opened this issue Dec 3, 2015 · 8 comments

Comments

@heiberg
Copy link
Contributor

heiberg commented Dec 3, 2015

This is an unreproducible one-off crash, so I just wanted to leave it here for future reference. It occurred in the wild for a user and was reported via Fabric.io.

From the mangled symbols it looks like it might be the UnSafeMutableContainer<OSSpinLock> property self.lock inside OSSpinLockSynchronization which is causing this issue.

The application was calling .onSuccess(.UserInteractive) on a Future from the main thread.
Device: non-jailbroken iPhone running iOS 9
FutureKit version: 1.2.0

This is the top of the call stack. The rest is application code + system libraries.

EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x50000000139032db
Thread : Crashed: com.apple.main-thread
0  FutureKit                      0x1009c6d40 _TFC9FutureKit25OSSpinLockSynchronization12synchronizedurfS0_FFT_q_q_ + 80
1  FutureKit                      0x1009c6e58 _TFC9FutureKit25OSSpinLockSynchronization13lockAndModifyurfS0_FT13waitUntilDoneSb11modifyBlockFT_q_4thenGSqFq_T___T_ + 128
2  FutureKit                      0x1009c7110 _TFC9FutureKit25OSSpinLockSynchronization18lockAndModifyAsyncurfS0_FTFT_q_4thenFq_T__T_ + 144
3  FutureKit                      0x1009c7b98 _TTWC9FutureKit25OSSpinLockSynchronizationS_23SynchronizationProtocolS_FS1_18lockAndModifyAsyncu__Rq_S1__fq_FTFT_qd__4thenFqd__T__T_ + 88
4  FutureKit                      0x10098511c _TFC9FutureKit6Future10onCompleteu__rfGS0_q__FTOS_8Executor5blockFzT6resultGOS_12FutureResultq___GOS_10Completionqd____GS0_qd___ + 568
5  FutureKit                      0x100985f24 _TFC9FutureKit6Future9onSuccessu__rfGS0_q__FTOS_8Executor5blockFzq_GOS_10Completionqd____GS0_qd___ + 200
6  FutureKit                      0x100987400 _TFC9FutureKit6Future9onSuccessu__rfGS0_q__FTOS_8Executor5blockFzq_GS0_qd____GS0_qd___ + 200
@mishagray
Copy link
Contributor

So it's running in the mainThread and trying to add a block to the completion block of some other Future.
What sort of Future is being called with .onSuccess(.UserInteractive) ?
Could it have already been completed, or possibly being completed at this time?

@mishagray
Copy link
Contributor

More importantly, is there any chance that the Future that you are calling onSuccess was somehow being deconstructed? Typically it's tricky to create a Future and deallocate it before it completes, but it's not impossible.

@heiberg
Copy link
Contributor Author

heiberg commented Dec 4, 2015

The future is either grabbed from a cache of previously created futures or created to satisfy the request, and performs its work on a background queue. So it might already be completed. Or it might be in the process of being completed from another thread. It's definitely not being deallocated, though.

@heiberg
Copy link
Contributor Author

heiberg commented Dec 12, 2015

@mishagray: This keeps happening at a low but steady rate in the wild for us. But we now have a reproduction method for it.

To reproduce we

  • create a promise on the main thread
  • complete it on a background thread
  • take the future of the promise and onSuccess to the main thread
  • do it again until we crash

In other words, it seems that every future has a small but non-zero risk of triggering this error.

Here is a tiny project which reproduces the crash:

https://github.com/heiberg/FutureThrasher

(run pod install to also get FutureKit in there)

It uses promises and futures to repeatedly increment a counter and show the value in a UILabel:

import UIKit
import FutureKit

class ViewController: UIViewController {

    @IBOutlet weak var textLabel: UILabel!

    override func viewDidAppear(animated: Bool) {
        super.viewDidAppear(animated)
        self.goBananas()
    }

    private var bananaCounter = 0

    private func goBananas() {
        let promise = Promise<NSDate>()
        Executor.UserInitiated.execute {
            promise.completeWithSuccess(NSDate())
        }
        promise.future.onSuccess(.Main) { _ in
            self.textLabel.text = "\(self.bananaCounter++)"
            self.goBananas()
        }
    }
}

Running this on my iPhone 6s usually reproduces the crash within 10 seconds or so.

The call stack usually ends in _os_lock_corruption_abort and looks like this:

screen shot 2015-12-11 at 22 34 43

@heiberg
Copy link
Contributor Author

heiberg commented Dec 12, 2015

Further info: Switching the synchronization strategy from .OSSpinLock to .NSLock results in seemingly the same failure rate, but perhaps a more usable error message:

*** NSForwarding: warning: object 0x13fd38110 of class 'FutureKit.UnsafeSynchronization' does not implement methodSignatureForSelector: -- trouble ahead
Unrecognized selector -[FutureKit.UnsafeSynchronization lock]

Somehow the method synchronizedWithLock ends up calling lock() on an object of type UnsafeSynchronization instead of an NSLock.

screen shot 2015-12-11 at 22 50 58

@mishagray
Copy link
Contributor

ok...
I have an idea what the bug is... It's not OSSpinLock vs NSLock, it's probably the way that UnsafeSynchronization is working...

There is a moment after the promise completes that we are 'swapping' from the SpinLock to the UnsafeLock. This is supposed to be 'ok' because the promise has effectively become 'immutable' and doesn't need a real lock anymore.

But I think the bug is that the swap operation itself is not properly guarded operation. I thought that just using a MemoryGuard around the lock pointer would do it... but I either didn't do it right, or it doesn't work the way I think it does.

Which opens up a sort of chicken and egg problem when using the Mutable/Immutable locking strategy.

I'm gonna take out the UnsafeSynch and just continue using the SpinLock even after the promise completes. If that's the problem that should solve it. And the optimization is kind of minor. It would be a bigger optimization if we were using NSLock, but OSSpinLocks are already kinda super fast.

Thanks!

@heiberg
Copy link
Contributor Author

heiberg commented Dec 12, 2015

That sounds like a good and safer alternative. Thanks for looking into this!

I'll keep an eye out for the commit and re-run the test.

@mishagray
Copy link
Contributor

okay! Looks fixed. Deployed a new version 1.2.1,
(#22)

Your 'FutureThrasher' seems to run forever on my devices now. (No more Thrashing..)

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants