Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash on BNCServerInterface.m line 158 -[BNCServerInterface genericHTTPRequest:retryNumber:log:callback:retryHandler:] #548

Closed
almostintuitive opened this issue Jan 14, 2017 · 29 comments

Comments

@almostintuitive
Copy link
Contributor

Hi,

We've seen 55 of these crashes coming from Branch in the last 7 days (10k sessions per day).

branch-crash-1

@almostintuitive
Copy link
Contributor Author

@E-B-Smith
Copy link
Contributor

Thanks for the heads up. I'm looking into this now.

Which version of the SDK are you using?

@almostintuitive
Copy link
Contributor Author

almostintuitive commented Jan 14, 2017 via email

@ahmednawar
Copy link
Contributor

@itchingpixels I am unable to reproduce any of the crashes you've reported. Could you please send us your Branch-related code so that we can take a look? Feel free to send it to integrations@branch.io and reference the github issue in the email. Also we have a Swift Testbed that you can take a look at as a reference implementation. Please let us know.

@almostintuitive
Copy link
Contributor Author

Thank you. The testbed project looks exactly how we have Branch integrated.
Is there anything I can do? We're also seeing another crash that can be connected (affecting a lot more users - 2-3% of the sessions). It appeared at the same time as the others.
http://crashes.to/s/117dea8947c

We're investigating the latter - I'm not sure if it's connected.

We'd love to sort these out as soon as possible.

We started seeing these crashes after upgrading first to 0.12.19 (may be a good clue)

@ahmednawar
Copy link
Contributor

We had changes in this version related to preferenceHelper, however, we don't have any reports of crashes lately.

  1. What was the version prior to 0.12.19 that you were running?
  2. Is it possible to share the code with us?

@almostintuitive
Copy link
Contributor Author

  1. Actually, I may got myself confused regarding when this crash appeared. Let me investigate it a bit more closely, I'll get back to you tomorrow!
  2. Happy to share it (isn't really touching any of the business logic):
    branch-related-code.txt

@almostintuitive
Copy link
Contributor Author

branch-crash-2

@almostintuitive
Copy link
Contributor Author

almostintuitive commented Jan 17, 2017

by looking at the stack trace (block_copy, block_object_assign, etc), I'd point to the unconventional (static?) declaration of NSURLSessionCompletionHandler at the top of BNCServerInterface.m.
is there any reason why it isn't simply a local variable?

@ahmednawar
Copy link
Contributor

@itchingpixels Unfortunately, I am still unable to reproduce.

  1. Are you initializing Branch in the main thread or the in an operations queue?
  2. Would it be possible to create a quick sample project with Branch integrated and see if you can reproduce? If so, could you please send us the sample project if you manage to reproduce?

Sorry for the headache :(

@almostintuitive
Copy link
Contributor Author

Hi!

I had a close look at the stack trace of the screenshot above, and realised that [Branch initalizeSession] is called when I call Branch.getInstance().loadRewards...
But that should not happen, since Branch should have been initalized by then. But I just realised that that's probably not the case in this instance (loadRewards() is called before initSession()).

I guess that's the cause of the crash that only we can reproduce.
Is this something you're aware of? I know that it sounds weird that we have this setup, it was mainly because of some singelton initalization magic, which I'll get rid of right now. - but it may also affect others.

Do you think calling loadRewards and then calling initSession() could lead to crashes? I'm asking because there's a binary waiting for review, which I'm happy to reject and re-upload, if you say so.

Thank you for your time! And sorry for this.

Mark

@ahmednawar
Copy link
Contributor

Actually this should still work. If you call a branch method before initializing first, we will initialize under the hood and the call to init afterwards will be a no-op. Perhaps try to not call loadRewards before and see if it doesn't crash in your local env? Maybe there are some conditions where our approach fails.

@almostintuitive
Copy link
Contributor Author

almostintuitive commented Jan 18, 2017

Hi!

Thanks for the info.
Looking through the code, I have a hypothesis that may have some holes in, but I'll describe it anyway:

  1. After Branch is initalized (which requires an internet connection, so it's inherently asynchronous, as far as I can see), isInitialized is set to true. That means you can have two initSessions in progress at the same time, because the isInitalized boolean is used to return early when Branch has been already initalized (ended), not when the initalization process has been started.
  2. We're calling both loadRewards and initSession within 0.1-1 sec (both will get executed at the start of the app). This could easily trigger point 1. (I can confirm that the crashes we're seeing are all hapenning at startup)
  3. One of the blocks that are causing BAD_ACCESS are declared as a static (NSURLSessionCompletionHandler) variable, not as a local one.
    As far as I understand, in this specific case, that block (that has been statically declared at the top of the file) will not be copied but referenced. The right mechanism would be the copy the block instead of referencing it, because that static variable can re-assigned by another initSession call that's going in parallel. Re-assignment could lead to an early de-allocation the block that has been passed to the dataTask.
    http://stackoverflow.com/questions/7404653/arc-getting-exc-bad-access-from-inside-block-used-in-delegate-method
    Making it into a local variable would solve the problem, because by passing a locally defined block to the dataTask, its reference count should be incremented by one. This does not apply to static variables as far as I understand the inner workings of ARC.
  4. The fact that I can only reproduce the crash very rarely (3-4 out of 100 app starts) would make me assume that this is some kind of race condition, hence I wrote point 1-3

Recommendations:

  1. There should be no need for NSURLSessionCompletionHandler & its pair to be a static variable. I would create a typedef for that block, so it can be easily referenced later easily. Then make it into a local variable, that can be passed to the dataTask the same way as it exist now.
  2. Create another boolean (isInitalizationInProgress), that would prevent parallel initalization requests - this may have consequences to other parts though.

Let me know what you think. I'll submit a PR for 1.

@almostintuitive
Copy link
Contributor Author

almostintuitive commented Jan 18, 2017

At the same time, I modified our code and we'll be calling initSession first. But we're still calling them one after the other, potentially within miliseconds.

@almostintuitive
Copy link
Contributor Author

almostintuitive commented Jan 18, 2017

PR submitted for point 1.

@ahmednawar
Copy link
Contributor

Thanks a lot for the analysis and the PR! We will review and potentially add to our next release.

Were you able to test the PR on your app and see if it mitigated any of the crashes?

@almostintuitive
Copy link
Contributor Author

almostintuitive commented Jan 18, 2017

I'll definitely do that, but unfortunately our next release is 7 days away. I think I'll be able to confirm if this fixes the problem.
If you guys could have a look at the feasibility of point 2 (recommendation section), that would be great, too.

Will report back as soon as I know something! This is mission critical for us at the moment, since we also see another crash that's really similar to this, which could be coming from the same source (although because it's happening inside an NSOperationQueue (also BAD_ACCESS), all the additional stack trace info is lost). That's impacting 4% of the sessions.

@E-B-Smith
Copy link
Contributor

Thank you for all your work on this!

I've merged your fix plus some other fixes into our beta branch. You can read about our beta program and the beta branch here: https://branch.app.link/Beta-Info

I can't test the SDK fully at the moment, but the small amount of testing I've done seems good.

I think you will be able to pull the code from the beta branch for your upcoming release.

@E-B-Smith
Copy link
Contributor

See comment for issue #548 .

@almostintuitive
Copy link
Contributor Author

That's great news, thank you!
We'll be using the beta version in our next release (going out on the 25th of Jan)!

Will report back about the results:)

@ahmednawar
Copy link
Contributor

@itchingpixels i am going to close the crash issues for now since Ed released the beta with the fix you've suggested + some more. If you are still noticing crashes, feel free to re-open.

Thanks for all your help!

@mohitongit
Copy link

Glad that I found this thread. I recently released an update to my iOS app and updated Branch to 0.12.24 (was 0.12.20). Since then, I have been experiencing these CF Network crashes. I am not experienced and couldn't find much info from crashlytics reports. The crashes are similar to the second url (http://crashes.to/s/117dea8947c) posted by @itchingpixels

@E-B-Smith
Copy link
Contributor

Did you try the Branch beta release that fixes this? It's an immediate fix for this problem.

Here's info about our beta program: https://branch.app.link/Beta-Info

Otherwise, I'm going to finish the QA work on the beta branch and release to master early this week.

@mohitongit
Copy link

Yes, I have updated the app and it is waiting for review. I was unable to reproduce the crash at my end. I'll let you know how it goes.

@mohitongit
Copy link

Hi! I had updated to 0.12.25 beta as per the docs. 147 users updated and just now received a new crash:

Crashed: com.apple.root.user-initiated-qos
0  libobjc.A.dylib                0x1904c40a0 objc_retain + 16
1  Branch                         0x100892764 -[BNCServerInterface updateDeviceInfoToMutableDictionary:] + 68
2  Branch                         0x100892c44 -[BNCServerInterface updateDeviceInfoToParams:] + 120
3  Branch                         0x100890408 -[BNCServerInterface postRequest:url:retryNumber:key:log:callback:] + 128
4  Branch                         0x1008902c4 -[BNCServerInterface postRequest:url:key:callback:] + 112
5  Branch                         0x1008a9c6c -[BranchOpenRequest makeRequest:key:callback:] + 1652
6  Branch                         0x1008a003c __30-[Branch processNextQueueItem]_block_invoke.631 + 104
7  libdispatch.dylib              0x1908fe1fc _dispatch_call_block_and_release + 24
8  libdispatch.dylib              0x1908fe1bc _dispatch_client_callout + 16
9  libdispatch.dylib              0x19090e518 _dispatch_root_queue_drain + 1032
10 libdispatch.dylib              0x19090e0ac _dispatch_worker_thread3 + 124
11 libsystem_pthread.dylib        0x190b072a0 _pthread_wqthread + 1288
12 libsystem_pthread.dylib        0x190b06d8c start_wqthread + 4

@E-B-Smith
Copy link
Contributor

I've since both master and beta branches to 0.12.26 which will solve this exact problem.

@almostintuitive
Copy link
Contributor Author

We haven't seen crashes from Branch since upgrading to the beta. Sample size over 50k devices.

@mohitongit
Copy link

Ok @E-B-Smith, will update the app to 0.12.26 maybe next week, only this one crash reported so far for 0.12.25 beta.

@E-B-Smith
Copy link
Contributor

Excellent! Thanks for reporting back. I really appreciate the feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants