Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-[SRProxyConnect _dequeueInput] crash #421

Closed
thorfei opened this issue Jun 23, 2016 · 16 comments
Closed

-[SRProxyConnect _dequeueInput] crash #421

thorfei opened this issue Jun 23, 2016 · 16 comments
Assignees

Comments

@thorfei
Copy link

thorfei commented Jun 23, 2016

the master branch, pod "SocketRocket", :git => 'https://github.com/facebook/SocketRocket.git', :commit => '0af058c'

2016-06-23 5 52 10

@nlutsenko
Copy link
Contributor

@CodeFei, can you share the full stack trace for all the threads by any chance?
Would be helpful to see if any other thread is trying to access the same thing.

@nlutsenko
Copy link
Contributor

inputQueue is ok to be not thread-safe, since it's accessed from a context of a single thread aka SRRunLoopThread, that's why I am wondering on the state of other threads...

@thorfei
Copy link
Author

thorfei commented Jun 24, 2016

here is the full stack, and I have checked other thread state doesn't find useful info.

stack.txt

@michaelkirk
Copy link
Contributor

michaelkirk commented Jun 27, 2016

Seeing a similar crash.

Crash report for all threads: https://gist.github.com/michaelkirk/8ceccf89d270c2df37fdc9d28a2d1e40

It wasn't in our top crashes until updating from 7373546 to b0aabde.

This could be because it's a new crash, or because we've reduced the noise of our other top crashes in the meanwhile (substantially because of #156 - thanks!).

Note, we're running a modified version to support our pinning policy - so actually the exact upgrade was from FredericJacobs@a062cc7 to signalapp@587ad29.

update: still seeing this as of 8096fef.

@erikprice
Copy link
Contributor

erikprice commented Jul 21, 2016

We have a related crash – it's in the same method, but a different error message:

Fatal Exception: NSInternalInconsistencyException

-[__NSCFArray removeObjectAtIndex:]: mutating method sent to immutable object

Fatal Exception: NSInternalInconsistencyException
0  CoreFoundation                 0x2098b91b __exceptionPreprocess
1  libobjc.A.dylib                0x20126e17 objc_exception_throw
2  CoreFoundation                 0x2098b861 -[NSException initWithCoder:]
3  CoreFoundation                 0x208e206f -[__NSCFArray removeObjectAtIndex:]
4  MyApp                          0x5f779b -[SRProxyConnect _dequeueInput] (SRProxyConnect.m:414)
5  MyApp                          0x5f7705 -[SRProxyConnect _processInputStream] (SRProxyConnect.m:404)
6  MyApp                          0x5f74c1 -[SRProxyConnect stream:handleEvent:] (SRProxyConnect.m:367)
7  CoreFoundation                 0x208f947b _signalEventSync
8  CoreFoundation                 0x20903b17 _cfstream_solo_signalEventSync
9  CoreFoundation                 0x208f9109 _CFStreamSignalEvent
10 CFNetwork                      0x20e6f09d SocketStream::dispatchSignalFromSocketCallbackUnlocked(SocketStreamSignalHolder*)
11 CFNetwork                      0x20e6ed79 SocketStream::socketCallback(__CFSocket*, unsigned long, __CFData const*, void const*)
12 CFNetwork                      0x20e6ec87 SocketStream::_SocketCallBack_stream(__CFSocket*, unsigned long, __CFData const*, void const*, void*)
13 CoreFoundation                 0x20951aef __CFSocketPerformV0
14 CoreFoundation                 0x2094ddff __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__
15 CoreFoundation                 0x2094d9ed __CFRunLoopDoSources0
16 CoreFoundation                 0x2094bd5b __CFRunLoopRun
17 CoreFoundation                 0x2089b229 CFRunLoopRunSpecific
18 CoreFoundation                 0x2089b015 CFRunLoopRunInMode
19 Foundation                     0x210e4045 -[NSRunLoop(NSRunLoop) runMode:beforeDate:]
20 MyApp                          0x5f802f -[SRRunLoopThread main] (SRRunLoopThread.m:76)
21 Foundation                     0x211b54a5 __NSThread__start__
22 libsystem_pthread.dylib        0x206bf85b _pthread_body
23 libsystem_pthread.dylib        0x206bf7cf _pthread_start
24 libsystem_pthread.dylib        0x206bd724 thread_start

We're using the HEAD of this branch.

@nlutsenko
Copy link
Contributor

Huh, I actually am not sure how that is even possible, taking into account that _inputQueue is assigned only once at initialization time and it's NSMutableArray at all times.

Memory corruption? :(

@erikprice
Copy link
Contributor

erikprice commented Aug 1, 2016

Does this method need synchronization, since it sets the streams' delegates (in -_initializeStreams) and then schedules one of those streams on run loop that runs on a different thread?

What I mean is, should the body of this method be handed over to be executed on the SRRunLoopThread? (Thread isolation-based synchronization, not necessarily lock-based synchronization.)

@nlutsenko
Copy link
Contributor

Quite possibly, but taking into account that it's a single open procedure, I don't think this actually is the bottleneck. What is probably happening is the ping back from the NSStream, when SRProxyConnect was already deallocated, which is a weird fruit by itself, since we make sure that we remove the delegates/remove the streams from runloop on dealloc.

@erikprice, would you have an http/socks proxy handy that I could use to debug this?

@erikprice
Copy link
Contributor

I'm taking another look and trying to understand how we might experience EXC_BAD_ACCESS in -[SRProxyConnect _dequeueInput]. This leads me to a different question:

What are the circumstances under which -[SRProxyConnect stream:handleEvent:] is passed NSStreamEventHasBytesAvailable? In some quick tests, I'm seeing this method only ever get passed NSStreamEventOpenCompleted.

would you have an http/socks proxy handy that I could use to debug this?

I have to confess I'm not sure exactly what you're asking for here, can you elaborate?

@nlutsenko
Copy link
Contributor

Most of the logic in SRProxyConnect is not invoked, unless there is a configured http/socks proxy for the current connection. After rigorous testing, I can't get the code to crash at all, so I am thinking that this might be reproducible only when using proxy.

Would love to know if you can encounter this on a general connection as well (aka no proxy).

@michaelmorton
Copy link

Erik and I are trying to figure out this crash in our app. We haven’t yet set up a proxy to try to drive the case where the code handles NSStreamEventHasBytesAvailable, but while we’re working on that, we’d like your take on the following theory:

The comment in dealloc mentions that an SRProxyConnect instance can get deallocated “before the socket open finishes”. Consider the following sequence:

(1) _dequeueInput: => _proxyProcessHTTPResponseWithData: => _proxyHTTPHeadersDidFinish => _didConnect
(2) _didConnect does _completion(nil, inputStream, outputStream)
(3) Hypothesis: the completion block causes the SRProxyConnect instance to dealloc
(4) The stack described in (1) unwinds and the next code executed is
[_inputQueue removeObjectAtIndex:0];
(5) In the common case, that line executes without any problem. Although self no longer points to a valid object, nothing in that line tries to message self and the reference points to _inputQueue, yielding the nil which ARC stored during dealloc.
(6) But… in rare cases, perhaps especially when memory runs low, something gets alloc’d at the same location before that line executes. The _dequeueInput: code picks up some bogus pointer for _inputQueue and has a bad hair day, either messaging bogus data or (by coincidence) a non-mutable array.

@nlutsenko
Copy link
Contributor

Yup, seems logical.
Let me move the code around a tiny bit here, so removal happens before we process the data...
Going to attach a Pull Request to this issue in a brief moment.

@michaelkirk
Copy link
Contributor

Has anyone tried #455 in production?

Anything interesting happen to crash rates in dequeuInput?

@erikprice
Copy link
Contributor

We're using that fix in production, and we haven't seen any crashes in -[SRProxyConnect _dequeueInput] since doing so.

@nlutsenko
Copy link
Contributor

Whee! Thanks @erikprice for confirming. I am going to close this issue in this case.
Please reopen or open a new one if you are still running into any crashes.

@nlutsenko nlutsenko self-assigned this Sep 9, 2016
michaelkirk added a commit to signalapp/Signal-iOS that referenced this issue Sep 30, 2016
See:
facebookincubator/SocketRocket#421

Which is one of our more popular crashes, and happening more frequently
with iOS10.

// FREEBIE
michaelkirk added a commit to signalapp/Signal-iOS that referenced this issue Sep 30, 2016
See:
facebookincubator/SocketRocket#421

Which is one of our more popular crashes, and happening more frequently
with iOS10.

// FREEBIE
@michaelkirk
Copy link
Contributor

Previously we were seeing this crash several times every day. Since releasing this update 10 days ago, we haven't seen it once. 👍

Thank you @erikprice, @michaelmorton, and @nlutsenko!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants