Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock #95

Open
dsparkplug opened this issue Mar 22, 2018 · 10 comments
Open

Deadlock #95

dsparkplug opened this issue Mar 22, 2018 · 10 comments
Labels

Comments

@dsparkplug
Copy link
Contributor

I've found an circular locking issue with the IStanzaEventListener methods in the XmppStream class .

Basically the IStanzaEventListener.StanzaReceived which acquires a [MethodImpl(MethodImplOptions.Synchronized)] lock can call IStanzaEventListener.BytesWritten which also needs to acquire a [MethodImpl(MethodImplOptions.Synchronized)] lock and therefore deadlocks.

The chain of calls is as follows:

XmppStream.StanzaReceived acquires a MethodImpl(MethodImplOptions.Synchronized) lock
--> calls XmppStream.OnElement
if (State is not ServerFeaturesState.Instance or SASLState.Instance or StartTLSState.Instance or CompressionState.Instance or SASLAuthedState.Instance)
--> invokes XmppStream.OnProtocol event
In IQTracker OnProtocol event is set to OnIQ
--> calls IQTracker.Call which calls the 'cb' callback which is set to JabberClient.GotSession in JabberClient.GotResource
--> sets JabberClient.IsAuthenticated
--> calls JabberClient.Presence
--> calls JabberClient.Write
--> calls SocketStanzaStream.JabberClient.Write
--> calls AsyncSocket.Write
--> calls Stream.BeginWrite with callback WroteData
--> WroteData calls SocketStanzaStream.OnWrite
--> calls XmppStream.BytesWritten which waits for a MethodImpl(MethodImplOptions.Synchronized) lock

Any ideas how to go about fixing this? The way this code uses callbacks is so convoluted its hard to know where to start without breaking something else.

The chain of events and subsequent deadlock is reproducible when sending a large number of XMPP messages.

@ForNeVeR
Copy link
Member

Alright, that's a fair question.

Do you have any repro scenario? It would be easier to check (or even to write a unit test) when we have a MCVE.

I still didn't understood what thread exactly deadlock here, so I'm not sure how to proceed.

I have an idea to make more things async / threadsafe by default in our code, but first I need to better understand this particular case.

I agree that the code base isn't in the best state now. Hopefully I'll finish some big things in my life in a couple of months and will be able to come back to this library.

@KotM
Copy link

KotM commented May 21, 2018

In our project we have encountered with similar issues on simultaneous XmppStream.Write() attempts. At the least application waits a response from XmppStream.Write() from several threads. It seems to be a deadlock.

Currently I'll try to update our code by adding simply lock clause around this command, hope it will help.

@KotM
Copy link

KotM commented May 23, 2018

Unfortunately, locking doesn't help. Currently we're captured XmppStream.Write() hanging again, with lock state. Find below a stack trace (this thread was stopped on XmppStream.Write(), another one on lock state):

 	[In a sleep, wait, or join]	
 	JabberNet.dll!JabberNet.jabber.connection.XmppStream.JabberNet.jabber.connection.IStanzaEventListener.BytesWritten(byte[] buf = {byte[1437]}, int offset = 0, int count = 1437) + 0x3f bytes	
 	JabberNet.dll!JabberNet.bedrock.net.AsyncSocket.WroteData(System.IAsyncResult ar) + 0xbf bytes	
 	System.dll!System.Net.LazyAsyncResult.Complete(System.IntPtr userToken) + 0x52 bytes	
 	System.dll!System.Net.ContextAwareResult.CaptureOrComplete(ref System.Threading.ExecutionContext cachedContext, bool returnContext) + 0xfb bytes	
 	System.dll!System.Net.ContextAwareResult.FinishPostingAsyncOp(ref System.Net.CallbackClosure closure = {System.Net.CallbackClosure}) + 0x77 bytes	
 	System.dll!System.Net.Sockets.Socket.BeginSend(byte[] buffer, int offset, int size, System.Net.Sockets.SocketFlags socketFlags, out System.Net.Sockets.SocketError errorCode, System.AsyncCallback callback, object state) + 0x11c bytes	
 	System.dll!System.Net.Sockets.Socket.BeginSend(byte[] buffer, int offset, int size, System.Net.Sockets.SocketFlags socketFlags, System.AsyncCallback callback, object state) + 0x41 bytes	
 	System.dll!System.Net.Sockets.NetworkStream.BeginWrite(byte[] buffer, int offset, int size, System.AsyncCallback callback, object state) + 0xa4 bytes	
 	JabberNet.dll!JabberNet.bedrock.net.AsyncSocket.Write(byte[] buf, int offset, int len) + 0x162 bytes	
 	JabberNet.dll!JabberNet.jabber.connection.SocketStanzaStream.Write(string str) + 0x92 bytes	
	
[our code].Utilities.Jabber.PracticeAreaChannel.RespondToQuery(JabberNet.jabber.protocol.client.IQ response = Cannot evaluate expression because the current thread is in a sleep, wait, or join) Line 914 + 0x22 bytes	C#
 	
[our stack here, removed by security reasons]

 	System.dll!System.Net.LazyAsyncResult.Complete(System.IntPtr userToken) + 0x52 bytes	
 	mscorlib.dll!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) + 0x15e bytes	
 	mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) + 0x17 bytes	
 	mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state) + 0x52 bytes	
 	System.dll!System.Net.ContextAwareResult.Complete(System.IntPtr userToken) + 0x98 bytes	
 	System.dll!System.Net.LazyAsyncResult.ProtectedInvokeCallback(object result, System.IntPtr userToken) + 0xcc bytes	
 	System.dll!System.Net.HttpWebRequest.ProcessResponse() + 0x13c bytes	
 	System.dll!System.Net.HttpWebRequest.SetResponse(System.Net.CoreResponseData coreResponseData) + 0x1b2 bytes	
 	System.dll!System.Net.ConnectionReturnResult.SetResponses(System.Net.ConnectionReturnResult returnResult = {System.Net.ConnectionReturnResult}) + 0xbb bytes	
 	System.dll!System.Net.Connection.ReadComplete(int bytesRead, System.Net.WebExceptionStatus errorStatus) + 0x450 bytes	
 	System.dll!System.Net.Connection.PostReceive() + 0x109 bytes	
 	System.dll!System.Net.Connection.ReadComplete(int bytesRead, System.Net.WebExceptionStatus errorStatus) + 0x2d1 bytes	
 	System.dll!System.Net.LazyAsyncResult.Complete(System.IntPtr userToken) + 0x52 bytes	
 	System.dll!System.Net.LazyAsyncResult.ProtectedInvokeCallback(object result, System.IntPtr userToken) + 0xcc bytes	
 	System.dll!System.Net.Security._SslStream.ProcessFrameBody(int readBytes, byte[] buffer, int offset, int count, System.Net.AsyncProtocolRequest asyncRequest) + 0x14b bytes	
 	System.dll!System.Net.Security._SslStream.ReadFrameCallback(System.Net.AsyncProtocolRequest asyncRequest = {System.Net.AsyncProtocolRequest}) + 0xab bytes	
 	System.dll!System.Net.AsyncProtocolRequest.CompleteRequest(int result) + 0x3d bytes	
 	System.dll!System.Net.FixedSizeReader.CheckCompletionBeforeNextRead(int bytes) + 0x29 bytes	
 	System.dll!System.Net.FixedSizeReader.ReadCallback(System.IAsyncResult transportResult) + 0x78 bytes	
 	System.dll!System.Net.LazyAsyncResult.Complete(System.IntPtr userToken) + 0x52 bytes	
 	mscorlib.dll!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) + 0x15e bytes	
 	mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) + 0x17 bytes	
 	mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state) + 0x52 bytes	
 	System.dll!System.Net.ContextAwareResult.Complete(System.IntPtr userToken) + 0x98 bytes	
 	System.dll!System.Net.LazyAsyncResult.ProtectedInvokeCallback(object result, System.IntPtr userToken) + 0xcc bytes	
 	System.dll!System.Net.Sockets.BaseOverlappedAsyncResult.CompletionPortCallback(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* nativeOverlapped) + 0x13a bytes	
 	mscorlib.dll!System.Threading._IOCompletionCallback.PerformIOCompletionCallback(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* pOVERLAP) + 0x7d bytes	
 	[Native to Managed Transition]

@ForNeVeR
Copy link
Member

Alright, this is one thread. Could you please show stack for other thread (the one holding the lock)?

@KotM
Copy link

KotM commented May 23, 2018

I was not aware that second stack trace being useful taking into account that it was stopped on our lock, but I forgot about preceding code... So, please find it below, too:

	[In a sleep, wait, or join]	

 	[our code].Utilities.Jabber.PracticeAreaChannel.RespondToQuery(JabberNet.jabber.protocol.client.IQ response = Cannot evaluate expression because the current thread is in a sleep, wait, or join) Line 912 + 0x19 bytes	C#

 	[our stack here, removed by security reasons]

 	JabberNet.dll!JabberNet.jabber.client.JabberClient.FireOnIQ(object sender, JabberNet.jabber.protocol.client.IQ iq = Cannot evaluate expression because the current thread is in a sleep, wait, or join) + 0x2f bytes	
 	JabberNet.dll!JabberNet.jabber.connection.XmppStream.JabberNet.jabber.connection.IStanzaEventListener.StanzaReceived(System.Xml.XmlElement elem) + 0x48 bytes	
 	JabberNet.dll!JabberNet.jabber.protocol.ElementStream.FireOnElement(System.Xml.XmlElement elem) + 0x2b bytes	
 	JabberNet.dll!JabberNet.jabber.protocol.AsynchElementStream.EndTag(byte[] buf, int offset, JabberNet.xpnet.ContentToken ct, JabberNet.xpnet.TOK tok) + 0x134 bytes	
 	JabberNet.dll!JabberNet.jabber.protocol.AsynchElementStream.Push(byte[] buf = {byte[4096]}, int offset = 0, int length = 1275) + 0x19c bytes	
 	JabberNet.dll!JabberNet.jabber.connection.SocketStanzaStream.JabberNet.bedrock.net.ISocketEventListener.OnRead(JabberNet.bedrock.net.BaseSocket sock = {JabberNet.bedrock.net.AsyncSocket}, byte[] buf, int offset, int length) + 0xbc bytes	
 	JabberNet.dll!JabberNet.bedrock.net.AsyncSocket.GotData(System.IAsyncResult ar) + 0x9e bytes	
 	System.dll!System.Net.LazyAsyncResult.Complete(System.IntPtr userToken) + 0x52 bytes	
 	mscorlib.dll!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) + 0x15e bytes	
 	mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) + 0x17 bytes	
 	mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state) + 0x52 bytes	
 	System.dll!System.Net.ContextAwareResult.Complete(System.IntPtr userToken) + 0x98 bytes	
 	System.dll!System.Net.LazyAsyncResult.ProtectedInvokeCallback(object result, System.IntPtr userToken) + 0xcc bytes	
 	System.dll!System.Net.Sockets.BaseOverlappedAsyncResult.CompletionPortCallback(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* nativeOverlapped) + 0x13a bytes	
 	mscorlib.dll!System.Threading._IOCompletionCallback.PerformIOCompletionCallback(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* pOVERLAP) + 0x7d bytes	
 	[Native to Managed Transition]	

@dsparkplug
Copy link
Contributor Author

dsparkplug commented May 24, 2018

Sorry for the delay - other priorities etc.

Here's a MCVE which sends 10000 test messages. The deadlock will generally occur after a few thousand messages with several threads stuck at AsyncSocket.Write and one stuck at XmppStream.BytesWritten.

Note that this is only appears to happen with AutoPresence = true . Also, I wouldn't normally create a new client and authenticate for each message as in this example, but the authentication seems to involved in causing the deadlock.

XMPP server is OpenFire 4.2.3


using JabberNet.bedrock;
using JabberNet.jabber;
using JabberNet.jabber.client;
using JabberNet.jabber.connection;
using JabberNet.jabber.protocol;
using System;
using System.Threading;
using System.Xml;

namespace JabberTest
{
	public class Program
	{

		public static void Main(string[] args)
		{

			string server = "servername";
			string xmppID = "sparky";
			int port = 5222;
			string password = "#######";

			XmppSender xmppSender = new XmppSender();
			int sent = 0;
			for (var i = 0; i < 10000; i++)
			{

				if (!xmppSender.SendMessage(server, xmppID, port, password))
				{
					Console.WriteLine($"Failed - sent {sent} messages");
					return;
				}
				sent++;
			}
			Console.WriteLine($"Success - sent {sent} messages"); ;
		}

	}
	public class XmppSender
	{
		private ManualResetEvent _authenticatedEvent;

		public XmppSender()
		{
			_authenticatedEvent = new ManualResetEvent(false);

		}

		private void client_OnAuthenticate(object sender)
		{
			Console.WriteLine("Authenticated");
			_authenticatedEvent.Set();
		}

		private void client_OnConnect(object sender, StanzaStream stream)
		{
			Console.WriteLine("Connected");
		}
		private void client_OnDisconnect(object sender)
		{
			Console.WriteLine("Disconnected");
			_authenticatedEvent.Reset();
		}

		private void client_OnError(object sender, Exception ex)
		{
			Console.WriteLine($"The following error occurred while publishing an XMPP notification:\n{ex}");
			_authenticatedEvent.Set();
		}

		private void client_OnStreamError(object sender, XmlElement rp)
		{
			string info = rp == null ? "No further details" : rp.OuterXml;
			Console.WriteLine($"The following stream error occurred while publishing an XMPP notification:\n{info}");
			_authenticatedEvent.Set();
		}


		private void client_OnAuthError(object sender, XmlElement elem)
		{
			string info = elem == null ? "No further details" : elem.OuterXml;
			Console.WriteLine($"The following authentication error occurred while publishing an XMPP notification:\n{info}");
			_authenticatedEvent.Set();
		}

		public bool SendMessage(string server, string xmppID, int port, string password)
		{


			var jId = JID.Escape(xmppID, server, "Test");

			JabberClient client = new JabberClient();

			client.AutoReconnect = 3f;
			client.User = jId.User;
			client.Server = jId.Server;
			client.NetworkHost = jId.Server;
			client.Port = port;
			client.Resource = jId.Resource;
			client.Password = password;

			client.AutoPresence = true; //note that a deadlock can occur if this is set to true
			client.AutoRoster = false;
			client.SSL = false;
			client.AutoReconnect = 3;
			client.PlaintextAuth = true;
			client.AutoStartCompression = true;
			client.AutoStartTLS = false;
			client.AutoLogin = true;

			client.OnError += new ExceptionHandler(client_OnError);
			client.OnStreamError += new ProtocolHandler(client_OnStreamError);
			client.OnAuthenticate += new ObjectHandler(client_OnAuthenticate);
			client.OnDisconnect += new ObjectHandler(client_OnDisconnect);
			client.OnAuthError += new ProtocolHandler(client_OnAuthError);
			client.OnConnect += new StanzaStreamHandler(client_OnConnect);

			try
			{
				if (!client.IsAuthenticated)
				{
					_authenticatedEvent.Reset();

					if (!_authenticatedEvent.WaitOne(0))
						client.Connect();

					if (!_authenticatedEvent.WaitOne(10000))
						return false;
				}
				if (!client.IsAuthenticated)
					return false;

				jId.Resource = client.Resource;

				Console.WriteLine($"Sending message"); ;
				client.Message(jId.BareJID, "test message");
			}
			finally
			{
				client.Close();

				client.OnError -= client_OnError;
				client.OnStreamError -= client_OnStreamError;
				client.OnAuthenticate -= client_OnAuthenticate;
				client.OnConnect -= client_OnConnect;
				client.OnDisconnect -= client_OnDisconnect;
				client.OnAuthError -= client_OnAuthError;

			}

			return true;
		}
	}
}

@KotM
Copy link

KotM commented Jun 11, 2018

Any progress here?

And how dangerous to use AutoPresence = false?

@ForNeVeR
Copy link
Member

ForNeVeR commented Jun 14, 2018

Sorry, I was preoccupied with other great deeds, so wasn't able to participate in this case. Honestly I think that you, guys, have better opportunity to fix the issue if you have a repro case.

Although, I will now do my best to research the case myself and fix it. I have some spare time and I'm on it right now.

@ForNeVeR
Copy link
Member

@KotM, AutoPresence = false is pretty safe (at least if you don't need this functionality actually), and it seems like it reduces the amount of deadlocks in the code.

I was able to reproduce the issue, although, sometimes it happens even with AutoPresence = false. Looks like to terminate the issue completely we'll need pretty heavy rearchitecture of the JabberClient part of the library :(

@ForNeVeR
Copy link
Member

For now, I'm thinking about replacing the JabberClient internals with some actor system such as Akka.NET that should solve the problems (partially, at least). Event dispatching should be asynchronous and without manual locking; that way we'll probably defeat the deadlocks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants