Skip to content

Agent crashes when disconnected from network and cannot recover #93

@virizar

Description

@virizar

Hi @PatrickRitchie

We are currently experimenting with disconnection and re connection recovery of the agent and we are seeing that the agent seems to be crashing and failing to recover and reconnecting to our MQTT brokers, and also the HTTP server stops working

We are using the v6.5 agent

The test setup is as follows:

  • We have an agent running in one particular PC running Ubuntu and a gateway (broker) in another, all in the same network
  • We connect the agent to the broker and check that we have data flowing
  • We disconnect the Ethernet cable from the PC that is running the agent, wait some seconds, and reconnect

At the end of this, we notice the agent is not reconnecting to the broker and the http server is also down

Looking at the logs of the agent, there seems to be a set of unhandled exceptions propagating all the way to the agent and/or the entity server. I presume this breaks the execution. However the process is not killed when checking the process manager in linux

2025-01-08 10:07:05,934: 2025-01-08 10:07:05.9341|DEBUG|modules.shdr-adapter|ID = adapter_shdr_56697b5624 : PING sent to : localhost on Port 7878
2025-01-08 10:07:05,936: 2025-01-08 10:07:05.9364|DEBUG|modules.shdr-adapter|ID = adapter_shdr_56697b5624 : PONG Received from : localhost on Port 7878 : Heartbeat = 10000ms
2025-01-08 10:07:15,616: Unhandled exception. Unhandled exception. Unhandled exception. Unhandled exception. Unhandled exception. Unhandled exception. Unhandled exception. Unhandled exception. Unhandled exception. Unhandled exception. MQTTnet.Client.MqttClientDisconnectedException: The MQTT client is disconnected.
2025-01-08 10:07:15,616: at MQTTnet.PacketDispatcher.MqttPacketAwaitable`1.WaitOneAsync(CancellationToken cancellationToken)
2025-01-08 10:07:15,617: at System.Threading.ThreadPoolWorkQueue.Dispatch()
2025-01-08 10:07:15,617: at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
2025-01-08 10:07:15,618: --- End of stack trace from previous location ---
2025-01-08 10:07:15,619: at MTConnect.Clients.MTConnectMqttEntityServer.PublishObservation(IMqttClient mqttClient, IObservation observation) in /root/agent/mtconnect_dotnet_agent/libraries/MTConnect.NET-MQTT/MTConnectMqttEntityServer.cs:line 99
2025-01-08 10:07:15,620: at MQTTnet.PacketDispatcher.MqttPacketAwaitable`1.WaitOneAsync(CancellationToken cancellationToken)
2025-01-08 10:07:15,620: at MQTTnet.Client.MqttClient.Request[TResponsePacket](MqttPacket requestPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,620: at MQTTnet.Client.MqttClient.PublishAtLeastOnce(MqttPublishPacket publishPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,621: at MTConnect.Clients.MTConnectMqttEntityServer.PublishObservation(IMqttClient mqttClient, IObservation observation) in /root/agent/mtconnect_dotnet_agent/libraries/MTConnect.NET-MQTT/MTConnectMqttEntityServer.cs:line 99
2025-01-08 10:07:15,621: at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
2025-01-08 10:07:15,622: at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
2025-01-08 10:07:15,622: --- End of stack trace from previous location ---
2025-01-08 10:07:15,623: MQTTnet.Client.MqttClientDisconnectedException: The MQTT client is disconnected.
2025-01-08 10:07:15,623: at MQTTnet.PacketDispatcher.MqttPacketAwaitable`1.WaitOneAsync(CancellationToken cancellationToken)
2025-01-08 10:07:15,623: at MQTTnet.Client.MqttClient.PublishAtLeastOnce(MqttPublishPacket publishPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,624: at MQTTnet.Client.MqttClient.Request[TResponsePacket](MqttPacket requestPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,624: at MQTTnet.Client.MqttClient.Request[TResponsePacket](MqttPacket requestPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,625: at MTConnect.Clients.MTConnectMqttEntityServer.PublishObservation(IMqttClient mqttClient, IObservation observation) in /root/agent/mtconnect_dotnet_agent/libraries/MTConnect.NET-MQTT/MTConnectMqttEntityServer.cs:line 99
2025-01-08 10:07:15,625: at MTConnect.Module.AgentObservationAdded(Object sender, IObservation observation) in /root/agent/mtconnect_dotnet_agent/agent/Modules/MTConnect.NET-AgentModule-MqttRelay/Module.cs:line 565
2025-01-08 10:07:15,626: at MQTTnet.Client.MqttClient.PublishAtLeastOnce(MqttPublishPacket publishPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,626: at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
2025-01-08 10:07:15,626: at System.Threading.ThreadPoolWorkQueue.Dispatch()
2025-01-08 10:07:15,627: at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
2025-01-08 10:07:15,627: --- End of stack trace from previous location ---
2025-01-08 10:07:15,627: at MTConnect.Clients.MTConnectMqttEntityServer.PublishObservation(IMqttClient mqttClient, IObservation observation) in /root/agent/mtconnect_dotnet_agent/libraries/MTConnect.NET-MQTT/MTConnectMqttEntityServer.cs:line 99
2025-01-08 10:07:15,628: MQTTnet.Client.MqttClientDisconnectedException: The MQTT client is disconnected.
2025-01-08 10:07:15,628: at MQTTnet.PacketDispatcher.MqttPacketAwaitable`1.WaitOneAsync(CancellationToken cancellationToken)
2025-01-08 10:07:15,628: at System.Threading.ThreadPoolWorkQueue.Dispatch()
2025-01-08 10:07:15,629: at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
2025-01-08 10:07:15,629: --- End of stack trace from previous location ---
2025-01-08 10:07:15,630: at MTConnect.Clients.MTConnectMqttEntityServer.PublishObservation(IMqttClient mqttClient, IObservation observation) in /root/agent/mtconnect_dotnet_agent/libraries/MTConnect.NET-MQTT/MTConnectMqttEntityServer.cs:line 99
2025-01-08 10:07:15,630: at MQTTnet.PacketDispatcher.MqttPacketAwaitable`1.WaitOneAsync(CancellationToken cancellationToken)
2025-01-08 10:07:15,631: at MQTTnet.Client.MqttClient.Request[TResponsePacket](MqttPacket requestPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,631: at MQTTnet.Client.MqttClient.PublishAtLeastOnce(MqttPublishPacket publishPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,632: at MTConnect.Clients.MTConnectMqttEntityServer.PublishObservation(IMqttClient mqttClient, IObservation observation) in /root/agent/mtconnect_dotnet_agent/libraries/MTConnect.NET-MQTT/MTConnectMqttEntityServer.cs:line 99
2025-01-08 10:07:15,632: at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
2025-01-08 10:07:15,632: at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
2025-01-08 10:07:15,633: --- End of stack trace from previous location ---
2025-01-08 10:07:15,633: at MQTTnet.PacketDispatcher.MqttPacketAwaitable`1.WaitOneAsync(CancellationToken cancellationToken)
2025-01-08 10:07:15,633: at MQTTnet.Client.MqttClient.Request[TResponsePacket](MqttPacket requestPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,634: at MQTTnet.Client.MqttClient.Request[TResponsePacket](MqttPacket requestPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,634: at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
2025-01-08 10:07:15,634: at System.Threading.ThreadPoolWorkQueue.Dispatch()
2025-01-08 10:07:15,635: at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
2025-01-08 10:07:15,635: --- End of stack trace from previous location ---
2025-01-08 10:07:15,635: MQTTnet.Client.MqttClientDisconnectedException: The MQTT client is disconnected.
2025-01-08 10:07:15,636: at MQTTnet.PacketDispatcher.MqttPacketAwaitable`1.WaitOneAsync(CancellationToken cancellationToken)
2025-01-08 10:07:15,636: at System.Threading.ThreadPoolWorkQueue.Dispatch()
2025-01-08 10:07:15,636: at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
2025-01-08 10:07:15,637: --- End of stack trace from previous location ---
2025-01-08 10:07:15,637: at MTConnect.Clients.MTConnectMqttEntityServer.PublishObservation(IMqttClient mqttClient, IObservation observation) in /root/agent/mtconnect_dotnet_agent/libraries/MTConnect.NET-MQTT/MTConnectMqttEntityServer.cs:line 99
2025-01-08 10:07:15,637: at MQTTnet.PacketDispatcher.MqttPacketAwaitable`1.WaitOneAsync(CancellationToken cancellationToken)
2025-01-08 10:07:15,638: at MQTTnet.Client.MqttClient.Request[TResponsePacket](MqttPacket requestPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,638: at MQTTnet.Client.MqttClient.PublishAtLeastOnce(MqttPublishPacket publishPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,638: at MTConnect.Clients.MTConnectMqttEntityServer.PublishObservation(IMqttClient mqttClient, IObservation observation) in /root/agent/mtconnect_dotnet_agent/libraries/MTConnect.NET-MQTT/MTConnectMqttEntityServer.cs:line 99
2025-01-08 10:07:15,638: at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
2025-01-08 10:07:15,639: at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
2025-01-08 10:07:15,639: --- End of stack trace from previous location ---
2025-01-08 10:07:15,639: MQTTnet.Client.MqttClientDisconnectedException: The MQTT client is disconnected.
2025-01-08 10:07:15,640: at MQTTnet.PacketDispatcher.MqttPacketAwaitable`1.WaitOneAsync(CancellationToken cancellationToken)
2025-01-08 10:07:15,640: at MQTTnet.Client.MqttClient.PublishAtLeastOnce(MqttPublishPacket publishPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,640: at MQTTnet.Client.MqttClient.Request[TResponsePacket](MqttPacket requestPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,640: at MQTTnet.Client.MqttClient.Request[TResponsePacket](MqttPacket requestPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,641: at MTConnect.Clients.MTConnectMqttEntityServer.PublishObservation(IMqttClient mqttClient, IObservation observation) in /root/agent/mtconnect_dotnet_agent/libraries/MTConnect.NET-MQTT/MTConnectMqttEntityServer.cs:line 99
2025-01-08 10:07:15,641: at MTConnect.Module.AgentObservationAdded(Object sender, IObservation observation) in /root/agent/mtconnect_dotnet_agent/agent/Modules/MTConnect.NET-AgentModule-MqttRelay/Module.cs:line 565
2025-01-08 10:07:15,641: at MQTTnet.Client.MqttClient.PublishAtLeastOnce(MqttPublishPacket publishPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,641: at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
2025-01-08 10:07:15,642: at System.Threading.ThreadPoolWorkQueue.Dispatch()
2025-01-08 10:07:15,642: at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
2025-01-08 10:07:15,642: --- End of stack trace from previous location ---
2025-01-08 10:07:15,643: at MTConnect.Clients.MTConnectMqttEntityServer.PublishObservation(IMqttClient mqttClient, IObservation observation) in /root/agent/mtconnect_dotnet_agent/libraries/MTConnect.NET-MQTT/MTConnectMqttEntityServer.cs:line 99
2025-01-08 10:07:15,643: at MTConnect.Module.AgentObservationAdded(Object sender, IObservation observation) in /root/agent/mtconnect_dotnet_agent/agent/Modules/MTConnect.NET-AgentModule-MqttRelay/Module.cs:line 565
2025-01-08 10:07:15,643: at MTConnect.Module.AgentObservationAdded(Object sender, IObservation observation) in /root/agent/mtconnect_dotnet_agent/agent/Modules/MTConnect.NET-AgentModule-MqttRelay/Module.cs:line 565
2025-01-08 10:07:15,643: at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
2025-01-08 10:07:15,644: --- End of stack trace from previous location ---
2025-01-08 10:07:15,644: at System.Threading.ThreadPoolWorkQueue.Dispatch()
2025-01-08 10:07:15,644: at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
2025-01-08 10:07:15,645: at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
2025-01-08 10:07:15,645: at System.Threading.ThreadPoolWorkQueue.Dispatch()
2025-01-08 10:07:15,645: at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
2025-01-08 10:07:15,645: MQTTnet.Client.MqttClientDisconnectedException: The MQTT client is disconnected.
2025-01-08 10:07:15,646: at MQTTnet.PacketDispatcher.MqttPacketAwaitable`1.WaitOneAsync(CancellationToken cancellationToken)
2025-01-08 10:07:15,646: at MQTTnet.Client.MqttClient.PublishAtLeastOnce(MqttPublishPacket publishPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,646: at MQTTnet.Client.MqttClient.Request[TResponsePacket](MqttPacket requestPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,647: at MQTTnet.Client.MqttClient.Request[TResponsePacket](MqttPacket requestPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,647: at MTConnect.Clients.MTConnectMqttEntityServer.PublishObservation(IMqttClient mqttClient, IObservation observation) in /root/agent/mtconnect_dotnet_agent/libraries/MTConnect.NET-MQTT/MTConnectMqttEntityServer.cs:line 99
2025-01-08 10:07:15,647: at MTConnect.Module.AgentObservationAdded(Object sender, IObservation observation) in /root/agent/mtconnect_dotnet_agent/agent/Modules/MTConnect.NET-AgentModule-MqttRelay/Module.cs:line 565
2025-01-08 10:07:15,647: at MQTTnet.Client.MqttClient.PublishAtLeastOnce(MqttPublishPacket publishPacket, CancellationToken cancellationToken)
2025-01-08 10:07:15,648: at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
2025-01-08 10:07:15,648: at System.Threading.ThreadPoolWorkQueue.Dispatch()
2025-01-08 10:07:15,648: at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
2025-01-08 10:07:15,649: --- End of stack trace from previous location ---
2025-01-08 10:07:15,649: at MTConnect.Clients.MTConnectMqttEntityServer.PublishObservation(IMqttClient mqttClient, IObservation observation) in /root/agent/mtconnect_dotnet_agent/libraries/MTConnect.NET-MQTT/MTConnectMqttEntityServer.cs:line 99
2025-01-08 10:07:15,649: at MTConnect.Module.AgentObservationAdded(Object sender, IObservation observation) in /root/agent/mtconnect_dotnet_agent/agent/Modules/MTConnect.NET-AgentModule-MqttRelay/Module.cs:line 565
2025-01-08 10:07:15,649: at MTConnect.Module.AgentObservationAdded(Object sender, IObservation observation) in /root/agent/mtconnect_dotnet_agent/agent/Modules/MTConnect.NET-AgentModule-MqttRelay/Module.cs:line 565
2025-01-08 10:07:15,650: at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
2025-01-08 10:07:15,650: --- End of stack trace from previous location ---
2025-01-08 10:07:15,650: at System.Threading.ThreadPoolWorkQueue.Dispatch()
2025-01-08 10:07:15,650: at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
2025-01-08 10:07:15,651: at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
2025-01-08 10:07:15,651: at System.Threading.ThreadPoolWorkQueue.Dispatch()
2025-01-08 10:07:15,651: at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()

It does seem that many (or all) of those failing functions check if the mqtt client is connected before calling methods on the instance but It seems that is not enough? Is there a place where a try-catch block can prevent this from happening?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions