-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Module fails to restart because of transient network error #1982
Comments
Did you check DNS works at the moment that your module failed to connect? did you test DNS inside failed module? does other modules use the same way to instantiate module client? Would you share the full log of iotedged, edgeAgent and edgeHub. @varunpuranik do you have any suggestion? |
I've been able to ssh into the Raspberry Pi that acts as an edge device. Since {
"dns": ["1.1.1.1"]
} I've also noticed that the For now, the problem hasn't reoccurred yet, but now I also do not know which change might have fixed the problem. |
Definitely 127.0.0.1 mapping should always map to your device name. For using public DNS 1.1.1.1, I think it is required as well since iotedge check result of your device tells that no DNS is configured. Please monitor and provide logs of IoT edge daemon, edgeAgent, edgeHub and your custom module , and iotedge check result when problem happens again. This issue will go to stale if it is inactive for 30 days. We will close it when it goes to stale. thanks. |
Should 127.0.0.1 map to the device-name or 127.0.1.1 ? (127.0.0.1 refers to The logs for edgeAgent, edgeHub should be in the issue, but seems like there's something wrong with the formatting. I'll fix it (done). I can't find anymore logs from my custom module since it seems that it is now running for a week and the issue has not reoccured yet, so it seems to be fixed for now ... How can I get the logs for the edge daemon ? |
I think the change to the /etc/hosts file fixed your issue because the connection failed from the module to edgeHub. The change to daemon.json to add the DNS is to fix issues from edgeHub to the cloud. Because your module is running on the host network it needs to be able to resolve the GatewayHostname, you can check which value it has in the Env for IOTEDGE_GATEWAYHOSTNAME using:
To get daemon logs:
We will evaluate if iotedge check should validate /etc/hosts and /etc/hostname |
I will close this issue, please reopen if needed. |
Expected Behavior
When an IoT Edge module disconnects it should be able to reconnect.
Current Behavior
I have an IoT Edge solution that runs on a Raspberry Pi and consists of 4 IoT Edge modules.
Upon deployment of the solution to the Edge Device (Pi), everything was working fine. After a couple of hours, one of the modules was being restarted by the IoT Edge runtime,
and failed to start again. All the other modules just keep on working.
When I look in the logs of the failing module, I can see that an exception is being thrown with the following stacktrace:
This exception occurs when the
OpenAsync
method on theModuleClient
is being called to open the connection to IoT Hub.The module is using MQTT as a communication protocol; this is how the connection is being established:
My other modules that are running on the same device (and that keep working), are also using MQTT.
I'm using
Microsoft.Azure.Devices.Client
1.20.3 in all my Edge Modules (both on the one that's failing and in the other ones that are still working).I also tried to restart the module manually via
iotedge restart <modulename>
, but this didn't help. I assume that the problem will be solved (temporarely?) when I reboot the device, but this is not something that can be done when the device is 'in the field'.The module that is failing has the following 'createOptions':
(However, this module is also running on another device and this issue doesn't exists on the other device).
Context (Environment)
Output of
iotedge check
Click here
Device Information
Runtime Versions
iotedge version
]: iotedge 1.0.8 (208b220)docker version
]: 3.0.7Logs
edge-agent logs
edge-hub logs
The text was updated successfully, but these errors were encountered: