Skip to content

Troubleshooting Guide Devices

Pierre Cauchois edited this page Feb 5, 2018 · 1 revision

Troubleshooting Guide - Devices

Have a problem connecting to IoT Hub? Some operations are failing and you're not sure why? What is it that you can do to try and troubleshoot things before you file an issue?

Cannot connect to your Azure IoT hub.

For some reasons, your device doesn't seem to be able to connect to your Azure IoT hub. A few things to verify:

  1. Are your credentials correct?
    1. if you're using x509 certificates, double-check that the thumbprint in the registry matches the one of the certificate you're trying to use
    2. if you're using a connection string with a shared access key, make sure it matches the device or a policy with the DeviceConnect capability.
    3. if you're using a shared access signature, make sure the expiry is correct and that you're using the rkght shared access key to sign it.
  2. Verify in your device registry (using the azure portal) that your device is enabled
  3. Can you get through the firewall?
    1. The easiest thing you can try and to run the iothub-diagnostics tool and see if it manages to connect to your Azure IoT hub with your devices credentials. It will try all supported protocols and websockets and report back.
    2. if you cannot run iothub-diagnostics you can try to run through the same steps manually:
      1. ping a known website to verify name resolution and outbound traffic works
      2. Change the transport used to instantiate the client (Amqp, AmqpWs, Mqtt, MqttWs, and Http).
  4. Try running the default samples.
    1. If the samples can connect, try finding differences between how you instantiate the client and how the samples do. it might be a simple typo.
    2. If the samples cannot connect and neither can iothub-diagnostics it's likely an issue with the credentials or your network.

Not detecting disconnections

The hard thing about disconnections is that they often seem random and if the SDK is not firing an error, there's no way to know what's going on. Or is there?

  1. Could the retry logic be just delaying things?
    • Be default the retry logic will go on for 4 minutes. Have you waited that long?
    • If you don't want to wait, try disabling the retry logic by calling client.setRetryPolicy(new NoRetry());
  2. Need detailed logs? The SDK uses the debug library for logging
    • Set the DEBUG environment variable and re-run your application. a few good values for the DEBUG environment variable to get you started:
      • azure* will log SDK activity but not the underlying transport library
      • amqp10* will log the low-level AMQP library activity
      • * will log everything
    • debug logs to stderr by default, and can be quite verbose especially if set to *.
    • If you're saving those logs in order to post them in an issue, be careful to scrape for confidential information!

Failing to send some messages

That's another tricky one. It looks like some messages are being sent, but not all of them. What gives? The first question to ask is How do you know some messages aren't being sent?

  1. If it's because the callback is called with an error, the error object might give you more clues than just a message. Pay attention to the type of the error itself:
    • If it's a custom SDK type it should be pretty explicit, but if it's not enough, look at the properties of the error and try to see if there's a protocol-specific error in there.
    • If it's a generic Error it means the SDK failed to translate that error. Please file an issue and give us as many details as possible including the values of the error properties and the error stack.
  2. If it's because you're not seeing the messages in your cloud application, try checking:
    • On the device side, the arguments passed to the callback of the send operation.
    • Try using iothub-explorer with the monitor-events subcommand to check if the messages show up on the event-hubs compatible endpoint of your IoT Hub. If they do, at least you know that the device is acting properly. If they don't, you know it's unlikely to be a service issue and can track down device-side issues