Skip to content

Diagnostics

krahabb edited this page Mar 24, 2024 · 3 revisions

Diagnostics are fundamental to get insights on what is happening behind the hood. We have two main scenarios for which user collected diagnostics are useful or sometimes needed in order to resolve issues:

  • Misbehaving or not working at all devices.
  • Unsupported device types or features (i.e. devices partially supported)

When posting an issue, posting a device diagnostic together, might speed up the resolution process even though, depending on the issue, sometimes this isn't needed at all. That's why there's no strict policy on issues posting and it's not mandatory to post traces when opening issues.

In meross_lan there are 3 main diagnostics tools that could be leveraged to get these traces:

Debug log

This is the most obvious and somewhat powerful way since it is a system-wide setting and could help diagnose subtle bugs or behaviours. Nevertheless, there are some caveats letting this feature more of a development debugging tool (which it is in reality). Also, debug logging is a 'passive' feature of the code so, when you activate debug logging, meross_lan doesn't do anything special to help gather diagnostics insights on the device which is instead used on the other 2 diagnostics mechanisms. Nevertheless, this is the standard diagnostic tool for any software and it works for some scenarios so it's nice to have it. Latest HA cores allow you to enable/disable debug logging for an integration from the UI and this works as it should but could be a bit cumbersome if you have a lot of devices in meross_lan since this setting affects the whole integration and could be very verbose. In meross_lan v5.0.0 and later you can configure logging level and other diagnostic features 'x device' (or 'x profile' meaning you can log the MQTT connections beahvior). On the integration page click CONFIGURE -> Diagnostics and you'll get a form with different configuration options. Here you can:

  • 'Create diagnostic entities': this is on option for exposing (newer) 'unknown' device features as sensors so that you can read the device state in HA without the need for meross_lan to be able to understand what it is. These sensors will be named according to their 'namespace' in the device api and will have no meaningful semantics in HA. They will just be plain values as reported by the device. Once set, they're queried on a rather long timeout (about 5 minutes)
  • 'Logging level': here you can control the verbosity of the log for the device/profile. It might be helpful to set this to 'DEBUG' in order to get detailed informations about what the code is doing. There's also the 'VERBOSE' level where also the raw messages payloads are reported (very verbose indeed)
  • 'Obfuscate sensitive data in logs': as expected this will hide/mask sensitive payloads data when dumped. The list of sensitive fields is based off current knowledge of the data semantics and 'personal beliefs' in what is sensitive and what's not. In general I consider 'sensitive' everything that might identify the device or that might be used to access the device (uuid and other 'ids' - tokens - mac addresses - server names ....)
  • 'Start diagnostics trace': when checked, the device will enter a special tracing mode for the amount of time configured in the next form field (See 'Debug tracing' section later). When in tracing mode, meross_lan, beside the usual logging, will dump logs and data payloads in a dedicated file.

Debug tracing

This is a custom feature of meross_lan, some of a 'debug log' on steroids without the caveats (edit/restart/verbosity) of the system debug log. This feature allows meross_lan to enter a special tracing mode on the device under inspection and trace every single message request/response together with log messages (the same you would see on the debug log) when something relevant happens. Also, this feature starts querying the device for any possible message it is claiming support: this is crucial in developing new features/devices since, if we don't know what the device could be capable of, and how it communicates these features, there's no 'guess' implementation to follow. Of course this message querying is not guaranteed to work since the message structure is almost totally unknown most of the times and meross_lan follows some heuristics to try query the device. That's why you could experience disconnections on the device while tracing because most of the times meross devices don't like at all incorrect payloads or so. The 'Debug tracing' feature is enabled by entering the integration configuration panel (hit CONFIGURE on the integration panel) - choosing Diagnostics in the appearing menu - and checking the Start diagnostics trace on the form. meross_lan will then start collecting the trace in background for a limited amount of time: it was hardcoded to 10 minutes in previous versions while it is now (since 2.5.3) user configurable. Also, there's a 256 kb hardcoded limit on the size of the trace file. The trace file will be saved under custom_components/meross_lan/traces in your HA configuration directory and will contain a 'tab separated values' list of all of these message exchanges and logs of the device under inspection. While saving to the file, meross_lan will take care of obfuscating (this feature can be configured in the same diagnostics panel) some known sensitive values like keys or ids. It could happen a new message type previously not known (remember this feature queries all of the actually supported messages on the device) is not correctly obfuscated since meross_lan doesn't know the meaning of the message itself. This feature is easy to activate (no restart - no configuration edit - no system log pollution), will stop itself automatically, will just 'require' you to access the 'traces' folder to retrieve the file after it's finished (no notification here..just wait for the timeout to expire). The Download diagnostics feature works almost the same but only collects and reports the full device state at the time it's collected and doesn't allow you to trace the behavior of the device over a time span.

Download diagnostics

Starting from HA core 2022.2.x there's a diagnostics feature available to ease diagnostics collection and sharing. You'll find it on a supported device panel and/or on a supported integration configuration panel (you'll have to open the dot menu here to access the Download diagnostics feature -meross_lan 2.5.3 required). In meross_lan they'll work the same since an integration is usually linked to a unique device and vice versa (only smart hubs are actually supporting more devices linked to a single integration but that doesn't care in meross_lan diagnostics).

The 'Download diagnostics' feature in meross_lan is very similar to the 'Debug tracing' (see above). Beside reporting the configuration and versioning informations, the code will start querying the device and report a full state (both for known messages and unknown ones). In versions pre 5.0.0 it worked exactly the same as 'Debug tracing' (i.e. collecting data for the amount of time configured in device diagnostics settings) just providing a different output format but this mechanic was not working fine when you were accessing your HA instance through a kind of proxy (like Nabucasa). Now (v5.0.0 and later) it works as:

  • If the device is not forcibly configured for MQTT it does a 'fast query' of the device state/features and then finishes thus avoiding the failure through any proxied access.
  • If the device is only configured for MQTT it works similar to 'Debug tracing' and so collects data for the amount of time there configured.

At the end of the process (almost instant for HTTP devices in v5.0.0) you'll get the download from your browser.

This feature now (v5.0.0) is less powerful than 'Debug tracing' since for devices on HTTP it doesn't collect 'sessions' so you can't log what is happening when issuing commands or inspecting transitions in device behavior. It is nevertheless useful to quickly gather a detailed device 'snapshot'