Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: BLE device does not work after moving to another gateway #194

Closed
1 of 3 tasks
Lurker00 opened this issue Apr 8, 2024 · 35 comments
Closed
1 of 3 tasks

[Bug]: BLE device does not work after moving to another gateway #194

Lurker00 opened this issue Apr 8, 2024 · 35 comments
Labels
bug Something isn't working

Comments

@Lurker00
Copy link

Lurker00 commented Apr 8, 2024

LocalTuya Version

3.2.5.2b1

Home Assistant Version

2024.4.1

Environment

  • Does the device work using the Home Assistant Tuya Cloud component?
  • Is this device connected to another local integration, including Home Assistant and any other tools?
  • The devices are within the same HA subnet, and they get discovered automatically when I add them

What happened?

Unlike Zigbee devices, a BLE device can be detached from a gateway and then attached to another gateway, keeping the same device id. I have two BLE capable gateways. When I moved a BLE sensor to another gateway, LocalTuya automaticaly updated its Local Key, but not the IP address of the new gateway. If I try to "Rreconfigure existing device", it allows me to change the IP address, but does not save and does not use it. If I move the device back, it updates the Local Key and the device becomes accessible.

The current (old gateway) IP address is 192.168.0.33. The correct (new gateway) is 192.168.0.32.

If it is not possible to detect the new IP address automatically (at least it can be guessed from the Local Key and the list of gateways), please allow to manually edit it.

The device is battery powered T&H sensor with LCD display, if it matters.

Steps to reproduce.

Add a BLE device to a gateway, then detach it and attach to another gateway. Then the connection is lost.

Relevant log output

No response

Diagnostics information.

localtuya-944c1616039a1d1ec6b9745179bff74e-ВС_ Градусник-61d84145af21bdd68c392c89a0e8c43d (1).json

@Lurker00 Lurker00 added the bug Something isn't working label Apr 8, 2024
@xZetsubou
Copy link
Owner

The integration does give you the ability to modify the IP however after you change it, it will instantly return to old IP.

Why it revert back to old IP?
Because the gateway id config in sub-device still not updated to new one so it uses the old gateway by default localtuya will revert it back because he doesn't know that the gateway changed.

I think the best way to fix this is by while it updating the localkey it updates the gateway id to new one as well.

@Lurker00
Copy link
Author

Lurker00 commented Apr 8, 2024

I think the best way to fix this is by while it updating the localkey it updates the gateway id to new one as well.

Do you need something from my side for that fix?

xZetsubou added a commit that referenced this issue Apr 8, 2024
* Groups set dp values is now set values together instead of set dp on it's own.
* Sub-devices updated localkey also update the node_id and gateway id.
@xZetsubou
Copy link
Owner

xZetsubou commented Apr 8, 2024

I did made changes however I haven't tested this :) let me know if this actually do the job!

edit: changes made on master.

note: that if sub device infos updated you will need to reload integration manually

@Lurker00
Copy link
Author

Lurker00 commented Apr 9, 2024

I have to wait for a beta or a new release, because HACS always reverts to the top of the list of versions when I try to Redownload, and I'm not experienced enough for manual install for the master.

@xZetsubou
Copy link
Owner

Released beta2 you can test.

@Lurker00
Copy link
Author

Lurker00 commented Apr 9, 2024

No luck! I've moved the sensor back to the original gateway while it was diabled in LocalTuya. Then I enabled it. Then I updated to beta2 and restarted HA. TRhen I moved the sensor to the other gateway. Reloaded LocalTuya, even restarted HA again, but the sensor again has the new local key, but the old IP address.

Just in case, tried Reconfigure to change the IP address to no avail.

@xZetsubou
Copy link
Owner

Can you download the device diagnostics "after it changed" so in-order to sub_device to change there "ip" the gateway_id in config should also updated with localkey as well. I want to check if it updated or not also an info logs should shows Updated informations for sub-device after it updates

@Lurker00
Copy link
Author

Lurker00 commented Apr 9, 2024

No, gateway_id is not updated! Only the local_key is changed when I move the device between gateways.

info logs should shows Updated informations for sub-device

I've enabled logging for LocalTuya and for the sensor, restarted HA, moved the sensor to another gateway. There is no such a record in the home-assistant.log, though the local key has been changed. The only related records are:

2024-04-09 22:03:49.606 DEBUG (MainThread) [custom_components.localtuya.common] [bfc...9me] Trying to connect to ВС: Градусник...
2024-04-09 22:03:49.606 DEBUG (MainThread) [custom_components.localtuya.common] [bfc...9me] Retrieving initial state
2024-04-09 22:03:49.805 DEBUG (MainThread) [custom_components.localtuya.select] [bfc...9me] Entity Temperature Unit (DP 9) - Not restoring as restore on reconnect is disabled for this entity and the entity has an initial status or it is not a passive entity
2024-04-09 22:03:49.888 WARNING (MainThread) [custom_components.localtuya.common] [bfc...9me] Connect to ВС: Градусник failed: Sub device is not connected
2024-04-09 22:03:49.888 DEBUG (MainThread) [custom_components.localtuya.common] [bfc...9me] Disconnected: waiting for discovery broadcast
2024-04-09 22:03:57.142 DEBUG (MainThread) [custom_components.localtuya.common] [bfc...9me] Trying to connect to ВС: Градусник...
2024-04-09 22:03:57.143 DEBUG (MainThread) [custom_components.localtuya.common] [bfc...9me] Retrieving initial state
2024-04-09 22:03:57.271 DEBUG (MainThread) [custom_components.localtuya.common] [bfc...9me] Disconnected: waiting for discovery broadcast


@xZetsubou
Copy link
Owner

xZetsubou commented Apr 9, 2024

This will trigger once after the localkey changed. so if the localkey is the same it won't update other stuff.

now inside of device diagnostics file does the gateway updated to the new one or not.

@xZetsubou
Copy link
Owner

I made a little change you will need to re-download beta2 from HACS.

After you re-download revert the sub-device to the old "localkey". well If the localkey change everytime you switch device between gateways it should run normally no.

@Lurker00
Copy link
Author

Lurker00 commented Apr 9, 2024

This will trigger once after the localkey changed

I'm not familiar with Python. Found and tried to understand get_gateway_by_deviceid and it looked like good.

Yes, the cloud_data contains the new local_key, but the "magic" does not happen, and the gateway is not updated.

I made a little change you will need to re-download beta2 from HACS

OK, will do it tomorrow.

@Lurker00
Copy link
Author

Redownloaded beta2, and it does not work.

I've inserted debug prints into update_local_key, and found out that it is never called. Bearing in mind that logs are full of

2024-04-09 22:05:41.348 DEBUG (MainThread) [custom_components.localtuya.common] [bfc...9me] Trying to connect to ВС: Градусник...
2024-04-09 22:05:41.348 DEBUG (MainThread) [custom_components.localtuya.common] [bfc...9me] Retrieving initial state
2024-04-09 22:05:41.390 DEBUG (MainThread) [custom_components.localtuya.common] [bfc...9me] Disconnected: waiting for discovery broadcast
2024-04-09 22:05:46.349 DEBUG (MainThread) [custom_components.localtuya.common] [bfc...9me] Trying to connect to ВС: Градусник...
2024-04-09 22:05:46.349 DEBUG (MainThread) [custom_components.localtuya.common] [bfc...9me] Retrieving initial state
2024-04-09 22:05:46.411 DEBUG (MainThread) [custom_components.localtuya.common] [bfc...9me] Disconnected: waiting for discovery broadcast

I've inserted (common.py)
await self.update_local_key()
right after the line
self.debug("Retrieving initial state")
restarted HA, and now I see in the log (some messages are mine):

2024-04-10 11:23:55.665 DEBUG (MainThread) [custom_components.localtuya.common] [bfc...9me] Retrieving initial state
2024-04-10 11:23:55.714 DEBUG (MainThread) [custom_components.localtuya.select] [bfc...9me] Entity Temperature Unit (DP 9) - Not restoring as restore on reconnect is disabled for this entity and the entity has an initial status or it is not a passive entity
2024-04-10 11:24:03.944 INFO (MainThread) [custom_components.localtuya.common] [bfc...9me] Updating local key for sub-device bfce16bai5lvo9me.
2024-04-10 11:24:03.944 INFO (MainThread) [custom_components.localtuya.common] [bfc...9me] Updating informations for sub-device bfce16bai5lvo9me.
2024-04-10 11:24:03.944 INFO (MainThread) [custom_components.localtuya.common] [bfc...9me] bfce16bai5lvo9me new gateway ID: bf2ffe45f7abbae9b3m38k
2024-04-10 11:24:03.945 INFO (MainThread) [custom_components.localtuya.common] [bfc...9me] Updated bfce16bai5lvo9me IP: 192.168.0.32
2024-04-10 11:24:03.945 INFO (MainThread) [custom_components.localtuya.common] [bfc...9me] Updated informations for sub-device bfce16bai5lvo9me.
2024-04-10 11:24:03.945 INFO (MainThread) [custom_components.localtuya.common] [bfc...9me] local_key updated for device bfce16bai5lvo9me.

It means, your code never calls update_local_key, but if it did, it would work correctly.

But the place I've put the call to was wrong: HA became so slow on its start, that it can't even make another graceful restart I initiated after removing that line of code. I had to restart the VM with HA.

After such a restart my sensor again had wrong gateway info :( OK, I've removed that bad line, and added it to another place, to be:

            except Exception as e:
                if not (self._fake_gateway and "Not found" in str(e)):
                    e = "Sub device is not connected" if self.is_subdevice else e
                    self.warning(f"Connect to {host} failed: {e}")
                    await self.abort_connect()
                    await self.update_local_key()

and voila - after HA restart, the sensor has updated local key and gateway. In some minutes I found

2024-04-10 11:51:29.762 DEBUG (MainThread) [custom_components.localtuya.common] [bfc...9me] Success: connected to ВС: Градусник

and I expect its data updated quite soon!

But I'm not quite sure if that call now in a good place for all the cases! It's up to you to decide.

@Lurker00
Copy link
Author

and I expect its data updated quite soon!

Yes: HA started to show the data form the sensor!

@Lurker00
Copy link
Author

I believe the best and the only place to call something like update_local_key() is where an update from the cloud just has been received. This also would prevent cases like #195. "Something like" - because it should work with the newly received data.

@xZetsubou
Copy link
Owner

xZetsubou commented Apr 10, 2024

After such a restart my sensor again had wrong gateway info :( OK, I've removed that bad line, and added it to another place, to be:

It because Home Assistant save the states every 15-minutes and when it restarts probably using the service
service: homeassistant.save_persistent_states, if you made hard restart it didn't had the time to actually store the states.


When you mentioned that "only the localkey automatically updated"

I just assumed it was called it because I don't know what actually happened

does this work good for you it's good place because this only called if the device ID actually doesn't exist anymore any this doesn't happen unless the user somehow added the device with different device ID.

            except Exception as e:
                if not (self._fake_gateway and "Not found" in str(e)):
                    e = "Sub device is not connected" if self.is_subdevice else e
                    self.warning(f"Connect to {host} failed: {e}")
                    await self.abort_connect()
                    await self.update_local_key()

@Lurker00
Copy link
Author

The change I made almost works. Consider this log:

Traceback (most recent call last):
  File "/config/custom_components/localtuya/common.py", line 315, in _make_connection
    status = await self._interface.detect_available_dps(cid=self._node_id)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/localtuya/core/pytuya/__init__.py", line 1101, in detect_available_dps
    data = await self.status(cid=cid)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/localtuya/core/pytuya/__init__.py", line 1023, in status
    status: dict = await self.exchange(command=DP_QUERY, nodeID=cid)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/localtuya/core/pytuya/__init__.py", line 996, in exchange
    self.transport.write(enc_payload)
    ^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'write'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/config/custom_components/localtuya/common.py", line 333, in _make_connection
    await self.update_local_key()
  File "/config/custom_components/localtuya/common.py", line 444, in update_local_key
    new_data[CONF_DEVICES][dev_id][CONF_GATEWAY_ID] = new_gw.id
                                                      ^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'id'

The line 333 (/config/custom_components/localtuya/common.py) in this log is exactly the line I've added to call update_local_key(). LocalTuya raised an exception because the device could not be connected. But update_local_key() raised the exception as well, because the gateway for the device was not found. This happened near the HA startup, during LocalTuya initialization.

Note, that all other calls to update_local_key() are made in the exception handlers as well. So, my suggestions are:

  • Accept my additional call to update_local_key().
  • Fix update_local_key() to ensure it does not raise exceptions by itself.

I have no idea why get_gateway_by_deviceid() returned None in this place, but it happened! Moreover, I have no idea why the local key was different in this case, because, before HA restart, the sensor was working already on the new gateway, and LocalTuya "knew" about it. Well, as Forrest Gump said, "sh*t happens", and the code should handle even weird cases.

@xZetsubou
Copy link
Owner

xZetsubou commented Apr 15, 2024

How much time it takes for the sub-device to move from GW 1 to GW 2, and how much time it takes LocalTuya will update them?
I'm trying to modify update_localkey function however not sure how much the best time to cached old devices data e.g. localkey gateway id etc....

edit: Does the Node ID changes when the GW changes as well?

@Lurker00
Copy link
Author

I'll do more tests later today.

Meanwhile, "move from GW 1 to GW 2" is a two steps action in Smart Life:

  1. Detach the device from GW1. Then it is available from the application directly, but the device can't send data to the cloud.
  2. Attach the device to GW2. Then it is visible to the cloud, instantly.

Both steps are peformed via Internet connecton to the cloud, no direct BLE connection from the app to the device is required. The time between these actions is undefined. I'll check statuses and the log in between, and after step 2, and will report you.

@Lurker00
Copy link
Author

I believe the current state of this part of the code is good to release.

however not sure how much the best time to cached old devices data e.g. localkey gateway id etc...

I think you should keep it while the device_id is still in the list of devices from the cloud.

I've made the following tests:

Test 1

  1. Detached the device from GW1.
  2. Reloaded LocalTuya after some minutes. Here LocalTuya didn't get any readings from the device.
  3. Attached the device to GW2. In a couple of minutes, LocalTuya updated its devices list from the cloud and found the device on GW2.

Test 2

  1. Detached the device from GW2.
  2. Immediately attached it to GW1.
  3. Waited for a hour for LocalTuya to find the device on GW1, to no avail (expected).
  4. Reloaded LocalTuya. It founds the device on GW1 almost immediately.

So,

  • if LocalTuya has connected to the device and got some readings, it has no knowledge that the device was moved to another GW or even detached or turned off. Reload is required.
  • if LocalTuya failed to connect to the device, then it tries to connect again and again, and finally finds the device on the new GW. No reload is required.

When a device is detached from its GW, its local key is changed to a very different value (no a gateway has such a local key), and node id becomes null. When the device is attached to a new gateway, its local key is changed to the gateway's local key, and its node it is restored to the previous value (at least what I see).

So, if LocalTuya parses the device list every time it is updated from the cloud, it may find out that node id became null, but it is unlikely, if the user attached the device to the new gateway fast enough. It may check for local key change and force disconnect from this device, to hope that the following connect attempt would find the device on another gateway, but I think it's a waste of efforts.

Right now I think it is good enough to request a user to reload TuyaLocal after moving a device to another gateway.

I attach a set of diagnostics files:

Please note, that detached device is the case when get_gateway_by_deviceid returns None and may cause that exception. Yes, I see you've already re-written both update_localkey and its use 👍

@xZetsubou
Copy link
Owner

xZetsubou commented Apr 16, 2024

There's an issue I have noticed before I didn't fixed it yet, so the device is actually moved to new GW and localkey probably updated the information of the sub-device, it seems the entities goes shutdown this is probably due to delay on shutdown entities I made before.

So what I can think of is that when you reload localtuya you actually enabling the entities not actually re-link the device to the new GW

returns None and may cause #194 (comment)

This should be handled after the refactor commit.

edit: Another thing I just found is that update_localkey won't works if the sub device isn't the fake gateway, and I'm not sure if there's a scenarios can lead to this point. not even sure if this may cause an issue in you cases as well.

xZetsubou added a commit that referenced this issue Apr 17, 2024
* Sub-device now will check the localkey of the gateway if it doesn't match it will raise an expectation.
@xZetsubou
Copy link
Owner

@Lurker00 Do you have a gateway with protocol version: 3.3?

@Lurker00
Copy link
Author

No, all my 3 gateways use protocol 3.4. But I have two WiFi devices with 3.3: a fan and a plug with power monitor, and I experienced no problems with them.

xZetsubou added a commit that referenced this issue May 3, 2024
@xZetsubou
Copy link
Owner

Is the past a few days I was testing a command and use it as a heartbeat for sub-devices this will let us know if the sub-device disconnected from the gateway and update the sub-device information without need to reload localtuya, I don't know another gateway or 3.3 gateway.

Can you test it with your sub-device that change the gateway then see if it will re-connect to it.

@Lurker00
Copy link
Author

Lurker00 commented May 3, 2024

Yes, I can. Will you prepare beta7 for that with all the changes made since beta6?

@xZetsubou
Copy link
Owner

Yes, I can. Will you prepare beta7 for that with all the changes made since beta6?

Released beta7

@Lurker00
Copy link
Author

Lurker00 commented May 3, 2024

First of all, there are new, misleading, messages:

2024-05-03 21:46:24.886 WARNING (MainThread) [custom_components.localtuya.coordinator] [] Failed to use К: T&H Сенсор as gateway.
2024-05-03 21:47:25.604 WARNING (MainThread) [custom_components.localtuya.coordinator] [bf5...daj] Failed to use КЛ1: LED BULB B5K as gateway.

Why LocalTuya even tries to use a BLE device "as a gateway"?

Then. I've moved "К: T&H Сенсор" to another gateway. It was detected almost instantly:

2024-05-03 21:46:19.700 WARNING (MainThread) [custom_components.localtuya.coordinator] [bf8...m91] Connect to К: T&H Сенсор failed: Sub device is not connected
2024-05-03 21:46:21.133 INFO (MainThread) [custom_components.localtuya.coordinator] [bf8...m91] Updated informations for sub-device К: T&H Сенсор.
2024-05-03 21:46:21.134 INFO (MainThread) [custom_components.localtuya.coordinator] [bf8...m91] local_key updated for device К: T&H Сенсор.
2024-05-03 21:46:27.435 INFO (MainThread) [custom_components.localtuya.coordinator] [bf8...m91] Updated К: T&H Сенсор gateway ID to: bf2ffe45f7abbae9b3m38k
2024-05-03 21:46:27.509 INFO (MainThread) [custom_components.localtuya.coordinator] [bf8...m91] Updated К: T&H Сенсор IP to: 192.168.0.32
2024-05-03 21:46:27.510 INFO (MainThread) [custom_components.localtuya.coordinator] [bf8...m91] Updated informations for sub-device К: T&H Сенсор.
2024-05-03 21:46:27.511 INFO (MainThread) [custom_components.localtuya.coordinator] [bf8...m91] local_key updated for device К: T&H Сенсор.

and then it works and updates its measurements. But... (later)

@Lurker00
Copy link
Author

Lurker00 commented May 3, 2024

... an old problem happened that I can't investigate to report: sometimes LocalTuya dispatch messages to a wrong device. You saw it in my logs before, now this is another example (messages from a scene switch and a bulb go to the T&H sensor):

2024-05-03 21:46:58.954 ERROR (MainThread) [homeassistant] Error doing job: Exception in callback Entity.async_write_ha_state()

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/sensor/__init__.py", line 668, in state
    numerical_value = float(value)  # type:ignore[arg-type]
                      ^^^^^^^^^^^^
ValueError: could not convert string to float: 'single_click'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/asyncio/events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 998, in async_write_ha_state
    self._async_write_ha_state()
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 1119, in _async_write_ha_state
    state, attr, capabilities, shadowed_attr = self.__async_calculate_state()
                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 1056, in __async_calculate_state
    state = self._stringify_state(available)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 1004, in _stringify_state
    if (state := self.state) is None:
                 ^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/sensor/__init__.py", line 670, in state
    raise ValueError(
ValueError: Sensor sensor.k_t_h_temperature_2 has device class 'temperature', state class 'measurement' unit '°C' and suggested precision 'None' thus indicating it has a numeric value; however, it has the non-numeric value: 'single_click' (<class 'str'>)
2024-05-03 21:46:59.045 ERROR (MainThread) [homeassistant] Error doing job: Exception in callback Entity.async_write_ha_state()
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/sensor/__init__.py", line 668, in state
    numerical_value = float(value)  # type:ignore[arg-type]
                      ^^^^^^^^^^^^
ValueError: could not convert string to float: 'white'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/asyncio/events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 998, in async_write_ha_state
    self._async_write_ha_state()
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 1119, in _async_write_ha_state
    state, attr, capabilities, shadowed_attr = self.__async_calculate_state()
                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 1056, in __async_calculate_state
    state = self._stringify_state(available)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 1004, in _stringify_state
    if (state := self.state) is None:
                 ^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/sensor/__init__.py", line 670, in state
    raise ValueError(
ValueError: Sensor sensor.vs_t_h_humidity_2 has device class 'humidity', state class 'measurement' unit '%' and suggested precision 'None' thus indicating it has a numeric value; however, it has the non-numeric value: 'white' (<class 'str'>)

It happened many times. Full log attached. But it is not related to the main test.

home-assistant_2024-05-03T18-53-09.235Z.log

@xZetsubou
Copy link
Owner

Why LocalTuya even tries to use a BLE device "as a gateway"?

It doesn't exactly uses the BLE device as gateway, it uses the BLE Device instance as gateway and it's completely fine since any sub-device as long it's connect to the gateway IP it can be used as gateway, as for why it uses BLE device instance it probably because it's the first sub-device you added into localtuya.
However I may need to adjust this so that if it fails to uses sub-device as gateway then try another sub-device.

and then it works and updates its measurements. But... (later)

means?

... an old problem happened that I can't investigate to report: sometimes LocalTuya dispatch messages to a wrong device. You saw it in my logs before, now this is another example (messages from a scene switch and a bulb go to the T&H sensor):

Can you post the "entity diagnostics", I want to also see the other devices configs that sending related to this one :)

@Lurker00
Copy link
Author

Lurker00 commented May 3, 2024

means?

Means, the changes to detect a device detached from one gateway and attached to another gateway seems to work as expected.

Can you post the "entity diagnostics"

Here my whole devices list:
config_entry-localtuya-944c1616039a1d1ec6b9745179bff74e.json

It seems when LocalTuya is waiting for a particular seqno, it may receive a message from another device with the same seqno, but there is no a check for expected cid. Sometimes, on HA restart, I see wrong states of some switches: a turned off switch is shown as turned on, and another one is shown as off but should be on. But sometimes it happens between different types of datapoints, which leads to such exceptions.

@xZetsubou
Copy link
Owner

xZetsubou commented May 4, 2024

I know that there is an issue on waiting seqno for device 3.4 and 3.5 since the waited seqno may timeout due to seqno may increased but it won't effect "status" command at all,

edit: I'll dig more into this.

@xZetsubou
Copy link
Owner

xZetsubou commented May 4, 2024

Just to ensure that nothing goes wrong in pytuya.py module change lines 839 to 858

From this

if "dps" not in decoded_message:
return
if dps_payload := decoded_message.get("dps"):
if cid := decoded_message.get("cid"):
self.dps_cache.setdefault(cid, {})
self.dps_cache[cid].update(dps_payload)
else:
self.dps_cache.setdefault("parent", {})
self.dps_cache["parent"].update(dps_payload)
listener = self.listener and self.listener()
if listener is not None:
if cid:
listener = listener._sub_devices.get(cid, listener)
device = self.dps_cache.get(cid, {})
else:
device = self.dps_cache.get("parent", {})
listener.status_updated(device)

To this

            if "dps" not in decoded_message:
                return

            self.dps_cache.setdefault("parent", {})
            if dps_payload := decoded_message.get("dps"):
                if cid := decoded_message.get("cid"):
                    self.dps_cache.setdefault(cid, {})
                    self.dps_cache[cid].update(dps_payload)
                else:
                    self.dps_cache["parent"].update(dps_payload)

            listener = self.listener and self.listener()
            if listener is not None:
                if cid and cid in listener._sub_devices:
                    listener = listener._sub_devices.get(cid, listener)
                    device_status = self.dps_cache.get(cid, {})
                else:
                    device_status = self.dps_cache.get("parent", {})

                listener.status_updated(device_status)

To ensure that it only updates that sub-device that added on HA only.

@Lurker00
Copy link
Author

Lurker00 commented May 4, 2024

Done. After restart I still see exceptions due to wrong dispatch. I attach the full log. I also forced logging of deciphered data for better understanding who is the real source of the messages.

I still believe that _dispatch() should check the cid before calling _release_listener() in this context:

        if msg.seqno in self.listeners:
            self.debug("Dispatching sequence number %d", msg.seqno)
            self._release_listener(msg.seqno, msg)

home-assistant_2024-05-04T09-37-45.958Z.log

@xZetsubou
Copy link
Owner

I think this should fix the issue. In line 1058 can you modify status function

From this

async def status(self, cid=None):
"""Return device status."""
status: dict = await self.exchange(command=DP_QUERY, nodeID=cid, delay=False)
if status:
if cid and "dps" in status:
self.dps_cache.update({cid: status["dps"]})
elif "dps" in status:
self.dps_cache.update({"parent": status["dps"]})
return self.dps_cache.get(cid, {}) if cid else self.dps_cache.get("parent", {})

To this

    async def status(self, cid=None):
        """Return device status."""
        status: dict = await self.exchange(command=DP_QUERY, nodeID=cid, delay=False)

        self.dps_cache.setdefault("parent", {})
        if status and "dps" in status:
            if "cid" in status:
                self.dps_cache.update({status["cid"]: status["dps"]})
            else:
                self.dps_cache["parent"].update(status["dps"])

        return self.dps_cache.get(cid, {}) if cid else self.dps_cache.get("parent", {})

@Lurker00
Copy link
Author

Lurker00 commented May 5, 2024

I think this should fix the issue

The restart with this change was without exceptions. Let's see how it will be running.

Then I moved BLE "К: T&H Сенсор" to another gateway, and it was detected on the new place quickly, without exceptions.

@xZetsubou xZetsubou added the master/next-release Fixed in master branch, Will be ready in the next release label May 5, 2024
Copy link

This issue was closed because it was resolved on the release:

@github-actions github-actions bot added stale and removed master/next-release Fixed in master branch, Will be ready in the next release stale labels May 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants