Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

knx tunneling connection loss - time/date no longer exposed - reconnect should not trigger automation #69328

Closed
dotsoltecti opened this issue Apr 5, 2022 · 23 comments · Fixed by XKNX/xknx#925, #70342 or #71144
Assignees
Milestone

Comments

@dotsoltecti
Copy link

The problem

HA seems to drop within 1-2 days after a restart the tunneling connection to my KNX IP gateway (Weizierl KNX IP BAOS 772). This leads to 2 problems:

  1. HA is exposing the time/date on the KNX bus. After reconnecting, this no longer seems to be working (my GIRA wall switches indicate they are not receiving time-updates). A restart of HA is needed to re-expose the time to the KNX bus
  2. I have a few Wifi wall socket switches (Sonoff S20 running Tasmota) that are switched on/off with a HA-automation when a certain KNX-scenario is activated. For example: an automation turns off certain sockets when a KNX 'night'-scenario has been activated. The KNX-connection loss seems to trigger this automation, resulting in a quick switch on and off, although the automation only contains an action to switch off the socket.

What version of Home Assistant Core has the issue?

core-2022.3.7

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant Container

Integration causing the issue

knx

Link to integration documentation on our website

https://www.home-assistant.io/integrations/knx/

Diagnostics information

config_entry-knx-28c60dd0515f3e0a436bbaf149f3a85b.json (1).txt

Example YAML snippet

No response

Anything in the logs that might be useful for us?

I am including the log file of the connection loss that happened this morning at 6:23:

2022-04-05 06:22:22 DEBUG (MainThread) [xknx.knx] Sending to 192.168.1.49:3671 at 1649132542.6317146:
<KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="CONNECTIONSTATE_REQUEST" Reserve="0" TotalLength="16" />
body="<ConnectionStateRequest CommunicationChannelID="72", control_endpoint="192.168.1.12:56975/udp" />" />
2022-04-05 06:22:22 DEBUG (MainThread) [xknx.raw_socket] Received from ('192.168.1.49', 3671): 0610020800084800
2022-04-05 06:22:22 DEBUG (MainThread) [xknx.knx] Received from 192.168.1.49:3671 at 1649132542.6371832:
<KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="CONNECTIONSTATE_RESPONSE" Reserve="0" TotalLength="8" />
body="<ConnectionStateResponse CommunicationChannelID="72" status_code="ErrorCode.E_NO_ERROR" />" />
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.raw_socket] Received from ('192.168.1.49', 3671): 0610042000160448c1002900bce0110f130202008019
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.knx] Received from 192.168.1.49:3671 at 1649132598.3190699:
<KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="TUNNELLING_REQUEST" Reserve="0" TotalLength="22" />
body="<TunnellingRequest communication_channel_id="72" sequence_counter="193" cemi="<CEMIFrame SourceAddress="IndividualAddress("1.1.15")" DestinationAddress="GroupAddress("2/3/2")" Flags="1011110011100000" code="L_DATA_IND" payload="<GroupValueWrite value="<DPTArray value="[0x19]" />" />" />" />" />
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.knx] Sending to 192.168.1.49:3671 at 1649132598.3197277:
<KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="TUNNELLING_ACK" Reserve="0" TotalLength="10" />
body="<TunnellingAck communication_channel_id="72" sequence_counter="193" status_code="ErrorCode.E_NO_ERROR" />" />
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.telegram] <Telegram direction="Incoming" source_address="1.1.15" destination_address="2/3/2" payload="<GroupValueWrite value="<DPTArray value="[0x19]" />" />" />
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.raw_socket] Received from ('192.168.1.49', 3671): 0610042000160448c1002900bce0110f130202008019
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.knx] Received from 192.168.1.49:3671 at 1649132598.3371336:
<KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="TUNNELLING_REQUEST" Reserve="0" TotalLength="22" />
body="<TunnellingRequest communication_channel_id="72" sequence_counter="193" cemi="<CEMIFrame SourceAddress="IndividualAddress("1.1.15")" DestinationAddress="GroupAddress("2/3/2")" Flags="1011110011100000" code="L_DATA_IND" payload="<GroupValueWrite value="<DPTArray value="[0x19]" />" />" />" />" />
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.knx] Sending to 192.168.1.49:3671 at 1649132598.3378189:
<KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="TUNNELLING_ACK" Reserve="0" TotalLength="10" />
body="<TunnellingAck communication_channel_id="72" sequence_counter="193" status_code="ErrorCode.E_NO_ERROR" />" />
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.telegram] <Telegram direction="Incoming" source_address="1.1.15" destination_address="2/3/2" payload="<GroupValueWrite value="<DPTArray value="[0x19]" />" />" />
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.raw_socket] Received from ('192.168.1.49', 3671): 06100209001048480801c0a801310e57
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.knx] Received from 192.168.1.49:3671 at 1649132598.352643:
<KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="DISCONNECT_REQUEST" Reserve="0" TotalLength="16" />
body="<DisconnectRequest CommunicationChannelID="72" control_endpoint="192.168.1.49:3671/udp" />" />
2022-04-05 06:23:18 WARNING (MainThread) [xknx.log] Received DisconnectRequest from tunnelling sever.
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.knx] Sending to 192.168.1.49:3671 at 1649132598.3535402:
<KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="DISCONNECT_RESPONSE" Reserve="0" TotalLength="8" />
body="<DisconnectResponse CommunicationChannelID="72" status_code="ErrorCode.E_NO_ERROR" />" />
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.log] Stopping task datetime.broadcast_139959802878896 because of connection loss.
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.log] Stopping task datetime.broadcast_139959803160992 because of connection loss.
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.log] Stopping task datetime.broadcast_139959809465744 because of connection loss.
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.log] Stopping task datetime.broadcast_139959801146480 because of connection loss.
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.log] Stopping task datetime.broadcast_139959830857184 because of connection loss.
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.log] Stopping task datetime.broadcast_139959793463360 because of connection loss.
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.state_updater] StateUpdater stopping
2022-04-05 06:23:18 DEBUG (MainThread) [xknx.log] Closing transport.
2022-04-05 06:23:19 WARNING (MainThread) [homeassistant.components.automation.ventilatie] Ventilatie: Already running
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.knx] Sending to 192.168.1.49:3671 at 1649132601.6370552:
<KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="CONNECT_REQUEST" Reserve="0" TotalLength="26" />
body="<ConnectRequest control_endpoint="192.168.1.12:33686/udp" data_endpoint="192.168.1.12:33686/udp" request_type="ConnectRequestType.TUNNEL_CONNECTION" />" />
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.raw_socket] Received from ('192.168.1.49', 3671): 06100206001449000801c0a801310e57040411fa
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.knx] Received from 192.168.1.49:3671 at 1649132601.6427915:
<KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="CONNECT_RESPONSE" Reserve="0" TotalLength="20" />
body="<ConnectResponse communication_channel="73" status_code="ErrorCode.E_NO_ERROR" data_endpoint="192.168.1.49:3671/udp" request_type="ConnectRequestType.TUNNEL_CONNECTION" identifier="4602" />" />
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.log] Tunnel established communication_channel=73, address=1.1.250
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.log] Restarting task datetime.broadcast_139959802878896 as the connection to the bus was reestablished.
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.log] Restarting task datetime.broadcast_139959803160992 as the connection to the bus was reestablished.
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.log] Restarting task datetime.broadcast_139959809465744 as the connection to the bus was reestablished.
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.log] Restarting task datetime.broadcast_139959801146480 as the connection to the bus was reestablished.
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.log] Restarting task datetime.broadcast_139959830857184 as the connection to the bus was reestablished.
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.log] Restarting task datetime.broadcast_139959793463360 as the connection to the bus was reestablished.
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.state_updater] StateUpdater initializing values
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.state_updater] StateUpdater reading 0/0/35 for Ketelcontact - State
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.state_updater] StateUpdater reading 2/6/0 for Temperatuur Voordeur - Value
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.telegram] <Telegram direction="Outgoing" source_address="1.1.250" destination_address="0/0/35" payload="<GroupValueRead />" />
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.knx] Sending to 192.168.1.49:3671 at 1649132601.8780334:
<KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="TUNNELLING_REQUEST" Reserve="0" TotalLength="21" />
body="<TunnellingRequest communication_channel_id="73" sequence_counter="0" cemi="<CEMIFrame SourceAddress="IndividualAddress("1.1.250")" DestinationAddress="GroupAddress("0/0/35")" Flags="1011110011100000" code="L_DATA_REQ" payload="<GroupValueRead />" />" />" />
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.raw_socket] Received from ('192.168.1.49', 3671): 06100421000a04490000
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.knx] Received from 192.168.1.49:3671 at 1649132601.908757:
<KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="TUNNELLING_ACK" Reserve="0" TotalLength="10" />
body="<TunnellingAck communication_channel_id="73" sequence_counter="0" status_code="ErrorCode.E_NO_ERROR" />" />
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.raw_socket] Received from ('192.168.1.49', 3671): 061004200015044900002e00bce011fa0023010000
2022-04-05 06:23:21 DEBUG (MainThread) [xknx.knx] Received from 192.168.1.49:3671 at 1649132601.9369955:
<KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="TUNNELLING_REQUEST" Reserve="0" TotalLength="21" />
body="<TunnellingRequest communication_channel_id="73" sequence_counter="0" cemi="<CEMIFrame SourceAddress="IndividualAddress("1.1.250")" DestinationAddress="GroupAddress("0/0/35")" Flags="1011110011100000" code="L_DATA_CON" payload="<GroupValueRead />" />" />" />

Additional information

I have tried to work around the second problem by adding a delay to the automation (it should only trigger when the KNX-'night' scenario has been active for 2 seconds) but the problem persists. The wall sockets are still being switched on by the automation, although it only contains a "switch off" action.

@probot-home-assistant
Copy link

knx documentation
knx source
(message by IssueLinks)

@probot-home-assistant
Copy link

Hey there @Julius2342, @farmio, @marvin-w, mind taking a look at this issue as it has been labeled with an integration (knx) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)

@farmio
Copy link
Contributor

farmio commented Apr 5, 2022

Hi 👋!

In 2022.4 we will have done countermeasures for your problem 0: the connection losses - this may or may not be fixed then.

At 1) this is odd, it should resume to send - 1 hour after the connection was established again. According to your log it did restart 6 times - are you exposing 6 date/time GAs?

At 2) Wehn the bus is disconnected your entities will have the state "unavailable", when it is connected again they regain their previous state. 2 state_changed events are fired - this is HA standard behaviour. Your automations should either account for this or just use knx_event instead of an entity state change as trigger for scenes. See also home-assistant/architecture#668

@dotsoltecti
Copy link
Author

Hi @farmio! Thanks for the quick assist. Glad to hear that 'problem 0' was already on the radar and is currently being worked on.

  1. I am exposing date/time on 5 different GA's. Oddly enough I had a duplicate entry (one GA was duplicate), so I just corrected that. Might this have been the problem (exposed time on the same GA twice)? Any reason why this has become a problem in the last months?
  2. To make my automations account for the 'unavailable' state, I presume it suffices to include a 'from' state (currently empty) in my automation triggers (for example state changes from 'off' to 'on'). If not I will look further into it (thanks for pointing me in the right direction)

@farmio
Copy link
Contributor

farmio commented Apr 6, 2022

I doubt this was the reason. If your disconnects were ~ once an hour it would just never trigger, but since it was every couple of days I don't know. Someone else has reported the same issue already, but I could not reproduce it on my system 🤷

Try 2022.4 - it just came out a couple of hours ago. Hope KNX connection is not affected by the actual problem anymore... and who knows, maybe the actual problem got fixed also 😃

@dotsoltecti
Copy link
Author

Last weekend I switched to 2022.4. The daily disconnects have vanished, but the exposing of time/date on the KNX bus still fails. Logs show however that it should be working/running (task gets resumed after the connection is reestablished). I will now start a group monitor in ETS (might give some insight).

@farmio
Copy link
Contributor

farmio commented Apr 13, 2022

Sure, but keep in mind the ETS GroupAddressMonitor doesn't show disconnects and reconnects which resets the timers for time sending).
You would probably see the state updater requesting fresh data for all entities though.

Maybe also look at xknx.telegram logs if HA even tries to send to your configured GAs.

@dotsoltecti
Copy link
Author

ETS GroupAddressMonitor confirmed that right now, the time is not being exposed (nothing seems to be written to the relevant GA's 2/4/0-2/4/4).

The xknx logs showed me that (see log.txt):

  1. After HA is started, the time is correctly exposed on the bus.
  2. Soon after starting, the connection gets dropped and reestablished 2 times. Since then, the connection has been stable (no reconnects). The 'expose'-tasks seems to get resumed.
  3. After those 2 reconnects, time is no longer being exposed by HA to the KNX-bus

So, it seems to me that the reconnect is somehow messing with the expose-feature? (Maybe something with the 'timer' for time sending you mentioned)

@farmio
Copy link
Contributor

farmio commented Apr 13, 2022

We'll look into it.
The more interesting question is: Why is your connection dropped 2 times after each HA restart. Any info in the logs that's not in the original post?

@dotsoltecti
Copy link
Author

dotsoltecti commented Apr 13, 2022

The more interesting question is: Why is your connection dropped 2 times after each HA restart. Any info in the logs that's not in the original post?

In the log I attached to my previous post, I don't see any indication why the connection was dropped 2 times. The only thing I can think of is that normally HA gets rebooted when it gets an upgraded. Since watchtower is handling these upgrades, these happen for a lot of containers a once, resulting in a temporary high load of my server. Maybe the KNX IP Gateway is shutting down the connection due to a slow response or something?

I just manually restarted HA but the connection seems to remain stable (no reconnects).

@dotsoltecti
Copy link
Author

The more interesting question is: Why is your connection dropped 2 times after each HA restart. Any info in the logs that's not in the original post?

In the log I attached to my previous post, I don't see any indication why the connection was dropped 2 times. The only thing I can think of is that normally HA gets rebooted when it gets an upgraded. Since watchtower is handling these upgrades, these happen for a lot of containers a once, resulting in a temporary high load of my server. Maybe the KNX IP Gateway is shutting down the connection due to a slow response or something?

I just manually restarted HA but the connection seems to remain stable (no reconnects).

After almost a day with a stable tunnel connection (no reconnects), HA keeps exposing time to the KNX bus on an hourly basis. I think this confirms something goes wrong after HA has reconnected. Please let me know if there is anything further I can do to help.

@smallFire1602
Copy link

I encounter the same problem now since somewhere in December - until now a restart of Home assistant helped but until 2022.04 event restart helps updating my time / date.

@farmio
Copy link
Contributor

farmio commented Apr 18, 2022

@smallFire1602 Hi 👋! Are you on 2022.4.5 already? We fixed a bug that might affect this, but we are not entirely sure as we can't reproduce it yet.

Could you please elaborate on

but until 2022.04 event restart helps updating my time / date.

What exactly does and does not work?

@smallFire1602
Copy link

@farmio I am on 2022.4.5-

Until 2022.04.0 I did get an updated time / date on my displays with every restart of home assistant. Update every hour was not working as told somewhere in December 21 or even a little bit longer...
With 2022.04.0 even the restart does not update the time. The problem is only time and date. I expose a temperature value too and this is working like a charm.

@be2881
Copy link

be2881 commented Apr 19, 2022

I'm having the same problem. I discovered that time and date were not exposed anymore to the KNX bus. Also the temperature is exposed, and that's working fine. I'm already running 2022.4.5

@marvin-w
Copy link
Contributor

I'm having the same problem. I discovered that time and date were not exposed anymore to the KNX bus. Also the temperature is exposed, and that's working fine. I'm already running 2022.4.5

#70342 should fix this. I was able to reproduce the issue locally and the fix works at least for me. @farmio Didn't have the bug locally so let's see what happens. I hope this will be part of 2022.4.7. If so, it would be nice if someone could test it again once that's released (will take some more days at least).

@be2881
Copy link

be2881 commented Apr 21, 2022

Shure, as soon as 2022.4.7 is released. I'll test it and post the restults here.

@be2881
Copy link

be2881 commented Apr 27, 2022

I've installed 2022.4.7 and can confirm this bug is solved. Time & date are now again exposed to the KNX bus. Thanks everybody for solving this issue.

@smallFire1602
Copy link

For me it worked sadly one time only after a restart.

@dotsoltecti
Copy link
Author

For me, my initial problem remains after installing 2022.4.7. A reconnection with the KNX-bus still breaks things (tested and confirmed by physically pulling the cable for a few minutes). The time is still no longer exposed after a reconnection to the KNX-bus (as initially reported). Should I open a new issue, or is there a way to reopen this issue?

@marvin-w marvin-w reopened this Apr 30, 2022
@marvin-w
Copy link
Contributor

I reopened it. I will test it again when I have some time left.

@farmio
Copy link
Contributor

farmio commented Apr 30, 2022

So we identified the issue and are working on a fix. It will need a new release of xknx - hopefully done until 2022.5 is released (xknx 0.21.0 used in coming beta 5 still has the issue).
XKNX/xknx#936

@marvin-w marvin-w mentioned this issue May 1, 2022
22 tasks
@marvin-w marvin-w modified the milestones: 2022.4.7, 2022.5.0 May 1, 2022
@smallFire1602
Copy link

Works now like a charm! Thanks!

@github-actions github-actions bot locked and limited conversation to collaborators Jun 11, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.