Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opentherm Gateway component not working #17263

Closed
Wummeke opened this issue Oct 8, 2018 · 81 comments · Fixed by #19618 or #19715
Closed

Opentherm Gateway component not working #17263

Wummeke opened this issue Oct 8, 2018 · 81 comments · Fixed by #19618 or #19715

Comments

@Wummeke
Copy link

Wummeke commented Oct 8, 2018

Home Assistant release with the issue:
0.79.3

Last working Home Assistant release (if known):
None

Operating environment (Hass.io/Docker/Windows/etc.):
I am running Home Assistant from Docker

Component/platform:
https://www.home-assistant.io/components/climate.opentherm_gw/

Description of problem:
Component does not work basicly. I sometimes have to reboot Home Assistant a few times to get the component going:

2018-10-08 22:23:24 WARNING (MainThread) [homeassistant.components.climate] Setup of platform opentherm_gw is taking over 10 seconds.
2018-10-08 22:23:30 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/src/app/homeassistant/helpers/entity_platform.py", line 345, in _async_add_entity
    await entity.async_added_to_hass()
  File "/usr/src/app/homeassistant/components/climate/opentherm_gw.py", line 66, in async_added_to_hass
    await self.gateway.connect(self.hass.loop, self._device)
  File "/usr/local/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 55, in connect
    await self.get_reports()
  File "/usr/local/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 170, in get_reports
    OTGW_DEFAULT_TIMEOUT, loop=self.loop)
  File "/usr/local/lib/python3.6/asyncio/tasks.py", line 362, in wait_for
    raise futures.TimeoutError()
concurrent.futures._base.TimeoutError

and When it finally works, it stops working after a few minutes. When I try to change the temperature an error is added to the log:

2018-10-08 22:13:51 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/src/app/homeassistant/helpers/service.py", line 226, in _handle_service_platform_call
    await func(entity, data)
  File "/usr/src/app/homeassistant/components/climate/__init__.py", line 679, in async_service_temperature_set
    await entity.async_set_temperature(**kwargs)
  File "/usr/src/app/homeassistant/components/climate/opentherm_gw.py", line 174, in async_set_temperature
    temp)
  File "/usr/local/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 91, in set_target_temp
    self._protocol.issue_cmd(cmd, value), timeout, loop=self.loop))
  File "/usr/local/lib/python3.6/asyncio/tasks.py", line 362, in wait_for
    raise futures.TimeoutError()
concurrent.futures._base.TimeoutError

My openthermgateway is connected to a nodeMCU with ESP Easy running ser2net.

Problem-relevant configuration.yaml entries and (fill out even if it seems unimportant):

climate:
  - platform: opentherm_gw
    device: socket://192.168.0.6:6638
    name: Thermostat
    precision: 0.5

Additional information:
Here a log of a 'successful' connection after reboot and then almost imediately a crash of the component when I try to set a temperature:

2018-10-08 22:29:12 INFO (MainThread) [homeassistant.components.climate] Setting up climate.opentherm_gw
2018-10-08 22:29:17 DEBUG (MainThread) [homeassistant.components.climate.opentherm_gw] Connected to Thermostat on socket://192.168.0.6:6638
2018-10-08 22:29:17 DEBUG (MainThread) [homeassistant.components.climate.opentherm_gw] Received report: {'control_setpoint': 10.0, 'otgw_about': 'OpenTherm Gateway 4.2.5', 'otgw_build': '17:59 20-10-2015', 'otgw_clockmhz': '4 MHz', 'otgw_gpio_a': 0, 'otgw_gpio_b': 0, 'otgw_gpio_a_state': 0, 'otgw_gpio_b_state': 0, 'otgw_led_a': 'F', 'otgw_led_b': 'X', 'otgw_led_c': 'O', 'otgw_led_d': 'T', 'otgw_led_e': 'P', 'otgw_led_f': 'C', 'otgw_mode': 'G', 'otgw_setpoint_ovrd_mode': 'N', 'otgw_smart_pwr': 'Low power', 'otgw_thermostat_detect': 'D', 'otgw_setback_temp': '16.00', 'otgw_ignore_transitions': 1, 'otgw_ovrd_high_byte': 1, 'otgw_vref': 3, 'otgw_dhw_ovrd': 'A', 'master_ch_enabled': 0, 'master_dhw_enabled': 1, 'master_cooling_enabled': 0, 'master_otc_enabled': 0, 'master_ch2_enabled': 0, 'slave_fault_indication': 0, 'slave_ch_active': 0, 'slave_dhw_active': 0, 'slave_flame_on': 0, 'slave_cooling_active': 0, 'slave_ch2_active': 0, 'slave_diagnostic_indication': 0, 'remote_transfer_dhw': 0, 'remote_transfer_max_ch': 0, 'remote_rw_dhw': 0, 'remote_rw_max_ch': 0, 'slave_max_relative_modulation': 0.0, 'slave_max_capacity': 0, 'slave_min_mod_level': 0, 'room_setpoint': 0.0, 'relative_mod_level': 0.0, 'ch_water_pressure': 0.0, 'room_temp': 0.0, 'ch_water_temp': 0.0, 'dhw_temp': 0.0, 'outside_temp': 0.0, 'return_water_temp': 0.0, 'slave_dhw_max_setp': 0, 'slave_dhw_min_setp': 0, 'slave_ch_max_setp': 0, 'slave_ch_min_setp': 0, 'dhw_setpoint': 0.0, 'max_ch_setpoint': 90.0, 'ch_burner_starts': 0, 'ch_pump_starts': 0, 'dhw_pump_starts': 0, 'dhw_burner_starts': 0, 'ch_burner_hours': 0, 'ch_pump_hours': 0, 'dhw_pump_hours': 0, 'dhw_burner_hours': 0, 'date': datetime.datetime(2018, 10, 8, 22, 29, 17, 661521)}
2018-10-08 22:30:01 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/src/app/homeassistant/helpers/service.py", line 226, in _handle_service_platform_call
    await func(entity, data)
  File "/usr/src/app/homeassistant/components/climate/__init__.py", line 679, in async_service_temperature_set
    await entity.async_set_temperature(**kwargs)
  File "/usr/src/app/homeassistant/components/climate/opentherm_gw.py", line 174, in async_set_temperature
    temp)
  File "/usr/local/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 91, in set_target_temp
    self._protocol.issue_cmd(cmd, value), timeout, loop=self.loop))
  File "/usr/local/lib/python3.6/asyncio/tasks.py", line 362, in wait_for
    raise futures.TimeoutError()
concurrent.futures._base.TimeoutError

@MartinHjelmare
Copy link
Member

CC @mvn23

@mvn23
Copy link
Contributor

mvn23 commented Oct 8, 2018

This looks like a connection issue to me. A few things you can try to confirm/troubleshoot this:

  • Connect to the host and port using telnet from inside the docker container if possible and from the host running docker: telnet 192.168.0.6 6638 Edit: make sure Home Assistant is not running.
  • Connect the gateway via USB/serial to the host running docker and try to access it with screen or a similar program (baud rate 9600). The thermostat and boiler don't need to be connected to the gateway.

In the first scenario, you should see a few lines of 9 characters each appearing roughly every second. If not, try typing PS=0 and hit enter. If the lines start with an E there is an issue with the gateway device or config and you may want to try the troubleshooting from the OpenTherm Gateway website. If the lines don't have 9 characters each or appear garbled, there may be an issue with the ser2net config. Please configure ser2net to comply with the serial port specifications from the OpenTherm Gateway website. This is what I use in ser2net.conf for local testing on port 8802:

127.0.0.1,8802:raw:0:/dev/otgw:9600 NONE 1STOPBIT 8DATABITS XONXOFF LOCAL -RTSCTS

In the second scenario, if the thermostat and boiler are not connected, you should see the line E00000000 appear every second IIRC. If not, try the PS=0 or PS=1 commands and post the output.

@Wummeke
Copy link
Author

Wummeke commented Oct 9, 2018

Seems like I have an issue in the combination of Home Assistant, my NodeMCU/OTGW and my Google Home mini. I just accidentally discovered that when I have my Google Home mini unplugged and reboot Home Assistant, the connection between HA and the NodeMCU/OTGW stays OK, until I plug my GHM back in. Not sure why this is happening and I am a bit puzzled, because OTmonitor (which also runs from a Docker container on the same host as Home Assistant) doesn't have this issue, it stays connected with the NodeMCU/OTGW (although it is now turned off, because I first that that was blocking HA from connecting to the NodeMCU/OTGW).

@Wummeke
Copy link
Author

Wummeke commented Oct 9, 2018

I had IP reservations on the DHCP server for all the mentioned devices. I have removed the DHCP address reservation for the Google Home Mini on my router and it seems that it solved the issue (for now).

But I have another question: is it possible that the OTGW component reconnects itself after it got disconnected? Or let it at least attempt to reconnect? Now I have to restart HA completely to achieve that.

@mvn23
Copy link
Contributor

mvn23 commented Oct 9, 2018

Automatic reconnection may be added in a future release, but it is quite low on my to-do list. In it's normal config, the gateway is a serial device that may be connected through a serial-to-USB interface, so any disconnect or intermittent connection would be considered an abnormality. Using ser2net is a way to make the setup more versatile, but as you noticed it is definitely not perfect.
For a more stable connection I can only recommend to use a wired network connection rather than wifi. In my production setup I have the gateway connected to a DreamPlug which is forwarding the serial connection over gigabit wired ethernet to my Home Assistant instance. This has been running stable for weeks now.

@Wummeke
Copy link
Author

Wummeke commented Oct 11, 2018

@mvn23 Hopefully you can move it up on your to-do-list a little someday. I don't have the possibility to connect the OTGW wired to my Home Assistant instance unfortunately. Thanks for the time and effort you've put in this component. Too bad I can't use at this moment.

@infernix
Copy link
Contributor

I'll chime in here - I have an OTGW with an ESP8266 using a software serial build I made and have otmonitor talking to it for more than a year. Reasoning for that setup is in the repo description.

I had to build a wrapper script around otmonitor because I suspect otmonitor isn't queueing commands and sometimes HASS will send a command to otmonitor while it's already sending a command to the OTGW and that freezes/breaks otmonitor. So I've also scripted a restart of otmonitor in case it isn't replying to commands that are being sent.

As for the usage in HASS, I don't have a physical thermostat so my OTGW is in "standalone" mode. I send CH=1 if my HASS rules determine that there's a need to heat the house/rooms, and CH=0 when heat demand has been met. I also set the setpoint CS, modulation MM and occasionally hot water setpoint SW. I've also set up a binary sensor that sends periodic PR=A requests to determine whether if the OTGW connection is still alive.

I'd gladly dump otmonitor, but since the OTGW is really a critical component, controlling it has to be reliable and fault-tolerant. Having the heater on for too long has really bad effects :)

@mvn23
Copy link
Contributor

mvn23 commented Oct 14, 2018

@WumsWatchFace this may be related, Google Home devices seem to be affected as well. However, firmware should have been updated automatically already.

@infernix your setup will not be officially supported as you're looking to replace your thermostat rather than control it. However, when I start working on the implementation of services I will take it into consideration while deciding which commands to implement.

@thefoxmangit
Copy link

thefoxmangit commented Oct 17, 2018

I am also running my OTGW with a NodeMCU and am trying to connect it to Home Assistant. From my windows machine, OTmonitor works perfectly. I can ping the NodeMCU from the Hass.io docker instance. However, the component does not work. I have no Google home appliances running.

In the log the following is mentioned:

2018-10-16 22:17:04 WARNING (MainThread) [homeassistant.components.climate] Setup of platform opentherm_gw is taking over 10 seconds.
2018-10-16 22:17:06 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/homeassistant/helpers/entity_platform.py", line 352, in _async_add_entity
    await entity.async_added_to_hass()
  File "/usr/local/lib/python3.6/site-packages/homeassistant/components/climate/opentherm_gw.py", line 66, in async_added_to_hass
    await self.gateway.connect(self.hass.loop, self._device)
  File "/usr/local/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 55, in connect
    await self.get_reports()
  File "/usr/local/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 170, in get_reports
    OTGW_DEFAULT_TIMEOUT, loop=self.loop)
  File "/usr/local/lib/python3.6/asyncio/tasks.py", line 362, in wait_for
    raise futures.TimeoutError()
concurrent.futures._base.TimeoutError

@mvn23
Copy link
Contributor

mvn23 commented Oct 17, 2018

@fmslottje I have set up a docker container with hass.io just now for testing and I can't reproduce your issue so far.
It did give me the opportunity to look around in the hass.io environment a bit. Try to connect to your gateway from the command line within the docker container using nc <host> <port>. If that produces output, then please post a part (~20-30 lines) of the output you get. If you don't get any output, even after resetting the gateway, then it is definitely a connection issue (firewall/nat/routing/etc.)

@magin1
Copy link

magin1 commented Oct 18, 2018

@mvn23 Hi, little bit off-topic to the problem but i am looking for a tutorial to what to buy and how to set it up. i want to control my heating with Home Assistant. Could not find it anywhere else and you seem very knowledgeable about this:)

@mvn23
Copy link
Contributor

mvn23 commented Oct 18, 2018

@magin1 please direct your question through the Proper Support Channels™

@thefoxmangit
Copy link

Thanks @mvn23, I have installed the SSH addon and were able to nc, part of the output is as follows (truncated):

B407B0910 A701A0000 T80190000 BC01930B3 T10010A00 BD0010A00 T001B0000 R00740000 BC074FFFF AF01B0000 T000F0000 R80750000 B4075FFFF AF00F0000 T00120000 R80760000 B4076FFFF AF0120000 T80000200 B40000200 T80190000 BC01930B3 T10010A00 BD0010A00 T00110000 BC0110000 T001B0000 R00770000 BC077FFFF AF01B0000 T80190000 BC01930B3 T10010A00 BD0010A00 T80000200 B40000200

@mvn23
Copy link
Contributor

mvn23 commented Oct 18, 2018

Is that the actual format or does it print each message on its own line?
You can use code blocks to preserve formatting by putting a line with three backticks before and after a block of text.

@thefoxmangit
Copy link

sorry, each message on its own line, like this:

T90394B00
B50394B00
T80190000
B4019389E
T10010A00
BD0010A00
T100E0000
BD00E0000
T00090000
R00770000
BC077FFFF
AC0090000
T80640000
R00780000
BC0785268
A40640000
T80000200
B40000200
T80190000
B40193880
T10010A00
BD0010A00
T00110000
BC0110000
T10181407
BD0181407
T80190000
B40193880
T10010A00
BD0010A00
T90100E00
B50100E00
T9014B417
R80790000
B407958C8
A5014B417
T801C0000
B401C37B3
T80000200
B40000200
T80190000
B40193880

@mvn23
Copy link
Contributor

mvn23 commented Oct 18, 2018

Ok, the connection seems to be fine, so the component 'should' work... If I write something up for you to put in custom_components and try with that, could you then provide me a debug log?

@thefoxmangit
Copy link

Sure, many thanks for following this up, greatly appreciated!

@mvn23
Copy link
Contributor

mvn23 commented Oct 21, 2018

@fmslottje please put the contents of this repo in your custom_components folder and post a debug log to Hastebin.
What this does is override the opentherm_gw climate platform in the Home Assistant release with a custom version which loads a more verbose version of the library that connects to the gateway. It is made for Home Assistant v0.80.x, but should work on earlier versions as well.

@silvanverschuur
Copy link

silvanverschuur commented Oct 27, 2018

I've got a similar issue. The opentherm_gw component cannot connect to my OTGW using a TCP socket. I get the following error:

2018-10-27 09:38:51 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved Traceback (most recent call last): File "/srv/homeassistant/lib/python3.6/site-packages/homeassistant/components/opentherm_gw.py", line 75, in connect_and_subscribe await gateway.connect(hass.loop, device_path) File "/srv/homeassistant/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 52, in connect await self.get_reports() File "/srv/homeassistant/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 165, in get_reports ret = await self._wait_for_cmd(cmd, value) File "/srv/homeassistant/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 690, in _wait_for_cmd loop=self.loop) File "/usr/lib/python3.6/asyncio/tasks.py", line 362, in wait_for raise futures.TimeoutError() concurrent.futures._base.TimeoutError

To me this looks like a connection issue. The OTGW uses an ESP module loaded with ESP link. This is working for more than a year now. I'm using a python tool that connects to the ESP and publishes values to a MQTT topic. So I know the OTGW should be accessible using python (socket).

I would love to use the opentherm_gw component to get rid of third party tools. Any help would be appreciated.

@thefoxmangit
Copy link

@mvn23 I have some mixed success now, sometimes it works, sometimes it doesn't, so it's kind of hard to put a finger on the problem.

I found this in the log at a moment in which the climate component didn't load:

2018-10-21 16:11:00 WARNING (MainThread) [homeassistant.loader] You are using a custom component for climate.opentherm_gw which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you do experience issues with Home Assistant.
2018-10-21 16:11:10 WARNING (MainThread) [homeassistant.components.climate] Setup of platform opentherm_gw is taking over 10 seconds.
2018-10-21 16:11:12 ERROR (MainThread) [custom_components.climate.pyotgw.pyotgw] Timed out waiting for command: PR, value: I. Are you connecting to the OpenTherm Gateway?
2018-10-21 16:11:22 ERROR (MainThread) [custom_components.climate.pyotgw.pyotgw] Timed out waiting for command: PR, value: L. Are you connecting to the OpenTherm Gateway?
2018-10-21 16:11:23 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/homeassistant/helpers/entity_platform.py", line 352, in _async_add_entity
    await entity.async_added_to_hass()
  File "/config/custom_components/climate/opentherm_gw.py", line 66, in async_added_to_hass
    await self.gateway.connect(self.hass.loop, self._device)
  File "/config/custom_components/climate/pyotgw/pyotgw.py", line 57, in connect
    await self.get_reports()
  File "/config/custom_components/climate/pyotgw/pyotgw.py", line 181, in get_reports
    OTGW_GPIO_A_STATE: int(reports[OTGW_REPORT_GPIO_STATES][0]),
KeyError: 'I'
2018-10-21 16:11:33 WARNING (SyncWorker_16) [netdisco.ssdp] Error fetching description at 192.168.1.102
2018-10-21 16:12:24 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting

btw, my OTGW is on 192.168.1.109 so the error on 192.168.1.102 seems unrelated

@mvn23
Copy link
Contributor

mvn23 commented Oct 27, 2018

Please post a full log with log level set to debug. Run hass --debug to enable debug logging globally or put the following in configuration.yaml to debug only the climate component:

logger:
  logs:
    homeassistant.components.climate: debug

This will be a long log, so please use hastebin or similar.

@silvanverschuur looks like a connection issue indeed. So far I am unable to reproduce any of these issues so it's hard to trace the real problem. Debug logs are welcome though ;)

@silvanverschuur
Copy link

silvanverschuur commented Oct 27, 2018

@mvn23 I've enabled DEBUG logging for the component. This is the log:

2018-10-27 13:22:21 DEBUG (MainThread) [homeassistant.components.climate.opentherm_gw] Added device Thermostat
2018-10-27 13:22:31 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/srv/homeassistant/lib/python3.6/site-packages/homeassistant/components/opentherm_gw.py", line 75, in connect_and_subscribe
    await gateway.connect(hass.loop, device_path)
  File "/srv/homeassistant/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 52, in connect
    await self.get_reports()
  File "/srv/homeassistant/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 165, in get_reports
    ret = await self._wait_for_cmd(cmd, value)
  File "/srv/homeassistant/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 690, in _wait_for_cmd
    loop=self.loop)
  File "/usr/lib/python3.6/asyncio/tasks.py", line 362, in wait_for
    raise futures.TimeoutError()
concurrent.futures._base.TimeoutError

After this stacktrace nothing is logged anymore. I've also started hass with default logging set to debug but the results were the same (besides a lot of debug information from other components).

So it looks to me that the component completely stops when the OTGW connection cannot be established.

@mvn23
Copy link
Contributor

mvn23 commented Oct 27, 2018

Sorry, I should have specified. Debug logs with the custom_component from here.

@silvanverschuur
Copy link

silvanverschuur commented Oct 27, 2018

@mvn23 The debug log with the custom component adds one stack trace to the log:

2018-10-27 13:42:08 ERROR (MainThread) [homeassistant.components.climate] Error while setting up platform opentherm_gw
Traceback (most recent call last):
  File "/srv/homeassistant/lib/python3.6/site-packages/homeassistant/helpers/entity_platform.py", line 128, in _async_setup_platform
    SLOW_SETUP_MAX_WAIT, loop=hass.loop)
  File "/usr/lib/python3.6/asyncio/tasks.py", line 358, in wait_for
    return fut.result()
  File "/home/homeassistant/.homeassistant/custom_components/climate/opentherm_gw.py", line 40, in async_setup_platform
    gateway = OpenThermGateway(config)
  File "/home/homeassistant/.homeassistant/custom_components/climate/opentherm_gw.py", line 52, in __init__
    self._device = config[CONF_DEVICE]
KeyError: 'device'

2018-10-27 13:42:18 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/srv/homeassistant/lib/python3.6/site-packages/homeassistant/components/opentherm_gw.py", line 75, in connect_and_subscribe
    await gateway.connect(hass.loop, device_path)
  File "/srv/homeassistant/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 52, in connect
    await self.get_reports()
  File "/srv/homeassistant/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 165, in get_reports
    ret = await self._wait_for_cmd(cmd, value)
  File "/srv/homeassistant/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 690, in _wait_for_cmd
    loop=self.loop)
  File "/usr/lib/python3.6/asyncio/tasks.py", line 362, in wait_for
    raise futures.TimeoutError()

@mvn23
Copy link
Contributor

mvn23 commented Oct 27, 2018

I assume you're running 0.81 here? The custom_component was made for 0.80. I will make one for 0.81 later on.
Also, please use code formatting for log dumps by putting the text between lines with three backticks, or even better, use hastebin.

@silvanverschuur
Copy link

Yes, I'm using 0.81. I'll wait for the 0.81 component.

@remkokruize
Copy link

remkokruize commented Oct 29, 2018

I have the same issue

2018-10-29 10:40:20 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/src/app/homeassistant/components/opentherm_gw.py", line 75, in connect_and_subscribe
    await gateway.connect(hass.loop, device_path)
  File "/usr/local/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 52, in connect
    await self.get_reports()
  File "/usr/local/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 165, in get_reports
    ret = await self._wait_for_cmd(cmd, value)
  File "/usr/local/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 690, in _wait_for_cmd
    loop=self.loop)
  File "/usr/local/lib/python3.6/asyncio/tasks.py", line 362, in wait_for
    raise futures.TimeoutError()
concurrent.futures._base.TimeoutError

Using this configuration:

logger:
  default: debug
  logs:
    homeassistant.components.opentherm_gw: info
    homeassistant.components.climate: info

opentherm_gw:
  device: socket://192.168.0.159:6638
  climate:
    name: Thermostat
    precision: 0.5
    floor_temperature: False
  monitored_variables:
    - room_setpoint
    - room_temp
    - room_setpoint_ovrd
    - return_water_temp
    - otgw_about

The Opentherm gateway is connected using a nodemcu running ESPEasy which uses ser2net.
It does work correctly with the otmonitor.
I'm running Home Assistant 0.81.1 using Docker on a Synology DS218+

@2xdehelft
Copy link

Yeah exact the same problem here... Same errors, hassio 0.81, OTMonitor.exe does work on ESPEasy with Ser2net....

@thefoxmangit
Copy link

@mvn23 i just run the debug log with custom component on 0.80 (refrained from updating for now).
Hastebin log: https://hastebin.com/huqayebawu.coffeescript

I have also added another wifi access point to my setup (practically next to the OTGW) to make 100% sure that the wifi signal is ok, and at -45db RSSI that should be absolutely fine.

Many thanks for looking into this, much appreciated!

@mvn23
Copy link
Contributor

mvn23 commented Oct 29, 2018

Thanks for the debug log. It is giving me some more insight into what's going on. At a certain point, the gateway stops responding to commands for ~50 seconds, but it keeps on sending reports during this time. Recent versions of the library skip over these unanswered commands with just a log entry, but never recover the missing values. Without a test setup in my dev environment, it will be nearly impossible to track down the exact issue here as there's a whole chain of tools at play (network, serial, etc). One thing I can do about this is add a retry for the failed commands - possibly with a delay - to obtain the missing values, but that feels rather hacky to me.

I'll order a NodeMCU soon to play around a bit, see what I can find.

@mvn23
Copy link
Contributor

mvn23 commented Dec 5, 2018

It looks as if I time my command between two "data-bursts" the response comes back.

This is what I am suspecting as well, that transmitting and receiving at the same time causes problems.
I took some time to implement retry/reconnect logic in the library to work around this. New code can be found at the retry_reconnect branch of pyotgw, so feel free to test. If it's a success I will push an update soon.

@hjbout
Copy link

hjbout commented Dec 5, 2018

so feel free to test. If it's a success I will push an update soon.

I definitely want to but I’m not sure if I have the opportunity coming days/weekend. If I do, I’ll let you know.
Anyway, great that you took the effort to tackle this issue. Appreciate it very much!

@gdschut
Copy link

gdschut commented Dec 6, 2018

What is the way to test? Copy the pyotgw.py, protocol.py and vars.py from github over the existing ones in site-packages/pyotgw folder, and restart HA?

@mvn23
Copy link
Contributor

mvn23 commented Dec 6, 2018

That's what I do when testing for longer periods.

@gdschut
Copy link

gdschut commented Dec 7, 2018

Seems like it works. Got this in my logfile:

Dec 07 02:13:24 pi hass[31520]: 2018-12-07 02:13:24 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: T. Are you connecting to the OpenTherm Gateway?
Dec 07 02:13:31 pi hass[31520]: 2018-12-07 02:13:31 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: W. Are you connecting to the OpenTherm Gateway?
Dec 07 02:13:37 pi hass[31520]: 2018-12-07 02:13:37 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: I. Are you connecting to the OpenTherm Gateway?
Dec 07 02:13:40 pi hass[31520]: 2018-12-07 02:13:40 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: O. Are you connecting to the OpenTherm Gateway?
Dec 07 02:13:40 pi hass[31520]: 2018-12-07 02:13:40 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: A. Are you connecting to the OpenTherm Gateway?
Dec 07 02:13:43 pi hass[31520]: 2018-12-07 02:13:43 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: S. Are you connecting to the OpenTherm Gateway?
Dec 07 02:13:45 pi hass[31520]: 2018-12-07 02:13:45 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: L. Are you connecting to the OpenTherm Gateway?
Dec 07 02:13:46 pi hass[31520]: 2018-12-07 02:13:46 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: W. Are you connecting to the OpenTherm Gateway?
Dec 07 02:13:48 pi hass[31520]: 2018-12-07 02:13:48 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: P. Are you connecting to the OpenTherm Gateway?
Dec 07 02:13:49 pi hass[31520]: 2018-12-07 02:13:49 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: I. Are you connecting to the OpenTherm Gateway?
Dec 07 02:13:54 pi hass[31520]: 2018-12-07 02:13:54 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: R. Are you connecting to the OpenTherm Gateway?
Dec 07 02:13:54 pi hass[31520]: 2018-12-07 02:13:54 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: O. Are you connecting to the OpenTherm Gateway?
Dec 07 02:13:56 pi hass[31520]: 2018-12-07 02:13:55 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Dec 07 02:13:56 pi hass[31520]: Traceback (most recent call last):
Dec 07 02:13:56 pi hass[31520]:   File "/usr/lib/python3.5/asyncio/tasks.py", line 239, in _step
Dec 07 02:13:56 pi hass[31520]:     result = coro.send(None)
Dec 07 02:13:56 pi hass[31520]:   File "/home/pi/opt/homeassistant/lib/python3.5/site-packages/pyotgw/pyotgw.py", line 98, in connect
Dec 07 02:13:56 pi hass[31520]:     await self.get_reports()
Dec 07 02:13:56 pi hass[31520]:   File "/home/pi/opt/homeassistant/lib/python3.5/site-packages/pyotgw/pyotgw.py", line 270, in get_reports
Dec 07 02:13:56 pi hass[31520]:     status.update({OTGW_SB_TEMP: float(sb_temp)})
Dec 07 02:13:56 pi hass[31520]: ValueError: could not convert string to float: 'OpenTherm Gateway 4.2.5'

But OTGW is still connected!

@mvn23
Copy link
Contributor

mvn23 commented Dec 7, 2018

That's a very strange error to say the least. It may indicate an unexpected reset of the gateway.
I added some debug logging to the retry_reconnect branch. Could you update and enable debug logging for the library?

logger:
  default: warning
  logs:
    pyotgw.pyotgw: debug
    pyotgw.protocol: debug

@andriej
Copy link
Contributor

andriej commented Dec 7, 2018

I'm testing retry_reconnect too. First observation - it was able to 'reconnect' itself to proper data, when normal restarts of HASS didn't help. Will see how it goes further.

I also really appreciate your work. Would like to have it stable as controller. :-)

@gdschut
Copy link

gdschut commented Dec 8, 2018

Thanks for the update! Now it makes it hard to recognize when there was a connectionproblem, due to the huge amount of debug logs and the system reconnects itself...
How can I find if and when the problem has occured?

@mvn23
Copy link
Contributor

mvn23 commented Dec 8, 2018

It will throw one or more WARNING messages in the log if a command has to be resent due to an erroneous response, or the usual TimeOut if it gets no response at all.
If it detects inactivity on the connection, it will throw DEBUG messages like Reconnecting to serial device on <path>. I just realized maybe it should throw a WARNING there as well...
It will throw an ERROR if it fails to connect.

@andriej
Copy link
Contributor

andriej commented Dec 8, 2018

What showed in log (no debug tho).
It looks like it managed to reconnect, as it seems to still work.

ERROR:
Could not connect to serial device on socket://x.x.x.x:xxxx. Will keep trying. Reported error was: Could not open port socket://x.x.x.x:xxx: timed out

ERROR:
Error doing job: Task was destroyed but it is pending!

@hjbout
Copy link

hjbout commented Dec 19, 2018

Ok, finally have some time to try this, but... Don't know how to copy the 3 files to hassio. Tried to replace the files in /usr/local/lib/python3.6/site-packages/pyotgw while in the container. But after a restart I belief they have been replaced with the original ones. If I do a find from outside the container I see 3 occurences of the files. Twice with a path like /var/lib/docker/overlay2/3d06......8ab9518/diff/usr/local/lib/python3.6/site-packages/pyotgw/ and once with /var/lib/docker/overlay2/3d06......8ab9518/merged/usr/local/lib/python3.6/site-packages/pyotgw. Do I replace any of those files? Anyone care to learn me something ;)

@andriej
Copy link
Contributor

andriej commented Dec 19, 2018

I'm testing it for few days already. Is it possible that while it has connection/reconnection issues, the component makes whole async overloaded with task and other components may also have problems? I think it's causing some kind of hiccups of performance or other sensors refreshing issues which I didn't have before (like darksky or synologydsm on local network)

@mvn23
Copy link
Contributor

mvn23 commented Dec 21, 2018

Is it possible that while it has connection/reconnection issues, the component makes whole async overloaded with task and other components may also have problems?

I'll look into this soon.

@gdschut
Copy link

gdschut commented Dec 22, 2018

I am running with the debug version for a while now:

  • Did not need to restart the OTGW anymore! (From user perspective, the problem is fixed!)
  • Some watchdog actions in journalctl:
pi@pi:~ $ journalctl -b --no-pager | grep -i watchdog
Dec 19 17:17:28 pi hass[14998]: 2018-12-19 17:17:28 DEBUG (MainThread) [pyotgw.protocol] Watchdog triggered!
Dec 19 20:34:40 pi hass[16118]: 2018-12-19 20:34:40 DEBUG (MainThread) [pyotgw.protocol] Watchdog triggered!
Dec 20 14:45:08 pi hass[16118]: 2018-12-20 14:45:08 DEBUG (MainThread) [pyotgw.protocol] Watchdog triggered!
Dec 21 05:19:57 pi hass[25580]: 2018-12-21 05:19:57 DEBUG (MainThread) [pyotgw.protocol] Watchdog triggered!
Dec 21 12:09:06 pi hass[29188]: 2018-12-21 12:09:06 DEBUG (MainThread) [pyotgw.protocol] Watchdog triggered!

Some are relating to times when I was restarting HASS, but others did not.

So what other information do you like to get?

@mvn23
Copy link
Contributor

mvn23 commented Dec 22, 2018

Any warnings, errors and all output from pyotgw would be great. That will be a lot of data, so please use hastebin or similar.

@gdschut
Copy link

gdschut commented Dec 22, 2018

I do not like to share my complete log on the internet, but no problem to send it you. Are you the same mvn23 at Tweakers? I can send you a pm over there.

@mvn23
Copy link
Contributor

mvn23 commented Dec 23, 2018

Are you the same mvn23 at Tweakers?

The very same.

@mvn23 mvn23 mentioned this issue Dec 28, 2018
7 tasks
@ghost ghost added the in progress label Dec 28, 2018
@ghost ghost removed the in progress label Dec 28, 2018
@mvn23
Copy link
Contributor

mvn23 commented Dec 29, 2018

May be reopened due to a bug in the updated lib. Working on a patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.