Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alarm Decoder connections via socket or serial stop responding randomly #11157

Closed
mririgoyen opened this issue Dec 15, 2017 · 32 comments · Fixed by #11168
Closed

Alarm Decoder connections via socket or serial stop responding randomly #11157

mririgoyen opened this issue Dec 15, 2017 · 32 comments · Fixed by #11168

Comments

@mririgoyen
Copy link
Contributor

mririgoyen commented Dec 15, 2017

Home Assistant release (hass --version): 0.59.2

Python release (python3 --version): 3.6

Component/platform: Alarm Decoder

Description of problem:
You can see the history of this issue here: https://community.home-assistant.io/t/alarm-decoder-stops-working-after-a-couple-days/18277

Basically, configuring Alarm Decoder to utilize a serial connection works for short periods of time. Eventually, Home Assistant no longer can communicate with the device. Restarting Home Assistant fixes the issue. While Home Assistant is unable to communicate with the device, using a terminal interface into the serial device shows that it is still responding and the alarm can be controlled via terminal commands.

After several people suggesting to utilize ser2sock, an application released by the company who creates Alarm Decoder, I have created a Hass.io add-on to utilize ser2sock to verify it is not a serial device issue. With ser2sock running and Home Assistant configured to communicate with the device now via socket, the connection works for short periods of time. Again, when the connection is no longer active within Home Assistant, hitting the socket connection directly works and control of the alarm is possible. Restarting Home Assistant reestablishes the connection.

Expected:
The connection, via serial or socket, is constant and, in case of error, Home Assistant recognizes when it can no longer communicate with a serial or socket device and reconnects to it. Considering this is an alarm system, it is imperative that there is always communication with it in case of emergency.

Problem-relevant configuration.yaml entries and steps to reproduce:
Serial configuration:

alarmdecoder:
  device:
    type: serial
    path: /dev/ttyAMA0
  panel_display: On
  zones: !include alarm_zones.yaml

Socket configuration:

alarmdecoder:
  device:
    type: socket
    host: local-ser2sock
    port: 8100
  panel_display: On
  zones: !include alarm_zones.yaml

Traceback (if applicable):
Serial error:

17-05-22 01:20:41 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
File "/home/homeassistant/.homeassistant/deps/alarmdecoder/devices.py", line 803, in write
self._device.write(data)
File "/home/homeassistant/.homeassistant/deps/serial/serialposix.py", line 490, in write
if not self._isOpen: raise portNotOpenError
File "/home/homeassistant/.homeassistant/deps/alarmdecoder/devices.py", line 803, in write
self._device.write(data)
File "/home/homeassistant/.homeassistant/deps/serial/serialposix.py", line 490, in write
if not self._isOpen: raise portNotOpenError
serial.serialutil.SerialException: Attempting to use a port that is not open

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "/usr/lib/python3.4/asyncio/tasks.py", line 237, in _step
result = next(coro)
File "/srv/homeassistant/homeassistant_venv/lib/python3.4/site-packages/homeassistant/core.py", line 1015, in eventto_service_call
yield from service_handler.func(service_call)
File "/srv/homeassistant/homeassistant_venv/lib/python3.4/site-packages/homeassistant/components/alarm_control_panel/__init__.py", line 109, in async_alarm_service_handler
yield from getattr(alarm, method)(code)
File "/usr/lib/python3.4/asyncio/coroutines.py", line 141, in coro
res = func(*args, **kw)
File "/srv/homeassistant/homeassistant_venv/lib/python3.4/site-packages/homeassistant/components/alarm_control_panel/alarmdecoder.py", line 119, in async_alarm_arm_home
self.hass.data[DATA_AD].send("{!s}3".format(code))
File "/home/homeassistant/.homeassistant/deps/alarmdecoder/decoder.py", line 260, in send
self._device.write(data)
File "/home/homeassistant/.homeassistant/deps/alarmdecoder/devices.py", line 809, in write
raise CommError('Error writing to device.', err)
alarmdecoder.util.CommError: ('Error writing to device.', SerialException('Attempting to use a port that is not open',))

Socket error:

Error doing job: Task exception was never retrieved
Traceback (most recent call last):
File “/usr/lib/python3.6/site-packages/alarmdecoder/devices.py”, line 1124, in write
data_sent = self._device.send(data)
OSError: [Errno 9] Bad file descriptor

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/usr/lib/python3.6/asyncio/tasks.py”, line 179, in _step
result = coro.send(None)
File “/usr/lib/python3.6/site-packages/homeassistant/core.py”, line 1031, in _event_to_service_call
yield from service_handler.func(service_call)
File “/usr/lib/python3.6/site-packages/homeassistant/components/alarm_control_panel/init.py”, line 142, in async_alarm_service_handler
yield from getattr(alarm, method)(code)
File “/usr/lib/python3.6/asyncio/coroutines.py”, line 210, in coro
res = func(*args, **kw)
File “usr/lib/python3.6/site-packages/homeassistant/components/alarm_control_panel/alarmdecoder.py”, line 116, in async_alarm_arm_home
self.hass.data[DATA_AD].send("{!s}3".format(code))
File “/usr/lib/python3.6/site-packages/alarmdecoder/decoder.py”, line 261, in send
self._device.write(data)
File “/usr/lib/python3.6/site-packages/alarmdecoder/devices.py”, line 1132, in write
raise CommError(‘Error writing to device.’, err)
alarmdecoder.util.CommError: (‘Error writing to device.’, OSError(9, ‘Bad file descriptor’))
@MartinHjelmare
Copy link
Member

The alarmdecoder component and platforms seem to be implemented wrongly using the home assistant asyncio api while the alarmdecoder library doesn't support that. That probably creates a lot of issues.

@PhracturedBlue
Copy link
Contributor

I added a PR to convert the component to synchronous. I didn't convert the platforms since I don't think it should be necessary(?). I cannot verify that this specific issue is fixed by the PR though since I have never experienced this issue of dropped connections.

@mririgoyen
Copy link
Contributor Author

@PhracturedBlue I will incorporate your changes into a custom component this evening so I can see if the problem still happens. It'll likely take a couple days to determine if the problem is fixed. I have seen the connection drops as soon as 12 hours and as long as 3 days in the past.

@PhracturedBlue
Copy link
Contributor

Thanks. I have done the same to try to ensure there are no regressions.

@mririgoyen
Copy link
Contributor Author

I'm not sure where my last comment went. This issue should not be closed. The PR linked here has not yet been confirmed to resolve this issue and it is still speculation that the reported issue was indeed caused by async behavior.

@pvizeli Please reopen until I can confirm this issue is resolved.

@PhracturedBlue
Copy link
Contributor

PhracturedBlue commented Dec 17, 2017

By the way, do you use home-assistant to turn your alarm on/off? If so, this patch may not fully fix the issue. Martin made the point that using asyc connection when sending info to the alarm is incorrect, and my patch needs to be fixed for that. My 2nd attempt didn't go so well, so I don't have a satisfactory patch ready. While the 2nd iteration was accepted, testing with the 1st iteration of the patch should be sufficient to verify that fixing the async calls is enough, so please continue your current testing.

@mririgoyen
Copy link
Contributor Author

@PhracturedBlue You were correct in your assumption. With the changes from your 1st iteration in place, I am unable to toggle my alarm on or off this morning. After restarting Home Assistant, I regained control. Looks like it is not 100% fixed yet.

@PhracturedBlue
Copy link
Contributor

PhracturedBlue commented Dec 17, 2017

Pull the fix from the dev branch (or wait for 0.60) and try it again. The 2nd half of the fix is in place there now.

@mririgoyen
Copy link
Contributor Author

0.60.0 is installed and running. I'll check back in in a few days to report if the problem seems to have been resolved with the changes.

@PhracturedBlue
Copy link
Contributor

Have you had any issues with 0.60?
I have seen some issues with things not updating for a while and then starting to work again (without restarting hass). But I don't know that it is the same issue.

@mririgoyen
Copy link
Contributor Author

So far, so good. Admittedly, I have restart Home Assistant a few times while working on #11271, but I'm done with that work now and probably won't be touching anything over the holiday. The next few days will be the ultimate test!

@mririgoyen
Copy link
Contributor Author

mririgoyen commented Dec 26, 2017

Here to report, one of my automations just failed to turn the alarm on. Went to the logs...

alarmdecoder.util.CommError: ('Error writing to device.', SerialException('Attempting to use a port that is not open',))

:(

Not being super familiar with Python, is there any way to incorporate a retry mechanism so that it forces a reconnect a couple times before failing?

@billimek
Copy link

billimek commented Dec 30, 2017

I've been experiencing and following the same issue and want to chime-in to the issue so @goyney doesn't feel alone in this plight.

Unlike @goyney, I've been running the ser2sock as a separate containerized deployment.

Whenever the ser2sock container restarts, HASS will lose connection with the alarmdecoder device and throw the alarmdecoder.util.CommError: ('Error writing to device.', OSError(9, 'Bad file descriptor')) error as described above. The only remedy is to restart home assistant.

This seems to be in-line with what @goyney observes when restarting the hassio-addon-ser2sock addon which begs the question about enabling some sort of retry mechanism with Home Assistant.


In addition to the socket restart issue, I've also experienced situations where the connection from Home Assistant and the alarmdecoder device appears to get 'lost' after some period of time, even when the ser2sock thing is not restarted. I'm hopeful that #11168 fixes this later issue and will report back now that I've reduced chances of ser2sock restarts by removing it from watchtower and upgraded HASS to v0.60.

@MartinHjelmare
Copy link
Member

The reconnect logic, if needed, should be implemented in the alarmdecoder library.

@mririgoyen
Copy link
Contributor Author

That's a bit out of my wheel-house. I unfortunately don't have the time in the short-term to dive into it myself because I'm getting married in a week. Anyone wanna volunteer for that?

@PhracturedBlue
Copy link
Contributor

I have addressed the reconnect issue in home-assistant here:
#11383

I think this could be more difficult to fix in alarmdecoder as it has no timer based mechanism to deal with retries, but who knows. I expect my pull request will be blocked based on Martin's comments, but I am providing it anyway to help anyone who actually wants a solution sooner rather than later.

@billimek
Copy link

billimek commented Dec 30, 2017

Does this change in the alarmdecoder library on 2017-11-05 (but not published to pypi yet) possibly fix this issue?

If so, then once that change to the AlarmDecoder library is published and a new release is cut, we'll need to update https://github.com/home-assistant/home-assistant/blob/master/requirements_all.txt#L84 appropriately.

@PhracturedBlue
Copy link
Contributor

No. That probably helps with a clean reconnect, but does not actually cause a reconnect to occur

@MartinHjelmare
Copy link
Member

But it sounds as #11383 depends on nutechsoftware/alarmdecoder#17. Am I right?

@PhracturedBlue
Copy link
Contributor

I can't test all configurations but at least using ser2sock, nutechsoftware/alarmdecoder#17 is not required

@PhracturedBlue
Copy link
Contributor

FYI, I did make a request to add alarmdecoder 0.13.1 to pypi so we can get that update:
nutechsoftware/alarmdecoder#18

@PhracturedBlue
Copy link
Contributor

the AD maintainer created a new release (1.13.2) with the fix from nutechsoftware/alarmdecoder#17
I have updated my pull request to use this version. While it doesn't change any behavior in my setup, my understanding is that it should be more robust depending on configuration.

@mririgoyen
Copy link
Contributor Author

@PhracturedBlue Will the bump to 1.13.2 make it into the 0.61.0 release?

@PhracturedBlue
Copy link
Contributor

I do not have any control over that. The pull request was approved by Martin, but balloob and fabaff seem to make the final decisions about what goes in.

@MartinHjelmare
Copy link
Member

Merged #11383 now.

@dnschneid
Copy link

I haven't dug into it much, but just chiming in that I also have this issue with the direct serial connection, and 0.60.1 doesn't fix it.

@MartinHjelmare
Copy link
Member

Above linked PR will be included in 0.61. Ie it's not released in 0.60.1.

@dnschneid
Copy link

Whoops, I had assumed that was only for the socket interface.
The async fixes were the ones I was testing. Perhaps I'll apply #11383 manually and see.

@dnschneid
Copy link

It's still working, 6 days in (0.60.1 with #11383 applied). Yay.

@balloobbot
Copy link

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates.

Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍

@dnschneid
Copy link

The reconnect patch has been working great for me. If we're content with that workaround, I would vote for this to be marked as fixed.

@mririgoyen
Copy link
Contributor Author

This problem has been resolved.

@home-assistant home-assistant locked and limited conversation to collaborators Jul 26, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants