Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poll UPS [user@localhost] failed - Data stale #15

Closed
JANogueira opened this issue Jan 6, 2020 · 11 comments
Closed

Poll UPS [user@localhost] failed - Data stale #15

JANogueira opened this issue Jan 6, 2020 · 11 comments

Comments

@JANogueira
Copy link

JANogueira commented Jan 6, 2020

Problem/Motivation

After upgrading today to the latest version ( Add-on version: 4213c06), Eaton UPS USB started to get "Data stale" errors. Previous version was working perfectly.

Running hassio on Raspberry pi 4 with 4GBs of RAM (All details on the module log below).

Expected behavior

Work as usual

Actual behavior

Poll UPS [user@localhost] failed - Data stale

Steps to reproduce

Eaton 5E connected through USB.

Configuration:

{
  "users": [
    {
      "username": "admin",
      "password": "passwd_hidden",
      "instcmds": [
        "all"
      ],
      "actions": []
    }
  ],
  "devices": [
    {
      "name": "eaton5e",
      "driver": "usbhid-ups",
      "port": "auto",
      "config": [
        "pollfreq = 5"
      ]
    }
  ],
  "mode": "netserver",
  "shutdown_hassio": "false",
  "i_like_to_be_pwned": "true",
  "list_usb_devices": "false"
}

If you change list_usb_devices, it shows properly.
hassio log:

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] nut: applying... 
[fix-attrs.d] nut: exited 0.
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 00-banner.sh: executing... 
-----------------------------------------------------------
 Hass.io Add-on: Network UPS Tools
 Manage battery backup (UPS) devices
-----------------------------------------------------------
 Add-on version: 4213c06
 You are running the latest version of this add-on.
 System: HassOS 3.7  (aarch64 / raspberrypi4-64)
 Home Assistant version: 0.103.5
 Supervisor version: 193
-----------------------------------------------------------
 Please, share the above information when looking for help
 or support in, e.g., GitHub, forums or the Discord chat.
-----------------------------------------------------------
[cont-init.d] 00-banner.sh: exited 0.
[cont-init.d] 01-log-level.sh: executing... 
[cont-init.d] 01-log-level.sh: exited 0.
[cont-init.d] nut.sh: executing... 
[22:24:44] INFO: Setting mode to netserver...
[22:24:44] INFO: Connected USB devices:
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 0463:ffff MGE UPS Systems UPS
Bus 001 Device 002: ID 2109:3431 VIA Labs, Inc. Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
[22:24:44] INFO: Generating /etc/nut/upsd.users...
[22:24:44] INFO: Configuring user: admin
[22:24:46] INFO: Configuring Device named eaton5e...
[22:24:46] INFO: Starting the UPS drivers...
Network UPS Tools - UPS driver controller 3.8.0-3259-g81f05ada5a
Network UPS Tools - Generic HID driver 0.41 (3.8.0-3259-g81f05ada5a)
Using subdriver: MGE HID 1.39
USB communication driver 0.33
[cont-init.d] nut.sh: exited 0.
[cont-init.d] nutclient.sh: executing... 
[cont-init.d] nutclient.sh: exited 0.
[cont-init.d] done.
[services.d] starting services
[services.d] done.
[22:24:52] INFO: Starting the UPS information server...
Network UPS Tools upsd 3.8.0-3259-g81f05ada5a
   0.000000	fopen /var/run/nut/upsd.pid: No such file or directory
   0.000696	listening on 0.0.0.0 port 3493
   0.001871	Connected to UPS [eaton5e]: usbhid-ups-eaton5e
[22:24:52] INFO: Starting the UPS monitor and shutdown controller...
Network UPS Tools upsmon 3.8.0-3259-g81f05ada5a
   0.000000	fopen /var/run/upsmon.pid: No such file or directory
   0.001396	Using power down flag file /etc/killpower
   0.002645	UPS: eaton5e@localhost (master) (power value 1)
   0.003780	debug level is '1'
   0.025697	Trying to connect to UPS [eaton5e@localhost]
   0.437283	User upsmonmaster@127.0.0.1 logged into UPS [eaton5e]
   0.044911	Logged into UPS eaton5e@localhost
 212.656081	Data for UPS [eaton5e] is stale - check driver
 215.075504	Poll UPS [eaton5e@localhost] failed - Data stale
 220.076508	Poll UPS [eaton5e@localhost] failed - Data stale
 225.077193	Poll UPS [eaton5e@localhost] failed - Data stale
 230.077656	Poll UPS [eaton5e@localhost] failed - Data stale
 235.078529	Poll UPS [eaton5e@localhost] failed - Data stale
 240.079161	Poll UPS [eaton5e@localhost] failed - Data stale
 245.079796	Poll UPS [eaton5e@localhost] failed - Data stale
 250.080840	Poll UPS [eaton5e@localhost] failed - Data stale
 255.081822	Poll UPS [eaton5e@localhost] failed - Data stale
 260.082700	Poll UPS [eaton5e@localhost] failed - Data stale
 265.083785	Poll UPS [eaton5e@localhost] failed - Data stale
 270.084475	Poll UPS [eaton5e@localhost] failed - Data stale
 275.085014	Poll UPS [eaton5e@localhost] failed - Data stale
 280.085596	Poll UPS [eaton5e@localhost] failed - Data stale
 285.086211	Poll UPS [eaton5e@localhost] failed - Data stale

System log showing all OK with the module startup:

20-01-06 22:24:07 INFO (MainThread) [hassio.addons] Found 10 installed add-ons
20-01-06 22:24:07 INFO (SyncWorker_8) [hassio.docker.interface] Attach to hassioaddons/nut-aarch64 with version 4213c06
20-01-06 22:24:07 INFO (MainThread) [hassio.updater] Fetch update data from https://version.home-assistant.io/stable.json
20-01-06 22:24:07 INFO (MainThread) [hassio.snapshots] Found 3 snapshot files
20-01-06 22:24:07 INFO (MainThread) [hassio.discovery] Load 0 messages
20-01-06 22:24:07 INFO (MainThread) [hassio.ingress] Load 12 ingress session
20-01-06 22:24:07 INFO (MainThread) [hassio.secrets] Load Home Assistant secrets: 19
20-01-06 22:24:07 INFO (MainThread) [__main__] Run Hass.io
20-01-06 22:24:07 INFO (MainThread) [hassio.api] Start API on 172.30.32.2
20-01-06 22:24:07 INFO (MainThread) [hassio.addons] Phase 'initialize' start 0 add-ons
20-01-06 22:24:07 INFO (MainThread) [hassio.addons] Phase 'system' start 1 add-ons
20-01-06 22:24:07 INFO (SyncWorker_11) [hassio.docker.interface] Clean addon_40817795_nut application
20-01-06 22:24:09 INFO (SyncWorker_11) [hassio.docker.addon] Start Docker add-on hassioaddons/nut-aarch64 with version 4213c06
20-01-06 22:24:42 INFO (MainThread) [hassio.api.security] /host/info access from 40817795_nut
@addons-assistant
Copy link

👋 Thanks for opening your first issue here! If you're reporting a 🐛 bug, please make sure you include steps to reproduce it. Also, logs, error messages and information about your hardware might be useful.

@sinclairpaul
Copy link
Member

Have a look at the issue at #14

The commits yesterday were just for doc updates.

@JANogueira
Copy link
Author

OK, I am checking. I'll provide feedback in a couple of minutes.

@JANogueira
Copy link
Author

JANogueira commented Jan 6, 2020

Yep! solved the problem. Thanks.
Added upsd_maxage of 60 seconds. Seems that made the trick.

  "devices": [
    {
      "name": "eaton5e",
      "driver": "usbhid-ups",
      "port": "auto",
      "config": [
        "pollfreq = 5"
      ]
    }
  ],
  "mode": "netserver",
  "shutdown_hassio": "false",
  "i_like_to_be_pwned": "true",
  "list_usb_devices": "false",
  "upsd_maxage": "60"

@JANogueira JANogueira reopened this Jan 7, 2020
@JANogueira
Copy link
Author

JANogueira commented Jan 7, 2020

Good morning,

After overnight monitoring, found that a lot of new "Data stale" errors occured and it seems that there is really a problem while detecting the UPS.

Here is the latest log after module restart:

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] nut: applying... 
[fix-attrs.d] nut: exited 0.
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 00-banner.sh: executing... 
-----------------------------------------------------------
 Hass.io Add-on: Network UPS Tools
 Manage battery backup (UPS) devices
-----------------------------------------------------------
 Add-on version: 4213c06
 You are running the latest version of this add-on.
 System: HassOS 3.7  (aarch64 / raspberrypi4-64)
 Home Assistant version: 0.103.6
 Supervisor version: 193
-----------------------------------------------------------
 Please, share the above information when looking for help
 or support in, e.g., GitHub, forums or the Discord chat.
-----------------------------------------------------------
[cont-init.d] 00-banner.sh: exited 0.
[cont-init.d] 01-log-level.sh: executing... 
[cont-init.d] 01-log-level.sh: exited 0.
[cont-init.d] nut.sh: executing... 
[08:33:57] INFO: Setting mode to netserver...
[08:33:57] INFO: Connected USB devices:
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
[08:33:58] INFO: Generating /etc/nut/upsd.users...
[08:33:58] INFO: Configuring user: admin
[08:34:00] INFO: Configuring Device named eaton5e...
[08:34:00] INFO: Starting the UPS drivers...
Network UPS Tools - UPS driver controller 3.8.0-3259-g81f05ada5a
Network UPS Tools - Generic HID driver 0.41 (3.8.0-3259-g81f05ada5a)
USB communication driver 0.33
No matching HID UPS found
Driver failed to start (exit status=1)
[cont-init.d] nut.sh: exited 1.
[cont-finish.d] executing container finish scripts...
[cont-finish.d] 99-message.sh: executing... 
-----------------------------------------------------------
                Oops! Something went wrong.

 We are so sorry, but something went terribly wrong when
 starting or running this add-on.
 
 Be sure to check the log above, line by line, for hints.
-----------------------------------------------------------
[cont-finish.d] 99-message.sh: exited 0.
[cont-finish.d] done.
[s6-finish] waiting for services.
[s6-finish] sending all processes the TERM signal.

If seems that the USB UPS is not even detected and the system does not start.
If I reboot the system, all is good (restarting hassio is not enough).

When going to the hassio SSH console, the USB devices are not listed:


  _    _                 _
 | |  | |               (_)
 | |__| | __ _ ___ ___   _  ___
 |  __  |/ _` / __/ __| | |/ _ \
 | |  | | (_| \__ \__ \_| | (_) |
 |_|  |_|\__,_|___/___(_)_|\___/



Our Cli:
$ hassio help
➜  ~ lsusb
Bus 001 Device 001: ID 1d6b:0002
Bus 002 Device 001: ID 1d6b:0003
➜  ~ 

hassio dmesg:

[  303.329705] xhci_hcd 0000:01:00.0: xHCI host not responding to stop endpoint command.                                                                                                                                                 
[  303.345777] xhci_hcd 0000:01:00.0: Host halt failed, -110                                                                                                                                                                             
[  303.345791] xhci_hcd 0000:01:00.0: xHCI host controller not responding, assume dead                                                                                                                                                   
[  303.345841] xhci_hcd 0000:01:00.0: HC died; cleaning up                                                                                                                                                                               
[  303.345984] usb 1-1: USB disconnect, device number 2                                                                                                                                                                                  
[  303.346009] usb 1-1.3: USB disconnect, device number 3                                                                                                                                                                                

This possibly has nothing to do with your add-on. Most likely to be a problem with hassio, right?

Thanks

@JANogueira
Copy link
Author

JANogueira commented Jan 7, 2020

So, if I do not enable the addon, the USB devices remain connected and I don't get any xHCI error that leads to the USB loss.
Also noticed that the pollfreq is related with this problem. So continuous driver queries lead to xHCI bus crash. If I increase this number to 15 seconds, the problem is mitigated, but creates a delay on the UPS status update. If I remove it (defaults to 30 seconds), then all is good, but UPS status update is much less frequent and subverts the objective of having the poll_freq parameter.

@sinclairpaul, not sure if Network UPS Tools is the "guilty" or if hassio host OS should be more solid about frequent driver requests. What is your opinion?

Thanks in advance.

@sinclairpaul
Copy link
Member

sinclairpaul commented Jan 7, 2020

In all honesty I don't know, from looking around there is also a possibility that it could be hardware related (specifically around the length of the USB cable, I also see some people have had success with placing a hub in the path).

Unfortunately I cannot validate on HassOS easily, I can tell you I do not see the issues personally, although it is a different OS and UPS. I also know Dale runs multiple UPS' without these issues as well.

I would also look the section on polling in the driver specific doc at https://github.com/networkupstools/nut/blob/master/docs/man/usbhid-ups.txt

I'm not sure pushing the driver polling so low is recommended.

GitHub
The Network UPS Tools repository. Contribute to networkupstools/nut development by creating an account on GitHub.

@JANogueira
Copy link
Author

Thanks for your feedback. I reverted back poll interval to 30 (just removed it), and the upsd_maxage was kept configured as 60 seconds, but upsd_maxage here has no effect because it keeps crashing the xHCI bus, leading to USB connectivity loss.

If I unplug and plug back the UPS, it will not show up on the USB devices. So only a Host OS reboot will restore USB functionality again. This seems to be something related with hass.io host OS problem at kernel level possibly.

But this is odd... it worked just fine for 2 days and then this...
Tried now to uninstall the add-on and install it back again, but same thing happened. I will check with other cable just in case.

Anyway, I already opened a ticket on hassio github. I am closing this one as it seems to be not related with Network UPS Tool addon.
Thanks for your tips.
Cheers

@JANogueira
Copy link
Author

JANogueira commented Jan 7, 2020

Some notes.

The problem you mentioned:
networkupstools/nut#515

The ticket on Hassio:
home-assistant/operating-system#526

EDIT:
changed USB cable by a brand new. Same behavior.
Added USB hub in between, same behavior.

@JANogueira
Copy link
Author

JANogueira commented Jan 15, 2020

Just to share an update on this topic.
rpi4b 0137ad eeprom solved the issue. USB devices are all stable and a small increase on the sdcard I/O is seen as well.

More information here:
https://www.raspberrypi.org/forums/viewtopic.php?t=260879

@addons-assistant
Copy link

This thread has been automatically locked because it has not had recent activity. Please open a new issue for related bugs and link to relevant comments in this thread.

@addons-assistant addons-assistant bot locked as resolved and limited conversation to collaborators Feb 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants