Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend shutdown after bad data from linux command input read #682

Closed
BobDDole opened this issue Aug 30, 2019 · 14 comments
Closed

Backend shutdown after bad data from linux command input read #682

BobDDole opened this issue Aug 30, 2019 · 14 comments

Comments

@BobDDole
Copy link

Mycodo Issue Report:

  • Specific Mycodo Version:
    7.6.3

Problem Description

Please list:
Backend Daemon stops randomly due to "bad reads"

AND/OR

curl -s 'http://10.10.0.88/cm?cmnd=%20Status%2011' | jq -r '.StatusSTS' | jq -r '.POWER2'| grep -c ON

  • specific setup details that are involved
    Parsing data from a tasmota sonoff module to get it's power status

Errors

  • List any errors you encountered.
    Backend just stops after getting no data from tasmota module or getting corrupt data from it

  • Copy and pasting crash logs, or link to any specific
    code lines you've isolated (please use GitHub permalinks for this)

2019-08-30 15:45:07,288 - ERROR - mycodo.inputs.linux_command_c419a78e - The command returned a non-numerical value. Ensure only one numerical value is returned by the command. Value returned: 'b'''
2019-08-30 15:45:07,292 - ERROR - mycodo.inputs.linux_command_f3048a36 - InputModule raised an exception when taking a reading: 'NoneType' object has no attribute 'isalpha'

Steps to Reproduce the issue:

How can this issue be reproduced?

  1. Create a linux/bash command input and use the command string or strings listed above
  2. Since you don't have a device on the network on that address nothing more needed just wait for the above error or errors and Daemon shut down.
    (I'm polling same sonoff/tasmota device but two power states from it)

Additional Notes

Is there anything that should be added to make it easier
to address this issue?
Seems like 7.6.3 wants to error and shut down on corrupt values from a linux/bash command call whereas the older 7.4 would ignore errors and continue on it's way.
This is completely unusable asis since it kills the server on any "bad/missing" data.

@kizniche
Copy link
Owner

Have you tried putting your commands in a bash script and executing the script with Mycodo?

@kizniche
Copy link
Owner

kizniche commented Aug 30, 2019

Backend Daemon stops randomly due to "bad reads"

The two exceptions you pasted shouldn't stop the daemon. In fact, those exceptions are handled by the Input module and are merely warnings. Do you have log lines demonstrating a deamon malfunction?

@kizniche
Copy link
Owner

kizniche commented Aug 30, 2019

I've added more concise exception-handling, but I'm still skeptical this is causing a daemon crash, or that if the daemon is crashing, this is the cause. Also in the commit moments ago, I changed the log line from error to debug, so you'll need to set Log Level Debug in the Input settings to see them. That should reduce the clutter in the daemon log.

@BobDDole
Copy link
Author

BobDDole commented Aug 30, 2019

Not sure what is happening but with those two linux commands deactivated I have had no daemon crashes.
Here is a cut from my log:

2019-08-30 15:44:28,925 - INFO - mycodo.daemon - Mycodo daemon v7.6.3 starting
2019-08-30 15:44:29,668 - INFO - mycodo.daemon - Anonymous statistics enabled
2019-08-30 15:44:29,711 - INFO - mycodo.daemon - Starting rpyc server
2019-08-30 15:44:30,698 - INFO - mycodo.controller_output - Output controller activated in 272.4 ms
2019-08-30 15:44:31,200 - INFO - mycodo.daemon - Output controller fully started
2019-08-30 15:44:31,201 - INFO - mycodo.daemon - All activated Conditional controllers started
2019-08-30 15:44:31,202 - INFO - mycodo.daemon - All activated Trigger controllers started
2019-08-30 15:44:32,274 - INFO - mycodo.controller_input_d9d5c81e - Activated in 890.1 ms
2019-08-30 15:44:33,629 - INFO - mycodo.controller_input_1b96b61f - Activated in 1168.2 ms
2019-08-30 15:44:34,524 - INFO - mycodo.controller_input_f3048a36 - Activated in 706.7 ms
2019-08-30 15:44:35,650 - INFO - mycodo.controller_input_c419a78e - Activated in 704.3 ms
2019-08-30 15:44:35,652 - INFO - mycodo.daemon - All activated Input controllers started
2019-08-30 15:44:35,653 - INFO - mycodo.daemon - All activated Math controllers started
2019-08-30 15:44:35,664 - INFO - mycodo.daemon - All activated PID controllers started
2019-08-30 15:44:35,666 - INFO - mycodo.daemon - All activated LCD controllers started
2019-08-30 15:44:36,168 - INFO - mycodo.daemon - Mycodo daemon started in 7.242 seconds
2019-08-30 15:44:36,173 - INFO - mycodo.daemon - 48.62 MB RAM in use
2019-08-30 15:45:07,288 - ERROR - mycodo.inputs.linux_command_c419a78e - The command returned a non-numerical value. Ensure only one numerical value is returned by the command. Value returned: 'b'''
2019-08-30 15:45:07,292 - ERROR - mycodo.inputs.linux_command_f3048a36 - InputModule raised an exception when taking a reading: 'NoneType' object has no attribute 'isalpha'
2019-08-30 15:45:07,294 - INFO - mycodo.daemon - All LCD controllers stopped
2019-08-30 15:45:07,295 - INFO - mycodo.daemon - All PID controllers stopped
2019-08-30 15:45:07,295 - INFO - mycodo.daemon - All Math controllers stopped
2019-08-30 15:45:07,342 - INFO - mycodo.controller_input_1b96b61f - Deactivated in 48.0 ms
2019-08-30 15:45:07,379 - INFO - mycodo.controller_input_d9d5c81e - Deactivated in 85.3 ms
2019-08-30 15:45:07,391 - INFO - mycodo.controller_input_c419a78e - Deactivated in 97.4 ms
2019-08-30 15:45:07,394 - INFO - mycodo.controller_input_f3048a36 - Deactivated in 101.1 ms
2019-08-30 15:45:07,396 - INFO - mycodo.daemon - All Input controllers stopped
2019-08-30 15:45:07,396 - INFO - mycodo.daemon - All Trigger controllers stopped
2019-08-30 15:45:07,397 - INFO - mycodo.daemon - All Conditional controllers stopped

It doesn't seem to crash, just shut down.

linux command config

This is how I have the command configured. If it helps any.

@kizniche
Copy link
Owner

Have you tried moving the command into a bash script and making the command the execution of the script?

Try making the edits from this commit, restart the daemon, and see if it fixes the issue: 2340fef

@BobDDole
Copy link
Author

BobDDole commented Aug 30, 2019

Just tried putting it into a bash script and executing via script call.
Same result.

2019-08-30 18:49:15,149 - ERROR - mycodo.inputs.linux_command_f3048a36 - The command returned a non-numerical value. Ensure only one numerical value is returned by the command. Value returned: 'b'''
2019-08-30 18:50:48,087 - ERROR - mycodo.inputs.linux_command_f3048a36 - InputModule raised an exception when taking a reading: 'NoneType' object has no attribute 'isalpha'
2019-08-30 18:50:48,088 - INFO - mycodo.daemon - All LCD controllers stopped
2019-08-30 18:50:48,092 - INFO - mycodo.daemon - All PID controllers stopped
2019-08-30 18:50:48,093 - INFO - mycodo.daemon - All Math controllers stopped
2019-08-30 18:50:48,108 - INFO - mycodo.controller_input_1b96b61f - Deactivated in 20.0 ms
2019-08-30 18:50:48,115 - INFO - mycodo.controller_input_d9d5c81e - Deactivated in 26.4 ms
2019-08-30 18:50:48,191 - INFO - mycodo.controller_input_f3048a36 - Deactivated in 103.0 ms
2019-08-30 18:50:48,193 - INFO - mycodo.daemon - All Input controllers stopped
2019-08-30 18:50:48,194 - INFO - mycodo.daemon - All Trigger controllers stopped
2019-08-30 18:50:48,194 - INFO - mycodo.daemon - All Conditional controllers stopped

@kizniche
Copy link
Owner

Have you tried both recommendations?

@BobDDole
Copy link
Author

Yes, I made the commit changes and all I get is a 503 Bad Gateway error. I can't get the web interface up until I reverse the changes.

@kizniche
Copy link
Owner

kizniche commented Sep 3, 2019

An error would indicate you made a typo in the manual edit of the files. Make sure you use only spaces (no tabs).

@BobDDole
Copy link
Author

BobDDole commented Sep 7, 2019

Yes it was the "tabs" that did me in. Wasn't aware python saw a difference between tabs and spaces.

I did the commits and now have no 503 error anymore but I have the similar shutdown issues but at this point it happens only randomly instead of every time also only seems to happen when "Error:"
gets returned as "None" instead of "0" or "1" if that helps any.

here's my latest log:

2019-09-07 21:56:58,066 - DEBUG - mycodo.inputs.linux_command_727d9bd8 - Command being executed: curl -s 'http://10.10.0.53/cm?cmnd=%20Status%2011' | jq -r '.StatusSTS' | jq -r '.POWER1'| grep -c ON
2019-09-07 21:56:58,229 - DEBUG - mycodo.inputs.linux_command_727d9bd8 - Command returned: b'1\n', Status: None, Error: 0
2019-09-07 21:56:58,256 - DEBUG - mycodo.controller_input_727d9bd8 - Adding measurements to InfluxDB with ID 727d9bd8-ebcb-48b0-953f-09791b43ac0d: {0: {'measurement': 'gpio_state', 'unit': 'bool', 'value': 1.0, 'timestamp_utc': None}}
2019-09-07 21:57:13,098 - DEBUG - mycodo.inputs.linux_command_727d9bd8 - Command being executed: curl -s 'http://10.10.0.53/cm?cmnd=%20Status%2011' | jq -r '.StatusSTS' | jq -r '.POWER1'| grep -c ON
2019-09-07 21:57:13,242 - DEBUG - mycodo.inputs.linux_command_727d9bd8 - Command returned: b'1\n', Status: None, Error: 0
2019-09-07 21:57:13,266 - DEBUG - mycodo.controller_input_727d9bd8 - Adding measurements to InfluxDB with ID 727d9bd8-ebcb-48b0-953f-09791b43ac0d: {0: {'measurement': 'gpio_state', 'unit': 'bool', 'value': 1.0, 'timestamp_utc': None}}
2019-09-07 21:57:28,102 - DEBUG - mycodo.inputs.linux_command_727d9bd8 - Command being executed: curl -s 'http://10.10.0.53/cm?cmnd=%20Status%2011' | jq -r '.StatusSTS' | jq -r '.POWER1'| grep -c ON
2019-09-07 21:57:28,241 - DEBUG - mycodo.inputs.linux_command_727d9bd8 - Command returned: b'1\n', Status: None, Error: 0
2019-09-07 21:57:28,265 - DEBUG - mycodo.controller_input_727d9bd8 - Adding measurements to InfluxDB with ID 727d9bd8-ebcb-48b0-953f-09791b43ac0d: {0: {'measurement': 'gpio_state', 'unit': 'bool', 'value': 1.0, 'timestamp_utc': None}}
2019-09-07 21:57:43,103 - DEBUG - mycodo.inputs.linux_command_727d9bd8 - Command being executed: curl -s 'http://10.10.0.53/cm?cmnd=%20Status%2011' | jq -r '.StatusSTS' | jq -r '.POWER1'| grep -c ON
2019-09-07 21:58:43,147 - DEBUG - mycodo.inputs.linux_command_727d9bd8 - Command returned: None, Status: None, Error: None
2019-09-07 21:58:43,149 - DEBUG - mycodo.inputs.linux_command_727d9bd8 - The command returned a non-numerical value. Ensure only one numerical value is returned by the command. Value returned: 'None'
2019-09-07 21:58:43,150 - INFO - mycodo.daemon - All LCD controllers stopped
2019-09-07 21:58:43,151 - INFO - mycodo.daemon - All PID controllers stopped
2019-09-07 21:58:43,151 - INFO - mycodo.daemon - All Math controllers stopped
2019-09-07 21:58:43,236 - INFO - mycodo.controller_input_4e679a49 - Deactivated in 86.0 ms
2019-09-07 21:58:43,247 - INFO - mycodo.controller_input_95510879 - Deactivated in 97.4 ms
2019-09-07 21:58:43,251 - INFO - mycodo.controller_input_727d9bd8 - Deactivated in 100.7 ms
2019-09-07 21:58:43,251 - INFO - mycodo.daemon - All Input controllers stopped
2019-09-07 21:58:43,252 - INFO - mycodo.daemon - All Trigger controllers stopped
2019-09-07 21:58:43,252 - INFO - mycodo.daemon - All Conditional controllers stopped
2019-09-07 21:58:43,274 - INFO - mycodo.controller_output - Output controller deactivated in 20.6 ms

@BobDDole
Copy link
Author

Ok so using the commits you made and modifying my curl call some seems to have resolved the problem:
curl -s -m 10 'http://10.10.0.53/cm?cmnd=%20Status%2011' | jq -r '.StatusSTS' | jq -r '.POWER1'| grep -c ON
My routine is ran every 15 seconds but the default timeout for curl is 30 seconds so I added the -m 10 to make it 10 second timeout thinking possibly the curl commands were overlapping each other and causing the 'NONE' value to return.
Seems this might be the case since making this change I have not had a single backend shutdown.

Thanks for the help, Appreciate it!

@kizniche
Copy link
Owner

Good to hear! It was likely the timeout because all I did was prevent the exception from stopping the whole Input. It seems there's something else with curl that caused the whole daemon to stop.

@BobDDole
Copy link
Author

Seems so, going to convert these to a single python input now that you've implemented it (thanks for that as well) Now off to learn enough python to do that. LOL

@kizniche
Copy link
Owner

You'll probably want to check out the Python module "urllib"

kizniche added a commit that referenced this issue Jun 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants