Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os-nut causes OPNsense web GUI to lock up under certain configurations #3304

Closed
3 tasks done
davemsh opened this issue Feb 12, 2023 · 7 comments
Closed
3 tasks done
Labels
help wanted Contributor missing

Comments

@davemsh
Copy link

davemsh commented Feb 12, 2023

Important notices
Before you add a new report, we ask you kindly to acknowledge the following:

Describe the bug
In the os-nut plugin, if the localhost 127.0.0.1 listen address is not accessible, attempting to access the Services->Nut->Diagnostics tab causes the entire OPNsense web GUI to lock up and become completely unresponsive. Network traffic passing through OPNsense appears unaffected by this.

I accidentally caused this to happen in two different ways. In both cases, the web GUI never never recovers until the system is rebooted.

To Reproduce
Steps to reproduce the behavior:

  1. I installed os-nut 1.8.1_1 on OPNsense 22.7.11_1.
  2. I used os-nut's default settings, aside from setting the UPS name and user passwords, and I set the driver to USBHID.
  3. As mentioned above, I had this happen twice at different points:
  4. A) I enabled the NUT service, but I had not rebooted OPNsense. Rebooting is apparently required for 127.0.0.1 to begin listening (speculation).
  5. B) After recovering from Case A, I removed the pre-configured 127.0.0.1 entry from the Listen Address field, not realizing it was mandatory.
  6. In both A and B cases, after saving the configuration, visiting the Diagnostics tab never finishes loading, and no other attempts to view other pages in the web GUI respond either, even in a new tab. In both cases, rebooting OPNsense is required to regain access to the web GUI.
  7. Once 127.0.0.1 is accessible, the Diagnostics tab works as expected, displaying my UPS parameters.

Expected behavior
Expected behavior is that the OPNsense web GUI does not become unresponsive given the above steps, and that there be a warning of some sort when removing 127.0.0.1 from the listen addresses that doing so will prevent the Diagnostic tab from functioning.

Screenshots
No relevant screenshots.

Relevant log files
These log messages get repeated once or twice per minute until the system is rebooted. It is unclear if repeats are due to automatic polling or from my attempts to regain access to the web GUI via page reloads.

2023-02-11T18:56:02-05:00 Error configd.py [566f587d-6053-4321-a896-c7e96db6572b] Script action failed with Command '/usr/local/bin/upsc 'ups@127.0.0.1'' returned non-zero exit status 1. at Traceback (most recent call last): File "/usr/local/opnsense/service/modules/processhandler.py", line 482, in execute subprocess.check_call(script_command, env=self.config_environment, shell=True, File "/usr/local/lib/python3.9/subprocess.py", line 373, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '/usr/local/bin/upsc 'ups@127.0.0.1'' returned non-zero exit status 1.

Additional context
N/A

Environment
OPNsense 22.7.11_1-amd64 (OpenSSL)
os-nut 1.8.1_1

@fichtner
Copy link
Member

fichtner commented Feb 12, 2023

As far as I know nut fails to do a proper timeout. Strategies it mitigate are expensive in terms of code and general effectiveness is probably still not enough.

@mimugmail
Copy link
Member

This is a known problem of nut itself, theres not much we can do here

@davemsh
Copy link
Author

davemsh commented Feb 12, 2023

Can the web GUI do this request asynchronously, so that at least it doesn't cause the whole interface to hang?

Perhaps the localhost IP should not be removable from the NUT plugin's configuration? I'm not sure what specific use-cases preventing its removal would break though, so perhaps that's not feasible.

@fichtner
Copy link
Member

All “wrong” input will cause this. This isn’t a validation issue around 127.0.0.1 or similar. Asynch works but not without doing a lot of extra work. PR welcome but as I said the benefit is lower than useful for the cases where this is misconfigured. Perhaps the read timeout by configd/configctl could be reduced in this case only.

@codiflow
Copy link

Interesting bug report, thanks for digging deeper into this @davemsh

I had the issue too and for me the cause was the missing 127.0.0.1 as NUT listening address. Adding it was a simple solution here.

The issue only exists if you click the "Diagnostics" tab. Using the CLI for querying the UPS works without issues (because you can always stop the command using CTRL+C). So before using this tab one could verify if the config is working by testing the command /usr/local/bin/upsc 'ups@127.0.0.1' via SSH. If the output looks valid you can also use the "Diagnostics" part.

One simple solution (I could do that) would be to adapt the description of the listening address. Currently it says "Set the addresses this service listen on.". We could add a warning here that removing 127.0.0.1 causes the GUI to hang if someone clicks on Diagnostics. Would maybe prevent people from removing it.

I also want to add that the issue resolves automatically after around 20 minutes. So NO reboot is required – just waiting is sufficient.

If you are speaking German you can get more details on the issue in my blogpost.

@davemsh
Copy link
Author

davemsh commented May 14, 2023

Thanks for the feedback, guess I just didn't wait long enough before rebooting.

Agreed, updating the description to provide a warning could be a decent change if the overall issue is difficult to prevent.

@OPNsense-bot
Copy link

This issue has been automatically timed-out (after 180 days of inactivity).

For more information about the policies for this repository,
please read https://github.com/opnsense/plugins/blob/master/CONTRIBUTING.md for further details.

If someone wants to step up and work on this issue,
just let us know, so we can reopen the issue and assign an owner to it.

@OPNsense-bot OPNsense-bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 11, 2023
@OPNsense-bot OPNsense-bot added the help wanted Contributor missing label Aug 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Contributor missing
Development

No branches or pull requests

5 participants