Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Services] Websocket failure to update service status #2681

Closed
FroggyFlox opened this issue Sep 29, 2023 · 3 comments · Fixed by #2682
Closed

[Services] Websocket failure to update service status #2681

FroggyFlox opened this issue Sep 29, 2023 · 3 comments · Fixed by #2682
Assignees
Labels

Comments

@FroggyFlox
Copy link
Member

In the "System" > "Services" page, we have websocket that fetches the status of each service and sends it to our front-end to update the toggle buttons. This unfortunately is currently failing, resulting in a toggled button remaining "misaligned" (see screenshot below) until a manual page refresh:
image

By using Firefox's dev tools, we can see that the websocket is correctly created and removed upon visiting/leaving the "Services" page, but it does not seem to send/return any data on the services' statuses, hence the failure to refresh the toggle button state.

It turns out this failure results from an Exception thrown when checking the status of the "replication" service. This exception is normal (when the replication service is OFF), but we shouldn't fail the checking of service status as a result.

@FroggyFlox
Copy link
Member Author

As a confirmation, setting throw=False to the underlying run_command() checking the status of the replication service, we no-longer see this exception being thrown, and the "get_services" websocket returns a correct response full of data at the expected interval. The toggle button is also updated and re-aligned at that time.

@FroggyFlox FroggyFlox self-assigned this Sep 29, 2023
@FroggyFlox FroggyFlox added the bug label Sep 29, 2023
@FroggyFlox
Copy link
Member Author

For details, the exception we see in the logs when visiting the Services page is:

[30/Sep/2023 11:19:11] ERROR [smart_manager.views.base_service:73] Exception while querying status of service(replication): Error running a command. cmd = /opt/rockstor/.venv/bin/supervisorctl status replication. rc = 3. stdout = ['replication                      STOPPED   Not started', '']. stderr = ['']
[30/Sep/2023 11:19:11] ERROR [smart_manager.views.base_service:74] Error running a command. cmd = /opt/rockstor/.venv/bin/supervisorctl status replication. rc = 3. stdout = ['replication                      STOPPED   Not started', '']. stderr = ['']
Traceback (most recent call last):
  File "/opt/rockstor/src/rockstor/smart_manager/views/base_service.py", line 64, in _get_status
    o, e, rc = service_status(service.name, config)
  File "/opt/rockstor/src/rockstor/system/services.py", line 192, in service_status
    return superctl(service_name, "status")
  File "/opt/rockstor/src/rockstor/system/services.py", line 142, in superctl
    out, err, rc = run_command([SUPERCTL_BIN, switch, service])
  File "/opt/rockstor/src/rockstor/system/osi.py", line 251, in run_command
    raise CommandException(cmd, out, err, rc)
system.exceptions.CommandException: Error running a command. cmd = /opt/rockstor/.venv/bin/supervisorctl status replication. rc = 3. stdout = ['replication                      STOPPED   Not started', '']. stderr = ['']

As we can see, the command itself does not fail, but returns a proper output with a rc of 3. Based on our run_command() logic, we raise a CommandException given rc != 0 :

if rc != 0:
if log:
e_msg = (
"non-zero code({0}) returned by command: {1}. output: "
"{2} error: {3}".format(rc, cmd, out, err)
)
logger.error(e_msg)
if throw:
raise CommandException(cmd, out, err, rc)

Here, we should thus set throw=False.

FroggyFlox added a commit to FroggyFlox/rockstor-core that referenced this issue Sep 30, 2023
…r#2681

We currently throw an Exception when checking the status of a
surpervisor-controlled service such as replication. This leads to an
exception being thrown even when expected (service is off), which can
itself interrupt other parts such as our get_services websocket.

This commit adds a new arg to the superctl() function to allow for the
possibility to not throw an exception. Note that run_command's default
for throw is True, so respect this here.
FroggyFlox added a commit to FroggyFlox/rockstor-core that referenced this issue Sep 30, 2023
…r#2681

We currently throw an Exception when checking the status of a
surpervisor-controlled service such as replication. This leads to an
exception being thrown even when expected (service is off), which can
itself interrupt other parts such as our get_services websocket.

This commit adds a new arg to the superctl() function to allow for the
possibility to not throw an exception. Note that run_command's default
for throw is True, so respect this here.
phillxnet added a commit that referenced this issue Oct 2, 2023
…date-service-status

Don't throw exception when getting supervisord service status #2681
@phillxnet
Copy link
Member

Closing as:
Fixed by #2682
by @FroggyFlox

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants