Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

selinux: Fix error reporting #18670

Merged
merged 1 commit into from
Apr 20, 2023
Merged

Conversation

martinpitt
Copy link
Member

@martinpitt martinpitt commented Apr 19, 2023

Exception objects are not directly printable, so trying to render it in
an Alert or with console.warn() just results in an empty string. That
makes debugging really hard, as there is no corresponding entry in the
journal either.

Use .toString() on the exception to render them properly. Also show
the actual exception text in the Alert in the UI. Still keep the
console.warn()s, so that we can do naughty matching on them.

While at it, eliminate the unnecessary Promise wrappers, and use chaining.


I'm trying to get to the bottom of this common flake. (another instance)

The screenshot just says "Unable run fix:". I didn't yet reproduce this locally, but when I run the test in a loop, it sometimes fails to delete the alert. The UI alert just says that (without a reason), and the console says:

warning: Unable to delete alert with id 3030af65-8f20-4e46-8d88-2fd542cfae6e
warning:

So as the first step, let the page tell us what's actually wrong!

@martinpitt martinpitt added flake unstable test no-test For doc/workflow changes, or experiments which don't need a full CI run, labels Apr 19, 2023
@martinpitt
Copy link
Member Author

martinpitt commented Apr 19, 2023

I locally got another "failed to delete". It says "message recipient disconnected from message bus without replying", in other words, setroubleshootd crashed. The journal just says:

Apr 19 08:28:48 centos-8-stream-127-0-0-2-2201 setroubleshoot[14254]: AnalyzeThread.run(): Set alarm timeout to 10
Apr 19 08:28:58 centos-8-stream-127-0-0-2-2201 systemd[1]: setroubleshootd.service: Succeeded.

so it smells like setroubleshoot has an idle timer which still fires when there's a pending D-Bus call? I mostly see this on RHEL 8 (setroubleshoot-plugins-3.3.14-1.el8 and setroubleshoot-server-3.3.26-5.el8).

Same symptom in this failure, there the timing information fits:

Apr 18 18:22:30 centos-8-stream-127-0-0-2-2201 setroubleshoot[2962]: AnalyzeThread.run(): Set alarm timeout to 10
Apr 18 18:22:39 centos-8-stream-127-0-0-2-2201 dbus-daemon[779]: [system] Activating service name='org.fedoraproject.SetroubleshootFixit' requested by ':1.37' (uid=0 pid=1454 comm="/usr/bin/python3 /usr/local/bin/cockpit-bridge --p" label="unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023") (using servicehelper)
Apr 18 18:22:40 centos-8-stream-127-0-0-2-2201 dbus-daemon[779]: [system] Successfully activated service 'org.fedoraproject.SetroubleshootFixit'
Apr 18 18:22:40 centos-8-stream-127-0-0-2-2201 org.fedoraproject.SetroubleshootFixit[3015]: ERROR:dbus.proxies:Introspect error on :1.56:/org/fedoraproject/Setroubleshootd: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Message recipient disconnected from message bus without replying
Apr 18 18:22:41 centos-8-stream-127-0-0-2-2201 org.fedoraproject.SetroubleshootFixit[3015]: could not attach to desktop process
Apr 18 18:22:41 centos-8-stream-127-0-0-2-2201 systemd[1]: setroubleshootd.service: Succeeded.

The weather report shows that this also happens on Fedora 38.

image

I reported this at https://bugzilla.redhat.com/show_bug.cgi?id=2112573 and added a known issue at cockpit-project/bots#4672 . I naughtied this in cockpit-project/bots#4673

In the meantime, let's fix the error reporting anyway. But let's keep the console.warn()'s, so that we can do naughty matching on it.

@martinpitt martinpitt removed the no-test For doc/workflow changes, or experiments which don't need a full CI run, label Apr 19, 2023
Exception objects are not directly printable, so trying to render it in
an Alert or with `console.warn()` just results in an empty string. That
makes debugging really hard, as there is no corresponding entry in the
journal either.

Use `.toString()` on the exception to render them properly. Also show
the actual exception text in the Alert in the UI. Still keep the
console.warn()s, so that we can do naughty matching on them.

While at it, eliminate the unnecessary Promise wrappers, and use chaining.
@martinpitt martinpitt marked this pull request as ready for review April 19, 2023 10:00
@martinpitt
Copy link
Member Author

When this fails locally, I now get:

> warning: Failed to delete alert: {"problem":null,"name":"org.freedesktop.DBus.Error.NoReply","message":"Message recipient disconnected from message bus without replying"}

and the error is shown in the UI:
TestSelinux-testTroubleshootAlerts-centos-8-stream-127 0 0 2-2201-FAIL

Comment on lines +133 to +135
.catch(ex => {
console.warn("Unable to get alert for id", localId, ":", JSON.stringify(ex));
return Promise.reject(new Error(`Unable to get alert for id ${localId}: ${ex.toString()}`));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 3 added lines are not executed by any test. Details

Comment on lines +143 to +145
.catch(ex => {
console.warn("Unable to run fix:", JSON.stringify(ex));
return Promise.reject(new Error(cockpit.format(_("Unable to run fix: $0"), ex.toString())));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 3 added lines are not executed by any test. Details

const deleteAlert = localId => client.proxy.call("delete_alert", [localId])
.then(success => {
if (!success)
return Promise.reject(new Error(cockpit.format(_("Failed to delete alert: $0"), localId)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This added line is not executed by any test. Details

Comment on lines +156 to +158
.catch(ex => {
console.warn("Unable to delete alert with id", localId, ":", JSON.stringify(ex));
return Promise.reject(new Error(cockpit.format(_("Failed to delete alert: $0"), ex.toString())));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 3 added lines are not executed by any test. Details

Copy link
Member

@mvollmer mvollmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@mvollmer mvollmer merged commit db99329 into cockpit-project:main Apr 20, 2023
29 checks passed
@martinpitt martinpitt deleted the selinux-errors branch April 20, 2023 07:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flake unstable test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants