Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exception handling around fapd handle is_active() call #560

Merged
merged 15 commits into from Aug 17, 2022

Conversation

tparchambault
Copy link
Contributor

No description provided.

@tparchambault
Copy link
Contributor Author

After merging master into this branch, on a clean FC34 VM (installed emacs, fapolicyd, the generated fapolicy-analyzer rpm), w/fapd running in --permissive after adding a number of files to the ancillary trust dbase and attempting a deploy, the following was observed:

 DEBUG:root:SessionManager::__autosave_edit_session()
DEBUG:root: Session autosave is disabled/bypassed
DEBUG:root:Deploying...
DEBUG:root:fapd_dbase::fapd_dbase_snapshot()
 Creating '/root/.local/share/fapolicy-analyzer/' 
DEBUG:root:Xdg absolute path: /root/.local/share/fapolicy-analyzer/Fapolicyd_Backup
Fapolicyd backup to: /root/.local/share/fapolicy-analyzer/Fapolicyd_Backup_20220727_145633_497756.tgz
DEBUG:root:fapd_dbase_cleanup_snapshots()
DEBUG:root:Search Pattern: /root/.local/share/fapolicy-analyzer/Fapolicyd_Backup_*.tgz
DEBUG:root:Glob search results: ['/root/.local/share/fapolicy-analyzer/Fapolicyd_Backup_20220727_145633_497756.tgz']
DEBUG:root:_status(ServiceStatus.FALSE updated
DEBUG:root:monitor_daemon:Dispatch update request
DEBUG:root:on_update_daemon_status(ServiceStatus.FALSE)
ERROR:root:An error occurred trying to deploy the changes. Please try again.: DbusFailure(D-Bus error: Transaction for fapolicyd.service/start is destructive (fapolicyd.service has 'stop' job queued, but 'start' is included in transaction). (org.freedesktop.systemd1.TransactionIsDestructive))
DEBUG:root:disposing of subscription for class DeployChangesetsOp
DEBUG:root:_update_fapd_status(ServiceStatus.FALSE)

@tparchambault
Copy link
Contributor Author

With the monitoring thread not running,

DEBUG:root:Starting the fapd monitoring thread
DEBUG:root:fapd monitoring thread commented out

we still observe the full assortment of errors wrt restarting the daemon after stopping it.
Here, at deployment:

DEBUG:root:Deploying...
DEBUG:root:fapd_dbase::fapd_dbase_snapshot()
DEBUG:root:Xdg absolute path: /root/.local/share/fapolicy-analyzer/Fapolicyd_Backup
Fapolicyd backup to: /root/.local/share/fapolicy-analyzer/Fapolicyd_Backup_20220727_184752_502467.tgz
DEBUG:root:fapd_dbase_cleanup_snapshots()
DEBUG:root:Search Pattern: /root/.local/share/fapolicy-analyzer/Fapolicyd_Backup_*.tgz
DEBUG:root:Glob search results: ['/root/.local/share/fapolicy-analyzer/Fapolicyd_Backup_20220727_175559_584055.tgz', '/root/.local/share/fapolicy-analyzer/Fapolicyd_Backup_20220727_182913_127940.tgz', '/root/.local/share/fapolicy-analyzer/Fapolicyd_Backup_20220727_183347_403512.tgz', '/root/.local/share/fapolicy-analyzer/Fapolicyd_Backup_20220727_184752_502467.tgz']
DEBUG:root:Deleting: /root/.local/share/fapolicy-analyzer/Fapolicyd_Backup_20220727_175559_584055.tgz
ERROR:root:An error occurred trying to deploy the changes. Please try again.: DbusFailure(D-Bus error: Transaction for fapolicyd.service/start is destructive (fapolicyd.service has 'stop' job queued, but 'start' is included in transaction). (org.freedesktop.systemd1.TransactionIsDestructive))
DEBUG:root:disposing of subscription for class DeployChangesetsOp

Also occurs on rollback, but the DBus error message does not always get displayed in the logging.debug() messages. They will show up in the journal though.

Jul 27 19:11:45 fedora systemd[1]: Stopping File Access Policy Daemon...
Jul 27 19:11:46 fedora systemd[1]: Requested transaction contradicts existing jobs: Transaction for fapolicyd.service/start is destructive (fapolicyd.service has 'stop' job queued, but 'start' is included in transaction).
Jul 27 19:11:48 fedora systemd[1]: fapolicyd.service: Deactivated successfully.
Jul 27 19:11:48 fedora systemd[1]: Stopped File Access Policy Daemon.

The issue is always the daemon start operation however there is no service start entry in the journal so possibly the fapolicyd start request is not being delivered to systemd or it's delivered in a format that is inconsistent/nonsensical from systemd's perspective based on the transaction contradicts... log msg.

@tparchambault
Copy link
Contributor Author

From the perspective of systemd, bringing down fapolicyd takes ~3 secs and in that 3 sec window, systemd is getting a request to restart fapolicyd:

Jul 27 20:30:03 fedora systemd[1]: Stopping File Access Policy Daemon...
Jul 27 20:30:03 fedora fapolicyd[1698]: shutting down...
Jul 27 20:30:04 fedora systemd[1]: Requested transaction contradicts existing jobs: Transaction for fapolicyd.service/start is destructive (fapolicyd.service has 'stop' job queued, but 'start' is included in transaction).
Jul 27 20:30:06 fedora systemd[1]: fapolicyd.service: Deactivated successfully.
Jul 27 20:30:06 fedora audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=fapolicyd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jul 27 20:30:06 fedora systemd[1]: Stopped File Access Policy Daemon.
Jul 27 20:30:06 fedora systemd[1]: fapolicyd.service: Consumed 1.306s CPU time.

@tparchambault
Copy link
Contributor Author

Installed and tested over FC34 and Rhel 8.6, both ancillary trust changeset deployments and rollbacks. Tested with fapd in online and permissive modes. Good to go.

@tparchambault tparchambault marked this pull request as ready for review August 1, 2022 18:26
@tparchambault tparchambault requested a review from jw3 August 1, 2022 18:27
Copy link
Member

@jw3 jw3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using context blocks on the locks would be nice.

I can also make this fail by

  1. Start analyzer
  2. Stop daemon
  3. Remove the fapolicyd.service file
  4. Start daemon from System menu
  5. Observe the GUI is locked

The stack trace from the above steps

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/fapolicy_analyzer/ui/main_window.py", line 417, in on_fapdStartMenu_activate
    self._fapd_mgr.start()
  File "/usr/lib/python3.6/site-packages/fapolicy_analyzer/ui/fapd_manager.py", line 103, in start
    pid = self._start(instance)
  File "/usr/lib/python3.6/site-packages/fapolicy_analyzer/ui/fapd_manager.py", line 138, in _start
    self._fapd_ref.start()
RuntimeError: DbusFailure(D-Bus error: Failed to get file context on /usr/lib/systemd/system/fapolicyd.service. (org.freedesktop.DBus.Error.AccessDenied))

@@ -209,15 +210,16 @@ def _status(self):
logging.debug("fapd is currently DISABLED")
return ServiceStatus.UNKNOWN
elif self.mode == FapdMode.ONLINE:
try:
if self._fapd_lock.acquire(blocking=False):
if self._fapd_lock.acquire(blocking=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use a with block here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into that approach originally. The fly/ointment thing is the blocking=False argument, i.e. if the lock is not available then don't wait and return False. Fapd control operations (start/stop) are given priority over the never-ending periodic status requests.

Still digging around. I would still prefer to have locking somewhere given the status requests and stop/start requests originate on differing threads. I don't care if it's under the hood in rust-land or not.

self._fapd_prior_online_state = updated_online_state
self._fapd_cur_online_state = updated_online_state
logging.debug("capture_online_state(): Capturing current online status")
if self._fapd_lock.acquire():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, with would be ideal on these locked blocks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, with context management is Ok here. Will do.

self._fapd_status = ServiceStatus(self._fapd_ref.is_active())
else:
self._fapd_status = ServiceStatus.UNKNOWN
try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and again.

@jw3
Copy link
Member

jw3 commented Aug 2, 2022

@tparchambault There are a few other spots that lock release needs to be ensured, looks like five spots total. I think that using with at all of these is appropriate, I just can't mark them in the review because they are aren't part of your changes.

Letting the lock be managed may eliminate the need for try-catch in some of these scenarios. I didn't review, just saying it's possible.

@tparchambault
Copy link
Contributor Author

I'll try and research this more... I didn't go with the Context Manager approach because I did not see clear documentation when try/exception blocks were involved. We may be able to get rid of one or the other.

@tparchambault
Copy link
Contributor Author

Added context manager blocks to manage Lock acquire()/release() ops. Non-blocking Lock.acquire() left as originally coded.

@tparchambault tparchambault requested a review from jw3 August 16, 2022 15:47
@jw3
Copy link
Member

jw3 commented Aug 16, 2022

Looks improved. I can't make it fail by moving the unit file around while the app is up.

Only issue I see is that caught exceptions do not send any notification to the UI. Is it possible to use a toaster message on these failures?

Copy link
Member

@jw3 jw3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The toasters would be nice, but not blocking.

@tparchambault
Copy link
Contributor Author

tparchambault commented Aug 16, 2022

Only issue I see is that caught exceptions do not send any notification to the UI. Is it possible to use a toaster message on these failures?

Sure I'll look at that tomorrow AM. SD did a nice job with the toaster messaging, so it's pretty easy to add them. He said; Probably no easy way to get a handle that's in scope... In any event I'll look at it tomorrow. I think it will be relatively easy.

@jw3
Copy link
Member

jw3 commented Aug 17, 2022

Let's add the toaster for the start/stop controls in this PR.

For the status checks let's have a separate issue that instead of a toaster we get a third icon on the status display. For this PR however lets just log the failure to stderr.

@jw3 jw3 merged commit 35f169c into master Aug 17, 2022
@jw3 jw3 deleted the fa544_ui_unresponsive_rollback branch August 17, 2022 22:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants