Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd sd_notify support #569

Conversation

frasertweedale
Copy link
Contributor

8896c2986 (Fraser Tweedale, 9 minutes ago)
   startup notification: complete systemd notifier

   Complete the implementation of SystemdStartupNotifier.  We use JNA to bind
   to libsystemd.  The dependency on 'jna' package only occurs when %{with
   sdnotify}.

   The systemd unit template file is left alone, retaining Type=simple. In
   order to enable systemd startup notification, you can override the Type in
   the "drop-in" directory. For example, if the instance name is 'pki-tomcat',
   write to the file:

     /etc/systemd/system/pki-tomcatd@pki-tomcat.service.d/notify.conf

   the content:

     [Service]
    Type=notify

   See systemd.unit(5) for more details.

   Fixes: https://pagure.io/dogtagpki/issue/1233

d27de20ca (Fraser Tweedale, 5 hours ago)
   startup notification: make the systemd class optional

   Add the `sdnotify` RPM bcond to make the SystemdStartupNotifier class
   optional.  When enabled, it is supplied in a separate JAR
   (part of the pki-server package), and the webapp symlink is added in the
   pki-ca package.  It is enabled by default.

   Note that on Fedora and presumably RHEL also, libsystemd.so is always
   present.  sd_booted(3) can be used to determine whether pid 1 is systemd or
   not, so having a systemd notifier implementation present doesn't imply that
   systemd must be used.

   Nevertheless, it was requested to make this component optional.  So here we
   are.

   Part of: https://pagure.io/dogtagpki/issue/1233

bee6ff9f8 (Fraser Tweedale, 5 hours ago)
   startup notification: add pki-systemd jar

   Implement SystemdStartupNotifier, which does not actually notify systemd
   yet (this will be implemented in a subsequent commit). Ship this class in
   its own jar.  The inclusion of this jar in the pki-server package will be
   made conditional on an RPM macro in the next commit.

   Part of: https://pagure.io/dogtagpki/issue/1233

59787e9c0 (Fraser Tweedale, 11 hours ago)
   startup notification: initialise from CS.cfg

   Initialise StartupNotifier instances configured in CS.cfg.  The 
   configuration scheme is:

     startupNotifiers.list=systemd,foo
    startupNotifiers.systemd.class=package.and.ClassName
    startupNotifiers.foo.class=com.netscape.cmscore.apps.FooNotifier
    startupNotifiers.foo.paramA=valueA
    startupNotifiers.foo.paramB=valueB

   `startupNotifiers.list' gives a list of substore names, one for each 
   StartupNotifier instance.  The 'class' parameter of each substore specifies
   the Java class name.  The config substore is passed to the 
   StartupNotifier.init() method.

   Part of: https://pagure.io/dogtagpki/issue/1233

7b77eaf3f (Fraser Tweedale, 12 hours ago)
   startup notification: add StartupNotifier interface

   Add the StartupNotifier interface.  Update CMSEngine to invoke
   .notifyReady() for each configured notifier when startup is completed.

   Loading of notifiers and a systemd notifier instance will be implemented in
   subsequent commits.

   Part of: https://pagure.io/dogtagpki/issue/1233

@frasertweedale
Copy link
Contributor Author

frasertweedale commented Sep 29, 2020

To test with FreeIPA, to test, add to /etc/pki/pki-tomcat/ca/CS.cfg:

startupNotifiers.list=systemd
startupNotifiers.systemd.class=com.netscape.cmscore.systemd.SystemdStartupNotifier

and modify /etc/systemd/system/pki-tomcatd@pki-tomcat.service.d/ipa.conf, adding Type=notify and
removing/commenting the ExecStartPost directive.

@frasertweedale frasertweedale force-pushed the feature/1233-systemd-startup-notification branch from 8896c29 to 3452cdf Compare September 29, 2020 13:39
Copy link
Contributor

@edewata edewata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my comments below. My main concerns are:

  • availability of JNA library in different platforms (due to modularity)
  • duplicate notifications

base/server/src/com/netscape/cmscore/apps/CMSEngine.java Outdated Show resolved Hide resolved
base/server/src/com/netscape/cmscore/apps/CMSEngine.java Outdated Show resolved Hide resolved
Comment on lines +21 to +9
import com.sun.jna.Library;
import com.sun.jna.Native;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this restrict us to certain Java versions or JDK vendors? I found this:
https://stackoverflow.com/questions/51278446/java-native-loadlibrary-works-on-openjdk-bug-not-on-oracle-jdk

@cipherboy ^^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure... from my POV the jna package is supported by Red Hat so it should work on the JDK supported by Red Hat, and that is good enough for us. Otherwise we can switch it to JNI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, this is fine, but JSS is already a JNI'd library. We in PKI already use JNI (see com.netscape.symkey). I'm less liking adding two different native bindings to the same project than I am JNA. :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reductio ad absurdum: We use JNI, so we must continue to use JNI only (or convert everything to another FFI system all at once).

I would sooner convert this PR to exec the systemd-notify(1) program, than convert it to JNI. At the start of this week I had never even heard of JNA, but now that I have seen it, I never want to use JNI ever again :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern about JNA is the package maintenance. The package is available now on Fedora & RHEL, but there seems to be an effort to move it into Eclipse module on Fedora, and we might be required to add it into pki-deps module on RHEL in the future. Unfortunately this is the current state of Java on Fedora/RHEL. Once we move to Maven we'll have more flexibility on what libraries to use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edewata we can switch from JNA to JNI or to exec'ing systemd-notify(1) in the future, if required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the past systemd-notify was not 100% stable. systemd handles notifications asynchrony. It is possible that the program exits before systemd is able to retrieve its cgroups.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, yeah, we can use JNA now and consider the alternative if it becomes a problem in the future.

base/server/systemd/SystemdStartupNotifier.java Outdated Show resolved Hide resolved
base/server/systemd/SystemdStartupNotifier.java Outdated Show resolved Hide resolved
@frasertweedale
Copy link
Contributor Author

@tiran @edewata

I don't think Tomcat lifecycle listeners help us. Tomcat considers the app "ready" and passes requests to it long before the app is actually ready. That is why we have the getStatus servlet so that we can poll it (indeed, this is what FreeIPA currently does, via
drop-in configuration for the pki-tomcatd@pki-tomcat systemd unit, which invokes a program that checks status via SystemStatusClient (for the ca subsystem only):

% sudo cat /etc/systemd/system/pki-tomcatd@pki-tomcat.service.d/ipa.conf
[Service]
ExecStartPost=/usr/libexec/ipa/ipa-pki-wait-running

Observations:

  • Tomcat lifecycle events don't actually refect CMS readiness and don't help us

  • We currently wait on CA subsystem readiness only - not KRA or ACME. Configuring CA subsystem to perform the sd_notify is semantically equivalent to current behaviour. But unlike polling it allows startup to proceed as early as possible.

  • It is assumed that CA subsystem takes the longest to start up (it loads a substantial amount of data from LDAP including profiles and LWCAs). Other subsystems have much less to do.

  • We should perform measurements to test the validity of this assumption.

  • We agree that relying on this assumption is not ideal. But due to the isolation between subsystems, and Tomcat's lack of insight into CMS readiness, I don't yet see a clean way to depend on readiness of all configured subsystems.

I commend the approach in this PR as addressing the main concern without regression. It is not the ideal solution but it does improve things.

I still have a bit of cleanup and robustness work to do in this PR before it is ready for merge.

Add the StartupNotifier interface.  Update CMSEngine to invoke
.notifyReady() for each configured notifier when startup is
completed.

Loading of notifiers and a systemd notifier instance will be
implemented in subsequent commits.

Part of: https://pagure.io/dogtagpki/issue/1233
Initialise StartupNotifier instances configured in CS.cfg.  The
configuration scheme is:

  startupNotifiers.list=systemd,foo
  startupNotifiers.systemd.class=package.and.ClassName
  startupNotifiers.foo.class=com.netscape.cmscore.apps.FooNotifier
  startupNotifiers.foo.paramA=valueA
  startupNotifiers.foo.paramB=valueB

`startupNotifiers.list' gives a list of substore names, one for each
StartupNotifier instance.  The 'class' parameter of each substore
specifies the Java class name.  The config substore is passed to the
StartupNotifier.init() method.

Part of: https://pagure.io/dogtagpki/issue/1233
Implement SystemdStartupNotifier, which does not actually notify
systemd yet (this will be implemented in a subsequent commit).
Ship this class in its own jar.  The inclusion of this jar in the
pki-server package will be made conditional on an RPM macro in the
next commit.

Part of: https://pagure.io/dogtagpki/issue/1233
Add the `sdnotify` RPM bcond to make the SystemdStartupNotifier
class optional.  When enabled, it is supplied in a separate JAR
(part of the pki-server package), and the webapp symlink is added in
the pki-ca package.  It is enabled by default.

Note that on Fedora and presumably RHEL also, libsystemd.so is
always present.  sd_booted(3) can be used to determine whether pid 1
is systemd or not, so having a systemd notifier implementation
present doesn't imply that systemd must be used.

Nevertheless, it was requested to make this component optional.  So
here we are.

Part of: https://pagure.io/dogtagpki/issue/1233
Complete the implementation of SystemdStartupNotifier.  We use JNA
to bind to libsystemd.  The dependency on 'jna' package only occurs
when %{with sdnotify}.

The systemd unit template file is left alone, retaining Type=simple.
In order to enable systemd startup notification, you can override
the Type in the "drop-in" directory. For example, if the instance
name is 'pki-tomcat', write to the file:

  /etc/systemd/system/pki-tomcatd@pki-tomcat.service.d/notify.conf

the content:

  [Service]
  Type=notify

See systemd.unit(5) for more details.

Fixes: https://pagure.io/dogtagpki/issue/1233
@frasertweedale frasertweedale force-pushed the feature/1233-systemd-startup-notification branch from 3452cdf to 764516e Compare October 1, 2020 05:04
In the SystemdStartupNotifier, in order to better handle errors load
the library in init() rather than as static class configuration.
This gives better control over when Dogtag attempts to load the
library, and how it can handle errors.

Part of: https://pagure.io/dogtagpki/issue/1233
Update the StartupNotifier interface to return a NotifyResult to the
caller.  The NotifyResult contains a success/failure enum and a
string message.  Update CMSEngine to interpret the NotifyResult and
log a message if appropriate.

The SystemdStartupNotifier no longer writes to stderr/stdout.  As a
result, there is no longer any ambiguity about which CMS subsystem a
failure occurred in.

Part of: https://pagure.io/dogtagpki/issue/1233
@frasertweedale frasertweedale force-pushed the feature/1233-systemd-startup-notification branch from 764516e to e82af0b Compare October 1, 2020 05:38
Copy link
Member

@cipherboy cipherboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start to a notification system IMO. If JNA proves to be an issue later, we can rewrite it to JNI fairly easily. ACK'd.

Copy link
Contributor

@edewata edewata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Feel free to merge.

@frasertweedale
Copy link
Contributor Author

Thanks @cipherboy and @edewata. @tiran did you want to review / test this before I press merge?

@frasertweedale
Copy link
Contributor Author

frasertweedale commented Oct 14, 2020

@cipherboy @edewata : we (@tiran and me) are wondering if we can also modify Dogtag to enable this notifier by default (in CS.cfg) and update pki-tomcatd@.service unit config from Type=simple to Type=notify.

Although the systemd startup notifier plugin depends on libsystemd as a runtime dependency, it does not depend on the pki-tomcatd actually running under systemd. In the event that pid 1 is not systemd, nothing happens. So we believe it's quite safe and reasonable to enable it by default. This would avoid extra configuration on the IPA side.

Let us know your thoughts. If we wish, we can move this discussion to a mailing list for wider audience.

@edewata
Copy link
Contributor

edewata commented Oct 19, 2020

I don't think PKI should enable this by default since this is kind of a workaround only used by IPA (i.e. the CA notifies the listeners when it's up, regardless of other subsystems). Some of PKI use cases involve having CA and other subsystems on separate instances, so if it's only enabled by default in CA it will require extra configuration on other subsystems (i.e. it's no longer a default behavior) and we'll have to explain this IPA-specific behavior in PKI docs.

If this mechanism can work for PKI in general (i.e. Tomcat notifies the listeners when it's up, instead of the subsystems), then we probably can enable it by default.

How about we merge this PR now as is, and we can discuss further improvements either in the mailing list or in a separate PR?

@frasertweedale
Copy link
Contributor Author

@edewata that's fair. Solving the "notify when all subsystems on this instance are ready" problem will be a trickier and more invasive change and I'm not really interested in tackling it. So let's merge this as-is, and it will be something for IPA to configure.

@frasertweedale frasertweedale merged commit 592f412 into dogtagpki:master Oct 20, 2020
@frasertweedale frasertweedale deleted the feature/1233-systemd-startup-notification branch October 20, 2020 01:09
edewata added a commit to edewata/pki that referenced this pull request Apr 26, 2021
The JNA dependency was originally added to notify
systemd listeners when the CA subsystem is ready:

https://pagure.io/dogtagpki/issue/1233
dogtagpki#569

However, the jna package may not be available on all
platforms, so the JNA dependency needs to be dropped.

As a replacement, the listeners can use the pki-server
command instead of systemctl to start/restart PKI which
can wait until the CA subsystem becomes ready.

The following commands can be used to start/restart
Tomcat and wait until it starts listening to the ports:

 $ pki-server start/stop/restart --wait

The following commands can be used to start/restart
the CA subsystem and wait until it's ready to receive
requests:

 $ pki-server ca-deploy/undeploy --wait

Resolves: #1953671
@edewata edewata mentioned this pull request Apr 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants