New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
systemd sd_notify support #569
systemd sd_notify support #569
Conversation
frasertweedale
commented
Sep 29, 2020
To test with FreeIPA, to test, add to
and modify |
8896c29
to
3452cdf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see my comments below. My main concerns are:
- availability of JNA library in different platforms (due to modularity)
- duplicate notifications
import com.sun.jna.Library; | ||
import com.sun.jna.Native; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this restrict us to certain Java versions or JDK vendors? I found this:
https://stackoverflow.com/questions/51278446/java-native-loadlibrary-works-on-openjdk-bug-not-on-oracle-jdk
@cipherboy ^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure... from my POV the jna package is supported by Red Hat so it should work on the JDK supported by Red Hat, and that is good enough for us. Otherwise we can switch it to JNI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, this is fine, but JSS is already a JNI'd library. We in PKI already use JNI (see com.netscape.symkey). I'm less liking adding two different native bindings to the same project than I am JNA. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reductio ad absurdum: We use JNI, so we must continue to use JNI only (or convert everything to another FFI system all at once).
I would sooner convert this PR to exec the systemd-notify(1)
program, than convert it to JNI. At the start of this week I had never even heard of JNA, but now that I have seen it, I never want to use JNI ever again :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern about JNA is the package maintenance. The package is available now on Fedora & RHEL, but there seems to be an effort to move it into Eclipse module on Fedora, and we might be required to add it into pki-deps module on RHEL in the future. Unfortunately this is the current state of Java on Fedora/RHEL. Once we move to Maven we'll have more flexibility on what libraries to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@edewata we can switch from JNA to JNI or to exec'ing systemd-notify(1)
in the future, if required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the past systemd-notify
was not 100% stable. systemd handles notifications asynchrony. It is possible that the program exits before systemd is able to retrieve its cgroups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, yeah, we can use JNA now and consider the alternative if it becomes a problem in the future.
I don't think Tomcat lifecycle listeners help us. Tomcat considers the app "ready" and passes requests to it long before the app is actually ready. That is why we have the
Observations:
I commend the approach in this PR as addressing the main concern without regression. It is not the ideal solution but it does improve things. I still have a bit of cleanup and robustness work to do in this PR before it is ready for merge. |
Add the StartupNotifier interface. Update CMSEngine to invoke .notifyReady() for each configured notifier when startup is completed. Loading of notifiers and a systemd notifier instance will be implemented in subsequent commits. Part of: https://pagure.io/dogtagpki/issue/1233
Initialise StartupNotifier instances configured in CS.cfg. The configuration scheme is: startupNotifiers.list=systemd,foo startupNotifiers.systemd.class=package.and.ClassName startupNotifiers.foo.class=com.netscape.cmscore.apps.FooNotifier startupNotifiers.foo.paramA=valueA startupNotifiers.foo.paramB=valueB `startupNotifiers.list' gives a list of substore names, one for each StartupNotifier instance. The 'class' parameter of each substore specifies the Java class name. The config substore is passed to the StartupNotifier.init() method. Part of: https://pagure.io/dogtagpki/issue/1233
Implement SystemdStartupNotifier, which does not actually notify systemd yet (this will be implemented in a subsequent commit). Ship this class in its own jar. The inclusion of this jar in the pki-server package will be made conditional on an RPM macro in the next commit. Part of: https://pagure.io/dogtagpki/issue/1233
Add the `sdnotify` RPM bcond to make the SystemdStartupNotifier class optional. When enabled, it is supplied in a separate JAR (part of the pki-server package), and the webapp symlink is added in the pki-ca package. It is enabled by default. Note that on Fedora and presumably RHEL also, libsystemd.so is always present. sd_booted(3) can be used to determine whether pid 1 is systemd or not, so having a systemd notifier implementation present doesn't imply that systemd must be used. Nevertheless, it was requested to make this component optional. So here we are. Part of: https://pagure.io/dogtagpki/issue/1233
Complete the implementation of SystemdStartupNotifier. We use JNA to bind to libsystemd. The dependency on 'jna' package only occurs when %{with sdnotify}. The systemd unit template file is left alone, retaining Type=simple. In order to enable systemd startup notification, you can override the Type in the "drop-in" directory. For example, if the instance name is 'pki-tomcat', write to the file: /etc/systemd/system/pki-tomcatd@pki-tomcat.service.d/notify.conf the content: [Service] Type=notify See systemd.unit(5) for more details. Fixes: https://pagure.io/dogtagpki/issue/1233
3452cdf
to
764516e
Compare
In the SystemdStartupNotifier, in order to better handle errors load the library in init() rather than as static class configuration. This gives better control over when Dogtag attempts to load the library, and how it can handle errors. Part of: https://pagure.io/dogtagpki/issue/1233
Update the StartupNotifier interface to return a NotifyResult to the caller. The NotifyResult contains a success/failure enum and a string message. Update CMSEngine to interpret the NotifyResult and log a message if appropriate. The SystemdStartupNotifier no longer writes to stderr/stdout. As a result, there is no longer any ambiguity about which CMS subsystem a failure occurred in. Part of: https://pagure.io/dogtagpki/issue/1233
764516e
to
e82af0b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good start to a notification system IMO. If JNA proves to be an issue later, we can rewrite it to JNI fairly easily. ACK'd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Feel free to merge.
Thanks @cipherboy and @edewata. @tiran did you want to review / test this before I press merge? |
@cipherboy @edewata : we (@tiran and me) are wondering if we can also modify Dogtag to enable this notifier by default (in Although the systemd startup notifier plugin depends on libsystemd as a runtime dependency, it does not depend on the pki-tomcatd actually running under systemd. In the event that pid 1 is not systemd, nothing happens. So we believe it's quite safe and reasonable to enable it by default. This would avoid extra configuration on the IPA side. Let us know your thoughts. If we wish, we can move this discussion to a mailing list for wider audience. |
I don't think PKI should enable this by default since this is kind of a workaround only used by IPA (i.e. the CA notifies the listeners when it's up, regardless of other subsystems). Some of PKI use cases involve having CA and other subsystems on separate instances, so if it's only enabled by default in CA it will require extra configuration on other subsystems (i.e. it's no longer a default behavior) and we'll have to explain this IPA-specific behavior in PKI docs. If this mechanism can work for PKI in general (i.e. Tomcat notifies the listeners when it's up, instead of the subsystems), then we probably can enable it by default. How about we merge this PR now as is, and we can discuss further improvements either in the mailing list or in a separate PR? |
@edewata that's fair. Solving the "notify when all subsystems on this instance are ready" problem will be a trickier and more invasive change and I'm not really interested in tackling it. So let's merge this as-is, and it will be something for IPA to configure. |
The JNA dependency was originally added to notify systemd listeners when the CA subsystem is ready: https://pagure.io/dogtagpki/issue/1233 dogtagpki#569 However, the jna package may not be available on all platforms, so the JNA dependency needs to be dropped. As a replacement, the listeners can use the pki-server command instead of systemctl to start/restart PKI which can wait until the CA subsystem becomes ready. The following commands can be used to start/restart Tomcat and wait until it starts listening to the ports: $ pki-server start/stop/restart --wait The following commands can be used to start/restart the CA subsystem and wait until it's ready to receive requests: $ pki-server ca-deploy/undeploy --wait Resolves: #1953671