libzfs binding for Java
Clone or download
jimklimov Merge pull request #10 from jimklimov/warn-start
Problem: not easy for uninformed end-users to find cause of JVM crash
Latest commit 6ad4b75 Sep 4, 2017

README.md

About libzfs.jar

The libzfs.jar is a Java wrapper for native ZFS functionality as implemented by a host operating system.

A bit of turbulent history

Since the ZFS filesystem, data integrity and volume management subsystem was introduced in the Sun Solaris 10 operating system over a decade ago (in 2005), the main interaction API was to use command-line tools. Also an OS-internal native library libzfs.so was available, but its purpose was interfacing those tools to the OS kernel. In absence of other ABIs, a few projects still risked to link against this "uncommitted" library -- including the libzfs.jar wrapper employed by Jenkins continuous integration and automation server, to use on matching platforms for tasks like snapshot management, etc. and a number of other projects e.g. on GitHub

As time went and Solaris evolved, some function signatures were changed in the library, causing a moderate bit of headache for consumers like this wrapper -- but it could be handwaved by requiring an upgrade to the newest release of the single OS.

As more time had passed, the Sun OpenSolaris project and later the illumos project and numerous open-source distributions based on that had splintered off from Solaris (which, as a brand and product, went along its own proprietary path and versioning under Oracle stewardship).

Also the ZFS technologies were adopted into multiple other operating system kernels (including several BSD's, Linux and MacOS efforts), cross-pollinating and evolving under the roof of the OpenZFS project. The libzfs.so library is still an "uncommitted implementation detail" of respective distributions, with random ABI changes causing now a much bigger headache due to fragmentation -- both because there is no single ZFS-wielding OS that you can require an upgrade to, and because there are dozens of combinations of possible function signatures that can be present in a particular instance and update-level of an OS deployment.

The effect for libzfs.jar and Jenkins in particular was that as the CI application server booted up on an OS with incompatible function signatures (and began interacting with ZFS datasets), its Java process just dumped core and died -- and with evolution of OpenZFS consuming and contributing operating systems, it became more likely than not to expect a wrong single fixed ABI.

Recent (2017) changes in this project aimed to specifically improve the portability of Jenkins to illumos distros -- the "SunOS" descendants based on modern OpenZFS. Also added was testability of this wrapper's codebase via Jenkins and Travis CI as integrated on GitHub, so contributors can test their improvements before posting a pull request.

The accepted solution for improvement of libzfs.jar wrapper portability was to identify which of the wrapped routines have different native ABI signatures "present in the wild" and introduce a way to pick and call the correct signature during run-time. While the updated JAR tries its best to guess the correct set for a given host OS, it also includes a way for the end-user (sysadmin) to enforce particular implementation for each such routine as well as the overall default, using environment variables or Java properties -- and since such settings can be passed from environment outside the Jenkins web-application, they would survive eventual upgrades of the jenkins.war). Also this technique is expandable, to uniformly handle more such cases as the future comes down upon us and the libzfs.jar sources have to be updated again :)

For details about currently supported names and values for such toggles please see the source code for your version of libzfs.jar (this is at the moment regarded as "implementation detail" so options are not listed here), or refer to the current Git HEAD status:

Settings ultimately applied to each toggle can be seen in application server (or standalone Jetty app) log if you start it with a FINE or greater log4j logging level. See also the lockpick-libzfs-abi.sh script that tries out the currently known toggle options and their values, to pick the correct settings for an end-user's deployment.

From practice, for late versions of Sun Solaris and several half a decade of operating systems and ZFS modules based on illumos and OpenZFS codebases, a likely end-user setup (e.g. in application server settings) would be:

LIBZFS4J_ABI=openzfs LIBZFS4J_ABI_zfs_iter_snapshots=legacy

while for illumos-based OSes with kernel since mid-2016 it would be all-new:

LIBZFS4J_ABI=openzfs

Note that there is more work possible in this area, such as in particular expanding Jenkins ZFS support to operating systems that do not identify as a SunOS, but this improvement is out of the scope for this update (the decision is made outside libzfs.jar codebase). It could help asking the wrapper whether it can represent ZFS on the host OS, rather than guessing by some strings the OS provides, though.

At this time one can wrap calls to initialization of a LibZFS instance in caller's set-up method (rather than using a pre-initialized static final class member) and catch resulting exceptions -- this should wrap both absence of ZFS on the host OS (or other inability to use it) and the end-user's explicit request to not use the wrapper by -DLIBZFS4J_API=off. See LibZFSTest.java for more details.

Kudos

  • Kohsuke Kawaguchi
  • Jim Klimov
  • Oleg Nenashev
  • Adam Stevko