Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v4.0.0: Check for ABI compatibility with v3.1.x #5447

Closed
jsquyres opened this issue Jul 17, 2018 · 4 comments
Closed

v4.0.0: Check for ABI compatibility with v3.1.x #5447

jsquyres opened this issue Jul 17, 2018 · 4 comments

Comments

@jsquyres
Copy link
Member

Per https://www.mail-archive.com/ompi-packagers@lists.open-mpi.org/msg00015.html and webex discussion on 2018-07-17, @hppritcha and @gpaulsen will be conducting an ABI compatibility check between the v4.0.x branch (which will be created tomorrow) and the v3.1.x branch.

If possible, we would like to not change libmpi/liboshmem's .so major version for all the reasons cited in https://www.mail-archive.com/ompi-packagers@lists.open-mpi.org/msg00015.html. We're pretty sure that this means increasing both c and a by 10.

Howard and Geoff will need to check the branches manually, and probably do a bunch of testing, such as:

  1. Compile and install Open MPI v3.1.x into /path
  2. Compile a bunch of tests against that Open MPI installation
  3. rm -rf /path
  4. Compile and install Open MPI v4.0.x (with appropriate c:r:a values and possibly --enable-mpi1-compat) into /path
  5. Run the tests (without recompiling/relinking) and make sure they all work properly
@jsquyres
Copy link
Member Author

FYI: @amckinstry

@amckinstry
Copy link

Thanks for the pointer; this plan makes sense to me. Postponing symbol versioning and adding it in v5.0 would be best.

Since symbols will be dropped in v5.0, a new major SOVERSION would be required.
From a Debian perspective, the next release is Q2/Q3 2019, with a transition freeze in January, so we'd go with V4.0, or if a new major SOVERSION is required, probably 3.x, as its unlikely to get a full transition through before January (transitions basically cause traffic jams in our release planning).
As soon as a test release of v4.0 is available to work with, I'm going to do an experimental package in Debian and test-build all our MPI packages against it. This will test to see if anyone is using the MPI-1 features or the CXX interface ; i'll disable them in the test build and only enable them if necessary, in prep for v5.0.

@gpaulsen
Copy link
Member

gpaulsen commented Sep 7, 2018

I have completed this testing. It is important to acknowledge that testing alone can never fully guarantee ABI compatibility. This tested was done in combination with discussion in the weekly web-ex discussions with our core developers and deemed the gating factor if we can keep our .so versions compatible so as to NOT require users' to rebuild/link their mpi applications.

I built and ran the open-mpi ibm test suite (generally considered good coverage of the MPI API interface) with top of v3.1.x branch. I then deleted that OpenMPI v3.1.x install, and built Open MPI v4.0.x two time (I repeated this process both with and without the new configure flag: -enable-mpi1-compatibility). I then updated the variables:

OPAL_PREFIX
OPAL_LIBDIR
PATH
LD_LIBRARY_PATH

I was then able to RUN the binaries under each subdirectory of the ibm test suite.
A few tests failed both test runs due to not enough ranks (I used --np 4, along with vader btl). I also killed the int_overflow test in collectives subdir as that test is a resource HOG!. These failures ensured that I corectly saw what a failure looked like. I then checked return codes for all of the other tests, and they all correctly returned 0 return codes.

PASSED.

I will close this issue and create a PR of the VERSION file to v4.0.x that reflects NO need to require customers who built with v3.1.x to recompile/relink with v4.0.0.

gpaulsen added a commit to gpaulsen/ompi that referenced this issue Sep 7, 2018
  This was done after discussions with core developers about any
  potential ABI breakage for any of the libs the user directly
  links against.  Also compaitiblity tests were done using the
  ibm test suite and building with v3.1.x and running with v4.0.x
  see: open-mpi#5447
gpaulsen added a commit to gpaulsen/ompi that referenced this issue Sep 7, 2018
  This was done after discussions with core developers about any
  potential ABI breakage for any of the libs the user directly
  links against.  Also compaitiblity tests were done using the
  ibm test suite and building with v3.1.x and running with v4.0.x
  see: open-mpi#5447

Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
gpaulsen added a commit to gpaulsen/ompi that referenced this issue Sep 8, 2018
  This was done after discussions with core developers about any
  potential ABI breakage for any of the libs the user directly
  links against.  Also compaitiblity tests were done using the
  ibm test suite and building with v3.1.x and running with v4.0.x
  see: open-mpi#5447

Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
@jsquyres
Copy link
Member Author

Per 2018-09-11 webex, @gpaulsen says this is now complete / ok to close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants