Skip to content

Commit

Permalink
Check the runtime version of PMIx
Browse files Browse the repository at this point in the history
It has been reported (and confirmed) that building against
one version of PMIx and then running with another version
will cause PRRTE to segfault. This isn't a universal rule.
For example, one can switch v5.0 and master without a
problem. However, switching v5.0 and v4.2 is a definite
segfault.

The root cause of the problem is a change in the layout
of the base pmix_object_t definition. This renders all
PMIx objects binary incompatible when crossing between
the v5 and v4 (and below) series.

Changing the v5 definition back to match v4 is an
overly complex task. The changes were required to
accommodate the new shared memory support that
was introduced in v5.

So instead, we check the runtime version of PMIx against
the build version. If the runtime version is incompatible
with the build version, then we print an explanatory
error message and error out.

Signed-off-by: Ralph Castain <rhc@pmix.org>

dd

Signed-off-by: Ralph Castain <rhc@pmix.org>
  • Loading branch information
rhc54 committed Jun 5, 2024
1 parent 7e0ff9b commit 0e150a4
Showing 1 changed file with 40 additions and 0 deletions.
40 changes: 40 additions & 0 deletions src/runtime/prte_init.c
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@
#ifdef HAVE_SYS_STAT_H
# include <sys/stat.h>
#endif
#ifdef HAVE_STRING_H
#include <string.h>
#endif

#include "src/util/error.h"
#include "src/util/error_strings.h"
Expand Down Expand Up @@ -122,16 +125,53 @@ static bool check_exist(char *path)
return false;
}

static void print_error(unsigned major,
unsigned minor,
unsigned release)
{
fprintf(stderr, "************************************************\n");
fprintf(stderr, "We have detected that the runtime version\n");
fprintf(stderr, "of the PMIx library we were given is binary\n");
fprintf(stderr, "incompatible with the version we were built against:\n\n");
fprintf(stderr, " Runtime: 0x%x%02x%02x\n", major, minor, release);
fprintf(stderr, " Build: 0x%0x\n\n", PMIX_NUMERIC_VERSION);
fprintf(stderr, "Please update your LD_LIBRARY_PATH to point\n");
fprintf(stderr, "us to the same PMIx version used to build PRRTE.\n");
fprintf(stderr, "************************************************\n");
}

int prte_init_minimum(void)
{
int ret;
char *path = NULL;
const char *rvers;
char token[100];
unsigned int major, minor, release;

if (min_initialized) {
return PRTE_SUCCESS;
}
min_initialized = true;

/* check to see if the version of PMIx we were given in the
* library path matches the version we were built against.
* Because we are using PMIx internals, we cannot support
* cross version operations from inside of PRRTE.
*/
rvers = PMIx_Get_version();
ret = sscanf(rvers, "%s %u.%u.%u", token, &major, &minor, &release);

/* check the version triplet - we know that version
* 5 and above are not runtime compatible with version
* 4 and below. Since PRRTE has a minimum PMIx requirement
* in the v4.x series, we only need to check v4 vs 5
* and above */
if ((PMIX_VERSION_MAJOR > 4 && 4 == major) ||
(PMIX_VERSION_MAJOR == 4 && 5 <= major)) {
print_error(major, minor, release);
return PRTE_ERR_SILENT;
}

/* carry across the toolname */
pmix_tool_basename = prte_tool_basename;

Expand Down

0 comments on commit 0e150a4

Please sign in to comment.