Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ABI: add option to build libmpi_abi.so #6390

Merged
merged 26 commits into from
Dec 13, 2023
Merged

ABI: add option to build libmpi_abi.so #6390

merged 26 commits into from
Dec 13, 2023

Conversation

hzhou
Copy link
Contributor

@hzhou hzhou commented Feb 7, 2023

Pull Request Description

Reference the current MPI ABI working group: https://github.com/mpiwg-abi

This pr enable mpich to build libmpi_abi.so, which conforms to the new (in-progress) MPI ABI standard

Use --enable-mpi-abi to enable it.

To test

$ ./autogen.sh
$ ./configure --prefix=[path] --eanble-mpi-abi
$ make install
$ [set path]
$ mpicc_abi -o cpi examples/cpi.c

$ mpirun -n 2 ./cpi

$ ldd ./cpi

$ nm [path/to/]libmpi_abi.so | less

Points

For users

  • mpicc builds mpich abi, mpicc_abi builds mpi5 abi
  • libmpi.so uses mpich abi, libmpi_abi.so uses mpi5 abi
  • mpi.h will effectively become mpi_abi.h when mpicc_abi is used. User code always #include <mpi.h>

implementations

  • src/binding/abi/mpi_abi_util.c provides the shim for ABI link-time constants
  • src/binding/abi/mpi_abi_util.h provides conversion routines
  • src/binding/abi/c_binding_abi.c provides all the MPI_ and PMPI_ functions
  • generate mpi_abi_internal.h from mpi_abi.h for internal use (mpi_abi_util.c and c_binding_abi.c)

Notes

  • Let's push for MPI_Aint/Offset/Count/Fint to be compatible ABI

  • MPICH will need to support both MPICH ABI and MPI-5 ABI for the foreseeable future

  • Open MPI likely can use MPI-5 ABI directly for its future versions

  • MPI handler constant objects need to use pointers in ABI; MPICH can dereference its handle

  • Scalar constants

    • option 1: translate in and translate out -- shim overhead
    • option 2: re-compile all objects with ABI constants -- build everything twice -- current choice

Author Checklist

  • Provide Description
    Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
  • Commits Follow Good Practice
    Commits are self-contained and do not do two things at once.
    Commit message is of the form: module: short description
    Commit message explains what's in the commit.
  • Passes All Tests
    Whitespace checker. Warnings test. Additional tests via comments.
  • Contribution Agreement
    For non-Argonne authors, check contribution agreement.
    If necessary, request an explicit comment from your companies PR approval manager.

@hzhou hzhou force-pushed the 2302_abi branch 3 times, most recently from 237b902 to 086c8d6 Compare February 7, 2023 23:02
@hzhou hzhou marked this pull request as draft February 8, 2023 15:22
@raffenet
Copy link
Contributor

raffenet commented Feb 8, 2023

So instead of calling PMPI from the ABI compat layer, we now have basically a duplicate binding layer, right? To be clear, IMO, this is a good thing. It should allow support for PMPI/QMPI tools and avoids the patent issue noted by @jeffhammond.

@hzhou
Copy link
Contributor Author

hzhou commented Feb 8, 2023

So instead of calling PMPI from the ABI compat layer, we now have basically a duplicate binding layer, right? To be clear, IMO, this is a good thing. It should allow support for PMPI/QMPI tools and avoids the patent issue noted by @jeffhammond.

Yes. The ABI binding will not contain any QMPI or MPIX stuff (until standardized). I essentially grabbed the mpi_abi.h from @jeffhammond https://github.com/jeffhammond/mukautuva -- BTW, it is missing MPI_T typedefs.

This PR will omit the MPI-IO for now since we haven't migrated the ROMIO binding layer yet.

@hzhou
Copy link
Contributor Author

hzhou commented Feb 8, 2023

@jeffhammond Could you consolidate the header files into mpi_abi.h and push to https://github.com/mpiwg-abi/compatibility-layer? We should do PRs over there for better review of progress.

@jeffhammond
Copy link
Member

Yeah, I think I am close enough to do that. I'll work on it tomorrow.

@hzhou
Copy link
Contributor Author

hzhou commented Feb 17, 2023

test:mpich/ch3/most ✔️
test:mpich/ch4/most ✔️

1 unrelated failure - ch4-ofi-amonly:
impls/mpich/threads/pt2pt/multinic_infohints 2 MPIR_CVAR_CH4_OFI_ENABLE_MULTI_NIC_STRIPING=0 MPIR_CVAR_CH4_OFI_ENABLE_MULTI_NIC_HASHING=0

* Try avoid print handle as int
* MPI_MESSAGE_NULL may not be MPI_REQUEST_NULL
* MPI_Comm may not be the same as MPI_Win
* Cast to (long long) to print MPI_Count
We should not assume handles to be of integer type.
Custom clean up code (ref. MPI_Comm_spawn_multiple) are loaded in
func['code-clean_up']. We need avoid overwriting this attribute or we
will lose the custom cleanup code then function is being processed
twice.
Newer gcc warns when non-null array constants such as MPI_UNWEIGHTED is
passed to paramber typed as "int param[]". Make the parameter type
simply a pointer avoids such warnings.
Remove the assumption that MPI_PROC_NULL is -1 and MPI_ANY_SOURCE is -2.
The constants may be defined as negative integers, as in the current ABI
draft.
MPID_MAX_PORT_NAME is 1024, and it is the same as abi MPI_MAX_PORT_NAME.
Alternatively, we can raise MPID_MAX_PORT_NAME to higher value.
Then internal code need deal with both the types from ABI header and the
actual types used by MPICH. Generate mpi_abi_internal.h from mpi_abi.h
by renaming all MPI_ typenames into ABI_ prefix. Generate mpi_abi_util.c
to initialize an internal table for builtin datatypes and ops
conversions.

Generates:
    src/binding/abi/mpi_abi_internal.h
    src/binding/abi/mpi_abi_util.c
Various conversion functions.
Use --enable-mpi-abi configure option to enable build and installing
libmpi_abi.so.
Dump c_binding_abi.c, to be linked into libmpi_abi.so.
If MPI_ABI is defined, make mpi.h to effectively the same as mpi_abi.h.
This allows applications to use mpicc_abi to compile their code without
any changes.
When BUILD_MPI_ABI is defined, include mpi_abi_internal.h, and exclude
the portion that defines constants and typedefs that are already defined
in mpi_abi_internal.h. We essentially swapping the ABI for all internal
code.

Note that we need to always define the MPICH specific parts or some
internal code won't build.

MPICH_SUPPRESS_PROTOTYPES is no longer used, so we removed it.
Internal all handles object may not be part of the ABI. Add check macro
to replace it on entry.
The ABI signature of callback functions can be different. We need
convert the handle input upon calling callbacks.
MPI-5 ABI may define various DUP_FN as constants. We can easily swap
them during input.
MPI_COMBINER_HVECTOR_INTEGER, MPI_COMBINER_HINDEXED_INTEGER, and
MPI_COMBINER_STRUCT_INTEGER are removed in the current MPI standards and
thus can be missing from the abi header. Add macro-guard to prevent
compile failures.
* Pass --enable-mpi-abi to romio configure
* Build libromio_abi.so and libpromio_abi.so when enabled
* Include mpi_abi.h instead of mpi.h when BUILD_MPI_ABI is defined
* internally adio.h will include mpio.h without abi, and
  romio_abi_internal.h with abi. I.e. mpio.h is skipped ifdef
  BUILD_MPI_ABI.
Three routines are defined in ROMIO, potentially using ABI header. We
need handler conversion to use inside MPICH.

Only include mpir_ext.h in the code that needs it. With BUILD_MPI_ABI,
we need alter the function parameter types for conversions.

Since we no longer include mpir_ext.h in the binding files, we need
explicitly declare them. We can treat them as void pointers to avoid the
distinction between ABI and non-ABI versions.
ROMIO directly uses the MPI ABI header since it sits on top of the MPI
layer. As such, we need add ABI handle conversions for functions defined
in MPICH and called by ROMIO.

Define them all in src/glue/romio/glue_romio.c.
Update builtin handle constants based on upstream huffman coding scheme.  Ref:
https://github.com/mpiwg-abi/specification-text-draft/blob/main/print-handle-constants.py
@hzhou hzhou merged commit 0b71273 into pmodels:main Dec 13, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants