Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: provide mpiacc wrapper script #12228

Open
gonzalobg opened this issue Jan 12, 2024 · 3 comments
Open

Proposal: provide mpiacc wrapper script #12228

gonzalobg opened this issue Jan 12, 2024 · 3 comments

Comments

@gonzalobg
Copy link

Is your feature request related to a problem? Please describe.

The problem is described in this issue: #12225 . Users with applications written in heterogeneous programming languages, where all files of all translation units are, e.g., .cu CUDA files (or hip, or sycl), often run into the issue that they can't compile their application using any of the provided wrappers. They try and struggle to, e.g., compile them using the C++ wrapper as follows:

OMPI_CXX=nvcc mpicxx .... main.cu

The heterogeneous compilers often call different compilers themselves. For example, nvcc expands the source code into a device file compiled with a device-only compiler, and a host c++ file that is then - in the case of a CUDA C++ MPI application - compiled using a host C++ MPI compiler wrapper like mpicxx (and a host c++ compiler like g++ or clang++ otherwise).

The feedback I've gotten from users multiple times is that they struggle to do this, they spend time fiddling with compiler wrapper options, environment variables, end up modifying their application (e.g. splitting the code that uses an accelerator from the code that initializes the program to simplify compiler), or have to go grab complex build systems like CMake to compile a single-file "MPI + CUDA C++ hello world", since CMake will query all include / link flags from the wrapper correctly, prefered compiler, and pass those to the heterogeneous compiler.

Describe the solution you'd like

Compiling an application that mixes MPI with an heterogeneous language (like CUDA C++, HIP, etc.) should be as easy as:

mpiacc hello_world.cu

Compiling multiple translation units should be as easy as compiling them with mpiacc, and linking them together.

Describe alternatives you've considered
See above. There are many workarounds, but none of them provide a smooth experience for beginner MPI programmers willing to extend a single GPU application to multiple GPUs.

Additional context
This proposal was discussed in this week's MPICH developer call, and there is an issue tracking it here. pmodels/mpich#6867
It would be best for users if the MPI wrapper for heteregoeneous compilers would have a similar API in both implementations.

@bosilca
Copy link
Member

bosilca commented Jan 12, 2024

We could make a gesture for a particular community, but where do we stop ?

I don't share your concerns here, configurations tools (autoconf, cmake) have their ways to identify the compile and link flags needed to pass to any compiler to compile applications. For everything else you can always fallback on mpicc --showme or mpicxx --showme

@jsquyres
Copy link
Member

I understand the pain of users just trying to compile / link their applications, particularly when trying to mix multiple tools -- such as MPI and CUDA.

However, I'm not sure that MPI needs to be the integration point for all compilation and linking. For example, if Open MPI includes an mpicudacc wrapper compiler, how will it know what flags to pull from CUDA? More specifically, what if I'm using an old Open MPI and a new CUDA release -- will the old Open MPI know how to full the newest / most recent set of flags from CUDA? More generally: how does Open MPI keep up with these CUDA flags over time? Also, some flags are necessary, but others are configuration-dependent, or user chosen. How should mpicudacc know what choices to make for all of these?

And to @bosilca's point, how does Open MPI also keep up with ROCM flags that are needed over time? ... etc. Open MPI's release schedule is orthogonal to all the release schedule of other HPC tools; what happens when incompatible changes are made and Open MPI now has stale / incorrect flags for a peer tool? That seems undesirable, and just creates more user confusion and frustration.

Even if you flip the script and make CUDA be the integration point, how would CUDA keep up with the changing set of Open MPI (and MPICH and ...) flags over time?

Rather than everyone having to keep up with everyone else's flags, Open MPI's approach has been to provide multiple mechanisms to extract the flags from our wrapper compilers, and also to allow nesting of wrapper compilers. We -- Open MPI -- can't know exactly what the end user will want from their other tools, or what systems they will want to compile/link against. As such, all we can do is provide both standardized and Open MPI-specific ways to extract what is needed to compile/link against Open MPI.

  1. Nesting wrapper compilers via environment variables
  2. The Open MPI wrapper --showme* CLI options to extract wrapper compiler flags
  3. Installed .pc files to allow use of pkg-config to extract wrapper compiler flags

Are these existing mechanisms not sufficient?

Note: I'm not asking if they're trivially easy to use -- I'm asking if they're insufficient to allow correct compiling and linking of Open MPI to other systems.

I understand the compiling / linking large HPC applications can be challenging. But no matter how it is done, some level of expertise is going to be needed by the end user. Perhaps better documentation and/or examples are needed...? If there's something that can be done in Open MPI's docs, for example, I'm open to suggestions (let's do this in Open MPI v5.0.x docs and beyond -- i.e., https://docs.open-mpi.org/ -- there's no much point in doing this for v4.1.x and earlier).

@ggouaillardet
Copy link
Contributor

FWIW, that can be achieved locally by the end users.

From the install directory:

  • symlink bin/mpiacc to opal_wrapper
  • copy share/openmpi/mpicc-wrapper-data.txt into share/openmpi/mpiacc-wrapper-data.txt
  • edit share/openmpi/mpiacc-wrapper-data.txt and replace the line compiler=... with compiler=nvcc

As @jsquyres pointed, some other adjustments might be required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants