Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to mpi_f08 #2486

Merged
merged 9 commits into from Jan 10, 2023
Merged

Switch to mpi_f08 #2486

merged 9 commits into from Jan 10, 2023

Conversation

fstein93
Copy link
Contributor

@fstein93 fstein93 commented Jan 4, 2023

This PR replaces the old Fortran 90-style MPI module with its Fortran 2008 counterpart. This solves a bunch of issues with the old Fortran wrapper.

Because the mpi_f08 module is not always available or is buggy with certain combinations of compilers and libraries, the use of the new module has to be actively enabled by adding the preprocessor flag -D__MPI_F08.

It will be used automatically via the toolchain if OpenMPI or IntelMPI is used. Probably, a few adjustments on the Dashboard will be necessary later.

@fstein93
Copy link
Contributor Author

fstein93 commented Jan 5, 2023

I am not really sure why Intel or MPICH have these issues. There are not too many changes from CP2K side.

@fstein93
Copy link
Contributor Author

fstein93 commented Jan 8, 2023

I suspect a bug on either MPICH or compiler side. I will consult the developers.

Furthermore, there is a bug in the Intel implementation which does not allow to pass non-contiguous arrays which is allowed to the MPI standard. I will also open a bug report there.

@oschuett
Copy link
Member

oschuett commented Jan 8, 2023

does not allow to pass non-contiguous arrays which is allowed to the MPI standard

Do we actually pass non-contiguous arrays to MPI? Generally, I think it would be good if we started using CONTIGUOUS wherever possible.

@fstein93
Copy link
Contributor Author

fstein93 commented Jan 8, 2023

There is one spot. I forgot which one. This triggered the Intel compiler. What we could do is to add the contiguous keyword if we do not use the mpi_f08.
I have just done some research. There is a compile-time variable called MPI_SUBARRAYS_SUPPORTED which enables this behavior. If we do not set it, we have to enforce contiguity of all arrays in our wrapper. The devil is in the details. But that should be easy to implement.

@oschuett
Copy link
Member

oschuett commented Jan 8, 2023

There is one spot. ...we have to enforce contiguity of all arrays in our wrapper.

Yes, I think the best solution would be to refactor that one spot an then simply declare all arrays in our wrapper as CONTIGUOUS. This should spare us a lot of hassle and produce more performance portable code.

@alazzaro
Copy link
Member

alazzaro commented Jan 8, 2023

For the record, I've already spent some time to add CONTIGUOUS keyword in DBCSR...

@fstein93
Copy link
Contributor Author

fstein93 commented Jan 9, 2023

I added the contiguous keyword wherever possible except from the non-blocking routines where I used the is_contiguous function to prevent accidental programming mistakes (see commit message).

@oschuett
Copy link
Member

oschuett commented Jan 9, 2023

I added the contiguous keyword wherever possible except from the non-blocking routines ...

Oh, good thinking!

As an alternative we could use CONTIGUOUS POINTER or ALLOCATABLE for those async routines because their are treated as a distinct types and get checked at compile time. However, it's probably not worth the effort.

@fstein93
Copy link
Contributor Author

fstein93 commented Jan 9, 2023

As an alternative we could use CONTIGUOUS POINTER or ALLOCATABLE for those async routines because their are treated as a distinct types and get checked at compile time. However, it's probably not worth the effort.

That does not work if the array has a constant size or the caller has to artificially use an allocatable instead.

@fstein93
Copy link
Contributor Author

fstein93 commented Jan 9, 2023

The issue with MPICH looks to me like a bug on the library or the compiler-side. If there are no complaints, I will just add a commit to enforce the use of the old MPI interface with MPICH in our install script and merge.

@oschuett
Copy link
Member

oschuett commented Jan 9, 2023

Looks like this GCC bug which I found via Stackoverflow.

@oschuett
Copy link
Member

oschuett commented Jan 9, 2023

I will just add a commit to enforce the use of the old MPI interface with MPICH in our install script and merge.

I don't know. It feels like this could still trip up a lot of people. Maybe we make it opt-in for now via a -D__MPI_F08 flag?

Frederick Stein added 9 commits January 10, 2023 11:34
For legacy/debugging
In case of non-blocking communication, we check contiguity with is_contiguous to prevent accidental copies from caller-site. If we were to call a non-blocking routine with a non-contiguous array and enforce contiguity in the interface, the compiler would make the array contiguous before calling the routine and copy it back afterwards such that MPI would not copy the result into the actual array and might even throw a segfault because the contiguous copy of the non-contiguous array might have already been cleaned up.
@fstein93
Copy link
Contributor Author

Just for reference: #1030

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants