Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for passing mpi4py comm through cython. #818

Merged
merged 10 commits into from
May 6, 2022

Conversation

hmcezar
Copy link
Contributor

@hmcezar hmcezar commented Apr 21, 2022

Description

As mentioned in #817, currently it's not possible to setMPIComm using the Python interface.

Target release

I would like my code to appear in the next point releases.

Type of contribution
  • changes to code or doc authored by PLUMED developers, or additions of code in the core or within the default modules
  • changes to a module not authored by you
  • new module contribution or edit of a module authored by you
Copyright
  • I agree to transfer the copyright of the code I have written to the PLUMED developers or to the author of the code I am modifying.
  • the module I added or modified contains a COPYRIGHT file with the correct license information. Code should be released under an open source license. I also used the command cd src && ./header.sh mymodulename in order to make sure the headers of the module are correct.
Tests
  • I added a new regtest or modified an existing regtest to validate my changes.
  • I verified that all regtests are passed successfully on GitHub Actions.

@GiovanniBussi
Copy link
Member

@hmcezar thanks for your contribution!

Could you please check if it's possible to add a regression test? You could just add one extra test in this directory. I think that in our master branch they are run with pytest, so you could be able to just do something like

cd python
pytest -v

Then, for the test to work on GitHub actions, you should install the mpi4py package there as well.

A possible strategy would be to make sure that the test is skipped when the package is not available (by properly adding a try import in the new test you are going to add) and then install the mpi4py package in the GHActions jobs where MPI is installed. This can be done by adding after this line something like

pip install mpi4py

@hmcezar
Copy link
Contributor Author

hmcezar commented Apr 22, 2022

@GiovanniBussi I know the test is too simple, but I'm not sure what could be done to really test the implementation.

About the releases, maybe it should be included in 2.7 and 2.8?

@GiovanniBussi
Copy link
Member

@hmcezar the test is fine for start.

I am a bit skeptical in including mpi4py as a dependency in conda. The more dependencies you add, the more you have problems with upgrades on conda.

An ideal solution would be one where:

  • having mpi4py installed at compile time is irrelevant
  • if mpi4py is installed at run time, the py-plumed package can use it
  • otherwise, it can't

For this to work, at least the cython compilation should work without mpi4py. Notice that we have type checkings in the python interface (that's why points are set to, e.g. <double*>, here). However, the type of a MPI communicator is not checked (it might be a typedef for int or a pointer depending on the implementation). Thus, if the only problem is with the conversion, you can use a <void*> cast. Would that fix the compilation without the need to install mpi4py before compiling plumed?

This reverts commit 4fd8c3f.
@hmcezar
Copy link
Contributor Author

hmcezar commented Apr 22, 2022

@GiovanniBussi No problem, I reverted the commit.

The CI problem seems to be in this line when I check if the object is MPI.Comm.
Any suggestions on how to make the cython compilation work in this case? Since it's in the Python part, it's not possible to cast it to something else, and the check is needed to make sure the right cmd is called.

@GiovanniBussi
Copy link
Member

Did you try to use a normal import here instead of a cimport?

In addition, here you could just convert to <void>.

@hmcezar
Copy link
Contributor Author

hmcezar commented Apr 25, 2022

Indeed, I think using import instead of cimport may solve the issue. Let's see what the CI thinks.

@hmcezar
Copy link
Contributor Author

hmcezar commented Apr 27, 2022

Well, after implementing the suggested changes I still get the CI errors below.

The plumed.pyx:136:30: 'MPI' is not a cimported **module** could be solved by not specifying the MPI.Comm of val in the cdef declaration, and that's ok, but the plumed.pyx:137:14: 'MPI_Comm' is not a type identifier I don't know how I can solve. I need to declare the buffer as the proper type before passing it as <void>, correct?

[1/1] Cythonizing plumed.pyx

Error compiling Cython file:
------------------------------------------------------------
...
except ImportError:
     HAS_NUMPY=False

try:
     import mpi4py.MPI as MPI
     cimport mpi4py.libmpi as libmpi
            ^
------------------------------------------------------------

plumed.pyx:52:13: 'mpi4py/libmpi.pxd' not found

Error compiling Cython file:
------------------------------------------------------------
...
         self.c_plumed.cmd( ckey, <long*>&abuffer[0], len(abuffer))
     cdef cmd_float(self, ckey, double val ):
         self.c_plumed.cmd_float( ckey, val)
     cdef cmd_int(self, ckey, int val):
         self.c_plumed.cmd_int( ckey, val)
     cdef cmd_mpi(self, ckey, MPI.Comm val):
                             ^
------------------------------------------------------------

plumed.pyx:136:30: 'MPI' is not a cimported module

Error compiling Cython file:
------------------------------------------------------------
...
     cdef cmd_float(self, ckey, double val ):
         self.c_plumed.cmd_float( ckey, val)
     cdef cmd_int(self, ckey, int val):
         self.c_plumed.cmd_int( ckey, val)
     cdef cmd_mpi(self, ckey, MPI.Comm val):
         cdef libmpi.MPI_Comm buffer = val.ob_mpi
             ^
------------------------------------------------------------

plumed.pyx:137:14: 'MPI_Comm' is not a type identifier

@hmcezar
Copy link
Contributor Author

hmcezar commented May 5, 2022

Ok, so digging deeper into this issue, it looks like cython supports conditional compilation.

The solution probably should go in this direction. Detect mpi4py prior to the actual cython compilation and DEF a variable based on that. Then, a simple IF would do the trick.

@GiovanniBussi
Copy link
Member

@hmcezar this is a possible solution, but I think it would be suboptimal.

In practice, if you do so, one should compile the python extension with a MPI compiler. This is not done by default when you install plumed, and it is not done in the py-plumed conda and Macports packages.

Notice that at the level of the python extension there should be no need to include the <mpi.h> header. Indeed, since we are passing MPI communicators by pointer, it is not necessary to know their size. It would be ideal if there was a way to find the address of the MPI communicator so that we can pass it as void*, and this should be feasible even if we do not know its sizeof (which is the information contained in the <mpi.h> file.

@lgtm-com
Copy link
Contributor

lgtm-com bot commented May 5, 2022

This pull request introduces 1 alert when merging feb1795 into 99bcb04 - view on LGTM.com

new alerts:

  • 1 for Unused import

@GiovanniBussi
Copy link
Member

GiovanniBussi commented May 5, 2022

I am not sure how portable this would be, but you might use id() to get the address. It looks like the cdef class Comm in py4mpi only contains the ob_mpi object. So, it is possible (though I didn't test) that by using id(mpi4py.MPI.COMM_WORLD) you can get an integer representation of a pointer that could be passed to plumed.

@hmcezar
Copy link
Contributor Author

hmcezar commented May 5, 2022

@GiovanniBussi The problem is that even though I understand that and it should be possible in the sense that we are just passing pointers, we still have to make some checks before knowing which cfunction we should call.
The isinstance check should be done against a MPI.comm, which is what the Python codes will pass to the interface.
To perform this check, we will need mpi4py, otherwise we get stuff like:

Error compiling Cython file:
------------------------------------------------------------
...
     HAS_NUMPY=True
except ImportError:
     HAS_NUMPY=False

try:
     cimport mpi4py.MPI as MPI
            ^
------------------------------------------------------------

plumed.pyx:51:13: 'mpi4py/MPI.pxd' not found

Error compiling Cython file:
------------------------------------------------------------
...
               raise ValueError("ndarrays should be double (size=8), int, or long")
         elif isinstance(val, str ) :
              py_bytes = val.encode()
              cval = py_bytes
              self.c_plumed.cmd( ckey, <const char*>cval )
         elif HAS_MPI4PY and isinstance(val, MPI.Comm):
                                               ^
------------------------------------------------------------

plumed.pyx:171:48: cimported module has no attribute 'Comm'

@GiovanniBussi
Copy link
Member

I think you can use a plain import (no cimport) to this aim:

[bussi@giorgione mpi4py]$ python3
Python 3.9.10 (main, Feb  2 2022, 22:47:17) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mpi4py.MPI as MPI
[giorgione.phys.sissa.it:16911] mca_base_component_repository_open: unable to open mca_oob_ud: libosmcomp.so.3: cannot open shared object file: No such file or directory (ignored)
[giorgione.phys.sissa.it:16907] mca_base_component_repository_open: unable to open mca_oob_ud: libosmcomp.so.3: cannot open shared object file: No such file or directory (ignored)
[giorgione.phys.sissa.it:16907] mca_base_component_repository_open: unable to open mca_btl_openib: libosmcomp.so.3: cannot open shared object file: No such file or directory (ignored)
>>> MPI.Comm
<class 'mpi4py.MPI.Comm'>

The only problem that I've seen is that the resulting MPI.Comm object do not have a visible ob_mpi member.

One thing that I don't like very much is that it looks like just importing mpi4py is slowing down things by initializing MPI. We could thing about options such as only importing that only when reaching a failure in the isinstance test here. In this way, however, at the first mistake, one would have MPI imported for no reason.

Another option could be to have a specific python method such as plumed.enable_mpi().

Finally, we could try to import mpi4py when the cmd string contains an MPI substring (then I would add some check within plumed that we do not add more ...MPI... commands).

In any case, this can be solved if you find a way to pass a pointer to the ob_mpi member without cimporting mpi4py.

@hmcezar
Copy link
Contributor Author

hmcezar commented May 5, 2022

So, I found this issue where the main mpi4py developer gives a suggestion on how to get the address of the communicator.

I've implemented it, and things seem to work:

  • If I don't have mpi4py at compile time, it's fine
  • If I have mpi4py at runtime then I can use setMPIcomm.

I reverted the conditional compilation changes.

About importing mpi4py slowing down things, I might be wrong, but I think it doesn't matter. The first time you import, Python will load it to sys.modules and then it'll stay there. It might add a few milliseconds to the Plumed() object creation, but that'll be negligible comparing with the time spent in the MD run.

@codecov-commenter
Copy link

Codecov Report

Merging #818 (0557f5f) into master (4197025) will increase coverage by 0.78%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master     #818      +/-   ##
==========================================
+ Coverage   85.03%   85.82%   +0.78%     
==========================================
  Files         599      599              
  Lines       49223    49268      +45     
==========================================
+ Hits        41857    42282     +425     
+ Misses       7366     6986     -380     
Impacted Files Coverage Δ
src/tools/Tools.h 82.71% <0.00%> (-13.37%) ⬇️
src/core/Atoms.cpp 94.23% <0.00%> (-1.11%) ⬇️
src/maze/Memetic.h 59.78% <0.00%> (-0.55%) ⬇️
src/tools/AtomNumber.h 100.00% <0.00%> (ø)
src/colvar/PathMSDBase.cpp 97.60% <0.00%> (ø)
src/core/ActionAtomistic.h 100.00% <0.00%> (ø)
src/eds/EDS.cpp 92.15% <0.00%> (+<0.01%) ⬆️
src/core/ActionAtomistic.cpp 92.39% <0.00%> (+0.18%) ⬆️
src/cltools/SimpleMD.cpp 94.01% <0.00%> (+0.45%) ⬆️
src/membranefusion/FusionPoreExpansionP.cpp 87.09% <0.00%> (+75.48%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4197025...0557f5f. Read the comment docs.

@GiovanniBussi
Copy link
Member

GiovanniBussi commented May 5, 2022

Thanks!

Regarding slowdown: the problematic case is a user with mpi4py installed that however does not want to use it. When plumed module is loaded, mpi4py will be loaded as well.

I tried to fix it here: 19cdea3 This is the same thing that we do with MDAnalysis here. Can you double check if things are working for you?

In this branch https://github.com/plumed/plumed2/tree/python-mpi I also added a couple more little change.

If you are ok with this, I would merge it to master. I think it's better not to merge this in v2.7 or v2.8 since it's to be considered a new feature.

@hmcezar
Copy link
Contributor Author

hmcezar commented May 6, 2022

I just checked and everything seems to work in your branch. I cythonized with and without mpi4py and everything works fine in my application.

About merging it to master, in the end, it is up to you guys, but if possible, since I'm using this implementation in HyMD it would be nice to have it in 2.7 and 2.8, so the code would be compatible with more versions of Plumed.
IMHO, it is not a new feature, just a fix to a functionality that was broken in the Python interface.
But again, up to you guys!

Thanks for your help in implementing this!

@GiovanniBussi
Copy link
Member

About merging it to master, in the end, it is up to you guys, but if possible, since I'm using this implementation in HyMD it would be nice to have it in 2.7 and 2.8, so the code would be compatible with more versions of Plumed.
IMHO, it is not a new feature, just a fix to a functionality that was broken in the Python interface.

Notice that you can install the plumed python wrappers from the master branch and then load (setting PLUMED_KERNEL) an older PLUMED version.

Anyway, I will check if this can be backported easily to 2.8. v2.7 is more difficult since the plumed.pyx file was extensively changed.

@GiovanniBussi
Copy link
Member

GiovanniBussi commented May 6, 2022

Thanks for your help in implementing this!

Thanks to you, you actually made most of the job!

@hmcezar
Copy link
Contributor Author

hmcezar commented May 6, 2022

Great, just let me know if you need anything else for merging.

@GiovanniBussi GiovanniBussi merged commit 0557f5f into plumed:master May 6, 2022
@GiovanniBussi
Copy link
Member

@hmcezar FYI I just backported this fix to the v2.8 branch

@hmcezar
Copy link
Contributor Author

hmcezar commented May 9, 2022

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants