Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cogent3.util.parallel: is there a flexible replacement, e.g. joblib? #3

Closed
GavinHuttley opened this issue Mar 4, 2019 · 5 comments
Closed

Comments

@GavinHuttley
Copy link
Collaborator

Original report by GavinH (Bitbucket: 557058:e40c23e1-e273-4527-a2f8-5de5876e870d, ).


What we need is something that can go from threads on a single box to using MPI.

Look into joblib and dask.

Do they also support a manager/worker style parallelisation?

Abstract module for parallel computations using mpi,futures or concurrent.futures.

def map(f, s, max_workers=None, use_mpi=False):
(user warning if max_workers is 1 unless the number of cpus is 1)
if MPI is None and use_mpi:
exception
(Check is max_workers can fit in the number of available processes)
parallel map stuff

Find out how to test(decorator from unittest, PyTest?)

@GavinHuttley
Copy link
Collaborator Author

Original comment by GavinH (Bitbucket: 557058:e40c23e1-e273-4527-a2f8-5de5876e870d, ).


Here's link to mpi variant of master/children

@GavinHuttley
Copy link
Collaborator Author

Original comment by GavinH (Bitbucket: 557058:e40c23e1-e273-4527-a2f8-5de5876e870d, ).


It looks like the concurrent.futures.ProcessPoolExecutor provides a simplified interface to doing what we want. Additionally, mpi4py has a similar API (?).

Can you work up a very simple example that does the same basic calculation using the two different backends.

@GavinHuttley
Copy link
Collaborator Author

Original comment by Sheng Han Moses Koh (Bitbucket: 5c6a02d4d3e7b93ea1c22610, GitHub: u6052029).


The attached example is derived from the example present in the standard library concurrent.futures documentation.

It is meant to be ran using "mpiexec -n 1 --bind-to none python executorPoolExample.py" to show the performance of both the multiprocessing pool and the mpi4py processing pool.

Ran sequentially, the example take roughly 15 seconds.

@GavinHuttley
Copy link
Collaborator Author

Original comment by GavinH (Bitbucket: 557058:e40c23e1-e273-4527-a2f8-5de5876e870d, ).


Below is output from different executions (edited for brevity) for discussion:

 $ python executorPoolExample.py
Multiprocessing
...
run time: 0.641617
MPI
...
run time: 2.415331
(c3dev) [gavin@Eratosthenes.local ~/Desktop/Inbox]
 $ mpiexec -n 1 --bind-to none python executorPoolExample.py
Multiprocessing
...
run time: 0.635012
MPI
...
run time: 1.045748
(c3dev) [gavin@Eratosthenes.local ~/Desktop/Inbox]
 $ mpiexec -n 2 --bind-to none python executorPoolExample.py
Multiprocessing
Multiprocessing
...
run time: 1.229716
MPI
...
run time: 1.234165
MPI
--------------------------------------------------------------------------
All nodes which are allocated for this job are already filled.
--------------------------------------------------------------------------
(c3dev) [gavin@Eratosthenes.local ~/Desktop/Inbox]
 $ 

@GavinHuttley
Copy link
Collaborator Author

Original comment by Sheng Han Moses Koh (Bitbucket: 5c6a02d4d3e7b93ea1c22610, GitHub: u6052029).


7e4e531

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant