Skip to content

Conversation

@jan-janssen
Copy link
Member

@jan-janssen jan-janssen commented Feb 16, 2024

@ligerzero-ai based on your performance analysis I designed a small benchmark. Using this benchmark I found the bug that is fixed with this pull request. The updated results of the benchmark are posted below. Now pympipool is competitive with the ProcessPoolExecutor. Can you test this branch with your own benchmark?

Code

# Benchmark: 
# >>> python benchmark.py static
# Result: 79.38484382629395
# >>> python benchmark.py process
# Result: 33.66195225715637
# >>> python benchmark.py thread
# Result: 52.65341782569885
# >>> mpiexec -n 4 python -m mpi4py.futures benchmark.py mpi4py
# Result: 39.54463863372803
# >>> python benchmark.py pympipool
# Result: 34.79054880142212
# >>> python benchmark.py flux
# Result: 36.002100467681885

import sys
from time import time


def llh_numpy(mean, sigma):
    import numpy
    data = numpy.random.normal(size=100000000).astype('float64')
    s = (data - mean) ** 2 / (2 * (sigma ** 2))
    pdfs = numpy.exp(- s)
    pdfs /= numpy.sqrt(2 * numpy.pi) * sigma
    return numpy.log(pdfs).sum()


def run_with_executor(executor=None, mean=0.1, sigma=1.1, runs=32, **kwargs):
    with executor(**kwargs) as exe:
        future_lst = [exe.submit(llh_numpy, mean=mean, sigma=sigma) for i in range(runs)]
        return [f.result() for f in future_lst]


def run_static(mean=0.1, sigma=1.1, runs=32):
    return [llh_numpy(mean=mean, sigma=sigma) for i in range(runs)]


if __name__ == "__main__":
    run_mode = sys.argv[1]
    start_time = time()
    if run_mode == "static":
        run_static(mean=0.1, sigma=1.1, runs=32)
    elif run_mode == "process":
        from concurrent.futures import ProcessPoolExecutor
        run_with_executor(executor=ProcessPoolExecutor, mean=0.1, sigma=1.1, runs=32, max_workers=4)
    elif run_mode == "thread":
        from concurrent.futures import ThreadPoolExecutor
        run_with_executor(executor=ThreadPoolExecutor, mean=0.1, sigma=1.1, runs=32, max_workers=4)
    elif run_mode == "pympipool":
        from pympipool.mpi.executor import PyMPIExecutor
        run_with_executor(executor=PyMPIExecutor, mean=0.1, sigma=1.1, runs=32, max_workers=4)
    elif run_mode == "flux":
        from pympipool.flux.executor import PyFluxExecutor
        run_with_executor(executor=PyFluxExecutor, mean=0.1, sigma=1.1, runs=32, max_workers=4)
    elif run_mode == "mpi4py":
        from mpi4py.futures import MPIPoolExecutor
        run_with_executor(executor=MPIPoolExecutor, mean=0.1, sigma=1.1, runs=32, max_workers=4)
    else:
        raise ValueError(run_mode)
    stop_time = time()
    print("Result:", stop_time-start_time)

Results

Command Runtime: cmpc06 [s]
python benchmark.py static 79.38484382629395
python benchmark.py process 33.66195225715637
python benchmark.py thread 52.65341782569885
mpiexec -n 4 python -m mpi4py.futures benchmark.py mpi4py 39.54463863372803
python benchmark.py pympipool 34.79054880142212
flux start; python benchmark.py flux 36.002100467681885

@jan-janssen jan-janssen marked this pull request as draft February 16, 2024 21:26
@jan-janssen
Copy link
Member Author

As an internal reference - the benchmark from @ligerzero-ai : https://github.com/orgs/pyiron/discussions/211#discussioncomment-8034046

@jan-janssen jan-janssen marked this pull request as ready for review February 17, 2024 10:11
@jan-janssen
Copy link
Member Author

@XzzX just for your awareness: I am currently benchmarking the Executor we use inside pyiron for parallelisation. As you have quite some experience with optimising the performance of simulation codes, any suggestions are highly appreciated.

@jan-janssen
Copy link
Member Author

The benchmark is now included in the continuous integration setup.

@jan-janssen
Copy link
Member Author

OpenMPI

static 119.52881598472595
process 50.263325691223145
thread 70.6196928024292
mpi4py 55.50081467628479
pympipool 53.83580446243286

MPIch

static 119.12273931503296
process 50.84898900985718
thread 70.48506259918213
mpi4py 56.26586318016052
pympipool 54.29321050643921

@jan-janssen jan-janssen merged commit d9edd2d into main Feb 18, 2024
@jan-janssen jan-janssen deleted the mpi_only_for_parallel branch February 18, 2024 12:47
@ligerzero-ai
Copy link

@ligerzero-ai based on your performance analysis I designed a small benchmark. Using this benchmark I found the bug that is fixed with this pull request. The updated results of the benchmark are posted below. Now pympipool is competitive with the ProcessPoolExecutor. Can you test this branch with your own benchmark?

Code

# Benchmark: 
# >>> python benchmark.py static
# Result: 79.38484382629395
# >>> python benchmark.py process
# Result: 33.66195225715637
# >>> python benchmark.py thread
# Result: 52.65341782569885
# >>> mpiexec -n 4 python -m mpi4py.futures benchmark.py mpi4py
# Result: 39.54463863372803
# >>> python benchmark.py pympipool
# Result: 34.79054880142212
# >>> python benchmark.py flux
# Result: 36.002100467681885

import sys
from time import time


def llh_numpy(mean, sigma):
    import numpy
    data = numpy.random.normal(size=[100000000](tel:100000000)).astype('float64')
    s = (data - mean) ** 2 / (2 * (sigma ** 2))
    pdfs = numpy.exp(- s)
    pdfs /= numpy.sqrt(2 * numpy.pi) * sigma
    return numpy.log(pdfs).sum()


def run_with_executor(executor=None, mean=0.1, sigma=1.1, runs=32, **kwargs):
    with executor(**kwargs) as exe:
        future_lst = [exe.submit(llh_numpy, mean=mean, sigma=sigma) for i in range(runs)]
        return [f.result() for f in future_lst]


def run_static(mean=0.1, sigma=1.1, runs=32):
    return [llh_numpy(mean=mean, sigma=sigma) for i in range(runs)]


if __name__ == "__main__":
    run_mode = sys.argv[1]
    start_time = time()
    if run_mode == "static":
        run_static(mean=0.1, sigma=1.1, runs=32)
    elif run_mode == "process":
        from concurrent.futures import ProcessPoolExecutor
        run_with_executor(executor=ProcessPoolExecutor, mean=0.1, sigma=1.1, runs=32, max_workers=4)
    elif run_mode == "thread":
        from concurrent.futures import ThreadPoolExecutor
        run_with_executor(executor=ThreadPoolExecutor, mean=0.1, sigma=1.1, runs=32, max_workers=4)
    elif run_mode == "pympipool":
        from pympipool.mpi.executor import PyMPIExecutor
        run_with_executor(executor=PyMPIExecutor, mean=0.1, sigma=1.1, runs=32, max_workers=4)
    elif run_mode == "flux":
        from pympipool.flux.executor import PyFluxExecutor
        run_with_executor(executor=PyFluxExecutor, mean=0.1, sigma=1.1, runs=32, max_workers=4)
    elif run_mode == "mpi4py":
        from mpi4py.futures import MPIPoolExecutor
        run_with_executor(executor=MPIPoolExecutor, mean=0.1, sigma=1.1, runs=32, max_workers=4)
    else:
        raise ValueError(run_mode)
    stop_time = time()
    print("Result:", stop_time-start_time)

Results

Command Runtime: cmpc06 [s]
python benchmark.py static 79.38484382629395
python benchmark.py process 33.66195225715637
python benchmark.py thread 52.65341782569885
mpiexec -n 4 python -m mpi4py.futures benchmark.py mpi4py 39.54463863372803
python benchmark.py pympipool 34.79054880142212
flux start; python benchmark.py flux 36.002100467681885

Sure, get it done this week (hopefully)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants