Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could you code an archipelago to use multithreading when it initalizes solutions? #170

Closed
bsugerman opened this issue May 4, 2018 · 6 comments

Comments

@bsugerman
Copy link

bsugerman commented May 4, 2018

I have noticed that while archi.evolve() uses multithreading to evolve as many islands as possible with multithreading, the archipelago class does not seem to take advantage of multithreading when it is finding initial candidate solutions within the given boundaries upon first being initialized.

I have defined my own problem CM(x,bounds) which happens to run fairly slowly because it is solving a huge problem. I initialize an archipelago as follows:

 es = pg.algorithm(pg.de1220(gen = 20, variant_adptv=2,xtol=0.01))
 archi=pg.archipelago(algo=es,pop_size=50,prob=pg.problem(CM(x,bounds)),n=8)

I notice that only 1 processor runs (for about a minute in my case) to set up this archipelago. But when I run archi.evolve(), all my processors run. It seems to me that each island should initialize its own candidate solutions on its own processor, just as happens when the populations are being evolved.

Is that a feature that could be added?

@bluescarni
Copy link
Member

This seems like a duplicate of #135. It is a feature request we recieve fairly regularly, we plan to add the capability eventually.

@MikolajMizera
Copy link

MikolajMizera commented May 9, 2018

@bsugerman here is my solution using ipyparallel:

import ipyparallel as ipp
import pygmo as pg

pop_size = 50
n_islands=8

def pop_init(pop_size, x, bounds):    
    prob_def = CM(x, bounds)
    prob = pg.problem(prob_def)
    return pg.population(prob, pop_size)

if __name__ == '__main__':
    rc = ipp.Client()
    lview = rc.load_balanced_view()
    populations=list(lview.map(pop_init, [pop_size]*n_islands, [x]*n_islands, [bounds]*n_islands))
    
    udi=pg.ipyparallel_island()
    es = pg.algorithm(pg.de1220(gen = 20, variant_adptv=2,xtol=0.01))
    for pop in populations:
        archi.push_back(algo=es, pop=pop, udi=udi)
    archi.evolve(20)

Just start ipcluster with desired number of ipengines and each island will initialize on different ipenegine (separate process).

@bsugerman
Copy link
Author

MikolajMizera, thanks for that option. I have been able to do the same thing with multiprocessing.pool, and the code needed is even shorter. However, I hope the good folks maintaining pygmo can add this feature into their overall structure, since they already have archipelagos running on multiple processors.

@bluescarni
Copy link
Member

The main reason why we haven't done this yet is that there's not much overlap with the existing parallel computing infrastructure we have in pagmo, and it seems like we would end up having to maintain dual codepaths for parallel initialisation and evolution.

(A secondary reason is that before tackling other parallel-related tasks, we first need to finish up the work on migration/topology, which is likely to induce further constraints and possibly internal API changes in the portions of the code dealing with parallel computing)

@Argysh
Copy link

Argysh commented May 6, 2019

This is how I do it with multiproccessing instead of ipyparallel

import pygmo as pg
def pop_init(pop_size):
    prob_def = pg.rosenbrock(1000) # your or a standard udp
    prob = pg.problem(prob_def)
    return pg.population(prob, pop_size)

if __name__ == "__main__":
    # find number of threads and use nThreads*usage of them
    def getThreads():
        if sys.platform == 'win32':
            return (int)(os.environ['NUMBER_OF_PROCESSORS'])
        return (int)(os.popen('grep -c cores /proc/cpuinfo').read())
    nWorker = min(int(nIslands), int(getThreads()*usage))

    # pre-processing
    [. . . ]

    # initialise your pops in parallel in an mp pool
    pool = mp.Pool(nWorker)
    populations = pool.map(pop_init, [nPop]*nIslands)

    # add them to new islands in the otherwise empty archepelago
    archipelago = pg.archipelago()
    for pop in populations:
        archipelago.push_back(algo=algorithm, pop=pop, udi=pg.mp_island())
    archipelago.wait()

    # post-processing
    [. . .]

on linux this will only work with python >3.4

@bluescarni
Copy link
Member

The batch fitness evaluation framework (which includes parallel initialisation for populations/islands/archipelagos) has now been completed on the Python side with the release of pagmo 2.13. I will close this report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants