Add pool functionality #2105

richardliaw · 2018-05-20T00:42:31Z

This utility that multiprocessing provides is quite nice:

    with closing(multiprocessing.Pool()) as pool:
        X = pool.map(fn, [os.urandom(4) for _ in range(num_trials)])
        episode_rewards = np.hstack(X)

We can add something exactly the same into ray.experimental.

The text was updated successfully, but these errors were encountered:

robertnishihara · 2018-05-20T23:12:19Z

This is specifically about having a pool of "stateless actors", right?

richardliaw · 2018-05-20T23:22:45Z

This was actually posted separate from that discussion, but it would make sense for this to include discussion of an API for a pool of stateless actors.

robertnishihara · 2018-05-20T23:27:46Z

We can make a separate issue for that.

What do you like about the pool functionality? E.g., it looks like the code snippet is equivalent to

X = ray.get([fn.remote(os.urandom(4)) for _ in range(num_trials)])
episode_rewards = np.hstack(X)

richardliaw · 2018-05-21T05:16:48Z

Ideally, it would be something like

items = [os.urandom(4)) for _ in range(num_trials)]
X = ray.map(fn, items)

As a user, I like this a lot more than the above provided snippet because I know I want to apply 1 function to a list of items, and I don't need to think about making the function remote, futures, getting the future, whether all these function calls will have enough resources, etc...

The reason why I think stateless actors and this functionality is closely related is that depending on the function, I may want map to either be completely stateless (tasks) or sort-of stateless (stateless actors, i.e. a neural network eval).

vakker · 2019-05-09T15:19:46Z

It's an old thread, but I've come across a usecase that could also benefit from some sort of a pool feature.
E.g. having the following:

@ray.remote
def fn(i):
    # some setup
    # process input
    return output

has the problem of running the setup in every function call.
The same thing with actors is better because you can just run the setup part once. E.g.:

@ray.remote
class ProcessStuff:
    def __init__(self):
        # some setup

    def fn(self, i):
        # process input
        return output

However, the problem is that you cannot just create a list (or queue) for the items and pass it to a list of actors, but you need to iterate over the actors or chunk the input list. It would be better to have a pool of actors where each actor takes an item and processes it until the queue is empty.

The experimental streaming library might be a way to do this, but it seems to me it's a bit more complex than just a queue and a set of actors.

If there's a way to this already in a neat way, then let me know, currently I have to fall back to Python multiprocessing to achieve this.

ericl · 2019-10-17T19:35:16Z

Try this?

import ray


class ActorPool(object):
    def __init__(self, actors):
        self.actors = actors

    def map(self, fn, values):
        values = list(values)
        idle = list(self.actors)
        running = {}
        results = {}
        for i, v in enumerate(values):
            if not idle:
                [r], _ = ray.wait(list(running), num_returns=1)
                j, a = running.pop(r)
                results[j] = ray.get(r)
                idle.append(a)
            a = idle.pop()
            running[fn(a, v)] = (i, a)
            i += 1
        while running:
            [r], _ = ray.wait(list(running), num_returns=1)
            j, _ = running.pop(r)
            results[j] = ray.get(r)
        return [results[i] for i in range(len(results))]


if __name__ == "__main__":

    @ray.remote
    class MyActor(object):
        def __init__(self):
            pass
        def f(self, x):
            return x * 2

    ray.init()
    actors = [MyActor.remote() for _ in range(4)]
    pool = ActorPool(actors)
    print(pool.map(
        lambda actor, v: actor.f.remote(v),
        range(10)))

edoakes · 2020-03-05T23:28:11Z

Stale - please open new issue if still relevant

edoakes closed this as completed Mar 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pool functionality #2105

Add pool functionality #2105

richardliaw commented May 20, 2018

robertnishihara commented May 20, 2018

richardliaw commented May 20, 2018 •

edited

robertnishihara commented May 20, 2018

richardliaw commented May 21, 2018

vakker commented May 9, 2019

ericl commented Oct 17, 2019

edoakes commented Mar 5, 2020

Add pool functionality #2105

Add pool functionality #2105

Comments

richardliaw commented May 20, 2018

robertnishihara commented May 20, 2018

richardliaw commented May 20, 2018 • edited

robertnishihara commented May 20, 2018

richardliaw commented May 21, 2018

vakker commented May 9, 2019

ericl commented Oct 17, 2019

edoakes commented Mar 5, 2020

richardliaw commented May 20, 2018 •

edited