-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Experimental asyncio support #2015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Here's a piece of testing code. I shows how this PR works. I will integrate it into the code later. import asyncio
import ray
from ray.plasma.plasma_client import AsyncPlasmaClient
def cvt(s):
return [ray.pyarrow.plasma.ObjectID(t.id()) for t in s]
address_dict = ray.init()
async def test_wait(client: AsyncPlasmaClient):
a = ray.put(2342342)
@ray.remote
def delay():
import time
time.sleep(10)
return 'ready'
b = delay.remote()
print(ray.wait([a, b]))
print(await client.wait(cvt([a, b])))
await asyncio.sleep(5)
print(await client.wait(cvt([a, b])))
print(ray.wait([a, b]))
await asyncio.sleep(10)
print(await client.wait(cvt([a, b])))
print(ray.wait([a, b]))
print(await client.wait(cvt([b])))
print(ray.wait([b]))
async def test_await_get(client):
@ray.remote
def delay():
import time
time.sleep(10)
return 'ready'
k = delay.remote()
k = client.wrap_objectid_with_future(k)
result = await k
print(result)
async def test_client(client: AsyncPlasmaClient):
await client.connect()
print("store_capacity = %d" % client.store_capacity)
await test_wait(client)
await test_await_get(client)
object_store_address = address_dict['object_store_addresses'][0]
client = AsyncPlasmaClient(store_socket_name=object_store_address.name, manager_socket_name=object_store_address.manager_name)
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(test_client(client))
except KeyboardInterrupt:
client.disconnect() |
Test PASSed. |
There are still some problems remaining to be solved.
|
So looking at this code, it looks like a lot of code is just copied over to new classes unmodified? This is because inheritance does not work well? |
python/ray/plasma/format/plasma.fbs
Outdated
@@ -0,0 +1,331 @@ | |||
// Licensed to the Apache Software Foundation (ASF) under one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is not needed, right? (same with python/ray/plasma/format/__init__.py
)
It takes some words to clarify. Currently, Ray has a plasma manager ( However, Ray does not implement a plasma client for itself. Instead, it uses Arrow's client using Arrow's And now, Ray's plasma manager uses an event loop (so it's async) but Arrow's plasma client doesn't, so I have to override parts of codes of Arrow's plasma client. To override it, Arrow's Arrow's |
Thanks a lot for the PR. I'm hoping to try it out this weekend. This seems like really useful functionality to have. It also seems like it belongs more in Arrow than in Ray. E.g., as the plasma client API changes we'll want to make sure this code gets updated/tested. |
How is it is to make an async code into blocking code? Would it be reasonable to have underlying implementation fully and just async, and then just expose blocking API for those who prefer that? |
The challenge with making async the default implementation is that it requires some sort of event loop, and users who just want to use the blocking version may not have an event loop on hand. We can expose non-blocking calls along with a socket for receiving notifications from the store. We have something like this that is used internally by the local scheduler and object manager (both of which have an event loop). |
So, yea. A blocking implementation would get a loop, do a call, wait for call to finish in a loop (block), which would then finish the loop, ending the blocking call, return the value. For example: import asyncio
import aiohttp
async def fetch(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
return await response.text()
def block_fetch(url):
loop = asyncio.get_event_loop()
return loop.run_until_complete(fetch(url))
print(block_fetch('https://common.tnode.com')) |
I think we have to make choices: Options group A:
Options group B:
Options group C:
Currently I prefer A2, B2, C3 |
For group A, modifying Arrow makes a lot more sense to me, since Plasma is part of Arrow, it makes sense to do all Plasma development there. Otherwise updates to Plasma will break the async client, so it's important to have the client tested in the Arrow CI. For group B, I think it makes sense to do as much as possible in C++. This will make it easier to wrap from other languages like Java. It'd be ok to do something quick in Python using the existing C++ API, but that doesn't feel like a long-term solution. For group C, creating an event loop for every blocking call sounds very heavyweight to me, but I could be completely wrong about that, so it's a question of performance. However, if we want to use an underlying async implementation, then that would require the async implementation to be in C++. |
I do not think that code above creates an event loop every time, but it just reuses "main" one. (There is also
Not sure how we would benchmark this code and compare regular blocking call to non-blocking call. |
I tried: import asyncio
import time
def block_sleep():
time.sleep(1)
async def sleep():
block_sleep()
def loop_sleep():
loop = asyncio.get_event_loop()
return loop.run_until_complete(sleep())
print("start blocking")
results = []
for i in range(100):
before = time.perf_counter()
block_sleep()
after = time.perf_counter()
results.append(after - before)
print("end blocking", sum(results) / len(results))
print("start async")
results = []
for i in range(100):
before = time.perf_counter()
loop_sleep()
after = time.perf_counter()
results.append(after - before)
print("end async", sum(results) / len(results)) Results:
This does not look like big difference? |
Currently, I am implementing a new async plasma client. Here are some ideas: C++ Part of Arrow
Python Part of Arrow
So then we can have asyncio-friendly async ray tasks. |
@suquark the The easiest way to implement an async |
@robertnishihara That's cool. I think it could work but a better idea may be using And about |
Yes, that is the subscribe that I am referring to.
…On Sun, Jun 3, 2018 at 10:36 PM Si-Yuan ***@***.***> wrote:
@robertnishihara <https://github.com/robertnishihara> That's cool. I
think it could work but a better idea may be using ray.wait as a selector
like linux poll so we can implement general and asyncio-friendly async
programming model.
And about subscribe, do you mean
ray.worker.global_worker.plasma_client.subscribe?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2015 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAPOrdqaW20a8Z-RQHkTkEQoLpzmLVrpks5t5Md7gaJpZM4T2yCr>
.
|
Test PASSed. |
Test PASSed. |
Test FAILed. |
Test PASSed. |
Test PASSed. |
Test PASSed. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test PASSed. |
Test FAILed. |
Test FAILed. |
python/ray/worker.py
Outdated
if sys.version_info >= (3, 5): | ||
from ray.experimental import async_api | ||
# Initialize | ||
async_api.init() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't init unless the user imports async_api right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it depends on ray.init(). Because users typically do ray.init() after importing modules, I have to put it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already init on as_future(), which already depends on ray.init().
So we can remove these lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But when initializing it, we should ensure the eventloop is not running. I suppose it is not a good idea to let users judge if it is safe to use as_future() to initialize them. However, in most cases, ray.init() is called before the eventloop starts, so initialize it in connect
is safe and it also works for remote functions/actors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but we don't need to init() on worker start right?
@suquark I made some edits to clarify that async_api.init() or to_future() must be called before the event loop starts. Let me know if this works. I also deleted the init block in worker.py. Basically the issue there is that you would be running this code even if the user is not using the async api, which is too risky for an experimental API. |
Test PASSed. |
@ericl That looks to me, thanks. |
Test FAILed. |
Test FAILed. |
Test PASSed. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Awesome! Thanks @suquark for this. |
What do these changes do?
This is a prototype implementation for #493 which provides awaitable interface for
ray.wait
& ray'sObjectID
As a prototype, these codes are meant to be modified later.
How do these changes work?
AsyncPlasmaClient
is implemeted to override originalpyarrow.plasma.PlasmaClient
.pyarrow.plasma.PlasmaClient
is created bypyarrow.plasma.connect
and is attached toray.worker.global_worker
to handle basic ray functions. It also create an interface for wrapping ray's ObjectID.AsyncPlasmaSocket
is created for async socket messaging with PlasmaStore & PlasmaManager. It is the core of async.pyarrow.plasma.PlasmaClient
does not make use of event loops and only create a single socket connection, it is why original ray does not support much of async functions.AsyncPlasmaSocket
uses asyncio event loop and is capable of creating multiple socket connections with PlasmaManager.plasma.fbs
underformat
directory needs to be compiled with flatbuffer ahead of time.Related issue number
#493
cc @mitar