-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use new event loop instead of the current loop for pipeline #1352
Conversation
lmdeploy/serve/async_engine.py
Outdated
@@ -476,8 +475,8 @@ async def gather(): | |||
]) | |||
outputs.put(None) | |||
|
|||
proc = Thread( | |||
target=lambda: self.loop.run_until_complete(gather())) | |||
proc = Thread(target=lambda: asyncio.new_event_loop(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you use thread instead of coroutine here?
asyncio.get_event_loop().create_task
would give create a new coroutine task parallelized with the current coroutine. and asyncio.Queue
is awaitable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you use thread instead of coroutine here?
asyncio.get_event_loop().create_task
would give create a new coroutine task parallelized with the current coroutine. andasyncio.Queue
is awaitable.
Just want to yield items without async. Do you have demo codes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thread
-> asyncio.get_event_loop().create_task
Queue
-> asyncio.queue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can I return a generator for stream infer?
lmdeploy/lmdeploy/serve/async_engine.py
Lines 484 to 488 in 3d355b5
try: | |
out = outputs.get(timeout=0.001) | |
if out is None: | |
break | |
yield out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try:
out = await outputs.get()
if out is None:
break
yield out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try: out = await outputs.get() if out is None: break yield out
but this is not an async function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lmdeploy/lmdeploy/pytorch/engine/engine.py
Line 1103 in 3d355b5
def __call_async(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid the performance drop, I used the original Thread
and Queue
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@AllentDan what's the performance of llama-7b now? |
It does not influence the api_server performance. As for the |
Tested the |
After recovering the original implementation, performance remained the same. |
@grimoire According to @AllentDan test result, coroutine slows down the performance. |
rollback |
For some jupyter notebook users, the loop can not be used directly.