Fix `asyncio` in `AsyncLLM` to use the running event loop if any #501

alvarobartt · 2024-04-03T11:13:57Z

Description

This PR adds a hot-fix for the AsyncLLM class and subclasses, since the event_loop is always created, but in Colab there's already an active running event loop, so that should be reused, otherwise the AsyncLLM subclasses or any other asyncio code creating event_loops will fail in Colab with the following exception RuntimeError: Cannot run the event loop while another loop is running.

On top of that, an utils function named in_notebook has been included so as to identify whether the code is running in a notebook, and if so, use nest-asyncio so as to be able to successfully use asyncio.

Thanks @burtenshaw for reporting and @gabrielmbmb for the hints!

gabrielmbmb

LGTM!

burtenshaw · 2024-04-03T11:40:39Z

How did you test this? I merged it into the DEITA branch docs/deita-tutorial and tried it in the notebook but I got the same error.

I would expect this to work:

from distilabel.llm.openai import OpenAILLM
from distilabel.pipeline.local import Pipeline
from distilabel.steps.task.evol_instruct.base import EvolInstruct

pipeline = Pipeline(name="DEITA")

evol_instruction_complexity = EvolInstruct(
    name="evol_instruction_complexity",
    llm=OpenAILLM(model="gpt-3.5-turbo"),
    num_evolutions=5,
    store_evolutions=True,
    generate_answers=True,
    include_original_instruction=True,
    pipeline=pipeline,
)

next(evol_instruction_complexity.process(([{"instruction": "How many fish are there in a dozen fish?"}])))

gabrielmbmb · 2024-04-03T20:49:46Z

For me it works, but I needed to use nest_asyncio:

import nest_asyncio
from distilabel.llm.openai import OpenAILLM
from distilabel.pipeline.local import Pipeline
from distilabel.steps.task.evol_instruct.base import EvolInstruct

nest_asyncio.apply()

pipeline = Pipeline(name="DEITA")

evol_instruction_complexity = EvolInstruct(
    name="evol_instruction_complexity",
    llm=OpenAILLM(model="gpt-3.5-turbo"),
    num_evolutions=5,
    store_evolutions=True,
    generate_answers=True,
    include_original_instruction=True,
    pipeline=pipeline,
)

next(evol_instruction_complexity.process(([{"instruction": "How many fish are there in a dozen fish?"}])))

maybe we can add nest_asyncio as dependency and in distilabel/__init__.py check if running on ipython environment, and if so nest_asyncio.apply()

Add asyncio.get_running_loop for Colab

098031e

alvarobartt added the fix label Apr 3, 2024

alvarobartt added this to the 1.0.0 milestone Apr 3, 2024

alvarobartt requested review from burtenshaw and gabrielmbmb April 3, 2024 11:13

alvarobartt self-assigned this Apr 3, 2024

alvarobartt changed the base branch from main to core-refactor April 3, 2024 11:14

gabrielmbmb approved these changes Apr 3, 2024

View reviewed changes

alvarobartt added 3 commits April 4, 2024 08:28

Add is_notebook function

7265c4d

Add nest-asyncio if is_notebook

d297275

Fix in_notebook import

c825ab3

alvarobartt merged commit 02e2a54 into core-refactor Apr 4, 2024
4 checks passed

alvarobartt deleted the asyncio-colab-fix branch April 4, 2024 06:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `asyncio` in `AsyncLLM` to use the running event loop if any #501

Fix `asyncio` in `AsyncLLM` to use the running event loop if any #501

alvarobartt commented Apr 3, 2024 •

edited

gabrielmbmb left a comment

burtenshaw commented Apr 3, 2024 •

edited

gabrielmbmb commented Apr 3, 2024

Fix asyncio in AsyncLLM to use the running event loop if any #501

Fix asyncio in AsyncLLM to use the running event loop if any #501

Conversation

alvarobartt commented Apr 3, 2024 • edited

Description

gabrielmbmb left a comment

Choose a reason for hiding this comment

burtenshaw commented Apr 3, 2024 • edited

gabrielmbmb commented Apr 3, 2024

Fix `asyncio` in `AsyncLLM` to use the running event loop if any #501

Fix `asyncio` in `AsyncLLM` to use the running event loop if any #501

alvarobartt commented Apr 3, 2024 •

edited

burtenshaw commented Apr 3, 2024 •

edited