Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix asyncio in AsyncLLM to use the running event loop if any #501

Merged
merged 4 commits into from
Apr 4, 2024

Conversation

alvarobartt
Copy link
Member

@alvarobartt alvarobartt commented Apr 3, 2024

Description

This PR adds a hot-fix for the AsyncLLM class and subclasses, since the event_loop is always created, but in Colab there's already an active running event loop, so that should be reused, otherwise the AsyncLLM subclasses or any other asyncio code creating event_loops will fail in Colab with the following exception RuntimeError: Cannot run the event loop while another loop is running.

On top of that, an utils function named in_notebook has been included so as to identify whether the code is running in a notebook, and if so, use nest-asyncio so as to be able to successfully use asyncio.

Thanks @burtenshaw for reporting and @gabrielmbmb for the hints!

@alvarobartt alvarobartt added the fix label Apr 3, 2024
@alvarobartt alvarobartt added this to the 1.0.0 milestone Apr 3, 2024
@alvarobartt alvarobartt self-assigned this Apr 3, 2024
@alvarobartt alvarobartt changed the base branch from main to core-refactor April 3, 2024 11:14
Copy link
Member

@gabrielmbmb gabrielmbmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@burtenshaw
Copy link
Contributor

burtenshaw commented Apr 3, 2024

How did you test this? I merged it into the DEITA branch docs/deita-tutorial and tried it in the notebook but I got the same error.

I would expect this to work:

from distilabel.llm.openai import OpenAILLM
from distilabel.pipeline.local import Pipeline
from distilabel.steps.task.evol_instruct.base import EvolInstruct

pipeline = Pipeline(name="DEITA")

evol_instruction_complexity = EvolInstruct(
    name="evol_instruction_complexity",
    llm=OpenAILLM(model="gpt-3.5-turbo"),
    num_evolutions=5,
    store_evolutions=True,
    generate_answers=True,
    include_original_instruction=True,
    pipeline=pipeline,
)

next(evol_instruction_complexity.process(([{"instruction": "How many fish are there in a dozen fish?"}])))

@gabrielmbmb
Copy link
Member

For me it works, but I needed to use nest_asyncio:

import nest_asyncio
from distilabel.llm.openai import OpenAILLM
from distilabel.pipeline.local import Pipeline
from distilabel.steps.task.evol_instruct.base import EvolInstruct

nest_asyncio.apply()

pipeline = Pipeline(name="DEITA")

evol_instruction_complexity = EvolInstruct(
    name="evol_instruction_complexity",
    llm=OpenAILLM(model="gpt-3.5-turbo"),
    num_evolutions=5,
    store_evolutions=True,
    generate_answers=True,
    include_original_instruction=True,
    pipeline=pipeline,
)

next(evol_instruction_complexity.process(([{"instruction": "How many fish are there in a dozen fish?"}])))

maybe we can add nest_asyncio as dependency and in distilabel/__init__.py check if running on ipython environment, and if so nest_asyncio.apply()

@alvarobartt alvarobartt merged commit 02e2a54 into core-refactor Apr 4, 2024
4 checks passed
@alvarobartt alvarobartt deleted the asyncio-colab-fix branch April 4, 2024 06:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

3 participants