Evaluate model sometimes leads to 'Runtime Error: bound to a different event loop' #1983

EdIzaguirre · 2024-07-18T22:47:56Z

I've been enjoying the Weave library quite a bit, but I have been running into an issue using the Evaluate method. The issue is that 20% of the time, when running my evaluation, I get the Runtime Error mentioned below referring to the use of different event loops. 80% of the time my evaluation is performed as normal. The thing is I don't see where the second event loop is being called? Part of the complication is that I am using Ragas to evaluate a few metrics for my chatbot, and I don't want to modify the Ragas library to accomplish my goal. I have tried following both of the examples given in the tutorials, but they both do it in slightly different ways. For the record, adding async to the evaluate_with_ragas method did not work, nor did replacing invoke with ainvoke and using await in my predict method. I have already limited WEAVE_PARALLELISM to stem any issues from rate limiting. Any help would be appreciated!

Code

evaluate_model.py

import weave
import asyncio
from ragas import evaluate
from ragas.metrics import AnswerRelevancy, ContextRelevancy, Faithfulness
from ragas.metrics.critique import harmfulness
from datasets import Dataset
from FilmSearchModel import FilmSearchModel
import os
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
import json

# Set environment variable to limit parallel workers
os.environ['WEAVE_PARALLELISM'] = '3'

with open('./config.json') as f:
    config = json.load(f)


@weave.op()
def evaluate_with_ragas(query, model_output):
    # Put data into a Dataset object
    data = {
        "question": [query],
        "contexts": [[model_output['context']]],
        "answer": [model_output['answer']]
    }
    dataset = Dataset.from_dict(data)

    # Define metrics to judge
    metrics = [
        AnswerRelevancy(),
        ContextRelevancy(),
        Faithfulness(),
        harmfulness
    ]

    judge_model = ChatOpenAI(model=config['JUDGE_MODEL_NAME'])
    embeddings_model = OpenAIEmbeddings(model=config['EMBEDDING_MODEL_NAME'])

    evaluation = evaluate(dataset=dataset, metrics=metrics, llm=judge_model, embeddings=embeddings_model)

    return {
        "answer_relevancy": float(evaluation['answer_relevancy']),
        "context_relevancy": float(evaluation['context_relevancy']),
        "faithfulness": float(evaluation['faithfulness']),
        "harmfulness": float(evaluation['harmfulness'])
    }


def run_evaluation():
    # Initialize FilmSearch model
    model = FilmSearchModel()

    # Define evaluation questions
    questions = [
        {"query": "Recommend some funny zombie movies that are streaming on Hulu."},
        {"query": "Find me drama movies in English that are less than 2 hours long and feature pets."},
        {"query": "I want some fantasy movies featuring dragons that are under 90 minutes long."},
        {"query": "Can you suggest some disney movies suitable for adults?"},
        {"query": "Suggest some romantic comedies available on Netflix."}
    ]

    # Create Weave Evaluation object
    evaluation = weave.Evaluation(dataset=questions, scorers=[evaluate_with_ragas])

    # Run the evaluation
    asyncio.run(evaluation.evaluate(model))


if __name__ == "__main__":
    weave.init('film-search')
    run_evaluation()

Relevant portion of FilmSearchModel.py:

@weave.op()
    async def predict(self, query: str):
        weave.init('film-search')

        try:
            result = self.rag_chain_with_source.invoke(query)
            return {
                'answer': result['answer'],
                'context': "\n".join([doc.page_content for doc in result['context']])
            }
        except Exception as e:
            return {'answer': f"An error occurred: {e}", 'context': ""}

Error:

Traceback (most recent call last):
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/openai/_base_client.py", line 1558, in _request
    response = await self._client.send(
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/httpx/_client.py", line 1661, in send
    response = await self._send_handling_auth(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/httpx/_client.py", line 1689, in _send_handling_auth
    response = await self._send_handling_redirects(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/httpx/_client.py", line 1726, in _send_handling_redirects
    response = await self._send_single_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/httpx/_client.py", line 1763, in _send_single_request
    response = await transport.handle_async_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/httpx/_transports/default.py", line 373, in handle_async_request
    resp = await self._pool.handle_async_request(req)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/httpcore/_async/connection_pool.py", line 216, in handle_async_request
    raise exc from None
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/httpcore/_async/connection_pool.py", line 196, in handle_async_request
    response = await connection.handle_async_request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/httpcore/_async/connection.py", line 101, in handle_async_request
    return await self._connection.handle_async_request(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/httpcore/_async/http11.py", line 143, in handle_async_request
    raise exc
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/httpcore/_async/http11.py", line 113, in handle_async_request
    ) = await self._receive_response_headers(**kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/httpcore/_async/http11.py", line 186, in _receive_response_headers
    event = await self._receive_event(timeout=timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/httpcore/_async/http11.py", line 224, in _receive_event
    data = await self._network_stream.read(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/httpcore/_backends/anyio.py", line 35, in read
    return await self._stream.receive(max_bytes=max_bytes)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/anyio/streams/tls.py", line 196, in receive
    data = await self._call_sslobject_method(self._ssl_object.read, max_bytes)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/anyio/streams/tls.py", line 138, in _call_sslobject_method
    data = await self.transport_stream.receive()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 1203, in receive
    await self._protocol.read_event.wait()
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/asyncio/locks.py", line 209, in wait
    fut = self._get_loop().create_future()
          ^^^^^^^^^^^^^^^^
  File "/Users/ed/miniconda3/envs/filmsearchprod/lib/python3.12/asyncio/mixins.py", line 20, in _get_loop
    raise RuntimeError(f'{self!r} is bound to a different event loop')
RuntimeError: <asyncio.locks.Event object at 0x14cae8650 [unset]> is bound to a different event loop

The text was updated successfully, but these errors were encountered:

gtarpenning · 2024-07-18T23:37:47Z

Hey! Thanks for writing in, let's get to the bottom of this. What version of weave are you running? We recently made changes relating to parallelism in the weave library, upgrading might be worth a shot. Also, you say that everything works 80% of the time, i'm assuming that those 20% runtime error occur non-deterministically?

I am seeing notes internally that we have run into this issue with Ragas around a month ago, but interestingly similar rag scoring frameworks like Tonic worked fine. You might also be interested to know that an integration with ragas is on our roadmap.

Internal ticket for tracking.

EdIzaguirre · 2024-07-19T18:26:52Z

I am using weave==0.50.10, the latest one. Correct, the error occurs non-deterministically. I may have to check out the Tonic library.

gtarpenning · 2024-07-19T20:33:31Z

Perfect thanks, this issue has been triaged and has been added to our backlog. A deeper investigation is imminent, but because the issue is an interaction with an external library it can't be prioritized over other internal issues. We will keep you posted with progress, thanks again for writing this up!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate model sometimes leads to 'Runtime Error: bound to a different event loop' #1983

Evaluate model sometimes leads to 'Runtime Error: bound to a different event loop' #1983

EdIzaguirre commented Jul 18, 2024 •

edited

Loading

gtarpenning commented Jul 18, 2024 •

edited

Loading

EdIzaguirre commented Jul 19, 2024

gtarpenning commented Jul 19, 2024

Evaluate model sometimes leads to 'Runtime Error: bound to a different event loop' #1983

Evaluate model sometimes leads to 'Runtime Error: bound to a different event loop' #1983

Comments

EdIzaguirre commented Jul 18, 2024 • edited Loading

gtarpenning commented Jul 18, 2024 • edited Loading

EdIzaguirre commented Jul 19, 2024

gtarpenning commented Jul 19, 2024

EdIzaguirre commented Jul 18, 2024 •

edited

Loading

gtarpenning commented Jul 18, 2024 •

edited

Loading