-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New executor throws RuntimeError: ... got Future <..> attached to a differen t loop
#681
Comments
I can consistently reproduce with a call to Exception in thread Thread-83:
Traceback (most recent call last):
File "/Users/rbellamy/.pyenv/versions/3.11.6/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
self.run()
File "/Users/rbellamy/test/.venv/lib/python3.11/site-packages/ragas/executor.py", line 93, in run
results = self.loop.run_until_complete(self._aresults())
...
<SNIP/>
...
RuntimeError: Task <Task pending name='Task-178' coro=<as_completed.<locals>.sema_coro()
running at /Users/rbellamy/test/.venv/lib/python3.11/site-packages/ragas/executor.py:36>
cb=[as_completed.<locals>._on_completion() at /Users/rbellamy/.pyenv/versions/3.11.6/lib/python3.11/asyncio/tasks.py:602]>
got Future <Task pending name='Task-227'
coro=<UnaryUnaryCall._invoke() running at /Users/rbellamy/test/.venv/lib/python3.11/site-packages/grpc/aio/_call.py:568>> attached to a different loop Calling code: def get_eval_chain_results(question, answer, ground_truth_answer, llm, embeddings, splitter):
# testset = get_testset_data(answer["source_documents"], llm, embeddings, splitter)
features = Features(
{
"question": Value(dtype="string", id=None),
"answer": Value(dtype="string", id=None),
"contexts": Sequence(feature=Value(dtype="string", id=None)),
}
)
mapping = {
"question": [question],
"answer": [answer["answer"]],
"contexts": [[source.page_content for source in answer["source_documents"]]],
}
metrics = [
answer_relevancy,
context_relevancy,
faithfulness,
]
if ground_truth_answer is not None:
features["ground_truths"] = Sequence(feature=Value(dtype="string", id=None))
mapping["ground_truths"] = [[ground_truth_answer]]
metrics.extend([answer_similarity, context_precision, context_recall])
try:
results = evaluate(
dataset=Dataset.from_dict(
mapping=mapping,
features=features,
),
metrics=metrics,
llm=LangchainLLMWrapper(llm),
embeddings=embeddings,
is_async=False,
)
except Exception as e:
results = {
"answer_relevancy": float("nan"),
"context_relevancy": float("nan"),
"faithfulness": float("nan"),
"answer_similarity": float("nan"),
"context_precision": float("nan"),
"context_recall": float("nan"),
}
print(f"Exception occurred while evaluating question: {question}")
print(f"answer: {answer['answer']}")
print(f"Error: {e}")
return results RC1 didn't have this issue. |
#689 seems to fix this issue since I can no longer reproduce the error, but I don't know from a theory & technical point of view it is really the correct fix or not. Haven't had time to verfiy exactly |
@rbellamy could that have anything to do with jupyter/jupyter_console#241? |
I do not. I think it's more likely something like the issue outlined in using-autoawait-in-a-notebook-ipykernel, and ipython/ipython#11338. I've followed those directions and things are working with: import nest_asyncio
nest_asyncio.apply() |
The I've tried a bunch of different methods of initiating the loop in my copy of |
…loop (#979) In issue #963 I commented about the RuntimeError when trying to use TestsetGenerator. In issue #681 you discussed that using get_event_loop() was the optimal way to solve the issue but it had a Deprecation warning. <img width="705" alt="image" src="https://github.com/explodinggradients/ragas/assets/76526314/94e5cd25-7e4f-4989-8b72-93cb74274a8f"> In newer python versions get_event_loop() raises a RuntimeError when no event loop is active. Thus we can avoid creating unnecessary new loops that seem to be the problem that was mentioned in both issues. I have modified the code the least possible to make it clear what I intended to, although in this setup runner could be avoided as was proposed in #689 Setup: Ragas version: 0.1.7 Python version: Python 3.10.13
hey @joy13975 I took inspiration from your code and removed the runner as you suggested in the above PR. closing this one for now. Thanks a lot for your input ❤️ . if you get some time to look at that PR I would really appreciate it too! |
ragas `0.1.10` broke how the executor was functioning and there were a few hard to reproduce bugs overall. This fixes all those. - uses `nest_asyncio` in jupyter notebooks - added tests in jupyter notebook with `nbmake` - removed the Runner. this means all the `Tasks` will be run in the same event loop as main. for jupyter notebooks we will use `nest_async` lib takes inspiration from #689 fixes: #681
Describe the bug
Calling
TestsetGenerator.generate_with_langchain_docs()
throws the following error, then subsequent attempts to embed (using the same Langchain cache) would get stuck and never proceed past 0%. Suspecting cache db(sqlite3) corruption but then integration check reported no problem..This requires deeper investigation but I'm raising it before I forget/lose the details.
Ragas version:
0.1.3
Python version:
Python 3.9.13
Code to Reproduce
Error trace
Expected behavior
Don't throw error and don't get stuck.
Additional context
After getting this, my langchain's sqlite3 cache always gets corrupted i.e. next cached embedding access never completes. However the db integration check reports ok.
The text was updated successfully, but these errors were encountered: