-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
[x] I have checked the documentation and related resources and couldn't resolve my bug.
Describe the bug
Hi! I'm currently working with ragas
to test different RAG architectures, so I'm using Ollama
, HuggingFace
and LangChain
framework on top of ragas
and I'm facing an issue when I'm trying to implement an unit test around the synthetic generation : the generated ragas.testset.generator.TestsetGenerator
object has empty rows.
I think it comes from a specific parametrization of each frameworks but after having looked over the repository, I think you've tried to avoid such a situation but your checking does not raise any error.
Ragas version: 0.1.11
Python version: 3.10.12
Code to Reproduce
The example.pdf
file used here can be found at: https://css4.pub/2015/usenix/example.pdf
from langchain_community.chat_models import ChatOllama
from ragas.testset import TestsetGenerator
from rag_sandbox.embedding.huggingface import HuggingFaceEmbeddings
from langchain_community.document_loaders.pdf import UnstructuredPDFLoader
if __name__ == "__main__":
llm = ChatOllama(base_url="http://localhost:11434", model="qwen2:7b")
generator = TestsetGenerator.from_langchain(
generator_llm=llm,
critic_llm=llm,
embeddings=HuggingFaceEmbeddings()
)
documents = UnstructuredPDFLoader(
"./tests/data/pdf/example.pdf",
mode="elements",
strategy="hi_res",
infer_table_structure=True,
hi_res_model_name="yolox",
extract_images_in_pdf=True,
extract_image_block_output_dir="./results/data_extraction/images"
).load()
dataset = generator.generate_with_langchain_docs(
documents=documents,
test_size=1
)
Error trace
No error, but that is the problem.
Expected behavior
According to your code, one can expect to get a ragas.exceptions.ExceptionInRunner
in such a situation.
Additional context
I'll offer a PR to fix this issue, but I don't know if it won't conflict with another part of the code.