Skip to content

Empty generation does not raise a ragas.exceptions.ExceptionInRunner #1137

@Gwenn-LR

Description

@Gwenn-LR

[x] I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug
Hi! I'm currently working with ragas to test different RAG architectures, so I'm using Ollama, HuggingFace and LangChain framework on top of ragas and I'm facing an issue when I'm trying to implement an unit test around the synthetic generation : the generated ragas.testset.generator.TestsetGenerator object has empty rows.

I think it comes from a specific parametrization of each frameworks but after having looked over the repository, I think you've tried to avoid such a situation but your checking does not raise any error.

Ragas version: 0.1.11
Python version: 3.10.12

Code to Reproduce
The example.pdf file used here can be found at: https://css4.pub/2015/usenix/example.pdf

from langchain_community.chat_models import ChatOllama
from ragas.testset import TestsetGenerator
from rag_sandbox.embedding.huggingface import HuggingFaceEmbeddings
from langchain_community.document_loaders.pdf import UnstructuredPDFLoader

if __name__ == "__main__":
    llm = ChatOllama(base_url="http://localhost:11434", model="qwen2:7b")

    generator = TestsetGenerator.from_langchain(
        generator_llm=llm,
        critic_llm=llm,
        embeddings=HuggingFaceEmbeddings()
    )

    documents = UnstructuredPDFLoader(
        "./tests/data/pdf/example.pdf",
        mode="elements",
        strategy="hi_res",
        infer_table_structure=True,
        hi_res_model_name="yolox",
        extract_images_in_pdf=True,
        extract_image_block_output_dir="./results/data_extraction/images"
        ).load()

    dataset = generator.generate_with_langchain_docs(
        documents=documents,
        test_size=1
    )

Error trace
No error, but that is the problem.

Expected behavior
According to your code, one can expect to get a ragas.exceptions.ExceptionInRunner in such a situation.

Additional context
I'll offer a PR to fix this issue, but I don't know if it won't conflict with another part of the code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions