-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Closed
Description
Describe the bug
When running the synthetic data generation example from https://docs.ragas.io/en/latest/getstarted/testset_generation.html#get-started-testset-generation, I get the error:
KeyError: 'file_name'
on the line:
testset = testsetgenerator.generate(documents, test_size=test_size)
.
Ragas version: 0.0.23.dev37+g041b20c
Python version: 3.11
Code to Reproduce
import os
os.environ["OPENAI_API_KEY"] = "KEY GOES HERE"
from llama_index import download_loader
SemanticScholarReader = download_loader("SemanticScholarReader")
loader = SemanticScholarReader()
# Narrow down the search space
query_space = "large language models"
# Increase the limit to obtain more documents
documents = loader.load_data(query=query_space, limit=10)
from ragas.testset import TestsetGenerator
testsetgenerator = TestsetGenerator.from_default()
test_size = 10
testset = testsetgenerator.generate(documents, test_size=test_size)
Error trace
testset = testsetgenerator.generate(documents, test_size=test_size)
0%| | 0/10 [00:00<?, ?it/s]
Traceback (most recent call last):
File "C:\Users\DMBO\AppData\Local\Anaconda3\envs\docgpt_ragas_py311\Lib\site-packages\IPython\core\interactiveshell.py", line 3553, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-3-62b08db69f64>", line 1, in <module>
testset = testsetgenerator.generate(documents, test_size=test_size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\DMBO\AppData\Local\Anaconda3\envs\docgpt_ragas_py311\Lib\site-packages\ragas\testset\testset_generator.py", line 427, in generate
neighbor_nodes = doc_nodes_map[curr_node.metadata["file_name"]]
~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
KeyError: 'file_name'
Expected behavior
I expected the test set to be generated without error.
Metadata
Metadata
Assignees
Labels
No labels