-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Describe the bug
This error "IndexError: Invalid key: 0 is out of bounds for size 0" occurs in the middle of evaluation.
I believe this is one of the corner cases as I have succeeded in a smaller dataset, but for the longer one it failed at middle.
Ragas version: git+https://github.com/explodinggradients/ragas.git@5f105c08b7579188aea1113334dae6b6a8a15660.
Python version: 3.9
Error trace
Traceback (most recent call last):
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\ragas\evaluation.py", line 176, in evaluate
raise e
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\ragas\evaluation.py", line 159, in evaluate
results = executor.results()
^^^^^^^^^^^^^^^^^^
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\ragas\executor.py", line 118, in results
raise e
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\ragas\executor.py", line 114, in results
r = future.result()
^^^^^^^^^^^^^^^
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\concurrent\futures_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\concurrent\futures_base.py", line 401, in __get_result
raise self._exception
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\concurrent\futures\thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\ragas\executor.py", line 36, in wrapped_callable
return counter, callable(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\ragas\metrics\base.py", line 75, in score
raise e
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\ragas\metrics\base.py", line 71, in score
score = self._score(row=row, callbacks=group_cm)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\ragas\metrics_answer_relevance.py", line 136, in _score
return self._calculate_score(response, row)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\ragas\metrics_answer_relevance.py", line 114, in _calculate_score
cosine_sim = self.calculate_similarity(question, gen_questions)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\ragas\metrics_answer_relevance.py", line 92, in calculate_similarity
norm = np.linalg.norm(gen_question_vec, axis=1) * np.linalg.norm(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\numpy\linalg\linalg.py", line 2583, in norm
return sqrt(add.reduce(s, axis=axis, keepdims=keepdims))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
numpy.exceptions.AxisError: axis 1 is out of bounds for array of dimension 1
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\allenpan\repos\autogenProject\rag_evaluation.py", line 95, in
result = evaluate(
^^^^^^^^^
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\ragas\evaluation.py", line 178, in evaluate
result = Result(
^^^^^^^
File "", line 6, in init
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\ragas\evaluation.py", line 207, in post_init
for cn in self.scores[0].keys():
~~~~~~~~~~~^^^
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\datasets\arrow_dataset.py", line 2800, in getitem
return self._getitem(key)
^^^^^^^^^^^^^^^^^^
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\datasets\arrow_dataset.py", line 2784, in _getitem
pa_subtable = query_table(self._data, key, indices=self._indices if self._indices is not None else None)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\datasets\formatting\formatting.py", line 583, in query_table
_check_valid_index_key(key, size)
File "C:\Users\allenpan\Anaconda\envs\rag_experiment\Lib\site-packages\datasets\formatting\formatting.py", line 526, in _check_valid_index_key
raise IndexError(f"Invalid key: {key} is out of bounds for size {size}")
IndexError: Invalid key: 0 is out of bounds for size 0
Expected behavior
Should have no error regardless of dataset size
Additional context
Add any other context about the problem here.