evaluation.py:276: RuntimeWarning: Mean of empty slice   value = np.nanmean(self.scores[cn])

**Describe the bug**
To continue on #528 and #550, unfortunately I'm still encountering nan values sometimes.

Ragas version: v0.1.1
Python version: 3.11
Model: Azure OpenAI endpoint gpt 3.5-turbo-4k

**Code to Reproduce**
The data itself is not publicly available, but the code is a generic evaluate call
`results = evaluate(
                ragas_ds["eval"],
                metrics=[faithfulness, context_precision, context_recall, answer_similarity],
                llm=azure_model,
                embeddings=azure_embeddings,
            )`

**Error trace**
evaluation.py:276: RuntimeWarning: Mean of empty slice   value = np.nanmean(self.scores[cn])

**Expected behavior**
Ideally no nan's or a more concise description of where the 'empty slice' was encountered


**Additional context**
I've ran my evaluation set a couple of times, due to the inconsistency in where the NaNs are placed I suspect is has to do with the llm output format during metric scoring
Out of the metrics I'm testing, faithfulness seems to throw nan's more often than other metrics but as mentioned it's not consistent

I'm happy to answer any questions which could help clarify the issue
Regards, Koen


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

evaluation.py:276: RuntimeWarning: Mean of empty slice value = np.nanmean(self.scores[cn]) #634

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

evaluation.py:276: RuntimeWarning: Mean of empty slice value = np.nanmean(self.scores[cn]) #634

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions