Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: failsafe for non-valid json and failed LLM calls #7723

Merged
merged 27 commits into from
May 23, 2024

Conversation

davidsbatista
Copy link
Contributor

@davidsbatista davidsbatista commented May 22, 2024

Related Issues

Proposed Changes:

The LLM-based evaluators can fail due to:

  • an error when making a call the LLM
  • an output returned by the LLM which is an invalid JSON

This PR adds safeguards to LLM-based evaluators:

  • If an LLM-based evaluator (e.g., Faithfulness or ContextRelevance) is initialised with raise_on_failure=False, and if a call to an LLM fails or an LLM outputs an invalid JSON, it returns np.nan and continues the evaluation.
  • The user being notified with a warning, and with the number of requests that failed.

@github-actions github-actions bot added topic:tests type:documentation Improvements on the docs labels May 22, 2024
@coveralls
Copy link
Collaborator

coveralls commented May 22, 2024

Pull Request Test Coverage Report for Build 9210542884

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 5 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.03%) to 90.563%

Files with Coverage Reduction New Missed Lines %
components/evaluators/llm_evaluator.py 5 95.41%
Totals Coverage Status
Change from base Build 9209846779: -0.03%
Covered Lines: 6641
Relevant Lines: 7333

💛 - Coveralls

@davidsbatista davidsbatista marked this pull request as ready for review May 22, 2024 10:25
@davidsbatista davidsbatista requested a review from a team as a code owner May 22, 2024 10:25
@davidsbatista davidsbatista requested review from julian-risch and shadeMe and removed request for a team May 22, 2024 10:25
@davidsbatista davidsbatista changed the title Failsafe for non valid json fix: failsafe for non-valid json and failed LLM calls May 22, 2024
@davidsbatista davidsbatista requested a review from a team as a code owner May 22, 2024 10:44
@davidsbatista davidsbatista requested review from dfokina and removed request for a team May 22, 2024 10:44
haystack/components/evaluators/context_relevance.py Outdated Show resolved Hide resolved
haystack/components/evaluators/context_relevance.py Outdated Show resolved Hide resolved
haystack/components/evaluators/faithfulness.py Outdated Show resolved Hide resolved
haystack/components/evaluators/llm_evaluator.py Outdated Show resolved Hide resolved
haystack/components/evaluators/llm_evaluator.py Outdated Show resolved Hide resolved
haystack/components/evaluators/llm_evaluator.py Outdated Show resolved Hide resolved
haystack/components/evaluators/llm_evaluator.py Outdated Show resolved Hide resolved
haystack/components/evaluators/llm_evaluator.py Outdated Show resolved Hide resolved
haystack/components/evaluators/faithfulness.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@shadeMe shadeMe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Good to merge after @julian-risch's review.

haystack/components/evaluators/context_relevance.py Outdated Show resolved Hide resolved
haystack/components/evaluators/faithfulness.py Outdated Show resolved Hide resolved
davidsbatista and others added 3 commits May 23, 2024 09:21
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
Copy link
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks quite good to me already. Two small notes and then I'll approve. Please update the docs pages once we merge this PR.
https://docs.haystack.deepset.ai/docs/llmevaluator
https://docs.haystack.deepset.ai/docs/faithfulnessevaluator
https://docs.haystack.deepset.ai/docs/contextrelevanceevaluator

@davidsbatista davidsbatista enabled auto-merge (squash) May 23, 2024 15:31
@davidsbatista davidsbatista merged commit 38747ff into main May 23, 2024
25 checks passed
@davidsbatista davidsbatista deleted the failsafe-for-non-valid-JSON branch May 23, 2024 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic:tests type:documentation Improvements on the docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LLM-based evaluators stoping conditions and valid JSON
4 participants