Implementation of Noise sensitivity metrics from RAGChecker #1190

sahusiddharth · 2024-08-12T18:52:18Z

Solves:

[R-293] Noise sensitivity metrics from RAGChecker #1185
Took inspiration from RAGChecker from AWS Noise sensitivity noise sensitivity metrics.
Have tested it locally, it is working giving the results.

Input

from datasets import Dataset 
from ragas.metrics import noise_sensitivity_relevant, noise_sensitivity_irrelevant
from ragas import evaluate
data_sample = {
    "question": ["What is the Life Insurance Corporation of India (LIC) known for?"],
    "ground_truth": ["The Life Insurance Corporation of India (LIC) is the largest insurance company in India, established in 1956 through the nationalization of the insurance industry. It is known for managing a large portfolio of investments."],
    "answer": ["The Life Insurance Corporation of India (LIC) is the largest insurance company in India, known for its vast portfolio of investments. LIC contributs to the financial stability of the country."],
    "contexts": [["The Life Insurance Corporation of India (LIC) was established in 1956 following the nationalization of the insurance industry in India.",
        "LIC is the largest insurance company in India, with a vast network of policyholders and a huge investments.",
        "As the largest institutional investor in India, LIC manages a substantial funds, contributing to the financial stability of the country.",
        "The Indian economy is one of the fastest-growing major economies in the world, thanks to the secors like finance, technology, manufacturing etc"]]
}


dataset = Dataset.from_dict(data_sample)
metrics = [noise_sensitivity_relevant, noise_sensitivity_irrelevant]
score = evaluate(dataset,metrics=metrics)
score.to_pandas()

shahules786

Thanks for the PR, great work. @sahusiddharth
Two more suggestions

Can you fix the conflict?
Can you also add a documentation associated with noise sensitivity?
TBH understanding the use and internals of this is little overwhelming, users will find it hard to adopt it if they can't understand it.

src/ragas/metrics/_noise_sensitivity.py

sahusiddharth · 2024-08-13T14:45:26Z

Could you please provide some suggestions on the types of documentation that might be required? Your input would be greatly appreciated.

shahules786 · 2024-08-13T17:19:07Z

Hey @sahusiddharth I did see this doc written by one of author of ragchecker, I have also emailed him asking for an intuitive explanation of noise sensitivity (let's wait for few hours and see if he replies). Otherwise, it would be nice to follow how the format in metrics as here https://docs.ragas.io/en/stable/concepts/metrics/context_recall.html

sahusiddharth · 2024-08-13T17:35:22Z

Hi @shahules786,

Got it. Please let me know when you hear back from the author. In the meantime, I’ll check the metrics format you mentioned.

Thanks!

shahules786

Hey @sahusiddharth The changes you did looks good. I also did some corrections on top of that. I think it's best to move forward and do the doc now - it the author reacts later we can get his feedback on it later. Can you prepare the docs to go with this?

It needs 3 sections:
A brief intuitive description of metrics
an example
how it's calculated using the example

sahusiddharth · 2024-08-14T09:28:22Z

Hi @shahules786,

I wanted to clarify something regarding the noise sensitivity implementation. There are two types: one for when relevant context is retrieved and another for when irrelevant context is retrieved. The current implementation only addresses the relevant context.

Would you prefer that I add the handling for irrelevant context first, or should I complete the documentation for the basic implementation before proceeding with the additional functionality?

shahules786 · 2024-08-14T10:13:49Z

@sahusiddharth I did notice that, thought that using relevant might be more useful but now I think if the user has the ability to switch b/w both using an argument that would be better. Can you modify it to include that behavior? Then we can add documentation for both in same page.

sahusiddharth · 2024-08-14T10:28:36Z

I’m happy to make the necessary modifications. I would appreciate some additional guidance on how best to return the results when asked for both. Since the output could be returned in multiple types of data, such as dictionaries, named tuples, or tuples, I’m considering the most appropriate format.

{'noise_sensitivity_relevant': 0.0, 'noise_sensitivity_irrelevant': 0.0}

I was thinking to return only the number when asked for a specific one.

Could you please advise on the preferred format for returning these results?

shahules786 · 2024-08-14T10:47:11Z

@sahusiddharth Thanks for asking that. I think both should not be an option - it would be one of them 'relevant' or 'irrelevant'. By default, it should stay as 'relevant'.
In upcoming versions, we will introduce caching so avoid llm recalculating the same intermediate results as in this case if someone wants both they have two make two calls.

Make sure that when you write the doc give credit to Ragchecker by citing the work.

sahusiddharth · 2024-08-14T13:05:16Z

@shahules786, I'm almost done with the documentation, to properly show the power of noise sensitivity, the example is getting long, Do you have a problem with that?

shahules786 · 2024-08-14T13:15:19Z

@sahusiddharth We can refine it later , but I also show some basic examples here

sahusiddharth · 2024-08-14T13:22:37Z

@shahules786, Have gone through them, I didn't liked it that much the answers generated by llm is rarely using the information provided in the context and I didn't find it intuitive enough.

shahules786 · 2024-08-14T14:10:26Z

@sahusiddharth I agree, their dataset generation is naive.

sahusiddharth · 2024-08-14T18:10:27Z

@shahules786, tried returning tuple when given we want noise sensitivity for both relevant and irrelevant the make-ci was giving me error.

ragas/src/ragas/metrics/_noise_sensitivity.py:208:15 - error: Method "_ascore" overrides class "Metric" in an incompatible manner
    Return type mismatch: base method returns type "Coroutine[Any, Any, float]", override returns type "Coroutine[Any, Any, Coroutine[Any, Any, float | Tuple[float, float]]]"
      "Coroutine[Any, Any, Coroutine[Any, Any, float | Tuple[float, float]]]" is incompatible with "Coroutine[Any, Any, float]"
        Type parameter "_ReturnT_co_nd@Coroutine" is covariant, but "Coroutine[Any, Any, float | Tuple[float, float]]" is not a subtype of "float"
          "Coroutine[Any, Any, float | Tuple[float, float]]" is incompatible with "float" (reportIncompatibleMethodOverride)
  /Users/nexus/Desktop/ankit/ragas/src/ragas/metrics/_noise_sensitivity.py:244:16 - error: Expression of type "Unknown | tuple[Unknown, Unknown]" is incompatible with return type "Coroutine[Any, Any, float | Tuple[float, float]]"
    Type "Unknown | tuple[Unknown, Unknown]" is incompatible with type "Coroutine[Any, Any, float | Tuple[float, float]]"
      "tuple[Unknown, Unknown]" is incompatible with "Coroutine[Any, Any, float | Tuple[float, float]]" (reportReturnType)

shahules786

LGTM, I have made few changes to polish it up.

HuXiangkun · 2024-09-29T01:53:54Z

Hi @sahusiddharth @shahules786 , thanks for your interest to RAGChecker. I'm the coauthor of RAGChecker. This is a nice work for integrating Noise Sensitivity into Ragas.

Regarding your comments on ground truth answer generation, I want to make some clarifications. We took the short answers and the annotated ground truth passages (the context) as input to generate long-form answers. And then, to ensure the answers are faithful to the context, we use RefChecker to fiter out answers that contain hallucinations. So the generated answers are always stick to the provided context.

Please refer to Appendix A.2 in RAGChecker paper. Thanks!

@shahules786, Have gone through them, I didn't liked it that much the answers generated by llm is rarely using the information provided in the context and I didn't find it intuitive enough.

jjmachan · 2024-09-30T16:18:28Z

hey @HuXiangkun thanks a lot for clearing that up ❤️
would love to have you in our community and stay in touch if you would be interested 🙂

HuXiangkun · 2024-10-01T01:41:10Z

Hi @jjmachan , I'm glad to stay in touch!

jjmachan · 2024-10-05T18:22:53Z

just send you a mail 🙂

nibeditaSw · 2024-11-20T11:29:53Z

Hi @sahusiddharth
which ragas version you are using for the evaluation of noise sensitivity metrics?

Basic implementation of Noise sensitivity metrics from RAGChecker

99e4848

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Aug 12, 2024

shahules786 self-requested a review August 13, 2024 04:04

shahules786 reviewed Aug 13, 2024

View reviewed changes

src/ragas/metrics/_noise_sensitivity.py Outdated Show resolved Hide resolved

src/ragas/metrics/_noise_sensitivity.py Outdated Show resolved Hide resolved

shahules786 requested changes Aug 13, 2024

View reviewed changes

src/ragas/metrics/_noise_sensitivity.py Outdated Show resolved Hide resolved

Implemented requested changes

dfe069f

sahusiddharth force-pushed the feat/adding_noise_metric branch from a5b2c28 to dfe069f Compare August 13, 2024 15:12

Merge branch 'main' into feat/adding_noise_metric

1a3ad3d

sahusiddharth requested a review from shahules786 August 13, 2024 15:25

shahules786 added 2 commits August 14, 2024 10:06

remove unnecssary declarations

d89ecb4

convert verdicts to bool

e029f08

shahules786 requested changes Aug 14, 2024

View reviewed changes

sahusiddharth added 2 commits August 14, 2024 23:28

added docs

8128f80

corrected the numbering

574323e

modified content

9731ef7

sahusiddharth changed the title ~~Basic implementation of Noise sensitivity metrics from RAGChecker~~ Implementation of Noise sensitivity metrics from RAGChecker Aug 14, 2024

shahules786 added 2 commits August 15, 2024 10:12

remove changes from experimental

e9426b6

minor fixes

529372f

shahules786 and others added 3 commits August 15, 2024 10:19

reflect changes in docs

2fbd51c

fix imports

a0786f1

Merge branch 'main' into feat/adding_noise_metric

246944a

shahules786 requested a review from jjmachan August 15, 2024 04:56

shahules786 added 2 commits August 15, 2024 10:29

fix typo

8188daa

add noise sensitivity to metrics

1e84692

shahules786 approved these changes Aug 15, 2024

View reviewed changes

remove all unneccessary linting changes

1acb14b

jjmachan linked an issue Aug 15, 2024 that may be closed by this pull request

[R-293] Noise sensitivity metrics from RAGChecker #1185

Closed

shahules786 merged commit 8da231d into explodinggradients:main Aug 23, 2024

sahusiddharth deleted the feat/adding_noise_metric branch November 6, 2024 13:07

Implementation of Noise sensitivity metrics from RAGChecker #1190

Implementation of Noise sensitivity metrics from RAGChecker #1190

Uh oh!

Conversation

sahusiddharth commented Aug 12, 2024 • edited by shahules786 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Input

Uh oh!

shahules786 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sahusiddharth commented Aug 13, 2024

Uh oh!

shahules786 commented Aug 13, 2024

Uh oh!

sahusiddharth commented Aug 13, 2024

Uh oh!

shahules786 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sahusiddharth commented Aug 14, 2024

Uh oh!

shahules786 commented Aug 14, 2024

Uh oh!

sahusiddharth commented Aug 14, 2024

Uh oh!

shahules786 commented Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sahusiddharth commented Aug 14, 2024

Uh oh!

shahules786 commented Aug 14, 2024

Uh oh!

sahusiddharth commented Aug 14, 2024

Uh oh!

shahules786 commented Aug 14, 2024

Uh oh!

sahusiddharth commented Aug 14, 2024

Uh oh!

shahules786 left a comment

Choose a reason for hiding this comment

Uh oh!

HuXiangkun commented Sep 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jjmachan commented Sep 30, 2024

Uh oh!

HuXiangkun commented Oct 1, 2024

Uh oh!

jjmachan commented Oct 5, 2024

Uh oh!

nibeditaSw commented Nov 20, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

sahusiddharth commented Aug 12, 2024 •

edited by shahules786

Loading

shahules786 left a comment •

edited

Loading

shahules786 left a comment •

edited

Loading

shahules786 commented Aug 14, 2024 •

edited

Loading

HuXiangkun commented Sep 29, 2024 •

edited

Loading