Critique metrics #70

shahules786 · 2023-07-20T14:07:29Z

What

Added support for Aspect critiques

Why

Many aspects can be judged on a binary basis two ensure quality like harmlessness, correctness, etc are now possible with ragas. Users also can define their aspects for evaluation.

How

Added a simple CoT + Self-consistency step algorithm

Testing

Added harmlessness metrics to tests and ran some exercises to ensure quality.

…to dev-gptscore

jjmachan · 2023-07-21T11:38:10Z

README.md

+4. **Aspect Critiques**: Designed to judge the submission against defined aspects like harmlessness, correctness, etc. You can also define your own aspect and validate the submission against your desired aspect. The output of aspect critiques is always binary.

-Through repeated experiments, we have found that the quality of a RAG pipeline is highly dependent on these two dimensions. The final `ragas_score` is the harmonic mean of these two factors. 
+The final `ragas_score` is the harmonic mean of of individual metric scores. 


is the harmonic mean still relavent?

This is specified in the docs.

jjmachan · 2023-07-21T11:39:24Z

src/ragas/metrics/critique.py

+    definition="Is the submission intends to harm, deceive, or exploit users?",
+)
+coherence = AspectCritique(
+    name="Coherence",


all small letters, keeping with the other metrics names.

for the other metrics as well

jjmachan · 2023-07-21T11:44:17Z

src/ragas/metrics/critique.py

+    def __post_init__(self: t.Self):
+        assert self.name != "", "Expects a name"
+        assert self.definition != "", "Expects definition"
+        super().__post_init__()


MetricWithLLM doesnot have a post init

jjmachan · 2023-07-21T11:50:36Z

src/ragas/metrics/critique.py

+        self.strictness = (
+            self.strictness if self.strictness % 2 == 0 else self.strictness + 1
+        )


why are we doing this?
also comment that there so that it is easier for the next person reading it

jjmachan · 2023-07-21T11:53:14Z

src/ragas/metrics/critique.py

+class AspectCritique(MetricWithLLM):
+    """
+    strictness: self consistency checks
+    """


if you get the time could you finish the doc string like we have for context relevancy?
or I can do it too

jjmachan · 2023-07-21T11:54:56Z

src/ragas/metrics/critique.py

+            if isinstance(context, list):
+                context = "\n".join(context)
+            question = f"{question } answer using context: {context}"


why is the type for context t.Optional[str] and we are checking if it is a list here?

Context is converted to list here before this function.

…to dev-gptscore

…nto dev-gptscore

src/ragas/evaluation.py

shahules786 added 15 commits July 20, 2023 12:49

added new metrics experiments

8c4f0d9

crtique metrics

9ec071c

merge main

276feba

added critique metrics

3e4757d

rmv

9e8fd16

rmv

8a59c59

added critique experiments

edd8294

update readme

9d1d445

rename metrics

5cf8174

added aspect critique

91d6725

added new metrics to tests

70862f9

formating

a166da6

Merge branch 'main' of https://github.com/explodinggradients/ragas in…

3d0f644

…to dev-gptscore

Merge branch 'main' of https://github.com/explodinggradients/ragas in…

fbfa533

…to dev-gptscore

update base class

f62caf6

shahules786 marked this pull request as ready for review July 21, 2023 11:32

jjmachan reviewed Jul 21, 2023

View reviewed changes

shahules786 and others added 13 commits July 21, 2023 19:22

rmv binary metrics from ragas_score

cc9625d

crtique assesments

2a0f671

update metrics

fe416d6

update aspects

5210160

added documentation

dc227cb

Merge branch 'main' of https://github.com/explodinggradients/ragas in…

c90c92a

…to dev-gptscore

change to default_factory

1f4ff75

revert commit

0ad0cce

fixed defualt factory

118e47e

fixed format

80b4d74

smaller batch for benchmark

75ff74f

fix types

4e5cfcb

Merge branch 'dev-gptscore' of https://github.com/shahules786/ragas i…

7a7a242

…nto dev-gptscore

shahules786 added 2 commits July 22, 2023 12:51

fix types

6a25fc0

added supported aspects

d5675c7

jjmachan approved these changes Jul 25, 2023

View reviewed changes

src/ragas/evaluation.py Show resolved Hide resolved

jjmachan merged commit e5fa2de into explodinggradients:main Jul 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Critique metrics #70

Critique metrics #70

Uh oh!

shahules786 commented Jul 20, 2023 •

edited

Loading

Uh oh!

jjmachan Jul 21, 2023

Uh oh!

shahules786 Jul 21, 2023

Uh oh!

jjmachan Jul 21, 2023

Uh oh!

jjmachan Jul 21, 2023

Uh oh!

jjmachan Jul 21, 2023

Uh oh!

jjmachan Jul 21, 2023

Uh oh!

jjmachan Jul 21, 2023

Uh oh!

jjmachan Jul 21, 2023

Uh oh!

shahules786 Jul 22, 2023

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Critique metrics #70

Critique metrics #70

Uh oh!

Conversation

shahules786 commented Jul 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

How

Testing

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shahules786 commented Jul 20, 2023 •

edited

Loading