# RAG Evaluation Basic Demo

This example shows a basic RAG evaluation pipeline based on the [Ragas](https://github.com/explodinggradients/ragas) framework. It focuses on two basic concepts:

- **Creating a test set**: This is a set of questions and answers that we'll use to evaluate a RAG pipeline.
- **Evaluation metrics**: Which metrics do we use to score a RAG pipeline? In this example, we measure the following:
    - *Faithfulness*: Are all the claims that are made in the answer inferred from the given context(s)?
    - *Context Precision*: Did our retriever return good results that matched the question it was being asked?
    - *Answer Correctness*: Was the generated answer correct? Was it complete?
    
**Requirements:**
- You will need an OpenAI access key, which requires a paid account you can sign up for at https://platform.openai.com/signup.
- After obtaining this key, store it in plain text in your home in directory in the `~/.openai.key` file.

## Set up the evaluation environment

In [1]:
from datasets import Dataset 
import os
from pathlib import Path
from ragas import evaluate
from ragas.metrics import faithfulness, answer_correctness, context_precision

This implementation requires an OpenAI key.

In [2]:
try:
    f = open(Path.home() / ".openai.key", "r")
    os.environ["OPENAI_API_KEY"] = f.read().rstrip("\n")
    f.close()
except Exception as err:
    print(f"Could not read your OpenAI API key. If you wish to run RAG evaluation, please make sure this is available in plain text under your home directory in ~/.openai.key: {err}")

## Create a test set: data samples we'll use to evaluate our RAG pipeline

In the `data_samples` structure below, the **answer** attribute contains the answers that a RAG pipeline might have returned to the questions asked under **question**. Try changing these answers to see how that affects the score in the next section.

In [3]:
rag_answer_1 = "The first superbowl was held on Jan 15, 1967"
rag_answer_2 = "The most super bowls have been won by The New England Patriots"

rag_context_1 = [
    'The First AFL–NFL World Championship Game was an American football game played on January 15, 1967, at the Los Angeles Memorial Coliseum in Los Angeles,'
]
rag_context_2 = [
    'The Green Bay Packers...Green Bay, Wisconsin.',
    'The Packers compete...Football Conference'
]

test_set = {
    'question': [
        'When was the first super bowl?', 
        'Who won the most super bowls?'
    ],
    'answer': [
        rag_answer_1,
        rag_answer_2 
    ],
    'contexts' : [
        rag_context_1, 
        rag_context_2
    ],
    'ground_truth': [
        'The first superbowl was held on January 15, 1967', 
        'The New England Patriots have won the Super Bowl a record six times'
    ]
}

## Now evaluate the RAG pipeline

Evaluate based on the metrics mentioned above: **faithfulness**, **context precision**, **answer correctness**.
    
There are other metrics that are available via the Ragas framework: [Ragas metrics](https://docs.ragas.io/en/latest/concepts/metrics/index.html)

Preview our test set before sending it for evaluation:

In [4]:
dataset = Dataset.from_dict(test_set)
dataset.to_pandas()

Unnamed: 0,question,answer,contexts,ground_truth
0,When was the first super bowl?,"The first superbowl was held on Jan 15, 1967",[The First AFL–NFL World Championship Game was...,"The first superbowl was held on January 15, 1967"
1,Who won the most super bowls?,The most super bowls have been won by The New ...,"[The Green Bay Packers...Green Bay, Wisconsin....",The New England Patriots have won the Super Bo...


Evaluation results:

In [5]:
score = evaluate(dataset,metrics=[faithfulness, context_precision, answer_correctness])
score.to_pandas()

Evaluating:   0%|          | 0/6 [00:00<?, ?it/s]

Unnamed: 0,question,answer,contexts,ground_truth,faithfulness,context_precision,answer_correctness
0,When was the first super bowl?,"The first superbowl was held on Jan 15, 1967",[The First AFL–NFL World Championship Game was...,"The first superbowl was held on January 15, 1967",0.0,1.0,0.749093
1,Who won the most super bowls?,The most super bowls have been won by The New ...,"[The Green Bay Packers...Green Bay, Wisconsin....",The New England Patriots have won the Super Bo...,,0.0,0.731061
