# Introduction

Ragas is a framework that helps you evaluate your Retrieval Augmented Generation (RAG) pipelines. RAG denotes a class of LLM applications that use external data to augment the LLM’s context. 

Note: This is notebook is a remake from activities provided by AI-Makerspace when I enrolled in Cohort 2 of AI Engineering.

# Why Evaluate

Because you cannot improve what you cannot measure.

Evaluating Retrieval-Augmented Generation (RAG) is crucial because it ensures the accuracy, reliability, and contextual relevance of the generated content by retrieving pertinent information from trusted sources. This evaluation helps identify and mitigate biases, ensuring fairness in the responses. Additionally, it optimizes the model's performance, enhancing efficiency and response times, which in turn builds user trust by consistently delivering high-quality and relevant information. Overall, thorough evaluation of RAG models is essential for refining their capabilities and improving the user experience.

# Evaluation of RAG Using Ragas

In the following notebook we'll explore how to evaluate RAG pipelines using a powerful open-source tool called "Ragas". This will give us tools to evaluate component-wise metrics, as well as end-to-end metrics about the performance of our RAG pipelines.

In the following notebook we'll complete the following tasks:

  1. Install required libraries
  2. Set Environment Variables
  3. Creating a simple RAG pipeline with Langchain
  4. Synthetic Dataset Generation for Evaluation using the Ragas
  5. Evaluating our pipeline with Ragas
  6. Making Adjustments to our RAG Pipeline
  7. Evaluating our Adjusted pipeline against our baseline
  8. Testing OpenAI's Claim

The only way to get started is to get started - so let's grab our dependencies for the day!

## Task 1: Set Environment Variables

Let's set up our OpenAI API key so we can leverage their API later on.

In [3]:
import os
import openai
from getpass import getpass

openai.api_key = getpass("Please provide your OpenAI Key: ")
os.environ["OPENAI_API_KEY"] = openai.api_key

## Task 2: Creating a Simple RAG Pipeline with LangChain

We'll be leveraging LangChain and LCEL to build a simple RAG pipeline that we can baseline with Ragas.

## Building our RAG pipeline

Let's review the basic steps of RAG again:

- Create an Index
- Use retrieval to obtain pieces of context from our Index that are similar to our query
- Use a LLM to generate responses based on the retrieved context

Let's get started by creating our index.

> NOTE: We're going to start leaning on the term "index" to refer to our `VectorStore`, `VectorDatabase`, etc. We can think of "index" as the catch-all term, whereas `VectorStore` and the like relate to the specific technologies used to create, store, and interact with the index.

### Creating an Index

You'll notice that the largest changes (outside of some import changes) are that our old favourite chains are back to being bundled in an easily usable abstraction.

We can still create custom chains using LCEL - but we can also be more confident that our pre-packaged chains are creating using LCEL under the hood.

In [3]:
from langchain_community.document_loaders import PyMuPDFLoader

loader = PyMuPDFLoader(
    "data/lotr.pdf",
)

all_documents = loader.load()

In [4]:
all_documents[0].metadata


{'producer': 'Acrobat Distiller 6.0 (Windows)',
 'creator': 'Adobe Acrobat 9.0.0',
 'creationdate': '2009-05-23T14:56:28+05:30',
 'source': 'data/lotr.pdf',
 'file_path': 'data/lotr.pdf',
 'total_pages': 1210,
 'format': 'PDF 1.5',
 'title': 'The Lord of the Rings',
 'author': 'J. R. R. Tolkien',
 'subject': '',
 'keywords': '',
 'moddate': '2010-08-13T20:05:05+02:00',
 'trapped': '',
 'modDate': "D:20100813200505+02'00'",
 'creationDate': "D:20090523145628+05'30'",
 'page': 0}

In [5]:
#limit to BOOK One - Page 21 to 215
print('Total pages', len(all_documents))
documents = all_documents[21:215]
print('# of pages - Book One', len(documents))
print('First page', documents[0].page_content)

Total pages 1210
# of pages - Book One 194
First page TH E L ORD OF THE RI NGS 
The crucial chapter, ‘The Shadow of the Past’, is one of the oldest 
parts of the tale. It was written long before the foreshadow of 1939 
had yet become a threat of inevitable disaster, and from that point 
the story would have developed along essentially the same lines, if 
that disaster had been averted. Its sources are things long before in 
mind, or in some cases already written, and little or nothing in it was 
modiﬁed by the war that began in 1939 or its sequels. 
The real war does not resemble the legendary war in its process 
or its conclusion. If it had inspired or directed the development 
of the legend, then certainly the Ring would have been seized and 
used against Sauron; he would not have been annihilated but en-
slaved, and Barad-duˆr would not have been destroyed but occupied. 
Saruman, failing to get possession of the Ring, would in the confusion 
and treacheries of the time have found in

#### Transforming Data

Now that we've got our single document - let's split it into smaller pieces so we can more effectively leverage it with our retrieval chain!

We'll start with the classic: `RecursiveCharacterTextSplitter`.

In [6]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 500,
    chunk_overlap = 50
)

chunks = text_splitter.split_documents(documents)

Let's confirm we've split our document.

In [7]:
chunks[300].page_content

'52 \nTH E L ORD OF THE RI NGS \nwill command them all again, wherever they be, even the Three, and \nall that has been wrought with them will be laid bare, and he will be \nstronger than ever. \n‘And this is the dreadful chance, Frodo. He believed that the One \nhad perished; that the Elves had destroyed it, as should have been \ndone. But he knows now that it has not perished, that it has been \nfound. So he is seeking it, seeking it, and all his thought is bent on'

In [9]:
print(f"{len(chunks)} chunks were generated from the documents.")

1121 chunks were generated from the documents.


<div style="background-color: #204B8E; color: white; padding: 10px; border-radius: 5px;">

#### ❓ Question #1:

How many chunks were generated?

</div>


<div style="background-color: #204B8E; color: white; padding: 10px; border-radius: 5px;">

### Answer :

* 1,121 chunks were generated from the documents.
</span>


#### Loading OpenAI Embeddings Model

We'll need a process by which we can convert our text into vectors that allow us to compare to our query vector.

Let's use OpenAI's `text-embedding-ada-002` for this task!

In [17]:
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(
    model="text-embedding-ada-002"
)

#### Creating a QDrant VectorStore

Now that we have documents - we'll need a place to store them alongside their embeddings.

In [18]:
from langchain_community.vectorstores import Qdrant

qdrant_vector_store = Qdrant.from_documents(
    chunks,
    embeddings,
    location=":memory:",
    collection_name="LOTR",
)

#### Creating a Retriever

To complete our index, all that's left to do is expose our vectorstore as a retriever - which we can do the same way we would in previous version of LangChain!

In [19]:
retriever = qdrant_vector_store.as_retriever()

#### Testing our Retriever

Now that we've gone through the trouble of creating our retriever - let's see it in action!

In [20]:
retrieved_documents = retriever.invoke("Who is the bearer of the ring?")
for doc in retrieved_documents:
  print(doc.page_content)
  print('')

Master-ring, the One Ring to rule them all. This is the One Ring

‘Well, so I have!’ cried Bilbo. ‘And my will and all the other 
documents too. You had better take it and deliver it for me. That 
will be safest.’ 
‘No, don’t give the ring to me,’ said Gandalf. ‘Put it on the mantel-
piece. It will be safe enough there, till Frodo comes. I shall wait for him.’ 
Bilbo took out the envelope, but just as he was about to set it by 
the clock, his hand jerked back, and the packet fell on the ﬂoor.

Bilbo’s ring. I longed to disappear.’ 
‘Don’t do that!’ said Gandalf, sitting down. ‘Do be careful of that 
ring, Frodo! In fact, it is partly about that that I have come to say a 
last word.’ 
‘Well, what about it?’ 
‘What do you know already?’ 
‘Only what Bilbo told me. I have heard his story: how he found 
it, and how he used it: on his journey, I mean.’ 
‘Which story, I wonder,’ said Gandalf. 
‘Oh, not what he told the dwarves and put in his book,’ said Frodo.

It was suggested to Bilbo, as h

### Creating a RAG Chain

Now that we have the "R" in RAG taken care of - let's look at creating the "AG"!

#### Creating a Prompt Template

There are a few different ways we could create our prompt template - we could create a custom template, as seen in the code below, or we could simply pull a prompt from the prompt hub! Let's look at an example of that!

In [10]:
!pip install -U -q langchainhub ipywidgets nest_asyncio


In [11]:
from langchain import hub

retrieval_qa_prompt = hub.pull("langchain-ai/retrieval-qa-chat")



In [12]:
print(retrieval_qa_prompt.messages[0].prompt.template)

Answer any use questions based solely on the context below:

<context>
{context}
</context>


As you can see - the prompt template is simple (and has a small error) - so we'll create our own to be a bit more specific!

In [13]:
from langchain.prompts import ChatPromptTemplate

template = """Answer the question based only on the following context. If you cannot answer the question with the context, please respond with 'I don't know':

Context:
{context}

Question:
{question}
"""

prompt = ChatPromptTemplate.from_template(template)

#### Setting Up our Basic QA Chain

Now we can instantiate our basic RAG chain!

We'll use LCEL directly just to see an example of it - but you could just as easily use an abstraction here to achieve the same goal!

We'll also ensure to pass-through our context - which is critical for RAGAS.

In [21]:
from operator import itemgetter

from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

primary_qa_llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

retrieval_augmented_qa_chain = (
    # INVOKE CHAIN WITH: {"question" : "<<SOME USER QUESTION>>"}
    # "question" : populated by getting the value of the "question" key
    # "context"  : populated by getting the value of the "question" key and chaining it into the base_retriever
    {"context": itemgetter("question") | retriever, "question": itemgetter("question")}
    # "context"  : is assigned to a RunnablePassthrough object (will not be called or considered in the next step)
    #              by getting the value of the "context" key from the previous step
    | RunnablePassthrough.assign(context=itemgetter("context"))
    # "response" : the "context" and "question" values are used to format our prompt object and then piped
    #              into the LLM and stored in a key called "response"
    # "context"  : populated by getting the value of the "context" key from the previous step
    | {"response": prompt | primary_qa_llm, "context": itemgetter("context")}
)

Let's test it out!

In [28]:
question = "Who is the bearer of the ring?"

result = retrieval_augmented_qa_chain.invoke({"question" : question})

print(result["response"].content)

Bilbo is the bearer of the ring.


In [29]:
question = "Who is the eventual bearer of the ring?"

result = retrieval_augmented_qa_chain.invoke({"question" : question})

print(result["response"].content)

The eventual bearer of the ring is Frodo.


We can already see that there are some improvements we could make here.

For now, let's switch gears to RAGAS to see how we can leverage that tool to provide us insight into how our pipeline is performing!

## Task 4: Evaluation using Ragas

Ragas is a powerful library that lets us evaluate our RAG pipeline by collecting input/output/context triplets and obtaining metrics relating to a number of different aspects of our RAG pipeline.

We'll be evaluating on every core metric today, but in order to do that - we'll need to create a test set. Luckily for us, Ragas can do that directly!

In [22]:
# load the CSV - lotr_testset.csv
import pandas as pd
import nest_asyncio

nest_asyncio.apply()
test_df = pd.read_csv('data/lotr_testset.csv')

Let's look at the output and see what we can learn about it!

In [23]:
test_df

Unnamed: 0,question,contexts,ground_truth,evolution_type,metadata,episode_done
0,How did the speaker's bad cold affect their ex...,"['been asked to ﬁll up the required number, li...",The speaker's bad cold affected their experien...,simple,"[{'source': 'lotr.pdf', 'file_path': 'lotr.pdf...",True
1,What happened to Pippin after he fell asleep i...,"['After a while Pippin fell fast asleep, and w...",Pippin was lifted up and borne away to a bower...,simple,"[{'source': 'lotr.pdf', 'file_path': 'lotr.pdf...",True
2,What was Frodo often seen doing far from home ...,['over the Shire with them; but more often he ...,Frodo was often seen walking in the hills and ...,simple,"[{'source': 'lotr.pdf', 'file_path': 'lotr.pdf...",True
3,What is Bilbo's attitude towards going on the ...,"['‘Well, that’s that,’ he said. ‘Now I’m off !...",Bilbo's attitude towards going on the Road wit...,simple,"[{'source': 'lotr.pdf', 'file_path': 'lotr.pdf...",True
4,What is the significance of the Shire in the c...,['over the Shire with them; but more often he ...,The answer to given question is not present in...,simple,"[{'source': 'lotr.pdf', 'file_path': 'lotr.pdf...",True
5,Who did Frodo find sitting in the rushes by th...,['By that pool long ago I found the River-daug...,"The River-daughter, fair young Goldberry",simple,"[{'source': 'lotr.pdf', 'file_path': 'lotr.pdf...",True
6,What led to the increased interest in Hobbit h...,['14 \nTH \nE L \nORD OF THE RI \nNGS \nof his...,The part played by the Hobbits in the great ev...,simple,"[{'source': 'lotr.pdf', 'file_path': 'lotr.pdf...",True
7,"Who did Bilbo address as ""My dear Bagginses an...",['silk waistcoat. They could all see him stand...,"Bilbo addressed the Bagginses and Boffins as ""...",simple,"[{'source': 'lotr.pdf', 'file_path': 'lotr.pdf...",True
8,Who can Frodo trust to stick by his side and k...,"['has some sense, mind you; and when you said ...","Frodo can trust Sam, Merry, and the rest of hi...",simple,"[{'source': 'lotr.pdf', 'file_path': 'lotr.pdf...",True
9,How did courage play a role in Gandalf's decis...,['but free and alive himself. Gandalf would ad...,The courage that had been awakened in Gandalf ...,simple,"[{'source': 'lotr.pdf', 'file_path': 'lotr.pdf...",True


### Generating Responses with RAG Pipeline

Now that we have some QC pairs, and some ground truths, let's evaluate our RAG pipeline using Ragas.

The process is, again, quite straightforward - thanks to Ragas and LangChain!

Let's start by extracting our questions and ground truths from our create testset.

We can start by converting our test dataset into a Pandas DataFrame.

In [24]:
test_questions = test_df["question"].values.tolist()
test_groundtruths = test_df["ground_truth"].values.tolist()

Now we'll generate responses using our RAG pipeline using the questions we've generated - we'll also need to collect our retrieved contexts for each question.

We'll do this in a simple loop to see exactly what's happening!

In [25]:
answers = []
contexts = []

for question in test_questions:
  response = retrieval_augmented_qa_chain.invoke({"question" : question})
  answers.append(response["response"].content)
  contexts.append([context.page_content for context in response["context"]])

Now we can wrap our information in a Hugging Face dataset for use in the Ragas library.

In [26]:
from datasets import Dataset

response_dataset = Dataset.from_dict({
    "question" : test_questions,
    "answer" : answers,
    "contexts" : contexts,
    "ground_truth" : test_groundtruths
})

Let's take a peek and see what that looks like!

In [27]:
response_dataset[0]

{'question': "How did the speaker's bad cold affect their experience at the banquet?",
 'answer': "The speaker's bad cold affected their experience at the banquet by making them only able to say 'thag you very buch' instead of properly thanking the guests.",
 'contexts': ['ﬁfty-one then, and birthdays did not seem so important. The banquet was \nvery splendid, however, though I had a bad cold at the time, I remember, \nand could only say ‘thag you very buch’. I now repeat it more correctly: \nThank you very much for coming to my little party. Obstinate silence. \nThey all feared that a song or some poetry was now imminent; and \nthey were getting bored. Why couldn’t he stop talking and let them \ndrink his health? But Bilbo did not sing or recite. He paused for a',
  'they had felt it was impossible to refuse. Besides, their cousin, Bilbo, \nhad been specializing in food for many years and his table had a high \nreputation. \nAll the one hundred and forty-four guests expected a pleasan

## Task 2: Evaluating our Pipeline with Ragas

Now that we have our response dataset - we can finally get into the "meat" of Ragas - evaluation!

First, we'll import the desired metrics, then we can use them to evaluate our created dataset!

Check out the specific metrics we'll be using in the Ragas documentation:

- [Faithfulness](https://docs.ragas.io/en/stable/concepts/metrics/faithfulness.html)
- [Answer Relevancy](https://docs.ragas.io/en/stable/concepts/metrics/answer_relevance.html)
- [Context Precision](https://docs.ragas.io/en/stable/concepts/metrics/context_precision.html)
- [Context Recall](https://docs.ragas.io/en/stable/concepts/metrics/context_recall.html)
- [Answer Correctness](https://docs.ragas.io/en/stable/concepts/metrics/answer_correctness.html)

See the accompanied presentation for more in-depth explanations about each of the metrics!

In [30]:
from ragas import evaluate
from ragas.metrics import (
    faithfulness,
    answer_relevancy,
    answer_correctness,
    context_recall,
    context_precision,
)

metrics = [
    faithfulness,
    answer_relevancy,
    context_recall,
    context_precision,
    answer_correctness,
]

All that's left to do is call "evaluate" and away we go!

In [31]:
import nest_asyncio

nest_asyncio.apply()
results = evaluate(response_dataset, metrics)

Evaluating:   0%|          | 0/90 [00:00<?, ?it/s]

In [32]:
results

{'faithfulness': 0.5917, 'answer_relevancy': 0.8391, 'context_recall': 0.7037, 'context_precision': 0.6034, 'answer_correctness': 0.5623}

In [33]:
results_df = results.to_pandas()
results_df

Unnamed: 0,user_input,retrieved_contexts,response,reference,faithfulness,answer_relevancy,context_recall,context_precision,answer_correctness
0,How did the speaker's bad cold affect their ex...,"[ﬁfty-one then, and birthdays did not seem so ...",The speaker's bad cold affected their experien...,The speaker's bad cold affected their experien...,1.0,0.980049,1.0,1.0,0.890641
1,What happened to Pippin after he fell asleep i...,"[Pippin, and he threw himself upon a bed and f...",Pippin fell asleep and dreamed pleasantly.,Pippin was lifted up and borne away to a bower...,1.0,0.935327,1.0,0.5,0.661209
2,What was Frodo often seen doing far from home ...,[falling asleep. The hobbits sat still before ...,Frodo was often seen walking and talking with ...,Frodo was often seen walking in the hills and ...,0.0,0.978512,0.0,0.416667,0.455312
3,What is Bilbo's attitude towards going on the ...,[used it on their way to their mines in the Bl...,Bilbo's attitude towards going on the Road wit...,Bilbo's attitude towards going on the Road wit...,1.0,0.997583,1.0,0.25,0.846095
4,What is the significance of the Shire in the c...,"[there the king had once had many farms, cornl...",The Shire is significant in the context of the...,The answer to given question is not present in...,0.75,0.989411,1.0,1.0,0.179858
5,Who did Frodo find sitting in the rushes by th...,[him.’ But at that moment Mr. Butterbur was ca...,Strider,"The River-daughter, fair young Goldberry",0.0,0.795878,0.0,0.0,0.198854
6,What led to the increased interest in Hobbit h...,[western Hobbits fell in love with their new l...,The part played by the Hobbits in the great ev...,The part played by the Hobbits in the great ev...,1.0,0.862027,1.0,0.638889,1.0
7,"Who did Bilbo address as ""My dear Bagginses an...","[point. No, I was not troubled about dear Bilb...","Bilbo addressed the hobbits as ""My dear Baggin...","Bilbo addressed the Bagginses and Boffins as ""...",1.0,0.896503,1.0,1.0,0.547381
8,Who can Frodo trust to stick by his side and k...,[Sam looked at him unhappily. ‘It all depends ...,Frodo can trust Merry and Sam to stick by his ...,"Frodo can trust Sam, Merry, and the rest of hi...",1.0,0.999999,1.0,0.805556,0.997077
9,How did courage play a role in Gandalf's decis...,[the incantation. Then a wild thought of escap...,I don't know.,The courage that had been awakened in Gandalf ...,0.0,0.0,0.666667,0.833333,0.185671


## Task 3: Making Adjustments to our RAG Pipeline

Now that we have established a baseline - we can see how any changes impact our pipeline's performance!

Let's modify our retriever and see how that impacts our Ragas metrics!

> NOTE: MultiQueryRetriever is expanded on [here](https://python.langchain.com/docs/modules/data_connection/retrievers/MultiQueryRetriever) but for now, the implementation is not important to our lesson!

In [34]:
from langchain.retrievers import MultiQueryRetriever

advanced_retriever = MultiQueryRetriever.from_llm(retriever=retriever, llm=primary_qa_llm)

We'll also re-create our RAG pipeline using the abstractions that come packaged with LangChain v0.1.0!

First, let's create a chain to "stuff" our documents into our context!

In [35]:
from langchain.chains.combine_documents import create_stuff_documents_chain

document_chain = create_stuff_documents_chain(primary_qa_llm, retrieval_qa_prompt)

Next, we'll create the retrieval chain!

In [36]:
from langchain.chains import create_retrieval_chain

retrieval_chain = create_retrieval_chain(advanced_retriever, document_chain)

In [37]:
response = retrieval_chain.invoke({"input": "Who is the bearer of the ring?"})

In [38]:
print(response["answer"])

The bearer of the ring is Frodo.


In [39]:
response = retrieval_chain.invoke({"input": "Who are the friends of Frodo?"})

In [40]:
print(response["answer"])

Frodo's closest friends are Peregrin Took (Pippin) and Merry Brandybuck (Meriadoc). Other friends mentioned are Folco Boffin and Fredegar Bolger.


<div style="background-color: #204B8E; color: white; padding: 10px; border-radius: 5px;">

#### 🏗️ Question #3:

What does Multiquery Retriever do? Why did model respond better this time?

</div>

<div style="background-color: #204B8E; color: white; padding: 10px; border-radius: 5px;">

### Answer :

* The MultiQueryRetriever automates the process of prompt tuning by utilizing an LLM and harnessing the power of diversity to generate multiple queries from various perspectives based on a user's input query. The model improved its response due to more relevant and complementary chunks of information being retrieved, providing the LLM with fuller context.
</span>

Well, just from those responses this chain *feels* better - but lets see how it performs on our eval!

Let's do the same process we did before to collect our pipeline's contexts and answers.

In [41]:
answers = []
contexts = []

for question in test_questions:
  response = retrieval_chain.invoke({"input" : question})
  answers.append(response["answer"])
  contexts.append([context.page_content for context in response["context"]])

Now we can convert this into a dataset, just like we did before.

In [42]:
response_dataset_advanced_retrieval = Dataset.from_dict({
    "question" : test_questions,
    "answer" : answers,
    "contexts" : contexts,
    "ground_truth" : test_groundtruths
})

Let's evaluate on the same metrics we did for the first pipeline and see how it does!

In [43]:
advanced_retrieval_results = evaluate(response_dataset_advanced_retrieval, metrics)

Evaluating:   0%|          | 0/90 [00:00<?, ?it/s]

In [44]:
advanced_retrieval_results_df = advanced_retrieval_results.to_pandas()
advanced_retrieval_results_df

Unnamed: 0,user_input,retrieved_contexts,response,reference,faithfulness,answer_relevancy,context_recall,context_precision,answer_correctness
0,How did the speaker's bad cold affect their ex...,[they had felt it was impossible to refuse. Be...,The speaker's bad cold affected their experien...,The speaker's bad cold affected their experien...,1.0,0.980072,1.0,0.5,0.74737
1,What happened to Pippin after he fell asleep i...,"[Pippin, and he threw himself upon a bed and f...","Pippin fell asleep dreaming pleasantly, but a ...",Pippin was lifted up and borne away to a bower...,0.8,0.949836,1.0,0.5,0.526391
2,What was Frodo often seen doing far from home ...,[away to a bower under the trees; there he was...,Frodo was often seen walking and talking with ...,Frodo was often seen walking in the hills and ...,0.5,0.877174,1.0,0.416667,0.535651
3,What is Bilbo's attitude towards going on the ...,"[Frodo himself, after the ﬁrst shock, found th...",Bilbo seems to be excited and ready to go on t...,Bilbo's attitude towards going on the Road wit...,0.6,0.968762,0.0,0.0,0.566873
4,What is the significance of the Shire in the c...,"[there the king had once had many farms, cornl...",The Shire is a region where the Hobbits live a...,The answer to given question is not present in...,0.714286,0.975736,1.0,0.966667,0.175587
5,Who did Frodo find sitting in the rushes by th...,[him.’ But at that moment Mr. Butterbur was ca...,Frodo found Sam sitting on the grass near the ...,"The River-daughter, fair young Goldberry",1.0,0.883318,0.0,0.0,0.195708
6,What led to the increased interest in Hobbit h...,[western Hobbits fell in love with their new l...,The increased interest in Hobbit history durin...,The part played by the Hobbits in the great ev...,0.5,0.995804,0.0,0.638889,0.633621
7,"Who did Bilbo address as ""My dear Bagginses an...",[dants of the Old Took) who had as children be...,"Bilbo addressed the hobbits as ""My dear Baggin...","Bilbo addressed the Bagginses and Boffins as ""...",1.0,0.896291,1.0,0.5,0.545685
8,Who can Frodo trust to stick by his side and k...,[Sam looked at him unhappily. ‘It all depends ...,Frodo can trust Merry and Sam to stick by his ...,"Frodo can trust Sam, Merry, and the rest of hi...",0.75,0.980996,1.0,0.8875,0.744855
9,How did courage play a role in Gandalf's decis...,[the incantation. Then a wild thought of escap...,"Courage played a role in Frodo's decision, not...",The courage that had been awakened in Gandalf ...,0.833333,0.920506,0.666667,0.833333,0.343957


## Task 4: Evaluating our Adjusted Pipeline Against Our Baseline

Now we can compare our results and see what directional changes occured!

Let's refresh with our initial metrics.

In [45]:
results

{'faithfulness': 0.5917, 'answer_relevancy': 0.8391, 'context_recall': 0.7037, 'context_precision': 0.6034, 'answer_correctness': 0.5623}

And see how our advanced retrieval modified our chain!

In [46]:
advanced_retrieval_results

{'faithfulness': 0.7401, 'answer_relevancy': 0.8921, 'context_recall': 0.6852, 'context_precision': 0.5542, 'answer_correctness': 0.5625}

<div style="background-color: #204B8E; color: white; padding: 10px; border-radius: 5px;">

#### 🏗️ Question #4:

Compare the difference between the standard RAG with Multiquery Retriever?

</div>

<div style="background-color: #204B8E; color: white; padding: 10px; border-radius: 5px;">

### Answers :

* For standard RAG, it runs a single semantic search with the user’s query and returns top-k chunks. While this method is faster and less expensive, it may overlook relevant passages due to differences in wording or vocabulary. In contrast, MultiQueryRetriever uses an LLM to generate multiple queries from different perspectives for a given user input query. For each query, it retrieves a set of relevant documents and takes the unique union across all queries to get a larger set of potentially relevant documents. As a result, the MultiQueryRetriever can overcome some limitations associated with distance-based retrieval, providing a richer array of results that enhances recall and context diversity, thereby improving overall coverage.
</span>