*Author: [Daniel Puente Viejo](https://www.linkedin.com/in/danielpuenteviejo/)*

## **Automatic RAG Dataset Creation and Evaluation with Giskard & RAGAS**

<img src="imgs/cover_page.png" height="200"> 

Practical example of how to automatically create and evaluate a dataset.

### **Index:**

- <a href='#1'><ins>1. 🔧 Setup</ins></a>
- <a href='#2'><ins>2. 📦 Chunking & Vect BBDD creation</ins></a>
- <a href='#3'><ins>3. ⚙️ Create dataset</ins></a>
- <a href='#4'><ins>4. 🔄 Retrieve examples & Evaluate</ins></a>
- <a href='#5'><ins>5. 🎯 Answer questions & Evaluate</ins></a>

### <a id='1' style="color: skyblue;">**1. 🔧 Setup**</a>

We import the libraries and variables like the model name or index name among others

```bash
conda create -n <name> python=3.10.15
conda activate <name>
pip install -r requirements.txt
```

In [1]:
import warnings
warnings.filterwarnings("ignore")

from dotenv import load_dotenv
load_dotenv()

import pandas as pd
import os

from langchain import FAISS
from langchain.document_loaders import DataFrameLoader 
from langchain_google_genai import GoogleGenerativeAIEmbeddings, ChatGoogleGenerativeAI

index_name = 'sample_index'
embedding_model_name = 'embedding-001'
llm_model_name = 'gemini-1.5-flash'
use_langsmith = True

It is important to configure the environment variables and the embedding models that will be used in the LLM. Additionally, the option to use LangSmith to track metrics can be enabled.

Configure the environment variables with a `.env` file.

In [2]:
os.environ["GEMINI_API_KEY"] = os.environ.get("GEMINI_API_KEY")
os.environ["GOOGLE_API_KEY"] = os.environ.get("GEMINI_API_KEY")

if use_langsmith:
    os.environ["LANGSMITH_API_KEY"] = os.environ.get("LANGSMITH_API_KEY")
    os.environ["LANGSMITH_TRACING"] = "true"
    os.environ["LANGSMITH_PROJECT"] = os.environ.get("LANGSMITH_PROJECT")
    from langsmith import utils
    utils.tracing_is_enabled()

embeddings = GoogleGenerativeAIEmbeddings(model=f"models/{embedding_model_name}")
llm = ChatGoogleGenerativeAI(model=llm_model_name,
                             temperature=0.0, 
                             max_output_tokens=2048)

We load the clients. These are **scripts from the `/src` folder** where we have grouped functionalities to allow us to **quickly progress in the notebooks**. These are basically:

- **Chunking** - Facilitates generating chunks given a set of paths.
- **RAGDataset** - Automatically generates the dataset for us.
- **Retrieval** - Given a set of questions, returns the retrieved results.
- **Answering** - Given a set of questions, returns the answers.
- **Evaluation** - Allows us to evaluate both the retrieval and the answers.

In [3]:
from src.chunk_generation import Chunking
from src.dataset_creation import RAGDataset
from src.retrieval import Retrieval
from src.answer import Answering
from src.evaluation import Evaluation

chunking_client = Chunking()
datasetcreation_client = RAGDataset(llm_model = f"gemini/{llm_model_name}", 
                                    embedding_model= f"gemini/{embedding_model_name}")
retrieval_client = Retrieval()
answering_client = Answering()
evaluation_client = Evaluation(llm, embeddings)

### <a id='2' style="color: skyblue;">**2. 📦 Chunking & Vect BBDD creation**</a>

In [4]:
data_folder = "data/source"
paths = [f"{data_folder}/{f}" for f in os.listdir(data_folder) if f.endswith(".pdf")]
print(paths)

['data/source/Breaking Bad.pdf', 'data/source/Game of Thrones.pdf', 'data/source/La casa de papel.pdf', 'data/source/Suits.pdf']


In case the chunks have not been generated, we generate them with this simple piece of code.

In [None]:
chunks_path = "data/chunks"
chunks_filename = "chunks.csv"
chunks_complete_path = f"{chunks_path}/{chunks_filename}"

if not os.path.exists(chunks_complete_path):
    print("Creating chunks...")
    df_chunks = chunking_client.preprocess_chunking(paths = paths, chunk_size=500, chunk_overlap=50)
    df_chunks.to_csv(chunks_complete_path, index=False)

else:
    print("Loading chunks...")
    df_chunks = pd.read_csv(chunks_complete_path)
    
df_chunks.head()

Creating chunks...


Unnamed: 0,chunk_id,content,filename,page
0,abc41dd7-fa60-4e14-babb-ae468feb72b9,Synopsis \nWhat would you do if you found out...,Breaking Bad.pdf,1
1,51dd275d-3370-4c7d-8a37-51ab5a54d5a9,". With \nhis business at risk, Walter hires la...",Breaking Bad.pdf,1
2,88906208-3410-4029-9db8-61f1b8c34a6c,divorce. \nGus is determined to bring Walter ...,Breaking Bad.pdf,2
3,66a16eba-cc39-4768-8982-24e54fa8dc46,\nThe choice of the title Breaking Bad offe...,Breaking Bad.pdf,2
4,265195cb-90f4-4355-8750-c5a9d460e2ae,"Game of Thrones (Juego de tronos) , also commo...",Game of Thrones.pdf,1


In case the index doesn't exist, we create it with this piece of code.

For this code, we will use the Langchain and FAISS libraries. **FAISS (Facebook AI Similarity Search) is a library for efficient similarity search and clustering of dense vectors. It allows for fast retrieval of vectors similar to a query vector, even in large datasets. It's commonly used for tasks like recommendation systems, image retrieval, and semantic search, making it a valuable tool for RAG applications.**

In [None]:
if not os.path.exists(index_name):
    print("Creating index...")
    loader = DataFrameLoader(df_chunks, page_content_column="content")
    data = loader.load()
    db = FAISS.from_documents(data, embedding=embeddings)
    db.save_local(index_name)

else:
    print("Loading index...")
    db = FAISS.load_local(index_name, embeddings, allow_dangerous_deserialization=True)
db

Creating index...


<langchain_community.vectorstores.faiss.FAISS at 0x191fa719570>

### <a id='3' style="color: skyblue;">**3. ⚙️ Create dataset**</a>

In [7]:
num_questions = 5

testset_path = "data/testset"
testset_filename = "testset.csv"
testset_complete_path = f"{testset_path}/{testset_filename}"

if not os.path.exists(testset_complete_path):
    print("Creating dataset...")
    testset_df = datasetcreation_client.dataset_creation(df_chunks, num_questions=num_questions)
    testset_df.to_csv(testset_complete_path, index=False)

else:
    print("Loading dataset...")
    testset_df = pd.read_csv(testset_complete_path)

testset_df.head()

Creating dataset...
2025-04-01 11:07:42,302 pid:5704 MainThread giskard.rag  INFO     Finding topics in the knowledge base.

[1;31mProvider List: https://docs.litellm.ai/docs/providers[0m


[1;31mProvider List: https://docs.litellm.ai/docs/providers[0m

2025-04-01 11:08:09,281 pid:5704 MainThread giskard.rag  INFO     Found 4 topics in the knowledge base.


Generating questions: 100%|██████████| 5/5 [00:11<00:00,  2.32s/it]


Unnamed: 0,id,question,reference_answer,reference_context,reference_context_id,reference_metadata,reference_context_metadata
0,069ea02e-88be-49e8-b129-5c9b01b0dd35,"What happens in Season 1 of Suits, focusing on...","Season 1 introduces Mike Ross, hired by Harvey...","[Suits \nSuits follows Mike Ross, a brillian...",[9901b576-2d51-4340-a1a1-9b2bf655a75a],"{'question_type': 'simple', 'seed_document_id'...",[{'chunk_id': '9901b576-2d51-4340-a1a1-9b2bf65...
1,7d737df0-f795-4752-9425-33330aebadf1,Considering the American fantasy television se...,The television series is called Game of Throne...,"[Game of Thrones (Juego de tronos) , also comm...",[265195cb-90f4-4355-8750-c5a9d460e2ae],"{'question_type': 'complex', 'seed_document_id...",[{'chunk_id': '265195cb-90f4-4355-8750-c5a9d46...
2,a863ca13-a910-4e86-9a7c-150757c9f50f,Considering the events of Season 5 of Money He...,"In Season 4, Joffrey is poisoned at his weddin...","[Daenerys travels to the city of Astapor, wher...",[bee6f43a-6a79-485d-bd82-ab7dc870f382],"{'question_type': 'distracting element', 'seed...",[{'chunk_id': 'bee6f43a-6a79-485d-bd82-ab7dc87...
3,358f1a49-3471-44d6-9110-9182a021dc0c,"Hi, I'm writing a piece comparing the ethical ...",The gang members escape the Bank in body bags—...,[the theft. \nThe military disables all explo...,[7ef06b8c-2969-4dfb-a76c-0eaeb64b74f2],"{'question_type': 'situational', 'seed_documen...",[{'chunk_id': '7ef06b8c-2969-4dfb-a76c-0eaeb64...
4,4698ce12-87eb-4f1f-9e8b-e5a6b2173195,What is the central conflict in Season 1 of Su...,Season 1 of Suits centers on Mike Ross's secre...,"[Suits \nSuits follows Mike Ross, a brillian...",[9901b576-2d51-4340-a1a1-9b2bf655a75a],"{'question_type': 'double', 'original_question...",[{'chunk_id': '9901b576-2d51-4340-a1a1-9b2bf65...


### <a id='4' style="color: skyblue;">**4. 🔄 Retrieve examples & Evaluate**</a>

**Perform** a simple retrieval.


In [8]:
chunks, chunks_id = retrieval_client.retrieval_multiple_queries(
    db = db,
    queries = testset_df["question"].tolist(),
    top_k = 5
)
testset_df['generated_context'], testset_df['generated_context_id'] = chunks, chunks_id
testset_df.head()

Unnamed: 0,id,question,reference_answer,reference_context,reference_context_id,reference_metadata,reference_context_metadata,generated_context,generated_context_id
0,069ea02e-88be-49e8-b129-5c9b01b0dd35,"What happens in Season 1 of Suits, focusing on...","Season 1 introduces Mike Ross, hired by Harvey...","[""Suits \nSuits follows Mike Ross, a brillia...",['9901b576-2d51-4340-a1a1-9b2bf655a75a'],"{'question_type': 'simple', 'seed_document_id'...",[{'chunk_id': '9901b576-2d51-4340-a1a1-9b2bf65...,"[Suits \nSuits follows Mike Ross, a brillian...","[9901b576-2d51-4340-a1a1-9b2bf655a75a, bfcc042..."
1,7d737df0-f795-4752-9425-33330aebadf1,Considering the American fantasy television se...,The television series is called Game of Throne...,"[""Game of Thrones (Juego de tronos) , also com...",['265195cb-90f4-4355-8750-c5a9d460e2ae'],"{'question_type': 'complex', 'seed_document_id...",[{'chunk_id': '265195cb-90f4-4355-8750-c5a9d46...,"[Game of Thrones (Juego de tronos) , also comm...","[265195cb-90f4-4355-8750-c5a9d460e2ae, d777c6c..."
2,a863ca13-a910-4e86-9a7c-150757c9f50f,Considering the events of Season 5 of Money He...,"In Season 4, Joffrey is poisoned at his weddin...","['Daenerys travels to the city of Astapor, whe...",['bee6f43a-6a79-485d-bd82-ab7dc870f382'],"{'question_type': 'distracting element', 'seed...",[{'chunk_id': 'bee6f43a-6a79-485d-bd82-ab7dc87...,"[Daenerys travels to the city of Astapor, wher...","[bee6f43a-6a79-485d-bd82-ab7dc870f382, 265195c..."
3,358f1a49-3471-44d6-9110-9182a021dc0c,"Hi, I'm writing a piece comparing the ethical ...",The gang members escape the Bank in body bags—...,['the theft. \nThe military disables all expl...,['7ef06b8c-2969-4dfb-a76c-0eaeb64b74f2'],"{'question_type': 'situational', 'seed_documen...",[{'chunk_id': '7ef06b8c-2969-4dfb-a76c-0eaeb64...,[the theft. \nThe military disables all explo...,"[7ef06b8c-2969-4dfb-a76c-0eaeb64b74f2, 0346d33..."
4,4698ce12-87eb-4f1f-9e8b-e5a6b2173195,What is the central conflict in Season 1 of Su...,Season 1 of Suits centers on Mike Ross's secre...,"[""Suits \nSuits follows Mike Ross, a brillia...",['9901b576-2d51-4340-a1a1-9b2bf655a75a'],"{'question_type': 'double', 'original_question...",[{'chunk_id': '9901b576-2d51-4340-a1a1-9b2bf65...,"[Suits \nSuits follows Mike Ross, a brillian...","[9901b576-2d51-4340-a1a1-9b2bf655a75a, 16c9176..."


**Evaluate** the retrieval, specifying the `context_precision` and `context_recall` metrics. 

* `context_precision`: Measures how relevant the retrieved context is to the query.
* `context_recall`: Assesses how well the retrieved context covers all relevant information for the query.

In [None]:
retrieval_results = evaluation_client.evaluate(testset_df = testset_df,
                                               retrieval_metrics = ['context_precision', 'context_recall'])
retrieval_results

Evaluating: 100%|██████████| 15/15 [00:25<00:00,  2.51s/it]


Unnamed: 0,question,generated_context,reference_contexts,reference_answer,context_precision,context_recall
0,"What happens in Season 1 of Suits, focusing on...","[Suits \nSuits follows Mike Ross, a brillian...","[Suits \nSuits follows Mike Ross, a brillian...","Season 1 introduces Mike Ross, hired by Harvey...",1.0,1.0
1,Considering the American fantasy television se...,"[Game of Thrones (Juego de tronos) , also comm...","[Game of Thrones (Juego de tronos) , also comm...",The television series is called Game of Throne...,1.0,1.0
2,Considering the events of Season 5 of Money He...,"[Daenerys travels to the city of Astapor, wher...","[Daenerys travels to the city of Astapor, wher...","In Season 4, Joffrey is poisoned at his weddin...",1.0,1.0
3,"Hi, I'm writing a piece comparing the ethical ...",[the theft. \nThe military disables all explo...,[the theft. \nThe military disables all explo...,The gang members escape the Bank in body bags—...,1.0,1.0
4,What is the central conflict in Season 1 of Su...,"[Suits \nSuits follows Mike Ross, a brillian...","[Suits \nSuits follows Mike Ross, a brillian...",Season 1 of Suits centers on Mike Ross's secre...,1.0,1.0


### <a id='5' style="color: skyblue;">**5. 🎯 Answer questions & Evaluate**</a>

**Answer** the questions. 

In [None]:
answers = answering_client.answer_multiple_queries(queries = testset_df["question"].tolist(), 
                                                   contexts = testset_df["generated_context"].tolist(),
                                                   llm = llm)
testset_df['generated_answer'] = answers
testset_df

Unnamed: 0,id,question,reference_answer,reference_context,reference_context_id,reference_metadata,reference_context_metadata,generated_context,generated_context_id,generated_answer
0,069ea02e-88be-49e8-b129-5c9b01b0dd35,"What happens in Season 1 of Suits, focusing on...","Season 1 introduces Mike Ross, hired by Harvey...","[""Suits \nSuits follows Mike Ross, a brillia...",['9901b576-2d51-4340-a1a1-9b2bf655a75a'],"{'question_type': 'simple', 'seed_document_id'...",[{'chunk_id': '9901b576-2d51-4340-a1a1-9b2bf65...,"[Suits \nSuits follows Mike Ross, a brillian...","[9901b576-2d51-4340-a1a1-9b2bf655a75a, bfcc042...","Season 1 of Suits introduces Mike Ross, a coll..."
1,7d737df0-f795-4752-9425-33330aebadf1,Considering the American fantasy television se...,The television series is called Game of Throne...,"[""Game of Thrones (Juego de tronos) , also com...",['265195cb-90f4-4355-8750-c5a9d460e2ae'],"{'question_type': 'complex', 'seed_document_id...",[{'chunk_id': '265195cb-90f4-4355-8750-c5a9d46...,"[Game of Thrones (Juego de tronos) , also comm...","[265195cb-90f4-4355-8750-c5a9d460e2ae, d777c6c...",The title of the American fantasy television s...
2,a863ca13-a910-4e86-9a7c-150757c9f50f,Considering the events of Season 5 of Money He...,"In Season 4, Joffrey is poisoned at his weddin...","['Daenerys travels to the city of Astapor, whe...",['bee6f43a-6a79-485d-bd82-ab7dc870f382'],"{'question_type': 'distracting element', 'seed...",[{'chunk_id': 'bee6f43a-6a79-485d-bd82-ab7dc87...,"[Daenerys travels to the city of Astapor, wher...","[bee6f43a-6a79-485d-bd82-ab7dc870f382, 265195c...",Answer not found. The provided text describes...
3,358f1a49-3471-44d6-9110-9182a021dc0c,"Hi, I'm writing a piece comparing the ethical ...",The gang members escape the Bank in body bags—...,['the theft. \nThe military disables all expl...,['7ef06b8c-2969-4dfb-a76c-0eaeb64b74f2'],"{'question_type': 'situational', 'seed_documen...",[{'chunk_id': '7ef06b8c-2969-4dfb-a76c-0eaeb64...,[the theft. \nThe military disables all explo...,"[7ef06b8c-2969-4dfb-a76c-0eaeb64b74f2, 0346d33...",The gang's escape and the gold's fate in *Mone...
4,4698ce12-87eb-4f1f-9e8b-e5a6b2173195,What is the central conflict in Season 1 of Su...,Season 1 of Suits centers on Mike Ross's secre...,"[""Suits \nSuits follows Mike Ross, a brillia...",['9901b576-2d51-4340-a1a1-9b2bf655a75a'],"{'question_type': 'double', 'original_question...",[{'chunk_id': '9901b576-2d51-4340-a1a1-9b2bf65...,"[Suits \nSuits follows Mike Ross, a brillian...","[9901b576-2d51-4340-a1a1-9b2bf655a75a, 16c9176...",The central conflict in Season 1 of *Suits* re...


**Evaluate** the answering, specifying the `answer_relevancy`, `answer_similarity` and `faithfulness` metrics.

* `answer_relevancy`: Evaluates how well the generated answer addresses the user’s query.
* `answer_similarity`: Measures how close the meaning of the generated answer is to a reference answer.
* `faithfulness`: Evaluates how factually accurate the generated answer is to the retrieved context or reference.

In [None]:
answer_results = evaluation_client.evaluate(testset_df = testset_df,
                                            answer_metrics = ['answer_relevancy', 'answer_similarity', 'faithfulness'])
answer_results

Evaluating: 100%|██████████| 15/15 [00:32<00:00,  2.20s/it]


Unnamed: 0,question,generated_context,reference_contexts,generated_answer,reference_answer,answer_relevancy,semantic_similarity,faithfulness
0,"What happens in Season 1 of Suits, focusing on...","[Suits \nSuits follows Mike Ross, a brillian...","[Suits \nSuits follows Mike Ross, a brillian...","Season 1 of Suits introduces Mike Ross, a coll...","Season 1 introduces Mike Ross, hired by Harvey...",0.719353,0.903579,1.0
1,Considering the American fantasy television se...,"[Game of Thrones (Juego de tronos) , also comm...","[Game of Thrones (Juego de tronos) , also comm...",The title of the American fantasy television s...,The television series is called Game of Throne...,0.749348,0.927763,0.833333
2,Considering the events of Season 5 of Money He...,"[Daenerys travels to the city of Astapor, wher...","[Daenerys travels to the city of Astapor, wher...",Answer not found. The provided text describes...,"In Season 4, Joffrey is poisoned at his weddin...",0.0,0.720141,0.859311
3,"Hi, I'm writing a piece comparing the ethical ...",[the theft. \nThe military disables all explo...,[the theft. \nThe military disables all explo...,The gang's escape and the gold's fate in *Mone...,The gang members escape the Bank in body bags—...,0.615349,0.804304,0.809524
4,What is the central conflict in Season 1 of Su...,"[Suits \nSuits follows Mike Ross, a brillian...","[Suits \nSuits follows Mike Ross, a brillian...",The central conflict in Season 1 of *Suits* re...,Season 1 of Suits centers on Mike Ross's secre...,0.724379,0.918573,0.933333


---