<img src="https://drive.google.com/uc?export=view&id=1wYSMgJtARFdvTt5g7E20mE4NmwUFUuog" width="200">

[![Build Fast with AI](https://img.shields.io/badge/BuildFastWithAI-GenAI%20Bootcamp-blue?style=for-the-badge&logo=artificial-intelligence)](https://www.buildfastwithai.com/genai-course)
[![EduChain GitHub](https://img.shields.io/github/stars/satvik314/educhain?style=for-the-badge&logo=github&color=gold)](https://github.com/satvik314/educhain)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1sy5nKIdTpy16VbYQzn3BvhDhfb5D1uyL?usp=sharing)
## Master Generative AI in 6 Weeks
**What You'll Learn:**
- Build with Latest LLMs
- Create Custom AI Apps
- Learn from Industry Experts
- Join Innovation Community
Transform your AI ideas into reality through hands-on projects and expert mentorship.
[Start Your Journey](https://www.buildfastwithai.com/genai-course)
*Empowering the Next Generation of AI Innovators

# Ragas: Evaluation Framework for RAG Systems

[Ragas](https://github.com/explodinggradients/ragas) is an open-source evaluation framework for retrieval-augmented generation (RAG) systems. It provides tools to assess the quality of RAG workflows by analyzing their retrieval and generation components. Ragas supports metrics like precision, recall, and response coherence, helping developers improve their systems. Designed for flexibility, it integrates easily with popular RAG setups and workflows.


###**Setup and Installation**



In [None]:
pip install ragas sacrebleu langchain-openai

In [None]:
pip install git+https://github.com/explodinggradients/ragas.git


###**Setup Keys**

In [None]:
from google.colab import userdata
import os

os.environ['OPENAI_API_KEY']=userdata.get('OPENAI_API_KEY')

##**Building a simple Q&A application**

In [None]:
import os
from dotenv import load_dotenv
from langchain_core.documents import Document

load_dotenv()

content_list = [
    "Andrew Ng is the CEO of Landing AI and is known for his pioneering work in deep learning. He is also widely recognized for democratizing AI education through platforms like Coursera.",
    "Sam Altman is the CEO of OpenAI and has played a key role in advancing AI research and development. He is a strong advocate for creating safe and beneficial AI technologies.",
    "Demis Hassabis is the CEO of DeepMind and is celebrated for his innovative approach to artificial intelligence. He gained prominence for developing systems that can master complex games like AlphaGo.",
    "Sundar Pichai is the CEO of Google and Alphabet Inc., and he is praised for leading innovation across Google's vast product ecosystem. His leadership has significantly enhanced user experiences on a global scale.",
    "Arvind Krishna is the CEO of IBM and is recognized for transforming the company towards cloud computing and AI solutions. He focuses on providing cutting-edge technologies to address modern business challenges.",
]

langchain_documents = []

for content in content_list:
    langchain_documents.append(
        Document(
            page_content=content,
        )
    )

### **LangChain Vector Store with OpenAI Embeddings**


In [None]:
from langchain_openai.embeddings import OpenAIEmbeddings
from langchain_core.vectorstores import InMemoryVectorStore

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vector_store = InMemoryVectorStore(embeddings)

_ = vector_store.add_documents(langchain_documents)

### **Retriever Configuration with Vector Store**


In [None]:
retriever = vector_store.as_retriever(search_kwargs={"k": 1})

In [None]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini")

### **Creating QA Chain with ChatPromptTemplate**


In [None]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser


template = """Answer the question based only on the following context:
{context}

Question: {query}
"""
prompt = ChatPromptTemplate.from_template(template)

qa_chain = prompt | llm | StrOutputParser()

### **Query Processing with Relevant Documents**


In [None]:
def format_docs(relevant_docs):
    return "\n".join(doc.page_content for doc in relevant_docs)


query = "Who is the CEO of OpenAI?"

relevant_docs = retriever.invoke(query)
qa_chain.invoke({"context": format_docs(relevant_docs), "query": query})

'The CEO of OpenAI is Sam Altman.'

### **Sample Queries and Expected Responses**


In [None]:
sample_queries = [
    "Which CEO is widely recognized for democratizing AI education through platforms like Coursera?",
    "Who is Sam Altman?",
    "Who is Demis Hassabis and how did he gained prominence?",
    "Who is the CEO of Google and Alphabet Inc., praised for leading innovation across Google's product ecosystem?",
    "How did Arvind Krishna transformed IBM?",
]

expected_responses = [
    "Andrew Ng is the CEO of Landing AI and is widely recognized for democratizing AI education through platforms like Coursera.",
    "Sam Altman is the CEO of OpenAI and has played a key role in advancing AI research and development. He strongly advocates for creating safe and beneficial AI technologies.",
    "Demis Hassabis is the CEO of DeepMind and is celebrated for his innovative approach to artificial intelligence. He gained prominence for developing systems like AlphaGo that can master complex games.",
    "Sundar Pichai is the CEO of Google and Alphabet Inc., praised for leading innovation across Google's vast product ecosystem. His leadership has significantly enhanced user experiences globally.",
    "Arvind Krishna is the CEO of IBM and has transformed the company towards cloud computing and AI solutions. He focuses on delivering cutting-edge technologies to address modern business challenges.",
]

### **Creating Evaluation Dataset for RAGAS**


In [None]:
from ragas import EvaluationDataset


dataset = []

for query, reference in zip(sample_queries, expected_responses):
    relevant_docs = retriever.invoke(query)
    response = qa_chain.invoke({"context": format_docs(relevant_docs), "query": query})
    dataset.append(
        {
            "user_input": query,
            "retrieved_contexts": [rdoc.page_content for rdoc in relevant_docs],
            "response": response,
            "reference": reference,
        }
    )

evaluation_dataset = EvaluationDataset.from_list(dataset)

### **Evaluating Model with RAGAS Metrics**


In [None]:
from ragas import evaluate
from ragas.llms import LangchainLLMWrapper
from ragas.metrics import LLMContextRecall, Faithfulness, FactualCorrectness

evaluator_llm = LangchainLLMWrapper(llm)

result = evaluate(
    dataset=evaluation_dataset,
    metrics=[LLMContextRecall(), Faithfulness(), FactualCorrectness()],
    llm=evaluator_llm,
)

result

Evaluating:   0%|          | 0/15 [00:00<?, ?it/s]

{'context_recall': 1.0000, 'faithfulness': 0.9500, 'factual_correctness': 0.9140}

##**Evaluate a simple LLM application**

In [None]:
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
evaluator_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4o"))
evaluator_embeddings = LangchainEmbeddingsWrapper(OpenAIEmbeddings())

### **Evaluating Summary Accuracy with AspectCritic**

In [None]:
from ragas import SingleTurnSample
from ragas.metrics import AspectCritic

test_data = {
    "user_input": "summarise given text\nThe company reported an 8% rise in Q3 2024, driven by strong performance in the Asian market. Sales in this region have significantly contributed to the overall growth. Analysts attribute this success to strategic marketing and product localization. The positive trend in the Asian market is expected to continue into the next quarter.",
    "response": "The company experienced an 8% increase in Q3 2024, largely due to effective marketing strategies and product adaptation, with expectations of continued growth in the coming quarter.",
}

metric = AspectCritic(name="summary_accuracy",llm=evaluator_llm, definition="Verify if the summary is accurate.")
test_data = SingleTurnSample(**test_data)
await metric.single_turn_ascore(test_data)

1

### **Load Dataset**


In [None]:
from datasets import load_dataset
from ragas import EvaluationDataset
eval_dataset = load_dataset("explodinggradients/earning_report_summary",split="train")
eval_dataset = EvaluationDataset.from_hf_dataset(eval_dataset)
print("Features in dataset:", eval_dataset.features())
print("Total samples in dataset:", len(eval_dataset))

Features in dataset: ['user_input', 'response']
Total samples in dataset: 50


### **Evaluate Dataset and Display Results**


In [None]:
results = evaluate(eval_dataset, metrics=[metric])
results

Evaluating:   0%|          | 0/50 [00:00<?, ?it/s]

{'summary_accuracy': 0.9800}

### **Convert Results to Pandas DataFrame**


In [None]:
results.to_pandas()


Unnamed: 0,user_input,response,summary_accuracy
0,summarise given text\nThe Q2 earnings report r...,The Q2 earnings report showed a 15% revenue in...,1
1,"summarise given text\nIn 2023, North American ...",Companies are strategizing to adapt to market ...,1
2,"summarise given text\nIn 2022, European expans...",Many companies experienced a notable 15% growt...,1
3,summarise given text\nSupply chain challenges ...,"Supply chain challenges in North America, caus...",1
4,"summarise given text\nIn Q2 2023, the company ...",The company experienced a notable increase in ...,1
5,"summarise given text\nIn 2023, marketing campa...","In 2023, marketing campaigns in North America ...",1
6,summarise given text\nThe company's internatio...,The company's international expansion strategy...,1
7,"summarise given text\nIn 2024, companies are i...",Companies are using data analytics to customiz...,1
8,"summarise given text\nIn 2023, logistics inves...",Driven by technological and infrastructural ad...,1
9,"summarise given text\nIn 2023, the company exp...",The company faced challenges due to competitio...,1


In [None]:
import os
os.environ["RAGAS_APP_TOKEN"] = userdata.get('RAGAS_API_KEY')

In [None]:
results.upload()


Evaluation results uploaded! View at https://app.ragas.io/dashboard/alignment/evaluation/77f55e23-b57c-4336-bf95-379ef4ba080b


'https://app.ragas.io/dashboard/alignment/evaluation/77f55e23-b57c-4336-bf95-379ef4ba080b'