<a href="https://colab.research.google.com/github/anshupandey/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag-evaluation/ragas_with_gemini.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Getting Started with RAGAS & Vertex AI Gemini API

## Overview

### [RAGAS](https://docs.ragas.io/en/stable/)

Ragas is a framework that helps you evaluate your Retrieval Augmented Generation (RAG) pipelines. RAG denotes a class of LLM applications that use external data to augment the LLM's context. There are existing tools and frameworks that help you build these pipelines but evaluating it and quantifying your pipeline performance can be hard. This is where Ragas (RAG Assessment) comes in.

### Gemini

Gemini is a family of generative AI models developed by Google DeepMind that is designed for multimodal use cases. The Gemini API gives you access to the Gemini Pro Vision and Gemini Pro models.

### Vertex AI Gemini API

The Vertex AI Gemini API provides a unified interface for interacting with Gemini models. There are currently two models available in the Gemini API:

- **Gemini Pro model** (`gemini-pro`): Designed to handle natural language tasks, multiturn text and code chat, and code generation.
- **Gemini Pro Vision model** (`gemini-pro-vision`): Supports multimodal prompts. You can include text, images, and video in your prompt requests and get text or code responses.

You can interact with the Gemini API using the following methods:

- Use the [Vertex AI Studio](https://cloud.google.com/generative-ai-studio) for quick testing and command generation
- Use cURL commands
- Use the Vertex AI SDK

This notebook focuses on using the **Gemini model with RAGAS**

For more information, see the [Generative AI on Vertex AI](https://cloud.google.com/vertex-ai/docs/generative-ai/learn/overview) documentation.


### Objectives

In this notebook we will focus on using the Vertex AI Gemini API with RAGAS
We will use the Gemini Pro (`gemini-1.0-pro-002`) model for Q&A evaluation.

You will complete the following tasks:

- Install the Vertex AI SDK for Python
- Use the Vertex AI Gemini API to interact with each model
  - Gemini Pro (`gemini-1.0-pro-002`) model:
    - Q&A Generation
    - Evaluate Q&A performance with RAGAS

## Getting Started


### Install Vertex AI SDK for Python


In [7]:
!pip install --user ragas==0.1.6 datasets==2.18.0 langchain langchain-google-vertexai langchain-chroma==0.1.1 chromadb==0.5.0 pypdf==4.2.0 -q

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/67.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.4/50.4 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m80.7/80.7 kB[0m [31m7.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m510.5/510.5 kB[0m [31m13.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m526.8/526.8 kB[0m [31m10.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m290.4/290.4 kB[0m [31m13.5 MB/s[0m eta [36m

### Restart current runtime

To use the newly installed packages in this Jupyter runtime, it is recommended to restart the runtime. Run the following cell to restart the current kernel.

The restart process might take a minute or so.


In [8]:
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

After the restart is complete, continue to the next step.


<div class="alert alert-block alert-warning">
<b>⚠️ Wait for the kernel to finish restarting before you continue. ⚠️</b>
</div>


## Import libraries


In [1]:
import pandas as pd
import vertexai

from langchain_community.document_loaders import PyPDFLoader
from langchain_chroma import Chroma
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate

from langchain_google_vertexai import VertexAI, VertexAIEmbeddings

# Important to make Gemini Work with RAGAS
from ragas.llms.base import LangchainLLMWrapper
from ragas import evaluate
from ragas.metrics import (
    answer_relevancy,
    context_recall,
    context_precision,
    answer_similarity,
)
from ragas.metrics.critique import harmfulness

from datasets import Dataset

In [6]:
import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

In [8]:
# TODO(developer): Update the below lines
PROJECT_ID = "jrproject-402905"
LOCATION = "us-central1"

vertexai.init(project=PROJECT_ID, location=LOCATION)

In [32]:
import os

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "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"

## Use Vertex AI models

The [Gemini-1.0-pro](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/overview) models are designed to handle natural language tasks, multiturn text and code chat, and code generation.


In [11]:
# Load the Gemini Pro model
llm = VertexAI(model_name="gemini-1.5-flash-001")

The [Vertex AI Embeddings](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings) models are designed to convert text to dense vector representations

In [12]:
# Load Embeddings Models
embeddings = VertexAIEmbeddings(model_name="textembedding-gecko@003")

## Create a local Vector DB
### Load the document

In [13]:
# source document
document_uri = "https://arxiv.org/pdf/1706.03762"

In [14]:
# use PyPDF loaded to read and chunk the input document
loader = PyPDFLoader(document_uri)
docs = loader.load_and_split()

# Verify if pages are loaded correctly
docs[0]

  from cryptography.hazmat.primitives.ciphers.algorithms import AES, ARC4


Document(metadata={'source': 'https://arxiv.org/pdf/1706.03762', 'page': 0}, page_content='Provided proper attribution is provided, Google hereby grants permission to\nreproduce the tables and figures in this paper solely for use in journalistic or\nscholarly works.\nAttention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.comNoam Shazeer∗\nGoogle Brain\nnoam@google.comNiki Parmar∗\nGoogle Research\nnikip@google.comJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllion@google.comAidan N. Gomez∗ †\nUniversity of Toronto\naidan@cs.toronto.eduŁukasz Kaiser∗\nGoogle Brain\nlukaszkaiser@google.com\nIllia Polosukhin∗ ‡\nillia.polosukhin@gmail.com\nAbstract\nThe dominant sequence transduction models are based on complex recurrent or\nconvolutional neural networks that include an encoder and a decoder. The best\nperforming models also connect the encoder and decoder through an attention\nmechanism. We propose a new simple network architecture, 

In [25]:
print(len(docs))

16


### Create local Vector DB

In [15]:
# Create an in-memory Vector DB using Chroma
vectordb = Chroma.from_documents(docs, embeddings)

In [18]:
# Set Vector DB as retriever
retriever = vectordb.as_retriever(config={"k":2})

### Create Q&A Chain

In [19]:
# Create Q&A template for the Gemini Model
template = """You task is to answer questions related documents.
Use the following context to answer the question at the end.
{context}

Answers should be crisp.

Question: {question}
Helpful Answer:"""

# Create a prompt template for the q&a chain
PROMPT = PromptTemplate(
    template=template,
    input_variables=["context", "question"],
)

# Pass prompts to q&a chain
chain_type_kwargs = {"prompt": PROMPT}

# Retriever arguments
retriever.search_kwargs = {"k": 3}

In [20]:
# Setup a RetrievalQA Chain
qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True,
    chain_type_kwargs=chain_type_kwargs,
)

In [21]:
# Test the chain with a sample question
query = "Who are the authors of paper on Attention is all you need"
result = qa({"query": query})
result

  warn_deprecated(


{'query': 'Who are the authors of paper on Attention is all you need',
 'result': 'The authors are: Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. \n',
 'source_documents': [Document(metadata={'page': 0, 'source': 'https://arxiv.org/pdf/1706.03762'}, page_content='Provided proper attribution is provided, Google hereby grants permission to\nreproduce the tables and figures in this paper solely for use in journalistic or\nscholarly works.\nAttention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.comNoam Shazeer∗\nGoogle Brain\nnoam@google.comNiki Parmar∗\nGoogle Research\nnikip@google.comJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllion@google.comAidan N. Gomez∗ †\nUniversity of Toronto\naidan@cs.toronto.eduŁukasz Kaiser∗\nGoogle Brain\nlukaszkaiser@google.com\nIllia Polosukhin∗ ‡\nillia.polosukhin@gmail.com\nAbstract\nThe dominant sequence transduction mo

## Evaluation
### Create the evaluation set

In [22]:
# Evaluation set with questions and ground_truth
questions = [
    "Who is the author of paper Attention is all you need",
    "What architecture is proposed in paper titled Attention is all you need?",
]
ground_truth = [
    "The authors of the paper 'Attention is all you need' are:\n\n* Ashish Vaswani\n* Noam Shazeer\n* Niki Parmar\n* Jakob Uszkoreit\n* Llion Jones\n* Aidan N. Gomez\n* Łukasz Kaiser\n* Illia Polosukhin",
    "Transformers architecture",
]

### Run the [Q&A chain](#create-qa-chain) on evaluation dataset

In [24]:
contexts = []
answers = []
import time

# Generate contexts and answers for each question
for query in questions:
    result = qa({"query": query})
    #time.sleep(30) # uncomment for quota issues
    contexts.append(
        [document.page_content for document in result.get("source_documents")]
    )
    answers.append(result.get("result"))

In [26]:
# Convert into a dataset and prepare for consumtion by RAGAS API
data = {
    "question": questions,
    "contexts": contexts,
    "ground_truth": ground_truth,
    "answer": answers,
}

dataset = Dataset.from_dict(data)
dataset

Dataset({
    features: ['question', 'contexts', 'ground_truth', 'answer'],
    num_rows: 2
})

In [27]:
# Compile list of RAGAS Metrics
metrics = [
    answer_relevancy,
    context_recall,
    context_precision,
    harmfulness,
    answer_similarity,
]

## IMPORTANT : Gemini with RAGAS
> RAGAS is designed to work with OpenAI Models by default. We must set a few attributes to make it work with Gemini

In [28]:
class RAGASVertexAIEmbeddings(VertexAIEmbeddings):
    """Wrapper for RAGAS"""

    async def embed_text(self, text: str) -> list[float]:
        """Embeds a text for semantics similarity"""
        return self.embed([text], 1, "SEMANTIC_SIMILARITY")[0]

In [29]:
# Wrapper to make RAGAS work with Gemini and Vertex AI Embeddings Models
embeddings = RAGASVertexAIEmbeddings(model_name="textembedding-gecko@003")
ragas_llm = LangchainLLMWrapper(llm)

for m in metrics:
    # change LLM for metric
    m.__setattr__("llm", ragas_llm)

    # check if this metric needs embeddings
    if hasattr(m, "embeddings"):
        # if so change with Vertex AI Embeddings
        m.__setattr__("embeddings", embeddings)

### Run the RAGAS Evaluation

In [30]:
# Run the evaluation on every row of the dataset
result_set = []
for i in range(len(dataset)):
    result = evaluate(
        dataset=Dataset.from_dict(dataset[i : i + 1]),
        metrics=metrics,
        raise_exceptions=False,
    )
    result_set.append(result.to_pandas())

Evaluating:   0%|          | 0/5 [00:00<?, ?it/s]



Evaluating:   0%|          | 0/5 [00:00<?, ?it/s]



In [31]:
# View results in Pandas DataFrame
results_df = pd.concat(result_set)
results_df

Unnamed: 0,question,contexts,ground_truth,answer,answer_relevancy,context_recall,context_precision,harmfulness,answer_similarity
0,Who is the author of paper Attention is all yo...,"[Provided proper attribution is provided, Goog...",The authors of the paper 'Attention is all you...,"The authors of the paper ""Attention is All You...",0.816914,1.0,1.0,0,0.996786
0,What architecture is proposed in paper titled ...,[Figure 1: The Transformer - model architectur...,Transformers architecture,The paper proposes a new architecture called t...,0.778555,1.0,1.0,0,0.743849


# Conclusion

In this notebook, you learned:

1. RAGAS - Framework for evaluation .
2. Making RAGAS Work with Vertex AI Gemini API