<a href="https://colab.research.google.com/github/xprilion/gemini-as-a-judge-for-rag-evals/blob/main/Step_1_Problem_Context_The_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Gemini As A Judge for RAG Evals

## The RAG

### 1. Load the datasets

In [1]:
!wget https://raw.githubusercontent.com/567-labs/systematically-improving-rag/refs/heads/main/cohort_1/week1_bootstrap_evals/reviews.json

--2025-03-01 02:02:55--  https://raw.githubusercontent.com/567-labs/systematically-improving-rag/refs/heads/main/cohort_1/week1_bootstrap_evals/reviews.json
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.110.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 507865 (496K) [text/plain]
Saving to: ‘reviews.json.1’


2025-03-01 02:02:56 (2.67 MB/s) - ‘reviews.json.1’ saved [507865/507865]



### Packages

In [2]:
%%capture
!pip install qdrant-client[fastembed]

### Imports

In [3]:
import pandas as pd
import json
import os
import time
from tqdm import tqdm
from google import genai
from google.genai import types
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams
import uuid

from google.colab import userdata

  warn(


### Helpers

In [4]:
collection_name = "product_reviews"

In [5]:
GEMINI_KEY = userdata.get('GEMINI_API_KEY')
gemini_client = genai.Client(
    api_key=GEMINI_KEY
)

In [6]:
def getGeminiResponse(prompt, max_tokens=8192, response_type="text/plain"):
    contents = [
        types.Content(
            role="user",
            parts=[
                types.Part.from_text(
                    text=prompt
                ),
            ],
        ),
    ]
    generate_content_config = types.GenerateContentConfig(
        temperature=0,
        top_p=0.95,
        top_k=40,
        max_output_tokens=max_tokens,
        response_mime_type=response_type,
    )
    response = gemini_client.models.generate_content(
        model="gemini-2.0-flash", contents=contents, config=generate_content_config
    )
    return response.text

In [7]:
getGeminiResponse("What is 2+3?")

'2 + 3 = 5\n'

### EDA

In [8]:
df = pd.read_json('reviews.json')

In [9]:
df.head()

Unnamed: 0,product_title,product_description,review
0,Hammer,This 16 oz claw hammer is perfect for general ...,I've been using this hammer for a few months n...
1,Hammer,This 16 oz claw hammer is perfect for general ...,This hammer is a solid addition to my toolbox....
2,Hammer,This 16 oz claw hammer is perfect for general ...,I purchased this hammer for some home renovati...
3,Hammer,This 16 oz claw hammer is perfect for general ...,"As a professional carpenter, I rely on my tool..."
4,Hammer,This 16 oz claw hammer is perfect for general ...,This hammer is a great value for the price. Th...


### Connect Qdrant

In [10]:
QDRANT_URL = "https://qdrant-1.sg-1.cloudtop.dev"
QDRANT_KEY = userdata.get('PERSONAL_QDRANT_KEY')

In [11]:
qdrant_client = QdrantClient(url=QDRANT_URL, api_key=QDRANT_KEY, port=None)

### Create Documents

In [12]:
documents = []
metadatas = []
ids = []

In [13]:
for index, row in df.iterrows():
    product_title = row['product_title']
    product_description = row['product_description']
    review = row['review']

    # Combine product information and review into a single document
    document = f"Title: {product_title}\nDescription: {product_description}\nReview: {review}"

    # Create metadata dictionary
    metadata = {
        "product_title": product_title,
        "product_description": product_description,
        "review": review,
        "index": index, # add index for reference
    }

    # Generate a unique ID
    doc_id = str(uuid.uuid4())

    documents.append(document)
    metadatas.append(metadata)
    ids.append(doc_id)

In [14]:
len(documents)

900

### Load data into qdrant

In [15]:
%%capture

qdrant_client.add(
    collection_name=collection_name,
    documents=documents,
    metadata=metadatas,
    ids=ids
)

### Test Retrieval

In [16]:
user_query = "I want to hang some shelves with a hammer"

In [17]:
search_result = qdrant_client.query(collection_name=collection_name, query_text=user_query)

In [18]:
for result in search_result:
    print(result.document)
    print("---")

Title: Hammer
Description: A lightweight 12 oz hammer ideal for small household repairs. Its compact size makes it easy to store and handle.
Review: I've been using this hammer for a few months now, and I'm very impressed with its performance. The 12 oz weight is ideal for small tasks, and the compact size makes it easy to store in my tool bag. The handle is ergonomic and provides a secure grip, even when I'm working in tight spaces. The hammer is also very durable; it has withstood several drops and still works perfectly. Great value for the price.
---
Title: Hammer
Description: A lightweight 12 oz hammer ideal for small household repairs. Its compact size makes it easy to store and handle.
Review: I bought this hammer for my daughter, who recently moved into her first apartment. She loves it! The lightweight design makes it easy for her to handle, and the compact size means she can store it in her small tool kit. The hammer is sturdy and well-made, and the grip is comfortable. She's 

In [19]:
def getRagResponse(question):
    search_result = qdrant_client.query(collection_name=collection_name, query_text=question)
    system_prompt = """
      You are an intelligent assistant designed to provide accurate and informative answers based on retrieved documents.

      Your primary task is to:

      Understand the user's query.
      Retrieve relevant information from the provided context (documents).
      Synthesize the retrieved information into a coherent and accurate response.

      documents:

      """

    documents_text = ""

    doc_count = 1
    for result in search_result:
      documents_text += str(doc_count) + ": \n" + result.document + "\n\n"
      doc_count += 1

    users_query = "\n\n The user is asking: " + question

    prompt = system_prompt + documents_text + users_query

    response = getGeminiResponse(prompt)

    return response

### Test RAG

In [20]:
getRagResponse(user_query)

"Based on the provided documents, the 12 oz lightweight hammer is suitable for hanging shelves. The reviews mention using it for hanging shelves, pictures, assembling furniture, and small household repairs. The hammer's lightweight design, compact size, and comfortable grip make it ideal for such tasks.\n"