## Introduction to Semantic Chunking
Text chunking is an essential step in Retrieval-Augmented Generation (RAG), where large text bodies are divided into meaningful segments to improve retrieval accuracy.
Unlike fixed-length chunking, semantic chunking splits text based on the content similarity between sentences.

### Breakpoint Methods:
- **Percentile**: Finds the Xth percentile of all similarity differences and splits chunks where the drop is greater than this value.
- **Standard Deviation**: Splits where similarity drops more than X standard deviations below the mean.
- **Interquartile Range (IQR)**: Uses the interquartile distance (Q3 - Q1) to determine split points.

This notebook implements semantic chunking **using the percentile method** and evaluates its performance on a sample text.

## Setting Up the Environment
We begin by importing necessary libraries.

In [3]:
import fitz # PyMuPDF
import os
import numpy as np
import json
import google.generativeai as genai

## Extracting Text from a PDF File
To implement RAG, we first need a source of textual data. In this case, we extract text from a PDF file using the PyMuPDF library.

In [5]:
import fitz
import os
import google.generativeai as genai
from dotenv import load_dotenv

# Your extract_text_from_pdf function goes here...
def extract_text_from_pdf(pdf_path):
    """
    Extracts text from a PDF file.
    Args:
    pdf_path (str): Path to the PDF file.
    Returns:
    str: Extracted text from the PDF.
    """
    mypdf = fitz.open(pdf_path)
    all_text = ""
    for page in mypdf:
        all_text += page.get_text("text") + " "
    return all_text.strip()

# --- Gemini API Configuration ---
load_dotenv()
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
if not GOOGLE_API_KEY:
    raise ValueError("GOOGLE_API_KEY environment variable is not set.")
genai.configure(api_key=GOOGLE_API_KEY)

# --- Define a function for summarization ---
def summarize_text_with_gemini(text_to_summarize):
    """
    Summarizes text using the Gemini API.
    """
    model = genai.GenerativeModel('gemini-1.5-flash')
    prompt = f"Summarize the following text:\n\n{text_to_summarize}"
    response = model.generate_content(prompt)
    return response.text

# --- Main Logic ---
if __name__ == "__main__":
    pdf_path = "/Users/kekunkoya/Desktop/770 Google /AI_Information.pdf"
    
    # 1. Extract text from the PDF
    extracted_text = extract_text_from_pdf(pdf_path)
    print(f"Extracted first 500 characters of text from PDF.")
    print(extracted_text[:500])

    # 2. Use Gemini to summarize the extracted text
    print("\nGenerating a summary with Gemini...")
    summary = summarize_text_with_gemini(extracted_text)
    print("Summary:")
    print(summary)

Extracted first 500 characters of text from PDF.
Understanding Artificial Intelligence 
Chapter 1: Introduction to Artificial Intelligence 
Artificial intelligence (AI) refers to the ability of a digital computer or computer-controlled robot 
to perform tasks commonly associated with intelligent beings. The term is frequently applied to 
the project of developing systems endowed with the intellectual processes characteristic of 
humans, such as the ability to reason, discover meaning, generalize, or learn from past 
experience. Over the past f

Generating a summary with Gemini...
Summary:
This book comprehensively explores artificial intelligence (AI), covering its history, core concepts, applications, ethical implications, and future directions.  It begins by defining AI and tracing its development from the Dartmouth Workshop to the current era of deep learning.  The core concepts explained include machine learning (supervised, unsupervised, and reinforcement learning), deep learning 

## Setting Up the OpenAI API Client
We initialize the OpenAI client to generate embeddings and responses.

In [7]:
import os
import google.generativeai as genai

# Retrieve the API key from environment variables
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")

# Configure the genai library with your API key
if not GOOGLE_API_KEY:
    raise ValueError("GOOGLE_API_KEY environment variable is not set.")
genai.configure(api_key=GOOGLE_API_KEY)

## Creating Sentence-Level Embeddings
We split text into sentences and generate embeddings.

In [8]:
import numpy as np
import google.generativeai as genai
import os

# --- Gemini API Configuration ---
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
if not GOOGLE_API_KEY:
    raise ValueError("GOOGLE_API_KEY environment variable not set.")
genai.configure(api_key=GOOGLE_API_KEY)

# --- Define the get_embedding function for Gemini ---
def get_embedding(text: str, model: str = "models/embedding-001") -> np.ndarray:
    """
    Creates an embedding for the given text using Gemini.

    Args:
        text (str): Input text.
        model (str): Embedding model name.

    Returns:
        np.ndarray: The embedding vector.
    """
    try:
        response = genai.embed_content(model=model, content=text)
        return np.array(response['embedding'], dtype=np.float32)
    except Exception as e:
        print(f"An error occurred: {e}")
        return np.array([])

# --- Main Logic ---
# Note: You'll need to define 'extracted_text' before this part
# For example: extracted_text = "This is a sample text. It has multiple sentences. We will split them."
# Define 'extracted_text' here, or ensure it's available from a previous step
extracted_text = "This is a sample text. It has multiple sentences. We will split them and get embeddings for each."

# Splitting text into sentences (basic split)
sentences = extracted_text.split(". ")

# Generate embeddings for each sentence
embeddings = [get_embedding(sentence) for sentence in sentences]

# Filter out any empty embeddings if an error occurred
embeddings = [emb for emb in embeddings if emb.size > 0]

print(f"Generated {len(embeddings)} sentence embeddings.")
if embeddings:
    print(f"Embedding vector shape: {embeddings[0].shape}")

Generated 3 sentence embeddings.
Embedding vector shape: (768,)


## Calculating Similarity Differences
We compute cosine similarity between consecutive sentences.

In [9]:
import numpy as np
import google.generativeai as genai
import os
from typing import List

# --- Gemini API Configuration ---
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
if not GOOGLE_API_KEY:
    raise ValueError("GOOGLE_API_KEY environment variable not set.")
genai.configure(api_key=GOOGLE_API_KEY)

# --- Define the get_embedding function for Gemini ---
def get_embedding(text: str, model: str = "models/embedding-001") -> np.ndarray:
    """
    Creates an embedding for the given text using the Gemini API.

    Args:
        text (str): Input text.
        model (str): Embedding model name.

    Returns:
        np.ndarray: The embedding vector.
    """
    try:
        response = genai.embed_content(model=model, content=text)
        return np.array(response['embedding'], dtype=np.float32)
    except Exception as e:
        print(f"An error occurred: {e}")
        return np.array([])

# --- Your original cosine_similarity function ---
def cosine_similarity(vec1: np.ndarray, vec2: np.ndarray) -> float:
    """
    Computes cosine similarity between two vectors.

    Args:
    vec1 (np.ndarray): First vector.
    vec2 (np.ndarray): Second vector.

    Returns:
    float: Cosine similarity.
    """
    dot_product = np.dot(vec1, vec2)
    norm_product = np.linalg.norm(vec1) * np.linalg.norm(vec2)
    return dot_product / norm_product if norm_product != 0 else 0.0

# --- Main Logic Example ---
if __name__ == "__main__":
    # Sample sentences to demonstrate the workflow
    sentences = [
        "The quick brown fox jumps over the lazy dog.",
        "A fast, ginger canine leaps above a sluggish hound.",
        "The car drives on the highway.",
        "A quick brown dog runs away from home."
    ]

    # Generate embeddings for each sentence using the Gemini API
    print("Generating embeddings with Gemini...")
    embeddings = [get_embedding(sentence) for sentence in sentences]

    # Filter out any potential empty embeddings from errors
    embeddings = [emb for emb in embeddings if emb.size > 0]
    
    if len(embeddings) < 2:
        print("Not enough embeddings to compute similarity.")
        exit()

    # Compute similarity between consecutive sentences
    similarities = [cosine_similarity(embeddings[i], embeddings[i + 1]) for i in range(len(embeddings) - 1)]

    print(f"\nGenerated {len(embeddings)} sentence embeddings.")
    print("Computed similarities between consecutive sentences:")
    for i, sim in enumerate(similarities):
        print(f"  Similarity between '{sentences[i]}' and '{sentences[i+1]}': {sim:.4f}")

Generating embeddings with Gemini...

Generated 4 sentence embeddings.
Computed similarities between consecutive sentences:
  Similarity between 'The quick brown fox jumps over the lazy dog.' and 'A fast, ginger canine leaps above a sluggish hound.': 0.7653
  Similarity between 'A fast, ginger canine leaps above a sluggish hound.' and 'The car drives on the highway.': 0.6351
  Similarity between 'The car drives on the highway.' and 'A quick brown dog runs away from home.': 0.6918


## Implementing Semantic Chunking
We implement three different methods for finding breakpoints.

In [10]:
import numpy as np
import google.generativeai as genai
import os
from typing import List

# --- Your original compute_breakpoints function ---
def compute_breakpoints(similarities, method="percentile", threshold=90):
    """
    Computes chunking breakpoints based on similarity drops.

    Args:
    similarities (List[float]): List of similarity scores between sentences.
    method (str): 'percentile', 'standard_deviation', or 'interquartile'.
    threshold (float): Threshold value (percentile for 'percentile', std devs for 'standard_deviation').

    Returns:
    List[int]: Indices where chunk splits should occur.
    """
    if method == "percentile":
        threshold_value = np.percentile(similarities, threshold)
    elif method == "standard_deviation":
        mean = np.mean(similarities)
        std_dev = np.std(similarities)
        threshold_value = mean - (threshold * std_dev)
    elif method == "interquartile":
        q1, q3 = np.percentile(similarities, [25, 75])
        threshold_value = q1 - 1.5 * (q3 - q1)
    else:
        raise ValueError("Invalid method. Choose 'percentile', 'standard_deviation', or 'interquartile'.")

    return [i for i, sim in enumerate(similarities) if sim < threshold_value]

# --- Helper functions to integrate with Gemini ---
# (These functions would be defined in a larger script)
def get_embedding(text: str, model: str = "models/embedding-001") -> np.ndarray:
    """Creates an embedding for text using the Gemini API."""
    response = genai.embed_content(model=model, content=text)
    return np.array(response['embedding'], dtype=np.float32)

def cosine_similarity(vec1: np.ndarray, vec2: np.ndarray) -> float:
    """Computes cosine similarity between two vectors."""
    return np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2))

# --- Full workflow example ---
if __name__ == "__main__":
    # Setup for Gemini (requires API key set as environment variable)
    GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
    if not GOOGLE_API_KEY:
        raise ValueError("GOOGLE_API_KEY env var not set.")
    genai.configure(api_key=GOOGLE_API_KEY)

    # 1. Sample sentences to be processed
    sentences = [
        "Artificial intelligence is a powerful technology.",
        "It can be used in many applications, from self-driving cars to medical diagnosis.",
        "The sun is the center of our solar system.", # A topic shift here
        "Our solar system consists of eight planets and many moons."
    ]

    # 2. Get embeddings for all sentences using the Gemini API
    print("Generating embeddings with Gemini...")
    embeddings = [get_embedding(s) for s in sentences]

    # 3. Compute similarity between consecutive sentences
    similarities = [cosine_similarity(embeddings[i], embeddings[i+1]) for i in range(len(embeddings)-1)]
    print("\nComputed similarities between consecutive sentences:", [f'{s:.4f}' for s in similarities])

    # 4. Use your function to find the breakpoints
    print("Computing breakpoints...")
    breakpoints = compute_breakpoints(similarities, method="percentile", threshold=50) # Using a low threshold for this example
    print("Breakpoints (indices of sentences where similarity dropped):", breakpoints)

    # 5. Reconstruct chunks based on the breakpoints
    start_idx = 0
    chunks = []
    for bp in breakpoints:
        chunks.append(" ".join(sentences[start_idx : bp+1]))
        start_idx = bp + 1
    # Add the last chunk
    chunks.append(" ".join(sentences[start_idx:]))

    print("\nReconstructed chunks:")
    for i, chunk in enumerate(chunks):
        print(f"Chunk {i+1}: {chunk}\n")

Generating embeddings with Gemini...

Computed similarities between consecutive sentences: ['0.6596', '0.5876', '0.8088']
Computing breakpoints...
Breakpoints (indices of sentences where similarity dropped): [1]

Reconstructed chunks:
Chunk 1: Artificial intelligence is a powerful technology. It can be used in many applications, from self-driving cars to medical diagnosis.

Chunk 2: The sun is the center of our solar system. Our solar system consists of eight planets and many moons.



## Splitting Text into Semantic Chunks
We split the text based on computed breakpoints.

In [11]:
import numpy as np
import google.generativeai as genai
import os
from typing import List

# --- Your original compute_breakpoints function ---
def compute_breakpoints(similarities, method="percentile", threshold=90):
    """
    Computes chunking breakpoints based on similarity drops.

    Args:
    similarities (List[float]): List of similarity scores between sentences.
    method (str): 'percentile', 'standard_deviation', or 'interquartile'.
    threshold (float): Threshold value (percentile for 'percentile', std devs for 'standard_deviation').

    Returns:
    List[int]: Indices where chunk splits should occur.
    """
    if method == "percentile":
        threshold_value = np.percentile(similarities, threshold)
    elif method == "standard_deviation":
        mean = np.mean(similarities)
        std_dev = np.std(similarities)
        threshold_value = mean - (threshold * std_dev)
    elif method == "interquartile":
        q1, q3 = np.percentile(similarities, [25, 75])
        threshold_value = q1 - 1.5 * (q3 - q1)
    else:
        raise ValueError("Invalid method. Choose 'percentile', 'standard_deviation', or 'interquartile'.")

    return [i for i, sim in enumerate(similarities) if sim < threshold_value]

# --- Helper functions to integrate with Gemini ---
# (These functions would be defined in a larger script)
def get_embedding(text: str, model: str = "models/embedding-001") -> np.ndarray:
    """Creates an embedding for text using the Gemini API."""
    response = genai.embed_content(model=model, content=text)
    return np.array(response['embedding'], dtype=np.float32)

def cosine_similarity(vec1: np.ndarray, vec2: np.ndarray) -> float:
    """Computes cosine similarity between two vectors."""
    return np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2))

# --- Full workflow example ---
if __name__ == "__main__":
    # Setup for Gemini (requires API key set as environment variable)
    GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
    if not GOOGLE_API_KEY:
        raise ValueError("GOOGLE_API_KEY env var not set.")
    genai.configure(api_key=GOOGLE_API_KEY)

    # 1. Sample sentences to be processed
    sentences = [
        "Artificial intelligence is a powerful technology.",
        "It can be used in many applications, from self-driving cars to medical diagnosis.",
        "The sun is the center of our solar system.", # A topic shift here
        "Our solar system consists of eight planets and many moons."
    ]

    # 2. Get embeddings for all sentences using the Gemini API
    print("Generating embeddings with Gemini...")
    embeddings = [get_embedding(s) for s in sentences]

    # 3. Compute similarity between consecutive sentences
    similarities = [cosine_similarity(embeddings[i], embeddings[i+1]) for i in range(len(embeddings)-1)]
    print("\nComputed similarities between consecutive sentences:", [f'{s:.4f}' for s in similarities])

    # 4. Use your function to find the breakpoints
    print("Computing breakpoints...")
    breakpoints = compute_breakpoints(similarities, method="percentile", threshold=50) # Using a low threshold for this example
    print("Breakpoints (indices of sentences where similarity dropped):", breakpoints)

    # 5. Reconstruct chunks based on the breakpoints
    start_idx = 0
    chunks = []
    for bp in breakpoints:
        chunks.append(" ".join(sentences[start_idx : bp+1]))
        start_idx = bp + 1
    # Add the last chunk
    chunks.append(" ".join(sentences[start_idx:]))

    print("\nReconstructed chunks:")
    for i, chunk in enumerate(chunks):
        print(f"Chunk {i+1}: {chunk}\n")

Generating embeddings with Gemini...

Computed similarities between consecutive sentences: ['0.6596', '0.5876', '0.8088']
Computing breakpoints...
Breakpoints (indices of sentences where similarity dropped): [1]

Reconstructed chunks:
Chunk 1: Artificial intelligence is a powerful technology. It can be used in many applications, from self-driving cars to medical diagnosis.

Chunk 2: The sun is the center of our solar system. Our solar system consists of eight planets and many moons.



## Creating Embeddings for Semantic Chunks
We create embeddings for each chunk for later retrieval.

In [12]:
import numpy as np
import google.generativeai as genai
import os
from typing import List

# --- Gemini API Configuration ---
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
if not GOOGLE_API_KEY:
    raise ValueError("GOOGLE_API_KEY environment variable not set.")
genai.configure(api_key=GOOGLE_API_KEY)

# --- Define the get_embedding function for Gemini ---
def get_embedding(text: str, model: str = "models/embedding-001") -> np.ndarray:
    """
    Creates an embedding for a single text using the Gemini API.

    Args:
        text (str): Input text.
        model (str): Embedding model name.

    Returns:
        np.ndarray: The embedding vector.
    """
    try:
        response = genai.embed_content(model=model, content=text)
        return np.array(response['embedding'], dtype=np.float32)
    except Exception as e:
        print(f"An error occurred: {e}")
        return np.array([])

# --- Your original create_embeddings function ---
def create_embeddings(text_chunks: List[str]) -> List[np.ndarray]:
    """
    Creates embeddings for each text chunk.

    Args:
    text_chunks (List[str]): List of text chunks.

    Returns:
    List[np.ndarray]: List of embedding vectors.
    """
    # Generate embeddings for each text chunk using the get_embedding function
    return [get_embedding(chunk) for chunk in text_chunks]

# --- Main Logic Example ---
if __name__ == "__main__":
    # 1. Define some sample text chunks
    text_chunks = [
        "Artificial intelligence is a powerful technology.",
        "It can be used in many applications, such as self-driving cars and medical diagnosis.",
        "The sun is the center of our solar system.",
        "Our solar system consists of eight planets and many moons."
    ]

    # 2. Create chunk embeddings using the create_embeddings function
    print("Creating chunk embeddings with Gemini...")
    chunk_embeddings = create_embeddings(text_chunks)

    # 3. Filter out any empty embeddings from potential errors
    chunk_embeddings = [emb for emb in chunk_embeddings if emb.size > 0]
    
    # 4. Print the results
    if chunk_embeddings:
        print(f"\nSuccessfully created {len(chunk_embeddings)} chunk embeddings.")
        print(f"Embedding vector shape: {chunk_embeddings[0].shape}")
        print("\nFirst embedding vector (first 5 values):")
        print(chunk_embeddings[0][:5])
    else:
        print("Failed to create any embeddings.")

Creating chunk embeddings with Gemini...

Successfully created 4 chunk embeddings.
Embedding vector shape: (768,)

First embedding vector (first 5 values):
[-0.04203147 -0.03221899  0.01675237 -0.01887356  0.02924487]


## Performing Semantic Search
We implement cosine similarity to retrieve the most relevant chunks.

In [78]:
def semantic_search(query, text_chunks, chunk_embeddings, k=5):
    """
    Finds the most relevant text chunks for a query.

    Args:
    query (str): Search query.
    text_chunks (List[str]): List of text chunks.
    chunk_embeddings (List[np.ndarray]): List of chunk embeddings.
    k (int): Number of top results to return.

    Returns:
    List[str]: Top-k relevant chunks.
    """
    # Generate an embedding for the query
    query_embedding = get_embedding(query)
    
    # Calculate cosine similarity between the query embedding and each chunk embedding
    similarities = [cosine_similarity(query_embedding, emb) for emb in chunk_embeddings]
    
    # Get the indices of the top-k most similar chunks
    top_indices = np.argsort(similarities)[-k:][::-1]
    
    # Return the top-k most relevant text chunks
    return [text_chunks[i] for i in top_indices]

In [14]:
import numpy as np
import google.generativeai as genai
import os
from typing import List

# --- Gemini API Configuration ---
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
if not GOOGLE_API_KEY:
    raise ValueError("GOOGLE_API_KEY environment variable not set.")
genai.configure(api_key=GOOGLE_API_KEY)

# --- Helper functions ---
def get_embedding(text: str, model: str = "models/embedding-001") -> np.ndarray:
    """
    Creates an embedding for a single text using the Gemini API.

    Args:
        text (str): Input text.
        model (str): Embedding model name.

    Returns:
        np.ndarray: The embedding vector.
    """
    try:
        response = genai.embed_content(model=model, content=text)
        return np.array(response['embedding'], dtype=np.float32)
    except Exception as e:
        print(f"An error occurred: {e}")
        return np.array([])

def cosine_similarity(vec1: np.ndarray, vec2: np.ndarray) -> float:
    """Computes cosine similarity between two vectors."""
    dot_product = np.dot(vec1, vec2)
    norm_product = np.linalg.norm(vec1) * np.linalg.norm(vec2)
    return dot_product / norm_product if norm_product != 0 else 0.0

# --- Your semantic_search function (revised) ---
def semantic_search(query: str, text_chunks: List[str], chunk_embeddings: List[np.ndarray], k: int = 5) -> List[str]:
    """
    Finds the most relevant text chunks for a query.

    Args:
    query (str): Search query.
    text_chunks (List[str]): List of text chunks.
    chunk_embeddings (List[np.ndarray]): List of chunk embeddings.
    k (int): Number of top results to return.

    Returns:
    List[str]: Top-k relevant chunks.
    """
    # Generate an embedding for the query
    query_embedding = get_embedding(query)
    
    if query_embedding.size == 0:
        return []

    # Calculate cosine similarity between the query embedding and each chunk embedding
    similarities = [cosine_similarity(query_embedding, emb) for emb in chunk_embeddings]
    
    # Get the indices of the top-k most similar chunks
    top_indices = np.argsort(similarities)[-k:][::-1]
    
    # Return the top-k most relevant text chunks
    return [text_chunks[i] for i in top_indices]



In [16]:


# --- 2. Helper Functions (Assumed to be defined elsewhere) ---
def get_embedding(text: str, model: str = "models/embedding-001") -> np.ndarray:
    """Creates an embedding for a single text using the Gemini API."""
    try:
        response = genai.embed_content(model=model, content=text)
        return np.array(response['embedding'], dtype=np.float32)
    except Exception as e:
        print(f"An error occurred: {e}")
        return np.array([])

def cosine_similarity(vec1: np.ndarray, vec2: np.ndarray) -> float:
    """Computes cosine similarity between two vectors."""
    dot_product = np.dot(vec1, vec2)
    norm_product = np.linalg.norm(vec1) * np.linalg.norm(vec2)
    return dot_product / norm_product if norm_product != 0 else 0.0

def semantic_search(query: str, text_chunks: List[str], chunk_embeddings: List[np.ndarray], k: int = 5) -> List[str]:
    """Finds the most relevant text chunks for a query."""
    query_embedding = get_embedding(query)
    if query_embedding.size == 0:
        return []
    similarities = [cosine_similarity(query_embedding, emb) for emb in chunk_embeddings]
    top_indices = np.argsort(similarities)[-k:][::-1]
    return [text_chunks[i] for i in top_indices]

# --- 3. Main Logic (Your provided code) ---
if __name__ == "__main__":
    # Simulate text chunks and their embeddings from a previous step
    # In a real scenario, these would be generated from a PDF or other source
    text_chunks = [
        "Artificial intelligence (AI) is a wide-ranging branch of computer science.",
        "The development of AI has been influenced by disciplines such as cognitive science and philosophy.",
        "Deep learning is a subfield of machine learning inspired by the structure of the human brain.",
        "Neural networks are a fundamental component of deep learning models.",
        "Quantum computing is a field that uses quantum-mechanical phenomena.",
        "The Mona Lisa is a famous portrait painting by Leonardo da Vinci."
    ]
    chunk_embeddings = [get_embedding(chunk) for chunk in text_chunks]
    
    # Load the validation data from a JSON file
    # Note: I am simulating this file since I don't have access to your local files.
    # In your case, this part will be identical.
    data = [
        {"question": "What is the relationship between deep learning and neural networks?", "ideal_answer": "Neural networks are a fundamental component of deep learning models."},
        {"question": "What is AI?", "ideal_answer": "AI is a wide-ranging branch of computer science."}
    ]

    # Extract the first query from the validation data
    query = data[0]['question']

    # Get top 2 relevant chunks
    top_chunks = semantic_search(query, text_chunks, chunk_embeddings, k=2)

    # Print the query
    print(f"Query: {query}")

    # Print the top 2 most relevant text chunks
    print("\nTop 2 most relevant text chunks:")
    for i, chunk in enumerate(top_chunks):
        print(f"Context {i+1}:\n{chunk}\n{'='*40}")

Query: What is the relationship between deep learning and neural networks?

Top 2 most relevant text chunks:
Context 1:
Neural networks are a fundamental component of deep learning models.
Context 2:
Deep learning is a subfield of machine learning inspired by the structure of the human brain.


## Generating a Response Based on Retrieved Chunks

In [18]:

def get_embedding(text: str, model: str = "models/embedding-001") -> np.ndarray:
    """Creates an embedding for a single text using the Gemini API."""
    try:
        response = genai.embed_content(model=model, content=text)
        return np.array(response['embedding'], dtype=np.float32)
    except Exception as e:
        print(f"Embedding error: {e}")
        return np.array([])

def cosine_similarity(vec1: np.ndarray, vec2: np.ndarray) -> float:
    """Computes cosine similarity between two vectors."""
    dot_product = np.dot(vec1, vec2)
    norm_product = np.linalg.norm(vec1) * np.linalg.norm(vec2)
    return dot_product / norm_product if norm_product != 0 else 0.0

def semantic_search(query: str, text_chunks: List[str], chunk_embeddings: List[np.ndarray], k: int = 5) -> List[str]:
    """Finds the most relevant text chunks for a query."""
    query_embedding = get_embedding(query)
    if query_embedding.size == 0:
        return []
    similarities = [cosine_similarity(query_embedding, emb) for emb in chunk_embeddings]
    top_indices = np.argsort(similarities)[-k:][::-1]
    return [text_chunks[i] for i in top_indices]

# --- 3. Define the generate_response function for Gemini ---
def generate_response(system_prompt: str, user_message: str, model: str = "gemini-1.5-flash") -> str:
    """Generates a response from the Gemini model."""
    try:
        gemini_model = genai.GenerativeModel(model, system_instruction=system_prompt)
        response = gemini_model.generate_content(user_message)
        return response.text
    except Exception as e:
        print(f"Generation error: {e}")
        return "I could not generate a response due to an error."

# --- 4. Main Logic: Revised for a Homelessness Query ---
if __name__ == "__main__":
    # Simulate text chunks that might come from documents about homelessness
    text_chunks = [
        "Homelessness is a complex social problem with various contributing factors, including economic, social, and personal issues.",
        "A key factor is the lack of affordable housing, which disproportionately affects low-income families and individuals.",
        "Social factors like family breakdown, domestic violence, and a lack of social support networks can also lead to homelessness.",
        "Personal crises, such as job loss, mental health challenges, or substance abuse, are often triggers for losing housing.",
        "Government and non-profit organizations offer services like shelters, food banks, and temporary housing to help the homeless.",
        "The best long-term solution is a holistic approach that includes providing affordable housing, mental health services, and job training."
    ]
    
    # Create embeddings for these chunks. In a real application, this would be a one-time process.
    print("Creating chunk embeddings...")
    chunk_embeddings = [get_embedding(chunk) for chunk in text_chunks]
    
    # Define a specific query related to homelessness
    query = "What are the main causes of homelessness?"
    
    # Use semantic search to find the most relevant context
    print(f"\nSearching for chunks relevant to: '{query}'...")
    top_chunks = semantic_search(query, text_chunks, chunk_embeddings, k=3)
    
    # Define the system prompt for the RAG system
    system_prompt = "You are an assistant that strictly answers based on the given context about homelessness. If the answer is not in the context, say 'I cannot answer based on the provided information.' Be concise and direct."
    
    # Combine the top chunks and the user's query into a single prompt for the LLM
    user_prompt = "\n".join([f"Context {i+1}:\n{chunk}\n{'='*40}\n" for i, chunk in enumerate(top_chunks)])
    user_prompt = f"{user_prompt}\nQuestion: {query}"
    
    # Generate the final response using the RAG-style prompt
    print("\nGenerating AI response...")
    ai_response = generate_response(system_prompt, user_prompt)
    
    # Print the result
    print(f"Query: {query}")
    print(f"\nAI Response:\n{ai_response}")

Creating chunk embeddings...

Searching for chunks relevant to: 'What are the main causes of homelessness?'...

Generating AI response...
Query: What are the main causes of homelessness?

AI Response:
Based on the provided text,  family breakdown, domestic violence, lack of social support networks, economic issues, social issues, and personal issues contribute to homelessness.



## Evaluating the AI Response
We compare the AI response with the expected answer and assign a score.

In [19]:

def generate_response(system_prompt: str, user_message: str, model: str = "gemini-1.5-flash") -> str:
    """
    Generates a response from the Gemini model based on the system prompt and user message.

    Args:
    system_prompt (str): The system prompt to guide the AI's behavior.
    user_message (str): The user's message or query.
    model (str): The model to be used for generating the response. Default is "gemini-1.5-flash".

    Returns:
    str: The response from the AI model as a string.
    """
    try:
        # Pass the system prompt to the GenerativeModel's system_instruction parameter
        gemini_model = genai.GenerativeModel(model, system_instruction=system_prompt)
        response = gemini_model.generate_content(user_message)
        return response.text
    except Exception as e:
        print(f"An error occurred during response generation: {e}")
        return "Error: Could not generate a response."

# --- 3. Main Logic (Re-implemented for a runnable example) ---
if __name__ == "__main__":
    # Simulate a query, AI response, and true response from a previous step
    query = "What is the capital of France?"
    ai_response_content = "Paris is the capital of France."
    ideal_answer = "Paris"

    # Define the system prompt for the evaluation system
    evaluate_system_prompt = "You are an intelligent evaluation system tasked with assessing the AI assistant's responses. If the AI assistant's response is very close to the true response, assign a score of 1. If the response is incorrect or unsatisfactory in relation to the true response, assign a score of 0. If the response is partially aligned with the true response, assign a score of 0.5. Provide only the score (e.g., '1', '0', or '0.5') and nothing else."

    # Create the evaluation prompt by combining the user query, AI response, and true response
    evaluation_prompt = (
        f"User Query: {query}\n"
        f"AI Response:\n{ai_response_content}\n"
        f"True Response: {ideal_answer}"
    )

    # Generate the evaluation response using the Gemini API
    print("Generating evaluation response with Gemini...")
    evaluation_response_text = generate_response(evaluate_system_prompt, evaluation_prompt)

    # Print the evaluation score
    print("\nEvaluation Score:")
    print(evaluation_response_text)

Generating evaluation response with Gemini...

Evaluation Score:
1

