# Contextual Chunk Headers (CCH) in Simple RAG

Retrieval-Augmented Generation (RAG) improves the factual accuracy of language models by retrieving relevant external knowledge before generating a response. However, standard chunking often loses important context, making retrieval less effective.

Contextual Chunk Headers (CCH) enhance RAG by prepending high-level context (like document titles or section headers) to each chunk before embedding them. This improves retrieval quality and prevents out-of-context responses.

## Steps in this Notebook:

1. **Data Ingestion**: Load and preprocess the text data.
2. **Chunking with Contextual Headers**: Extract section titles and prepend them to chunks.
3. **Embedding Creation**: Convert context-enhanced chunks into numerical representations.
4. **Semantic Search**: Retrieve relevant chunks based on a user query.
5. **Response Generation**: Use a language model to generate a response from retrieved text.
6. **Evaluation**: Assess response accuracy using a scoring system.

## Setting Up the Environment
We begin by importing necessary libraries.

In [1]:
import os
import numpy as np
import json
import fitz
from tqdm import tqdm
import google.generativeai as genai

# --- Gemini API Configuration ---
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
if not GOOGLE_API_KEY:
    raise ValueError("GOOGLE_API_KEY environment variable is not set. Please set it.")
genai.configure(api_key=GOOGLE_API_KEY)

## Extracting Text and Identifying Section Headers
We extract text from a PDF while also identifying section titles (potential headers for chunks).

In [2]:
import fitz

def extract_text_from_pdf(pdf_path: str) -> str:
    """
    Extracts text from a PDF file.

    Args:
    pdf_path (str): Path to the PDF file.

    Returns:
    str: Extracted text from the PDF, or an empty string if an error occurs.
    """
    all_text = ""
    try:
        # Open the PDF file using a context manager
        with fitz.open(pdf_path) as mypdf:
            # Iterate through each page in the PDF
            for page in mypdf:
                # Extract text from the page
                all_text += page.get_text("text")

    except Exception as e:
        print(f"Error reading PDF file: {e}")
        return ""

    return all_text

## Chunking Text with Contextual Headers
To improve retrieval, we generate descriptive headers for each chunk using an LLM model.

In [3]:
import os
import google.generativeai as genai

# --- 0) Initialize Gemini client (make sure GOOGLE_API_KEY is set) ---
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
if not GOOGLE_API_KEY:
    raise ValueError("GOOGLE_API_KEY environment variable is not set.")
genai.configure(api_key=GOOGLE_API_KEY)

# 1) Define the response generator
def generate_chunk_header(chunk: str, model: str = "gemini-1.5-flash") -> str:
    """
    Generates a title/header for a given text chunk using a Gemini LLM.

    Args:
    chunk (str): The text chunk to summarize as a header.
    model (str): The model to be used for generating the header. Default is "gemini-1.5-flash".

    Returns:
    str: Generated header/title.
    """
    # Define the system prompt to guide the AI's behavior
    system_prompt = "Generate a concise and informative title for the given text. Do not include any other text besides the title."
    
    try:
        # Create a GenerativeModel instance with the system prompt
        gemini_model = genai.GenerativeModel(model, system_instruction=system_prompt)
        
        # Generate the response
        response = gemini_model.generate_content(chunk)

        # Return the generated header/title, stripping any leading/trailing whitespace
        return response.text.strip()
    except Exception as e:
        print(f"An error occurred during header generation: {e}")
        return "Failed to generate header."

# --- Main Logic (Re-implemented for a runnable example) ---
if __name__ == "__main__":
    # Simulate a text chunk
    text_chunk = "Homelessness is a complex social problem with various contributing factors, including economic, social, and personal issues. A key factor is the lack of affordable housing, which disproportionately affects low-income families and individuals."

    print("Generating chunk header with Gemini...")
    # Generate a header for the text chunk
    header = generate_chunk_header(text_chunk)
    
    # Print the result
    print(f"\nOriginal Chunk:\n{text_chunk}")
    print(f"\nGenerated Header:\n{header}")

Generating chunk header with Gemini...

Original Chunk:
Homelessness is a complex social problem with various contributing factors, including economic, social, and personal issues. A key factor is the lack of affordable housing, which disproportionately affects low-income families and individuals.

Generated Header:
The Complex Issue of Homelessness


In [4]:
import os
import google.generativeai as genai
from typing import List, Dict

# --- 0) Initialize Gemini client ---
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
if not GOOGLE_API_KEY:
    raise ValueError("GOOGLE_API_KEY environment variable not set.")
genai.configure(api_key=GOOGLE_API_KEY)

# 1) Define the header generator for Gemini
def generate_chunk_header(chunk: str, model: str = "gemini-1.5-flash") -> str:
    """
    Generates a title/header for a given text chunk using a Gemini LLM.

    Args:
    chunk (str): The text chunk to summarize as a header.
    model (str): The model to be used for generating the header. Default is "gemini-1.5-flash".

    Returns:
    str: Generated header/title.
    """
    system_prompt = "Generate a concise and informative title for the given text. Do not include any other text besides the title."
    
    try:
        gemini_model = genai.GenerativeModel(model, system_instruction=system_prompt)
        response = gemini_model.generate_content(chunk)
        return response.text.strip()
    except Exception as e:
        print(f"An error occurred during header generation: {e}")
        return "Failed to generate header."

# 2) Define the text chunker with headers
def chunk_text_with_headers(text: str, n: int, overlap: int) -> List[Dict[str, str]]:
    """
    Chunks text into smaller segments and generates headers for each.

    Args:
    text (str): The full text to be chunked.
    n (int): The chunk size in characters.
    overlap (int): Overlapping characters between chunks.

    Returns:
    List[Dict]: A list of dictionaries with 'header' and 'text' keys.
    """
    chunks = []
    for i in range(0, len(text), n - overlap):
        chunk = text[i:i + n]
        header = generate_chunk_header(chunk)
        chunks.append({"header": header, "text": chunk})

    return chunks

# --- Main Logic (Re-implemented for a runnable example) ---
if __name__ == "__main__":
    text_to_chunk = """
    Homelessness is a complex social problem with various contributing factors, including economic, social, and personal issues.
    A key factor is the lack of affordable housing, which disproportionately affects low-income families and individuals.
    Social factors like family breakdown, domestic violence, and a lack of social support networks can also lead to homelessness.
    Personal crises, such as job loss, mental health challenges, or substance abuse, are often triggers for losing housing.
    Government and non-profit organizations offer services like shelters, food banks, and temporary housing to help the homeless.
    The best long-term solution is a holistic approach that includes providing affordable housing, mental health services, and job training.
    """
    
    print("Chunking text and generating headers with Gemini...")
    chunked_data = chunk_text_with_headers(text_to_chunk, n=256, overlap=50)

    print("\nGenerated chunks with headers:")
    for item in chunked_data:
        print(f"Header: {item['header']}\nText: {item['text'][:100]}...\n")

Chunking text and generating headers with Gemini...

Generated chunks with headers:
Header: Causes of Homelessness
Text: 
    Homelessness is a complex social problem with various contributing factors, including economic,...

Header: Factors Contributing to Homelessness
Text:  affects low-income families and individuals.
    Social factors like family breakdown, domestic vio...

Header: Addressing Homelessness: Solutions and Support
Text: ob loss, mental health challenges, or substance abuse, are often triggers for losing housing.
    Go...

Header: Addressing Homelessness: A Holistic Approach
Text: elp the homeless.
    The best long-term solution is a holistic approach that includes providing aff...



## Extracting and Chunking Text from a PDF File
Now, we load the PDF, extract text, and split it into chunks.

In [5]:
import os
import fitz
import google.generativeai as genai
from typing import List, Dict

# --- 1. Gemini API Configuration ---
# Your GOOGLE_API_KEY should be set in your environment
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
if not GOOGLE_API_KEY:
    raise ValueError("GOOGLE_API_KEY environment variable is not set.")
genai.configure(api_key=GOOGLE_API_KEY)

# --- 2. Helper functions ---
def extract_text_from_pdf(pdf_path: str) -> str:
    """
    Extracts text from a PDF file using PyMuPDF (fitz).
    """
    all_text = ""
    try:
        with fitz.open(pdf_path) as mypdf:
            for page in mypdf:
                all_text += page.get_text("text") + " "
    except Exception as e:
        print(f"Error reading PDF: {e}")
        return ""
    return all_text.strip()

def generate_chunk_header(chunk: str, model: str = "gemini-1.5-flash") -> str:
    """
    Generates a concise and informative title for a text chunk using a Gemini LLM.
    """
    system_prompt = "Generate a concise and informative title for the given text. Do not include any other text besides the title."
    try:
        gemini_model = genai.GenerativeModel(model, system_instruction=system_prompt)
        response = gemini_model.generate_content(chunk)
        return response.text.strip()
    except Exception as e:
        print(f"An error occurred during header generation: {e}")
        return "Failed to generate header."

def chunk_text_with_headers(text: str, n: int, overlap: int) -> List[Dict[str, str]]:
    """
    Chunks text into smaller segments and generates headers for each.
    """
    chunks = []
    step_size = n - overlap
    for i in range(0, len(text), step_size):
        chunk = text[i:i + n]
        if chunk.strip():  # Only process non-empty chunks
            header = generate_chunk_header(chunk)
            chunks.append({"header": header, "text": chunk})
    return chunks

# --- 3. Main Logic ---
if __name__ == "__main__":
    # Define the PDF file path
    pdf_path = "/Users/kekunkoya/Desktop/ISEM 770 Class Project/AI_Information.pdf"
    
    # Check if the file exists
    if not os.path.isfile(pdf_path):
        print(f"Error: PDF file not found at '{pdf_path}'")
        exit()

    # Extract text from the PDF file
    print("Extracting text from PDF...")
    extracted_text = extract_text_from_pdf(pdf_path)

    # Chunk the extracted text with headers
    print("Chunking text and generating headers with Gemini...")
    text_chunks = chunk_text_with_headers(extracted_text, 1000, 200)

    # Check if chunks were created
    if text_chunks:
        # Print a sample chunk with its generated header
        print("\nSample Chunk:")
        print("Header:", text_chunks[0]['header'])
        print("Content:", text_chunks[0]['text'])
    else:
        print("Failed to create any text chunks.")

Extracting text from PDF...
Chunking text and generating headers with Gemini...

Sample Chunk:
Header: Artificial Intelligence: An Introduction
Content: Understanding Artificial Intelligence 
Chapter 1: Introduction to Artificial Intelligence 
Artificial intelligence (AI) refers to the ability of a digital computer or computer-controlled robot 
to perform tasks commonly associated with intelligent beings. The term is frequently applied to 
the project of developing systems endowed with the intellectual processes characteristic of 
humans, such as the ability to reason, discover meaning, generalize, or learn from past 
experience. Over the past few decades, advancements in computing power and data availability 
have significantly accelerated the development and deployment of AI. 
Historical Context 
The idea of artificial intelligence has existed for centuries, often depicted in myths and fiction. 
However, the formal field of AI research began in the mid-20th century. The Dartmouth Wor

## Creating Embeddings for Headers and Text
We create embeddings for both headers and text to improve retrieval accuracy.

In [6]:


# --- 2. Define the create_embeddings function for Gemini ---
def create_embeddings(text: str, model: str = "models/embedding-001") -> np.ndarray:
    """
    Creates an embedding for the given text using the Gemini API.

    Args:
        text (str): The input text to be embedded.
        model (str): The embedding model to be used. Default is "models/embedding-001".

    Returns:
        np.ndarray: The embedding vector as a NumPy array.
    """
    try:
        # Create embeddings using the specified model and input text
        response = genai.embed_content(model=model, content=text)
        # Return the embedding from the response as a NumPy array
        return np.array(response['embedding'], dtype=np.float32)
    except Exception as e:
        print(f"An error occurred during embedding: {e}")
        return np.array([])

# --- 3. Main Logic (Re-implemented for a runnable example) ---
if __name__ == "__main__":
    sample_text = "Homelessness is a significant social issue."

    print("Creating embedding with Gemini...")
    embedding = create_embeddings(sample_text)

    if embedding.size > 0:
        print("\nEmbedding created successfully.")
        print(f"Embedding shape: {embedding.shape}")
        print(f"First 5 values: {embedding[:5]}")
    else:
        print("\nFailed to create embedding.")

Creating embedding with Gemini...

Embedding created successfully.
Embedding shape: (768,)
First 5 values: [ 0.05044351 -0.03356895 -0.06893376 -0.04065947  0.04912253]


In [7]:
# Generate embeddings for each chunk
embeddings = []  # Initialize an empty list to store embeddings

# Iterate through each text chunk with a progress bar
for chunk in tqdm(text_chunks, desc="Generating embeddings"):
    # Create an embedding for the chunk's text
    text_embedding = create_embeddings(chunk["text"])
    # Create an embedding for the chunk's header
    header_embedding = create_embeddings(chunk["header"])
    # Append the chunk's header, text, and their embeddings to the list
    embeddings.append({"header": chunk["header"], "text": chunk["text"], "embedding": text_embedding, "header_embedding": header_embedding})

Generating embeddings: 100%|██████████| 42/42 [00:21<00:00,  1.99it/s]


## Performing Semantic Search
We implement cosine similarity to find the most relevant text chunks for a user query.

In [8]:


# --- 2. Helper function to get embeddings from Gemini ---
def get_embedding(text: str, model: str = "models/embedding-001") -> np.ndarray:
    """
    Creates an embedding for a given text using the Gemini API.
    """
    try:
        response = genai.embed_content(model=model, content=text)
        return np.array(response['embedding'], dtype=np.float32)
    except Exception as e:
        print(f"An error occurred: {e}")
        return np.array([])

# --- 3. Your original cosine_similarity function ---
def cosine_similarity(vec1: np.ndarray, vec2: np.ndarray) -> float:
    """
    Computes cosine similarity between two vectors.
    """
    dot_product = np.dot(vec1, vec2)
    norm_product = np.linalg.norm(vec1) * np.linalg.norm(vec2)
    # Handle the case where a norm is zero to prevent division by zero
    return dot_product / norm_product if norm_product != 0 else 0.0

# --- 4. Main Logic ---
if __name__ == "__main__":
    # Define two sentences with similar meaning
    sentence1 = "The cat sat on the mat."
    sentence2 = "A feline rested on the rug."
    
    # Define a sentence with a different meaning
    sentence3 = "The car drove on the highway."

    print("Generating embeddings with Gemini...")
    vec1 = get_embedding(sentence1)
    vec2 = get_embedding(sentence2)
    vec3 = get_embedding(sentence3)

    if vec1.size > 0 and vec2.size > 0 and vec3.size > 0:
        # Calculate and print the cosine similarity between the similar sentences
        print(f"\nComparing '{sentence1}' and '{sentence2}'...")
        similarity_1_2 = cosine_similarity(vec1, vec2)
        print(f"Cosine Similarity: {similarity_1_2:.4f}")

        # Calculate and print the cosine similarity between the dissimilar sentences
        print(f"\nComparing '{sentence1}' and '{sentence3}'...")
        similarity_1_3 = cosine_similarity(vec1, vec3)
        print(f"Cosine Similarity: {similarity_1_3:.4f}")

        # Expected output: similarity_1_2 should be a high value (close to 1), 
        # and similarity_1_3 should be a low value (closer to 0).
    else:
        print("\nFailed to generate embeddings.")

Generating embeddings with Gemini...

Comparing 'The cat sat on the mat.' and 'A feline rested on the rug.'...
Cosine Similarity: 0.8778

Comparing 'The cat sat on the mat.' and 'The car drove on the highway.'...
Cosine Similarity: 0.6581


In [9]:
import numpy as np
import google.generativeai as genai
import os
from typing import List, Dict

# --- 1. Gemini API Configuration ---
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
if not GOOGLE_API_KEY:
    raise ValueError("GOOGLE_API_KEY environment variable not set.")
genai.configure(api_key=GOOGLE_API_KEY)

# --- 2. Helper Functions for Gemini API ---
def create_embeddings(text: str, model: str = "models/embedding-001") -> np.ndarray:
    """
    Creates an embedding for the given text using the Gemini API.
    
    Args:
        text (str): The input text to be embedded.
        model (str): The embedding model to be used.
        
    Returns:
        np.ndarray: The embedding vector as a NumPy array.
    """
    try:
        response = genai.embed_content(model=model, content=text)
        return np.array(response['embedding'], dtype=np.float32)
    except Exception as e:
        print(f"An error occurred during embedding: {e}")
        return np.array([])

def cosine_similarity(vec1: np.ndarray, vec2: np.ndarray) -> float:
    """
    Computes cosine similarity between two vectors.
    """
    dot_product = np.dot(vec1, vec2)
    norm_product = np.linalg.norm(vec1) * np.linalg.norm(vec2)
    return dot_product / norm_product if norm_product != 0 else 0.0

# --- 3. Your Semantic Search Function (Revised) ---
def semantic_search(query: str, chunks: List[Dict], k: int = 5) -> List[Dict]:
    """
    Searches for the most relevant chunks based on a query.
    
    Args:
    query (str): User query.
    chunks (List[dict]): List of text chunks with embeddings.
    k (int): Number of top results.
    
    Returns:
    List[dict]: Top-k most relevant chunks.
    """
    # Create an embedding for the query using the Gemini API
    query_embedding = create_embeddings(query)
    
    if query_embedding.size == 0:
        return []

    similarities = []
    
    for chunk in chunks:
        # Compute cosine similarity between query embedding and chunk text embedding
        sim_text = cosine_similarity(query_embedding, np.array(chunk["embedding"]))
        # Compute cosine similarity between query embedding and chunk header embedding
        sim_header = cosine_similarity(query_embedding, np.array(chunk["header_embedding"]))
        
        # Calculate the average similarity score
        avg_similarity = (sim_text + sim_header) / 2
        
        similarities.append((chunk, avg_similarity))

    similarities.sort(key=lambda x: x[1], reverse=True)
    return [x[0] for x in similarities[:k]]

# --- 4. Main Logic (Re-implemented for a runnable example) ---
if __name__ == "__main__":
    # Simulate a list of chunks with their embeddings
    # In a real scenario, these would come from a previous step
    # with generate_chunk_header and create_embeddings functions
    chunks_with_embeddings = [
        {"header": "Homelessness Causes", "text": "Homelessness is a complex social problem with various contributing factors, including economic, social, and personal issues.", "embedding": create_embeddings("Homelessness is a complex social problem with various contributing factors, including economic, social, and personal issues."), "header_embedding": create_embeddings("Homelessness Causes")},
        {"header": "Affordable Housing", "text": "A key factor is the lack of affordable housing, which disproportionately affects low-income families and individuals.", "embedding": create_embeddings("A key factor is the lack of affordable housing, which disproportionately affects low-income families and individuals."), "header_embedding": create_embeddings("Affordable Housing")},
        {"header": "Mental Health and Job Loss", "text": "Personal crises, such as job loss, mental health challenges, or substance abuse, are often triggers for losing housing.", "embedding": create_embeddings("Personal crises, such as job loss, mental health challenges, or substance abuse, are often triggers for losing housing."), "header_embedding": create_embeddings("Mental Health and Job Loss")},
        {"header": "Solar System Planets", "text": "The sun is the star at the center of the Solar System. The eight planets are Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.", "embedding": create_embeddings("The sun is the star at the center of the Solar System. The eight planets are Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune."), "header_embedding": create_embeddings("Solar System Planets")}
    ]

    query = "What are the reasons for someone becoming homeless?"
    
    print(f"Searching for relevant chunks for query: '{query}'...")
    top_chunks = semantic_search(query, chunks_with_embeddings, k=2)

    print("\nTop 2 most relevant chunks:")
    for i, chunk in enumerate(top_chunks):
        print(f"[{i+1}] Header: {chunk['header']}")
        print(f"    Content: {chunk['text']}\n")

Searching for relevant chunks for query: 'What are the reasons for someone becoming homeless?'...

Top 2 most relevant chunks:
[1] Header: Homelessness Causes
    Content: Homelessness is a complex social problem with various contributing factors, including economic, social, and personal issues.

[2] Header: Mental Health and Job Loss
    Content: Personal crises, such as job loss, mental health challenges, or substance abuse, are often triggers for losing housing.



## Running a Query on Extracted Chunks

In [11]:
# Load validation data
with open('/Users/kekunkoya/Desktop/ISEM 770 Class Project/valh.json') as f:
    data = json.load(f)

query = data[0]['question']

# Retrieve the top 2 most relevant text chunks
top_chunks = semantic_search(query, embeddings, k=2)

# Print the results
print("Query:", query)
for i, chunk in enumerate(top_chunks):
    print(f"Header {i+1}: {chunk['header']}")
    print(f"Content:\n{chunk['text']}\n")

Query: What is the ETHOS typology?
Header 1: "Understanding Homelessness and Housing Exclusion: The ETHOS Typology in the EU"
Content:
elessness, while people living in insecure and/or inadequate housing 
and/or in social isolation might also be affected by exclusion from one or two domains, 
but their situation is classified under ‘housing exclusion’ rather than ‘homelessness’.
On the basis of this conceptional understanding and to try to grasp the varying 
practices in different EU countries, the ETHOS typology was developed, which 
relates, in its most recent version, thirteen different operational categories and 
twenty-four different living situations to the four conceptional categories: roofless, 
houseless, insecure housing and inadequate housing.4 See Table 1.2.
4	
Apart from documenting progress concerning the measurement of homelessness in different 
EU countries and reporting on the latest available data, the forth and fifth reviews of statistics 
(Edgar and Meert, 2005, 200

In [10]:


    # Load validation data
    val_path = '/Users/kekunkoya/Desktop/ISEM 770 Class Project/valh.json'
    # The `json.load()` method is a standard Python function and doesn't need to be changed for Gemini.
    if not os.path.isfile(val_path):
        raise FileNotFoundError(f"Could not find validation file at: {val_path!r}")
    with open(val_path, 'r', encoding='utf-8') as f:
        data = json.load(f)

    # Use the first question from the validation data
    query = data[0]['question']

    # Retrieve the top 2 most relevant text chunks
    top_chunks = semantic_search(query, chunks_with_embeddings, k=2)

    # Print the results
    print("Query:", query)
    for i, chunk in enumerate(top_chunks):
        print(f"Header {i+1}: {chunk['header']}")
        print(f"Content:\n{chunk['text']}\n")

Query: What is the ETHOS typology?
Header 1: Homelessness Causes
Content:
Homelessness is a complex social problem with various contributing factors, including economic, social, and personal issues.

Header 2: Mental Health and Job Loss
Content:
Personal crises, such as job loss, mental health challenges, or substance abuse, are often triggers for losing housing.



## Generating a Response Based on Retrieved Chunks

In [12]:



    # Define the system prompt for the AI assistant
    system_prompt = "You are an AI assistant that strictly answers based on the given context. If the answer cannot be derived directly from the provided context, respond with: 'I do not have enough information to answer that.'"

    # Create the user prompt based on the top chunks
    user_prompt = "\n".join([f"Header: {chunk['header']}\nContent:\n{chunk['text']}" for chunk in top_chunks])
    user_prompt = f"{user_prompt}\nQuestion: {query}"

    # Generate AI response
    print("Generating AI response with Gemini...")
    ai_response = generate_response(system_prompt, user_prompt)

    # Print the final AI response
    print("\nAI Response:")
    print(ai_response)

Generating AI response with Gemini...

AI Response:
Social factors that contribute to homelessness include family breakdown, domestic violence, and a lack of social support networks.



## Evaluating the AI Response
We compare the AI response with the expected answer and assign a score.

In [13]:
# Define evaluation system prompt
evaluate_system_prompt = """You are an intelligent evaluation system. 
Assess the AI assistant's response based on the provided context. 
- Assign a score of 1 if the response is very close to the true answer. 
- Assign a score of 0.5 if the response is partially correct. 
- Assign a score of 0 if the response is incorrect.
Return only the score (0, 0.5, or 1)."""

# Extract the ground truth answer from validation data
true_answer = data[0]['ideal_answer']

# Construct evaluation prompt
evaluation_prompt = f"""
User Query: {query}
AI Response: {ai_response}
True Answer: {true_answer}
{evaluate_system_prompt}
"""

# Generate evaluation score
evaluation_response = generate_response(evaluate_system_prompt, evaluation_prompt)

# Print the evaluation score
print("Evaluation Score:", evaluation_response.choices[0].message.content)

Evaluation Score: 1


In [13]:


# --- 3. Main Logic (Re-implemented for a runnable example) ---
if __name__ == "__main__":
    # Simulate a query, AI response, and true answer
    query = "What are the main causes of homelessness?"
    ai_response_content = "A lack of affordable housing is a key contributing factor to homelessness."
    true_answer = "Homelessness is caused by a lack of affordable housing, job loss, and mental health issues."

    # Define evaluation system prompt
    evaluate_system_prompt = """You are an intelligent evaluation system.
Assess the AI assistant's response based on the provided context.
- Assign a score of 1 if the response is very close to the true answer.
- Assign a score of 0.5 if the response is partially correct.
- Assign a score of 0 if the response is incorrect.
Return only the score (0, 0.5, or 1)."""

    # Construct evaluation prompt
    evaluation_prompt = f"""
    User Query: {query}
    AI Response: {ai_response_content}
    True Answer: {true_answer}
    """

    # Generate evaluation score
    print("Generating evaluation score with Gemini...")
    evaluation_response = generate_response(evaluate_system_prompt, evaluation_prompt)
    
    # Print the evaluation score
    print("\nEvaluation Score:", evaluation_response)

Generating evaluation score with Gemini...

Evaluation Score: 0.5

