# Retrieval Augmented Generation with ChromaDB and OpenAI

This notebook demonstrates the complete RAG pipeline: retrieving relevant context from the vector store and using it to generate enhanced responses.

**Features:** Local ChromaDB storage | OpenAI embeddings | GPT-4 generation | Cosine similarity evaluation

## 1. Setup and Configuration

Load required libraries and configuration

In [1]:
import chromadb
import openai
from openai import OpenAI
import os
from dotenv import load_dotenv
import time
import textwrap
import re
from IPython.display import display, HTML
import markdown

# Configuration
CHROMA_PATH = "./chroma_db"
COLLECTION_NAME = "space_exploration"
N_RESULTS = 10  # Number of results to retrieve for RAG
GPT_MODEL = "gpt-4o"

# Load API key
load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")

if not openai.api_key:
    raise ValueError("OPENAI_API_KEY not found in .env file")

print("✓ Configuration loaded")
print(f"  API Key: {openai.api_key[:10]}...")
print(f"  Vector Store: {CHROMA_PATH}")
print(f"  Collection: {COLLECTION_NAME}")
print(f"  Retrieval results: {N_RESULTS}")
print(f"  Generation model: {GPT_MODEL}")

✓ Configuration loaded
  API Key: sk-proj-lq...
  Vector Store: ./chroma_db
  Collection: space_exploration
  Retrieval results: 10
  Generation model: gpt-4o


## 2. Load Vector Store

In [2]:
# Initialize ChromaDB client and load existing collection
client = chromadb.PersistentClient(path=CHROMA_PATH)

try:
    collection = client.get_collection(name=COLLECTION_NAME)
    print(f"✓ Loaded existing collection: {COLLECTION_NAME}")
    print(f"  Total documents: {collection.count()}")
except Exception as e:
    print(f"✗ Error loading collection: {e}")
    print(f"  Make sure you've run 2_Embeddings_vector_store.ipynb first!")
    raise

✓ Loaded existing collection: space_exploration
  Total documents: 1145


## 3. Retrieval Functions

In [3]:
def get_embeddings(texts, model="text-embedding-3-small"):
    """Get embeddings from OpenAI"""
    if isinstance(texts, str):
        texts = [texts]
    texts = [t.replace("\n", " ") for t in texts]
    response = openai.embeddings.create(input=texts, model=model)
    return [data.embedding for data in response.data]

def search_query(query_text, n_results=N_RESULTS):
    """Search the vector store for relevant documents"""
    # Get embedding for the query
    query_embedding = get_embeddings([query_text])[0]
    
    # Search ChromaDB
    results = collection.query(
        query_embeddings=[query_embedding],
        n_results=n_results,
        include=["documents", "metadatas", "distances"]
    )
    
    return results

print("✓ Retrieval functions defined")

✓ Retrieval functions defined


## 4. User Query and Retrieval

In [4]:
# User's question
user_prompt = "Tell me about space exploration on the Moon and Mars."

print(f"User Query: {user_prompt}\n")

# Perform retrieval
search_results = search_query(user_prompt)

# Display search results
print(f"Retrieved {len(search_results['ids'][0])} relevant documents:\n")
print("="*80)

for i in range(len(search_results['ids'][0])):
    distance = search_results['distances'][0][i]
    text = search_results['documents'][0][i]
    source = search_results['metadatas'][0][i]['source']
    
    print(f"\nResult {i+1} (distance: {distance:.4f})")
    print(f"Source: {source}")
    print(f"Text: {text[:200]}...")

print("="*80)

User Query: Tell me about space exploration on the Moon and Mars.

Retrieved 10 relevant documents:


Result 1 (distance: 0.8402)
Source: llm.txt
Text: marked the sixth landing and the most recent human visit.Artemis IIis scheduled to complete a crewed flyby of the Moon in 2025, andArtemis IIIwill perform the first lunar landing since Apollo 17 with ...

Result 2 (distance: 0.8876)
Source: llm.txt
Text: ued their operations aroundMars, providing scientific insights into the planet's surface and atmosphere. In 2025,Mars Expressreceived a software update, which could allow it to stay operational until ...

Result 3 (distance: 0.9456)
Source: llm.txt
Text: al efforts toward returning humans to the Moon and laying the foundation of eventual humanexploration of Mars. Space Policy Directive 1 authorized the lunar-focused campaign. The campaign, later named...

Result 4 (distance: 0.9588)
Source: llm.txt
Text: mis programis aMoon explorationprogram led by the United States'National Aeronautic

## 5. Create Augmented Input

In [5]:
# Get the top result text
top_text = search_results['documents'][0][0].strip()
top_distance = search_results['distances'][0][0]
top_source = search_results['metadatas'][0][0]['source']

print("Top Search Result:")
print(f"Distance: {top_distance:.4f}")
print(f"Source: {top_source}")
print(f"\nText:\n{textwrap.fill(top_text, width=80)}")

Top Search Result:
Distance: 0.8402
Source: llm.txt

Text:
marked the sixth landing and the most recent human visit.Artemis IIis scheduled
to complete a crewed flyby of the Moon in 2025, andArtemis IIIwill perform the
first lunar landing since Apollo 17 with it scheduled for launch no earlier than
2026. Robotic missions are still pursued vigorously. The exploration ofMarshas
been an important part of the space exploration programs of the Soviet Union
(later Russia), the United States, Europe, Japan, and India. Dozens ofrobotic
spacecraft, includingorbiters,landers, androvers, have been launched toward Mars
since the 1960s. These missions were aimed at gathering data about current
conditions and answering questions about the history of Mars. The questions
raised by the scientific community are expected to not only give a better
appreciation of the Red Planet but also yield further insight into the past, and
possible future, of Earth. The exploration of Mars has come at a considerable
fi

In [6]:
# Create augmented input by combining user query with retrieved context
augmented_input = user_prompt + " " + top_text

print(f"\n{'='*80}")
print("AUGMENTED INPUT CREATED")
print(f"{'='*80}")
print(f"Original query length: {len(user_prompt)} characters")
print(f"Retrieved context length: {len(top_text)} characters")
print(f"Total augmented input length: {len(augmented_input)} characters")


AUGMENTED INPUT CREATED
Original query length: 53 characters
Retrieved context length: 1000 characters
Total augmented input length: 1054 characters


## 6. Generation with GPT-4

In [7]:
client = OpenAI()

def call_gpt4_with_context(user_query, context):
    """Generate response using GPT-4 with retrieved context"""
    prompt = f"Based on the following context, please answer the question.\n\nContext: {context}\n\nQuestion: {user_query}"
    
    try:
        response = client.chat.completions.create(
            model=GPT_MODEL,
            messages=[
                {"role": "system", "content": "You are a space exploration expert. Answer questions based on the provided context."},
                {"role": "user", "content": prompt}
            ],
            temperature=0.1
        )
        return response.choices[0].message.content
    except Exception as e:
        return str(e)

# Measure response time
start_time = time.time()

gpt4_response = call_gpt4_with_context(user_prompt, top_text)

response_time = time.time() - start_time

print(f"Response Time: {response_time:.2f} seconds\n")
print(f"{GPT_MODEL} Response:\n")
print("="*80)
print(gpt4_response)
print("="*80)

Response Time: 5.33 seconds

gpt-4o Response:

Space exploration on the Moon and Mars has been a significant focus for various space agencies around the world.

**Moon Exploration:**
- The Moon has been a primary target for human space exploration, with the Apollo program marking significant milestones. The last human visit to the Moon was Apollo 17, which was the sixth landing.
- The Artemis program is the current initiative by NASA to return humans to the Moon. Artemis II is scheduled for a crewed flyby of the Moon in 2025, and Artemis III aims to perform the first lunar landing since Apollo 17, with a launch no earlier than 2026.
- Robotic missions continue to play a crucial role in lunar exploration, providing valuable data and paving the way for future human missions.

**Mars Exploration:**
- Mars exploration has been a major focus for space agencies from the Soviet Union (later Russia), the United States, Europe, Japan, and India.
- Since the 1960s, dozens of robotic spacecraft, 

## 7. Formatted Response Display

In [8]:
from IPython.display import display, Markdown

def print_formatted_response(response):
    """Display response with proper formatting (markdown or plain text)"""
    # Check for markdown patterns
    markdown_patterns = [
        r"^#+\s",           # Headers
        r"^\*+",            # Bullet points
        r"\*\*",            # Bold
        r"_",               # Italics
        r"\[.+\]\(.+\)",    # Links
        r"-\s",             # Dashes used for lists
        r"```"              # Code blocks
    ]

    # If any pattern matches, assume markdown and render it
    if any(re.search(pattern, response, re.MULTILINE) for pattern in markdown_patterns):
        display(Markdown(response))
    else:
        # Plain text with wrapping
        wrapper = textwrap.TextWrapper(width=80)
        wrapped_text = wrapper.fill(text=response)
        print("Text Response:")
        print("-"*80)
        print(wrapped_text)
        print("-"*80)

print_formatted_response(gpt4_response)

Space exploration on the Moon and Mars has been a significant focus for various space agencies around the world.

**Moon Exploration:**
- The Moon has been a primary target for human space exploration, with the Apollo program marking significant milestones. The last human visit to the Moon was Apollo 17, which was the sixth landing.
- The Artemis program is the current initiative by NASA to return humans to the Moon. Artemis II is scheduled for a crewed flyby of the Moon in 2025, and Artemis III aims to perform the first lunar landing since Apollo 17, with a launch no earlier than 2026.
- Robotic missions continue to play a crucial role in lunar exploration, providing valuable data and paving the way for future human missions.

**Mars Exploration:**
- Mars exploration has been a major focus for space agencies from the Soviet Union (later Russia), the United States, Europe, Japan, and India.
- Since the 1960s, dozens of robotic spacecraft, including orbiters, landers, and rovers, have been launched toward Mars.
- These missions aim to gather data about Mars' current conditions and answer questions about its history, which could provide insights into the past and possible future of Earth.
- Despite the considerable financial cost, with roughly two-thirds of all spacecraft destined for Mars facing challenges, the exploration of Mars continues to be a priority due to its potential to enhance our understanding of the planet and its implications for Earth.

## 8. Evaluation with Cosine Similarity

We'll evaluate the RAG output by measuring cosine similarity between:
1. User prompt vs GPT-4 response (without context)
2. Augmented input vs GPT-4 response (with context)

Higher similarity with augmented input indicates the model effectively used the retrieved context.

### 8.1 TF-IDF Based Cosine Similarity

In [9]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

def calculate_cosine_similarity(text1, text2):
    """Calculate cosine similarity using TF-IDF"""
    vectorizer = TfidfVectorizer()
    tfidf = vectorizer.fit_transform([text1, text2])
    similarity = cosine_similarity(tfidf[0:1], tfidf[1:2])
    return similarity[0][0]

# Compare user prompt (without context) vs response
similarity_without_context = calculate_cosine_similarity(user_prompt, gpt4_response)

# Compare augmented input (with context) vs response
similarity_with_context = calculate_cosine_similarity(augmented_input, gpt4_response)

print("TF-IDF Cosine Similarity Results:")
print("="*80)
print(f"User prompt → GPT-4 response: {similarity_without_context:.3f}")
print(f"Augmented input → GPT-4 response: {similarity_with_context:.3f}")
print(f"\nImprovement with RAG: {(similarity_with_context - similarity_without_context):.3f}")
print("="*80)

TF-IDF Cosine Similarity Results:
User prompt → GPT-4 response: 0.421
Augmented input → GPT-4 response: 0.772

Improvement with RAG: 0.351


### 8.2 Embedding-Based Cosine Similarity

In [11]:
from sentence_transformers import SentenceTransformer

# Load sentence transformer model
model = SentenceTransformer('all-MiniLM-L6-v2')

def calculate_cosine_similarity_with_embeddings(text1, text2):
    """Calculate cosine similarity using sentence embeddings"""
    embeddings1 = model.encode(text1)
    embeddings2 = model.encode(text2)
    similarity = cosine_similarity([embeddings1], [embeddings2])
    return similarity[0][0]

# Compare using embeddings
emb_similarity_without_context = calculate_cosine_similarity_with_embeddings(user_prompt, gpt4_response)
emb_similarity_with_context = calculate_cosine_similarity_with_embeddings(augmented_input, gpt4_response)

print("Embedding-Based Cosine Similarity Results:")
print("="*80)
print(f"User prompt → GPT-4 response: {emb_similarity_without_context:.3f}")
print(f"Augmented input → GPT-4 response: {emb_similarity_with_context:.3f}")
print(f"\nImprovement with RAG: {(emb_similarity_with_context - emb_similarity_without_context):.3f}")
print("="*80)

Embedding-Based Cosine Similarity Results:
User prompt → GPT-4 response: 0.692
Augmented input → GPT-4 response: 0.874

Improvement with RAG: 0.181
