# Lab 3: Building the RAG Query Flow

This lab walks you through the core steps of a Retrieval Augmented Generation (RAG) query. We'll use Voyage-AI to embed a user's question, then query MongoDB Atlas Vector Search to retrieve relevant context, and finally, structure an augmented prompt for a Large Language Model (LLM).

## Objectives
- Embed a user's query using Voyage-AI.
- Perform a vector search in MongoDB Atlas to retrieve relevant documents.
- Combine the retrieved documents into a context for the LLM.
- Construct an augmented prompt for an LLM.

## Prerequisites
- Complete Lab 1 and Lab 2 (MongoDB Atlas setup and data ingestion).
- Ensure your MongoDB Atlas Vector Search index is built and ready.
- Python environment set up with `pymongo`, `voyageai`, and `python-dotenv` installed.
- `.env` file containing `MONGO_URI` and `VOYAGE_API_KEY`.

## Step 1: Load Environment Variables and Initialize Clients

In [None]:
from dotenv import load_dotenv
import os
import voyageai
from pymongo import MongoClient

# Load environment variables
load_dotenv()

# Initialize Voyage-AI Client
voyage_api_key = os.environ.get("VOYAGE_API_KEY")
if not voyage_api_key:
    raise ValueError("VOYAGE_API_KEY not found in .env file or environment variables.")
vo = voyageai.Client(api_key=voyage_api_key)

# Initialize MongoDB Client
mongo_uri = os.environ.get("MONGO_URI")
if not mongo_uri:
    raise ValueError("MONGO_URI not found in .env file or environment variables.")
client = MongoClient(mongo_uri)

# Select your database and collection
db = client['rag_db']
collection = db['documents']

print("Clients initialized successfully.")

## Step 2: Define User Query and Embed It with Voyage-AI

We'll take a sample user question and convert it into a vector embedding. Remember to specify `input_type="query"` when embedding search queries.

In [None]:
user_query = "Tell me about the upcoming features of the software."
print(f"\nUser Query: {user_query}")

print("Generating query embedding with Voyage-AI...")
try:
    query_embedding_response = vo.embed(
        texts=[user_query],
        model="voyage-large-2", # Use the same model as for indexing
        input_type="query" 
    )
    query_embedding = query_embedding_response.embeddings[0]
    print(f"Query embedding generated. Dimension: {len(query_embedding)}")
except Exception as e:
    print(f"Error generating query embedding: {e}")
    exit()

## Step 3: Perform Vector Search in MongoDB Atlas

Now we'll use the `$vectorSearch` aggregation stage to find documents in our MongoDB collection that are semantically similar to our `user_query`.

In [None]:
pipeline = [
  {
    '$vectorSearch': {
      'queryVector': query_embedding,
      'path': 'embedding',          # The field where your embeddings are stored
      'numCandidates': 50,          # Number of nearest neighbors to search
      'limit': 3,                   # Number of top results to return
      'index': 'default'            # Name of your Vector Search index (created in Lab 2)
      # 'filter': { 'source': { '$eq': 'specific_document' } } # Optional: add filters
    }
  },
  {
    '$project': {
      'text_chunk': 1,
      'source': 1,
      'score': { '$meta': 'vectorSearchScore' }, # Include similarity score
      '_id': 0 # Exclude _id from results
    }
  }
]

print("Performing vector search in MongoDB Atlas...")
retrieved_documents = list(collection.aggregate(pipeline))

if retrieved_documents:
    print(f"Retrieved {len(retrieved_documents)} relevant documents:")
    for doc in retrieved_documents:
        print(f"  - Score: {doc['score']:.4f}, Source: {doc['source']}, Text: {doc['text_chunk'][:70]}...")
else:
    print("No relevant documents found.")

## Step 4: Build Context for the LLM

We combine the `text_chunk` from the retrieved documents to form a single context string that will be passed to the LLM.

In [None]:
context = "\n".join([doc['text_chunk'] for doc in retrieved_documents])

print("\n--- Retrieved Context for LLM ---")
print(context)
print("---------------------------------")

if not context:
    print("Warning: No context was retrieved. The LLM might not be able to answer accurately.")

## Step 5: Construct Augmented Prompt for LLM

Finally, we create the full prompt that will be sent to our LLM. This prompt includes instructions for the LLM, the retrieved context, and the original user question. For this lab, we will only construct the prompt, not make an actual LLM call.

In [None]:
if context:
    llm_prompt = f"""
You are a helpful assistant. Answer the user's question based on the provided context only.
If you cannot find the answer in the context, politely state that the information is not available.

Context:
{context}

Question: {user_query}

Answer:
"""
else:
    llm_prompt = f"""
You are a helpful assistant. I couldn't find relevant information for the following question.
Please state that the information is not available in the provided knowledge base.

Question: {user_query}

Answer:
"""

print("\n--- LLM Augmented Prompt ---")
print(llm_prompt)
print("----------------------------")

# You would then make your LLM API call here, e.g.:
# from openai import OpenAI
# client_openai = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
# completion = client_openai.chat.completions.create(
#     model="gpt-3.5-turbo",
#     messages=[
#         {"role": "user", "content": llm_prompt}
#     ]
# )
# llm_response = completion.choices[0].message.content
# print(f"\nLLM Response: {llm_response}")

## Conclusion

You've successfully built the core retrieval and augmentation steps of a RAG pipeline! The next step in a full application would be to send this `llm_prompt` to an actual LLM and process its response.

In [None]:
# Don't forget to close the MongoDB client connection
client.close()
print("MongoDB client connection closed.")