# Lab 3: Building the RAG Query Flow

This lab walks you through the core steps of a Retrieval Augmented Generation (RAG) query. We'll use Voyage-AI to embed a user's question, then query MongoDB Atlas Vector Search to retrieve relevant context, and finally, call Azure OpenAI via LangChain to generate an answer.

## Objectives
- Embed a user's query using Voyage-AI.
- Perform a vector search in MongoDB Atlas to retrieve relevant documents.
- Combine the retrieved documents into a context for the LLM.
- Construct an augmented prompt for an LLM.
- Generate an answer using Azure OpenAI with LangChain.

## Prerequisites
- Complete Lab 1 and Lab 2 (MongoDB Atlas setup and data ingestion).
- Ensure your MongoDB Atlas Vector Search index is built and ready.
- Python environment set up with `pymongo`, `voyageai`, `langchain-openai`, and `python-dotenv` installed.
- `.env` file containing the following variables:
  - `MONGODB_URI`
  - `VOYAGEAI_API_KEY`
  - `AZURE_OPENAI_API_KEY`
  - `AZURE_OPENAI_ENDPOINT`
  - `AZURE_OPENAI_DEPLOYMENT_NAME`
  - `AZURE_OPENAI_API_VERSION`

In [None]:
%pip install langchain-openai pymongo python-dotenv voyageai

## Step 1: Load Environment Variables and Initialize Clients

In [None]:
from dotenv import load_dotenv
import os
import voyageai
from pymongo import MongoClient

# Load environment variables
load_dotenv()

# Initialize Voyage-AI Client
voyageai_api_key = os.environ.get("VOYAGEAI_API_KEY")
if not voyageai_api_key:
    raise ValueError("VOYAGEAI_API_KEY not found in .env file or environment variables.")
vo = voyageai.Client(api_key=voyageai_api_key)

# Initialize MongoDB Client
mongo_uri = os.environ.get("MONGODB_URI")
if not mongo_uri:
    raise ValueError("MONGODB_URI not found in .env file or environment variables.")
client = MongoClient(mongo_uri)

# Select your database and collection
db = client['rag_db']
collection = db['documents']

print("Clients initialized successfully.")

## Step 2: Define User Query and Embed It with Voyage-AI

We'll take a sample user question and convert it into a vector embedding. Remember to specify `input_type="query"` when embedding search queries.

In [None]:
user_query = "Tell me about the upcoming features of the software."
print(f"\nUser Query: {user_query}")

print("Generating query embedding with Voyage-AI...")
try:
    query_embedding_response = vo.embed(
        texts=[user_query],
        model="voyage-3-large", # Use the same model as for indexing
        input_type="query" 
    )
    query_embedding = query_embedding_response.embeddings[0]
    print(f"Query embedding generated. Dimension: {len(query_embedding)}")
except Exception as e:
    print(f"Error generating query embedding: {e}")
    exit()

## Step 3: Perform Vector Search in MongoDB Atlas

Now we'll use the `$vectorSearch` aggregation stage to find documents in our MongoDB collection that are semantically similar to our `user_query`.

In [None]:
pipeline = [
  {
    '$vectorSearch': {
      'queryVector': query_embedding,
      'path': 'embedding',          # The field where your embeddings are stored
      'numCandidates': 50,          # Number of nearest neighbors to search
      'limit': 3,                   # Number of top results to return
      'index': 'vector_index'            # Name of your Vector Search index (created in Lab 2)
      # 'filter': { 'source': { '$eq': 'specific_document' } } # Optional: add filters
    }
  },
  {
    '$project': {
      'text_chunk': 1,
      'source': 1,
      'score': { '$meta': 'vectorSearchScore' }, # Include similarity score
      '_id': 0 # Exclude _id from results
    }
  }
]

print("Performing vector search in MongoDB Atlas...")
retrieved_documents = list(collection.aggregate(pipeline))

if retrieved_documents:
    print(f"Retrieved {len(retrieved_documents)} relevant documents:")
    for doc in retrieved_documents:
        print(f"  - Score: {doc['score']:.4f}, Source: {doc['source']}, Text: {doc['text_chunk'][:70]}...")
else:
    print("No relevant documents found.")

## Step 4: Build Context for the LLM

We combine the `text_chunk` from the retrieved documents to form a single context string that will be passed to the LLM.

In [None]:
context = "\n".join([doc['text_chunk'] for doc in retrieved_documents])

print("\n--- Retrieved Context for LLM ---")
print(context)
print("---------------------------------")

if not context:
    print("Warning: No context was retrieved. The LLM might not be able to answer accurately.")

## Step 5: Construct Augmented Prompt for LLM

We create the system instruction and user message that will be sent to Azure OpenAI. The system message tells the LLM how to behave, while the user message contains the retrieved context and the original question.

In [None]:
system_instruction = """You are a helpful assistant. Answer the user's question based on the provided context only.
If you cannot find the answer in the context, politely state that the information is not available."""

if context:
    user_message = f"""Context:
{context}

Question: {user_query}

Answer:"""
else:
    user_message = f"""I couldn't find relevant information for the following question.
Please state that the information is not available in the provided knowledge base.

Question: {user_query}

Answer:"""

print("\n--- System Instruction ---")
print(system_instruction)
print("\n--- User Message ---")
print(user_message)
print("-" * 30)

## Step 6: Generate Answer with Azure OpenAI (LangChain)

We use `AzureChatOpenAI` from the `langchain-openai` package to call your Azure OpenAI deployment. The model receives the system instruction and the augmented user message, then generates a grounded answer.

In [None]:
from langchain_openai import AzureChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage

# Read Azure OpenAI configuration from environment variables
azure_openai_api_key = os.environ.get("AZURE_OPENAI_API_KEY")
azure_openai_endpoint = os.environ.get("AZURE_OPENAI_ENDPOINT")
azure_openai_deployment = os.environ.get("AZURE_OPENAI_DEPLOYMENT_NAME")
azure_openai_api_version = os.environ.get("AZURE_OPENAI_API_VERSION", "2024-06-01")

if not all([azure_openai_api_key, azure_openai_endpoint, azure_openai_deployment]):
    raise ValueError(
        "Missing Azure OpenAI environment variables. "
        "Please set AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT, "
        "and AZURE_OPENAI_DEPLOYMENT_NAME in your .env file."
    )

# Initialize Azure OpenAI LLM via LangChain
llm = AzureChatOpenAI(
    azure_deployment=azure_openai_deployment,
    azure_endpoint=azure_openai_endpoint,
    api_key=azure_openai_api_key,
    api_version=azure_openai_api_version,
    model=azure_openai_deployment,
    temperature=0,
)


print(f"Azure OpenAI LLM initialized (deployment: {azure_openai_deployment})")

# Build the messages list
messages = [
    SystemMessage(content=system_instruction),
    HumanMessage(content=user_message),
]

# Invoke the LLM
print("\nCalling Azure OpenAI via LangChain...")
response = llm.invoke(messages)

print("\n--- LLM Response ---")
print(response.content)
print("--------------------")

## Conclusion

You've successfully built a complete RAG pipeline! Here's what we accomplished:

1. **Embedded** a user query using Voyage-AI.
2. **Retrieved** relevant documents from MongoDB Atlas via Vector Search.
3. **Constructed** an augmented prompt with retrieved context.
4. **Generated** a grounded answer using Azure OpenAI through LangChain.

In the next lab, we'll explore how to further improve retrieval quality using Voyage-AI's **reranking** capabilities.

In [None]:
# Don't forget to close the MongoDB client connection
client.close()
print("MongoDB client connection closed.")