RAG Pipeline Test with MongoDB Vector Store </br>
Author: Sanjit Verma

In [2]:
from langchain_mongodb import MongoDBAtlasVectorSearch
from langchain_openai import OpenAIEmbeddings
from pymongo import MongoClient
import logging
import os, pprint
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
from dotenv import load_dotenv
import openai
from functools import lru_cache

load_dotenv()
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger()
OPENAI_KEY = os.getenv("OPENAI_API_KEY")
MONGODB_URI = os.getenv('MONGODB_URI')
db_name = os.getenv('MONGODB_DATABASE')
collection_name = os.getenv('MONGODB_TEMPUSER')
vector_search_idx = os.getenv('MONGODB_VECTOR_INDEX_TEMPUSER')
openai.api_key = OPENAI_KEY

In [3]:
# Connect to db
client = MongoClient(MONGODB_URI)
db = client[db_name]
collection = db[collection_name]

In [4]:
vector_search = MongoDBAtlasVectorSearch( # retrieve documents from MongoDB collection
   embedding=OpenAIEmbeddings(disallowed_special=()),
   collection=collection,
   index_name=vector_search_idx,
)

retriever = vector_search.as_retriever( #This method configures the vector_search instance to retrieve documents based on similarity.
   search_type = "similarity",
   search_kwargs = {"k": 5, "score_threshold": 0.75} # top 10 most similar documents, , only return documents with a similarity score of 0.75 or higher 
)

#Define the template used to format the input for the language model and provide a consistent response
template = """
Use the following pieces of context to answer the question at the end.
If you don't know the answer or if it is not provided in the context, just say that you don't know, don't try to make up an answer.
If the answer is in the context, dont say mentioned in the context.
Please provide a detailed explanation and if applicable, give examples or historical context.
{context}
Question: {question}
"""

custom_rag_prompt = PromptTemplate.from_template(template)
llm = ChatOpenAI()

def format_docs(docs):
   return "\n\n".join(doc.page_content for doc in docs) # This function formats the documents retrieved from MongoDB into a single string with each document separated by two newlines. 

rag_chain = (
   # retriever first gets all relevant documents, then that is passed to the next step in the chain which is formatting docs (denoted by |)
   { "context": retriever | format_docs, "question": RunnablePassthrough()} #runnable passthrough is used to pass the question to the next step in the chain without mods
   | custom_rag_prompt
   | llm
   | StrOutputParser()
)

# demonstrating caching of mongoDB queries + questions, very basic example actual 
MAX_CACHE_SIZE = 100
@lru_cache(maxsize=MAX_CACHE_SIZE)
def cached_query(question):
    response = rag_chain.invoke(question)
    return response

RED = '\033[91m'
GREEN = '\033[92m'
YELLOW = '\033[93m'
RESET = '\033[0m'


In [8]:
question = "How were the notes used?"
answer = cached_query(question) # convert question to embedding and he most relevant documents based on the query's embedding are fetched from the MongoDB collection 


print(f"{YELLOW}Cache Info: {cached_query.cache_info()}{RESET}")
print(f"{RED}Question: {question}{RESET}")
print(f"{GREEN}Answer: {answer}{RESET}")

documents = retriever.get_relevant_documents(question)
print("\nSource documents:")
pprint.pprint(documents)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


[93mCache Info: CacheInfo(hits=1, misses=3, maxsize=100, currsize=3)[0m
[91mQuestion: How were the notes used?[0m
[92mAnswer: The context provided does not specify what type of notes are being referred to, so it is difficult to provide a specific answer. 

Notes could refer to musical notes, in which case they are used in music to represent the pitch and duration of a sound. Musicians read and interpret notes on a musical staff to play or sing a piece of music. 

Notes could also refer to written messages or reminders. In this case, notes are used as a form of communication or to keep track of information. People use notes to jot down ideas, important points, to-do lists, or messages for themselves or others.

Without more specific information on the type of notes being discussed, it is not possible to provide a more detailed answer.[0m


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"



Source documents:
[]



STEPS THAT OCCUR IN THE BACKGROUND when .invoke() is called on the rag_chain instance 

Step 1 INVOCATION: Invocation of rag chain with the question. Method call triggers the chains operation starting with the retriever

Step 2 RETRIEVER FUNC: The retriever retrieves the most relevant documents from the MongoDB collection based on the question. Converts question into embedding and 
uses this to perform a similarity search in the database, retrieving documents that are contextually similar to the query.

Step 3 DOC FORMATTING: The retrieved documents are passed to the next step in the chain, which is the format_docs function. This function formats the documents 
into a single string with each document separated by double newlines (basically processing output from retriever)

Step 4 CONTEXT ASSEMBLY: The formatted documents and the question are passed to the custom_rag_prompt function to from complete context. RunnablePassthrough 
is highly important here to make sure no mods occur to question. 

Step 5 PROMPT TEMPLATE: template recieves step 4 and fills it out where context is replaced by the output from format_docs and question is replaced by the output from RunnablePasst

Step 6 LANGUAGE MODEL: templated string are passed to the language model, which generates a response based on the input.model generates a response based on the input prompt, 
considering the provided context and directly addressing the query.

Step 7 POST PARSE: Takes output and converts it into a string. If it is already a string, return it otherwise if it is a structured object, convert it to a string.
