[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mongodb-developer/GenAI-Showcase/blob/main/notebooks/rag/mongodb-langchain-cache-memory.ipynb)

[![View Article](https://img.shields.io/badge/View%20Article-blue)](https://www.mongodb.com/developer/products/atlas/advanced-rag-langchain-mongodb/)


# Adding Semantic Caching and Memory to your RAG Application using MongoDB and LangChain

In this notebook, we will see how to use the new MongoDBCache and MongoDBChatMessageHistory in your RAG application.


## Step 1: Install required libraries

- **datasets**: Python library to get access to datasets available on Hugging Face Hub

- **langchain**: Python toolkit for LangChain

- **langchain-mongodb**: Python package to use MongoDB as a vector store, semantic cache, chat history store etc. in LangChain

- **langchain-openai**: Python package to use OpenAI models with LangChain

- **pymongo**: Python toolkit for MongoDB

- **pandas**: Python library for data analysis, exploration, and manipulation

In [45]:
! pip install -qU datasets langchain langchain-mongodb langchain-openai pymongo pandas

## Step 2: Setup pre-requisites

* Set the MongoDB connection string. Follow the steps [here](https://www.mongodb.com/docs/manual/reference/connection-string/) to get the connection string from the Atlas UI.

* Set the OpenAI API key. Steps to obtain an API key as [here](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key)

In [46]:
import getpass

In [4]:
MONGODB_URI = getpass.getpass("Enter your MongoDB connection string:")

Enter your MongoDB connection string:··········


In [5]:
GROQ_API_KEY = getpass.getpass("Enter your GROQ API key:")

Enter your GROQ API key:··········


In [6]:
# Optional-- If you want to enable Langsmith -- good for debugging
import os

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()

··········


## Step 3: Download the dataset

We will be using MongoDB's [embedded_movies](https://huggingface.co/datasets/MongoDB/embedded_movies) dataset

In [47]:
import pandas as pd
from datasets import load_dataset

In [48]:
# Load MongoDB's embedded_movies dataset from Hugging Face
data = load_dataset("MongoDB/embedded_movies")

In [49]:
df = pd.DataFrame(data["train"])

## Step 4: Data analysis

Make sure length of the dataset is what we expect, drop Nones etc.

In [50]:
# Previewing the contents of the data
df.head(1)

Unnamed: 0,plot,runtime,genres,fullplot,directors,writers,countries,poster,languages,cast,title,num_mflix_comments,rated,imdb,awards,type,metacritic,plot_embedding
0,Young Pauline is left a lot of money when her ...,199.0,[Action],Young Pauline is left a lot of money when her ...,"[Louis J. Gasnier, Donald MacKenzie]","[Charles W. Goddard (screenplay), Basil Dickey...",[USA],https://m.media-amazon.com/images/M/MV5BMzgxOD...,[English],"[Pearl White, Crane Wilbur, Paul Panzer, Edwar...",The Perils of Pauline,0,,"{'id': 4465, 'rating': 7.6, 'votes': 744}","{'nominations': 0, 'text': '1 win.', 'wins': 1}",movie,,"[0.0007293965299999999, -0.026834568000000003,..."


In [51]:
# Only keep records where the fullplot field is not null
df = df[df["fullplot"].notna()]

In [52]:
# Renaming the embedding field to "embedding" -- required by LangChain
df.rename(columns={"plot_embedding": "embedding"}, inplace=True)

## Step 5: Create a simple RAG chain using MongoDB as the vector store

In [56]:
from langchain_mongodb import MongoDBAtlasVectorSearch
from pymongo import MongoClient

# Initialize MongoDB python client
client = MongoClient(
    MONGODB_URI, appname="devrel.showcase.mongodb_langchain_cache_memory"
)

DB_NAME = "RagDB"
COLLECTION_NAME = "MongoCacheDBS"
ATLAS_VECTOR_SEARCH_INDEX_NAME = "test_vector_index"
collection = client[DB_NAME][COLLECTION_NAME]

In [57]:
# Delete any existing records in the collection
collection.delete_many({})

DeleteResult({'n': 0, 'electionId': ObjectId('7fffffff00000000000000ba'), 'opTime': {'ts': Timestamp(1741885795, 2), 't': 186}, 'ok': 1.0, '$clusterTime': {'clusterTime': Timestamp(1741885795, 2), 'signature': {'hash': b'\xbb\x11\xf28\x00\x82\x18\xc1\xfb\xb1\xed8\xe5\xe3\xc7\xe6\x08\x84\xf0\x92', 'keyId': 7418469192430518274}}, 'operationTime': Timestamp(1741885795, 2)}, acknowledged=True)

In [58]:
# Data Ingestion
records = df.to_dict("records")
collection.insert_many(records)

print("Data ingestion into MongoDB completed")

Data ingestion into MongoDB completed


In [59]:
%pip install -qU langchain-huggingface

In [60]:
from langchain_huggingface import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

In [61]:
# Vector Store Creation
vector_store = MongoDBAtlasVectorSearch.from_connection_string(
    connection_string=MONGODB_URI,
    namespace=DB_NAME + "." + COLLECTION_NAME,
    embedding=embeddings,
    index_name="test_vector_index",
    text_key="fullplot",
)

In [62]:
# Using the MongoDB vector store as a retriever in a RAG chain
retriever = vector_store.as_retriever(search_type="similarity", search_kwargs={"k": 5})

In [63]:
%pip install -qU langchain-groq

In [64]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_groq.chat_models import ChatGroq

# Generate context using the retriever, and pass the user question through
retrieve = {
    "context": retriever | (lambda docs: "\n\n".join([d.page_content for d in docs])),
    "question": RunnablePassthrough(),
}
template = """Answer the question based only on the following context: \
{context}

Question: {question}
"""
# Defining the chat prompt
prompt = ChatPromptTemplate.from_template(template)

# Defining the model to be used for chat completion

# model = ChatOpenAI(temperature=0, openai_api_key=OPENAI_API_KEY)
model = ChatGroq(model="Llama3-70b-8192", api_key=GROQ_API_KEY)

# Parse output as a string
parse_output = StrOutputParser()

# Naive RAG chain
naive_rag_chain = retrieve | prompt | model | parse_output

In [65]:
naive_rag_chain.invoke("What is the best movie to watch when sad?")

'Based on the context provided (which is none), I would say that the best movie to watch when sad is a personal preference. What one person finds comforting and uplifting might not be the same for another. Some people might prefer a light-hearted rom-com to take their mind off their sadness, while others might find solace in a more introspective and emotional drama.\n\nThat being said, if I had to make a general recommendation, I\'d suggest a movie that has a positive and uplifting message, with a dash of humor and heart. Perhaps a classic like "When Harry Met Sally" or "The Princess Bride"? Or maybe a more modern pick like "The Greatest Showman" or "La La Land"? Ultimately, the best movie to watch when sad is one that resonates with your personal tastes and helps take your mind off your troubles.'

## Step 6: Create a RAG chain with chat history

In [27]:
from langchain_core.prompts import MessagesPlaceholder
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_mongodb.chat_message_histories import MongoDBChatMessageHistory

In [28]:
def get_session_history(session_id: str) -> MongoDBChatMessageHistory:
    return MongoDBChatMessageHistory(
        MONGODB_URI, session_id, database_name=DB_NAME, collection_name="history"
    )

In [29]:
# Given a follow-up question and history, create a standalone question
standalone_system_prompt = """
Given a chat history and a follow-up question, rephrase the follow-up question to be a standalone question. \
Do NOT answer the question, just reformulate it if needed, otherwise return it as is. \
Only return the final standalone question. \
"""
standalone_question_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", standalone_system_prompt),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{question}"),
    ]
)

question_chain = standalone_question_prompt | model | parse_output

In [30]:
# Generate context by passing output of the question_chain i.e. the standalone question to the retriever
retriever_chain = RunnablePassthrough.assign(
    context=question_chain
    | retriever
    | (lambda docs: "\n\n".join([d.page_content for d in docs]))
)

In [31]:
# Create a prompt that includes the context, history and the follow-up question
rag_system_prompt = """Answer the question based only on the following context: \
{context}
"""
rag_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", rag_system_prompt),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{question}"),
    ]
)

In [32]:
# RAG chain
rag_chain = retriever_chain | rag_prompt | model | parse_output

In [33]:
# RAG chain with history
with_message_history = RunnableWithMessageHistory(
    rag_chain,
    get_session_history,
    input_messages_key="question",
    history_messages_key="history",
)
with_message_history.invoke(
    {"question": "What is the best movie to watch when sad?"},
    {"configurable": {"session_id": "1"}},
)

'I\'m sorry, but I didn\'t receive any context to provide a specific answer. However, I can give you some general suggestions for movies that might help lift your mood when you\'re feeling sad:\n\n1. Comedies: Watch a light-hearted comedy that makes you laugh, such as "The Hangover," "Bridesmaids," or "Monty Python and the Holy Grail."\n2. Uplifting stories: Movies with inspiring true stories or overcoming adversity can be uplifting, like "Rocky," "Erin Brockovich," or "Hacksaw Ridge."\n3. Animated films: Colorful and imaginative animated movies can take your mind off your sadness, such as "Inside Out," "Zootopia," or "The Lego Movie."\n4. Rom-coms: Romantic comedies can provide a feel-good escape, like "When Harry Met Sally," "Sleepless in Seattle," or "Crazy Rich Asians."\n5. Nostalgic favorites: Re-watch a movie that brings back happy memories or reminds you of a better time, such as a childhood favorite or a film that you loved growing up.\n\nRemember, everyone\'s tastes are differ

In [34]:
with_message_history.invoke(
    {
        "question": "Hmmm..I don't want to watch that one. Can you suggest something else?"
    },
    {"configurable": {"session_id": "1"}},
)

'I apologize, but I didn\'t provide a specific movie suggestion earlier. If you\'re open to more ideas, I can offer some additional suggestions:\n\n1. "Elf" (2003) - A Christmas comedy classic that\'s funny and heartwarming.\n2. "The Princess Bride" (1987) - A swashbuckling adventure with a sweet love story and memorable characters.\n3. "The Sound of Music" (1965) - A timeless musical with iconic songs and a uplifting story.\n4. "Freaky Friday" (2003) - A fun and lighthearted mother-daughter comedy.\n5. "Amélie" (2001) - A quirky and charming French romantic comedy.\n\nIf none of these appeal to you, please let me know what type of movie you\'re in the mood for (e.g., drama, action, fantasy, etc.), and I can try to suggest something else!'

In [35]:
with_message_history.invoke(
    {"question": "How about something more light?"},
    {"configurable": {"session_id": "1"}},
)

'If you\'re looking for something even lighter, here are some more suggestions:\n\n1. "Clueless" (1995) - A classic teen comedy with a fun and upbeat tone.\n2. "Miss Congeniality" (2000) - A silly and entertaining comedy about a tomboy FBI agent going undercover at a beauty pageant.\n3. "10 Things I Hate About You" (1999) - A lighthearted teen rom-com with a fun \'90s vibe.\n4. "My Big Fat Greek Wedding" (2002) - A heartwarming and hilarious comedy about cultural differences and love.\n5. "The Devil Wears Prada" (2006) - A fun and fashionable comedy about the world of high fashion.\n\nThese movies are all relatively light and easy to watch, with a focus on humor and entertainment. Let me know if any of these appeal to you!'

## Step 7: Get faster responses using Semantic Cache

**NOTE:** Semantic cache only caches the input to the LLM. When using it in retrieval chains, remember that documents retrieved can change between runs resulting in cache misses for semantically similar queries.

In [41]:
from langchain_core.globals import set_llm_cache
from langchain_mongodb.cache import MongoDBAtlasSemanticCache

set_llm_cache(
    MongoDBAtlasSemanticCache(
        connection_string=MONGODB_URI,
        embedding=embeddings,
        collection_name="semantic_cache",
        database_name=DB_NAME,
        index_name=ATLAS_VECTOR_SEARCH_INDEX_NAME,
        # wait_until_ready=True,  # Optional, waits until the cache is ready to be used
    )
)

In [42]:
%%time
naive_rag_chain.invoke("What is the best movie to watch when sad?")

CPU times: user 1.43 s, sys: 15.9 ms, total: 1.44 s
Wall time: 3.9 s


"I'm so sorry to hear you're feeling sad! Unfortunately, I don't have any specific information about movies to recommend in this context. Can you give me a little more information about what you're in the mood for? Do you want a rom-com to lift your spirits, an animated film to take your mind off things, or perhaps a classic tearjerker to allow yourself a good cry?"

In [43]:
%%time
naive_rag_chain.invoke("What is the best movie to watch when sad?")

CPU times: user 1.32 s, sys: 11.4 ms, total: 1.33 s
Wall time: 2.88 s


'Unfortunately, I don\'t have enough information to give a specific answer. However, I can provide some general suggestions. When people are feeling sad, they often look for movies that can take their mind off their emotions, provide comfort, or offer a distraction. Some popular options might include:\n\n* Uplifting romantic comedies like "When Harry Met Sally" or "The Proposal"\n* Classic feel-good movies like "Forrest Gump" or "The Sound of Music"\n* Light-hearted animated films like "Toy Story" or "Finding Nemo"\n* Inspirational true stories like "Erin Brockovich" or "Hacksaw Ridge"\n* Or even a good cry-fest like "Titanic" or "The Fault in Our Stars"\n\nUltimately, the best movie to watch when sad is a personal preference. What kind of movie do you usually turn to when you\'re feeling down?'

In [44]:
%%time
naive_rag_chain.invoke("Which movie do I watch when sad?")

CPU times: user 1.37 s, sys: 7.68 ms, total: 1.38 s
Wall time: 2.75 s


"I apologize, but there is no context provided to answer this question. The conversation just started, and no information has been shared about your preferences or habits. Could you please provide more context or information about your favorite movies or what you usually watch when you're feeling sad? I'd be happy to help!"