<a href="https://colab.research.google.com/github/syedmahmoodiagents/Agents/blob/main/RAG_LangChainMemory.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install faiss-cpu --q

In [2]:
!pip install -qU langchain langchain-community langchain-text-splitters --q

In [3]:
!pip install -qU langchain-openai langchain-faiss --q

In [4]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage

In [5]:
from langchain_core.runnables import RunnableWithMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory, InMemoryChatMessageHistory

In [6]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings, ChatOpenAI

In [7]:
from langchain_core.runnables import RunnablePassthrough

In [8]:
import os, getpass

In [9]:
os.environ["OPENAI_API_KEY"] = getpass.getpass() # can be replace by Ollama settings

··········


In [10]:
llm = ChatOpenAI(model="gpt-4o-mini")

In [11]:
texts = [
    "LangChain helps developers build LLM applications.",
    "FAISS is used for vector similarity search.",
    "Chat history must be manually maintained in LangChain 1.1.",
    "Retrievers are used in RAG pipelines.",
    "OpenAI embeddings create vector representations."
]

In [12]:
embeddings = OpenAIEmbeddings()

In [13]:
db = FAISS.from_texts(texts, embeddings)

In [14]:
retriever = db.as_retriever()

In [15]:
store = {}
def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]

In [16]:
rag_prompt_with_history = ChatPromptTemplate.from_messages([
    ("system", "Use the retrieved context to answer the user."),
    MessagesPlaceholder("history"), # This is where the history will be injected
    ("human", "{question}\n\nContext:\n{context}")
])

In [17]:
def get_context_from_retriever(question_dict):
    # The input to this function will be a dictionary, e.g., {'question': '...'}
    docs = retriever.invoke(question_dict["question"])
    return "\n".join([d.page_content for d in docs])

In [23]:
runnable = RunnablePassthrough.assign(context=get_context_from_retriever)

In [25]:
rag_chain_with_context = (
    runnable | rag_prompt_with_history | llm
)

In [26]:
conversational_rag_chain_with_history = RunnableWithMessageHistory(
    rag_chain_with_context,
    get_session_history,
    input_messages_key="question", # Key in the input dict for the user's question
    history_messages_key="history", # Key in the prompt for the chat history
)

In [27]:
def ask_with_managed_history(question: str, session_id: str = "default_session"):
    response = conversational_rag_chain_with_history.invoke(
        {"question": question}, # Input only needs the question now
        config={
            "configurable": {"session_id": session_id}
        }
    )
    return response.content

In [28]:
store.clear()

In [29]:
print("User (session1): What is FAISS?")
print("AI (session1):", ask_with_managed_history("What is FAISS?", session_id="session1"))

print("\nUser (session1): What did I ask earlier?")
print("AI (session1):", ask_with_managed_history("What did I ask earlier?", session_id="session1"))

print("\nUser (session2): How does LangChain handle memory?")
print("AI (session2):", ask_with_managed_history("How does LangChain handle memory?", session_id="session2"))

print("\nUser (session1): And what about LangChain's memory?")
print("AI (session1):", ask_with_managed_history("And what about LangChain's memory?", session_id="session1"))


User (session1): What is FAISS?
AI (session1): FAISS (Facebook AI Similarity Search) is a library designed for efficient similarity search and clustering of dense vectors. It is particularly useful for tasks such as vector similarity search, enabling applications to quickly find similar items based on their vector representations. When working with OpenAI embeddings, which create these vector representations, FAISS can be used to effectively retrieve relevant data in scenarios such as retrieval-augmented generation (RAG) pipelines. Additionally, libraries like LangChain assist developers in building applications leveraging large language models (LLMs), often utilizing FAISS for managing vector searches.

User (session1): What did I ask earlier?
AI (session1): You asked about FAISS.

User (session2): How does LangChain handle memory?
AI (session2): In LangChain, memory management is primarily a manual process, particularly in version 1.1 where chat history must be explicitly maintained 

In [None]:


# 1. Define a store for chat session

# 2. New prompt with history placeholder
# This prompt now explicitly expects the chat history as part of its messages.

# 3. Define a processing step to get context from the retriever

# 4. Create the RAG chain with context retrieval and history handling
# RunnablePassthrough.assign is used to add 'context' to the input dictionary
# before passing it to the prompt.

# 5. Wrap this chain with RunnableWithMessageHistory
# This runnable automatically manages adding messages to history
# and retrieving them based on the session_id.

# Example usage function with managed history
