<a href="https://colab.research.google.com/github/Sankalpa0011/LLM-Conversational-RAG-App-LangChain-and-OpenAI/blob/main/LLM_Conversational_RAG_App_LangChain_and_OpenAI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Conversational Rag Application With LangChain And OpenAI LLM**

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [2]:
# Install the necessary packages
!pip install langchain -qU
!pip install langchain-openai -qU
!pip install langchain-chroma -qU
!pip install langchain-community -qU

In [3]:
import os
from google.colab import userdata

## **Initialize OpenAI LLM**

In [4]:
from langchain_openai import ChatOpenAI

# set OpenAI API key
os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")

# initialize the ChatOpenAI model
llm = ChatOpenAI(
    model = "gpt-3.5-turbo",
    temperature = 0
)

## **Initialize Embedding Model**

In [5]:
from langchain_openai import OpenAIEmbeddings

embedding_model = OpenAIEmbeddings(
    model="text-embedding-3-small"
)

## **Load PDF Document**

In [6]:
!pip install pypdf -qU

In [7]:
from langchain_community.document_loaders import PyPDFLoader

# load the pdf document
loader = PyPDFLoader("/content/drive/MyDrive/CodeProLK DL/LLM Conversational RAG App LangChain And OpenAI/kavindusankalpa.pdf")

docs = loader.load()

In [8]:
len(docs)  # page count

3

In [9]:
docs[0]

Document(metadata={'source': '/content/drive/MyDrive/CodeProLK DL/LLM Conversational RAG App LangChain And OpenAI/kavindusankalpa.pdf', 'page': 0}, page_content=' Introduction to Kavindu Sankalpa  \nKavindu Sankalpa is a dedicated Information Technology undergraduate at the Institute of \nTechnology, University of Moratuwa. Hailing from Mandaram Nuwara, Kandy, Sri Lanka, \nKavindu is passionate about leveraging technology to solve real -world problems and drive \ninnovation.  \nAcademic Background and Achievements  \nEducation  \n• Institution:  Institute of Technology, University of Moratuwa  \n• Major:  Information Technology  \n• SGPA:  3.34 \n• First Semester GPA:  3.24 \n• Second Semester GPA:  3.15 \n• Notable Coursework:  Python, NumPy, scikit -learn, MySQL  \nAcademic Projects  \n• Spatial Distribution Patterns of Trees:  Led data analysis and visualization using R \nStudio. Collaborated with Hansika Shamal under the guidance of Pasindu Perera from \nFiverr.  \n• Eye Tracking D

In [10]:
docs[0].metadata

{'source': '/content/drive/MyDrive/CodeProLK DL/LLM Conversational RAG App LangChain And OpenAI/kavindusankalpa.pdf',
 'page': 0}

In [11]:
docs[0].page_content

' Introduction to Kavindu Sankalpa  \nKavindu Sankalpa is a dedicated Information Technology undergraduate at the Institute of \nTechnology, University of Moratuwa. Hailing from Mandaram Nuwara, Kandy, Sri Lanka, \nKavindu is passionate about leveraging technology to solve real -world problems and drive \ninnovation.  \nAcademic Background and Achievements  \nEducation  \n• Institution:  Institute of Technology, University of Moratuwa  \n• Major:  Information Technology  \n• SGPA:  3.34 \n• First Semester GPA:  3.24 \n• Second Semester GPA:  3.15 \n• Notable Coursework:  Python, NumPy, scikit -learn, MySQL  \nAcademic Projects  \n• Spatial Distribution Patterns of Trees:  Led data analysis and visualization using R \nStudio. Collaborated with Hansika Shamal under the guidance of Pasindu Perera from \nFiverr.  \n• Eye Tracking Dataset Analysis:  Analyzed data using Tobii Eye Tracker 2150 to study \nthe effects of static vs. animated graphs on cognitive load, with pupil dilation as the \

## **Split Documents Into Chunks**

In [12]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

# initialize the text splitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 400,
    chunk_overlap = 50,
)

# split the document into chunks
splits = text_splitter.split_documents(docs)

In [13]:
len(splits)

14

In [14]:
splits

[Document(metadata={'source': '/content/drive/MyDrive/CodeProLK DL/LLM Conversational RAG App LangChain And OpenAI/kavindusankalpa.pdf', 'page': 0}, page_content='Introduction to Kavindu Sankalpa  \nKavindu Sankalpa is a dedicated Information Technology undergraduate at the Institute of \nTechnology, University of Moratuwa. Hailing from Mandaram Nuwara, Kandy, Sri Lanka, \nKavindu is passionate about leveraging technology to solve real -world problems and drive \ninnovation.  \nAcademic Background and Achievements  \nEducation'),
 Document(metadata={'source': '/content/drive/MyDrive/CodeProLK DL/LLM Conversational RAG App LangChain And OpenAI/kavindusankalpa.pdf', 'page': 0}, page_content='Education  \n• Institution:  Institute of Technology, University of Moratuwa  \n• Major:  Information Technology  \n• SGPA:  3.34 \n• First Semester GPA:  3.24 \n• Second Semester GPA:  3.15 \n• Notable Coursework:  Python, NumPy, scikit -learn, MySQL  \nAcademic Projects  \n• Spatial Distribution Pa

## **Create Vector Store And Retriever**

In [15]:
from langchain_chroma import Chroma

# create vector store from the document chunks
vectorstore = Chroma.from_documents(
    documents = splits,
    embedding = embedding_model
)

In [16]:
# create a retriever from the vector store
retriever = vectorstore.as_retriever()

## **Define Prompt Template**

In [17]:
from langchain_core.prompts import ChatPromptTemplate

# define the system prompt
system_prompt = (
    "You are an itelligent chatbot. Use the following context to answer the question.If you don't know the answer, just say that you don't know."
    "\n\n"
    "{context}"
)

# create the prompt template
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("user", "{input}"),
    ]
)

In [18]:
prompt

ChatPromptTemplate(input_variables=['context', 'input'], messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], template="You are an itelligent chatbot. Use the following context to answer the question.If you don't know the answer, just say that you don't know.\n\n{context}")), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}'))])

## **Create RAG Chain**

In [19]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

# create the question answering chain
qa_chain = create_stuff_documents_chain(llm, prompt)

# create the RAG chain
rag_chain = create_retrieval_chain(retriever, qa_chain)

## **Invoke RAG Chain With Example Question**

In [20]:
response = rag_chain.invoke({"input": "Who is Sankalpa?"})
response["answer"]

'Sankalpa is a dedicated Information Technology undergraduate at the Institute of Technology, University of Moratuwa, who is passionate about leveraging technology to solve real-world problems and drive innovation.'

In [21]:
response = rag_chain.invoke({"input": "What is RAG Architecture"})
response["answer"]

"I'm sorry, but based on the context provided, I don't have information specifically about RAG Architecture. If you have any other questions or need clarification on a different topic, feel free to ask!"

In [22]:
response = rag_chain.invoke({"input": "What are the skills Sankalpa have?"})
response["answer"]

'Kavindu Sankalpa possesses a blend of technical expertise, leadership skills, and a passion for data science. He has honed his ability to lead teams, work collaboratively, and achieve common goals through his experiences as a cadet and a cricket player. Additionally, he offers data analysis services on Fiverr, showcasing his technical skills in providing valuable insights. Kavindu is deeply interested in AI and data science, continually seeking to expand his knowledge and skills in these areas.'

In [23]:
response = rag_chain.invoke({"input": "Do you remember chat history"})

response["answer"]

"I don't have the ability to remember past interactions or chat history. Each conversation is independent."

## **Add Chat History**

In [25]:
from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder

# define the contextualize system prompt
contextualize_system_prompt = (
    "Using chat history and the latest user question, just reformulate question if needed and otherwise return it as is"
)

# create the cotextualize prompt template
cotextualize_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_system_prompt),
        MessagesPlaceholder(variable_name="chat_history"),
        ("user", "{input}"),
    ]
)

# create the history aware retriever
history_aware_retriever = create_history_aware_retriever(llm, retriever, cotextualize_prompt)

### **Create History Aware Retriever**

In [26]:
from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder

system_prompt = (
    "You are an intelligent chatbot. Use the following context to answer the question. If you don't know the answer, just say that you don't know"
    "\n\n"
    "{context}"
)

# create the prompt template
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        MessagesPlaceholder(variable_name="chat_history"),
        ("user", "{input}"),
    ]
)

In [27]:
prompt

ChatPromptTemplate(input_variables=['chat_history', 'context', 'input'], input_types={'chat_history': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], template="You are an intelligent chatbot. Use the following context to answer the question. If you don't know the answer, just say that you don't know\n\n{context}")), MessagesPlaceholder(variable_name='chat_history'), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}'))])

In [28]:
# create the question answering chain
qa_chain = create_stuff_documents_chain(llm, prompt)

# create the RAG chain
rag_chain = create_retrieval_chain(history_aware_retriever, qa_chain)

### **Manage Chat Sessin History**

In [29]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

# initialize the store for session histories
store = {}

# function to get the session history for a given session ID
def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

# create the conversational RAG chain with session history
conversational_rag_chain = RunnableWithMessageHistory(
    rag_chain,
    get_session_history,
    input_messages_key = "input",
    history_messages_key = "chat_history",
    output_messages_key = "answer",
)

## **Invoke Conversational RAG Chain With Example Questions**

In [30]:
response = conversational_rag_chain.invoke(
    {"input": "Who is Sankalpa?"},
    config={"configurable": {"session_id": "101"}},
)

response["answer"]

'Sankalpa is a dedicated Information Technology undergraduate at the Institute of Technology, University of Moratuwa, who is passionate about leveraging technology to solve real-world problems and drive innovation.'

In [31]:
response = conversational_rag_chain.invoke(
    {"input": "What are the skills Sankalpa have?"},
    config={"configurable": {"session_id": "101"}},
)

response["answer"]

'Sankalpa has technical expertise in data analysis, leadership skills gained through experiences as a cadet and cricket player, and a passion for data science and AI.'

In [32]:
response = conversational_rag_chain.invoke(
    {"input": "Can you listdown"},
    config={"configurable": {"session_id": "101"}},
)

response["answer"]

'Sure! Here are the skills that Sankalpa possesses:\n\n1. Proficiency in programming languages: Python, R\n2. Data analysis and machine learning skills using libraries like NumPy and scikit-learn\n3. Database management skills in MySQL\n4. Experience with tools and software such as AMOS, EViews, Jamovi, Minitab, SmartPLS, Mplus\n5. Leadership and teamwork skills honed through roles as a cadet and cricket player\n6. Offering data analysis services on Fiverr, providing valuable insights using technical skills'