# Task 3: Context-Aware Chatbot (LangChain + RAG)

# Problem Statement & Objective
 **Problem Statement**
Traditional chatbots fail to remember conversation context and cannot reliably answer questions from external knowledge sources.

  **Objective**
To build a context-aware conversational chatbot that:


*   Maintains chat history (context memory)

*  Retrieves relevant information from a custom document corpus

*   Uses Retrieval-Augmented Generation (RAG)

*   Is deployed using Streamlit



In [2]:
#mount google drive
from google.colab import drive
drive.mount('/content/drive')


Mounted at /content/drive


In [3]:
#install required libraries
!pip install langchain faiss-cpu sentence-transformers openai tiktoken

Collecting faiss-cpu
  Downloading faiss_cpu-1.13.2-cp310-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (7.6 kB)
Downloading faiss_cpu-1.13.2-cp310-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (23.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m23.8/23.8 MB[0m [31m101.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: faiss-cpu
Successfully installed faiss-cpu-1.13.2


In [1]:
#Load the Dataset
!pip install langchain_community
from langchain_community.document_loaders import TextLoader

file_path = "/content/drive/MyDrive/data/knowledge.txt"

loader = TextLoader(file_path)
document_texts = loader.load()

print(document_texts)

[Document(metadata={'source': '/content/drive/MyDrive/data/knowledge.txt'}, page_content='LangChain is a framework designed to help developers build applications powered by large language models.\nIt provides tools for prompt management, memory, chains, and integrations with external data sources.\n\nRetrieval-Augmented Generation, also known as RAG, is a technique that enhances language models by retrieving relevant documents from an external knowledge base before generating a response.\nRAG helps reduce hallucinations and improves factual accuracy.\n\nFAISS is a vector database library developed by Meta for efficient similarity search.\nIt is commonly used to store and retrieve document embeddings in machine learning applications.\n\nEmbeddings are numerical representations of text that capture semantic meaning.\nThey allow systems to compare text based on meaning rather than exact words.\n\nA context-aware chatbot is capable of remembering previous user interactions and using that h

In [2]:
#split text into chunks
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=300,
    chunk_overlap=50
)

docs = text_splitter.split_documents(document_texts)

print(f"Number of chunks: {len(docs)}")
print(docs[0])

Number of chunks: 6
page_content='LangChain is a framework designed to help developers build applications powered by large language models.
It provides tools for prompt management, memory, chains, and integrations with external data sources.' metadata={'source': '/content/drive/MyDrive/data/knowledge.txt'}


In [3]:
#create embeddings
from langchain_community.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)

  embeddings = HuggingFaceEmbeddings(
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


In [4]:
#Create FAISS Vector Store
from langchain_community.vectorstores import FAISS

vectorstore = FAISS.from_documents(docs, embeddings)

print("Vector store created successfully!")

Vector store created successfully!


In [5]:
#test queries
query = "What is RAG?"
retrieved_docs = vectorstore.similarity_search(query, k=2)

for i, doc in enumerate(retrieved_docs):
    print(f"\nResult {i+1}:")
    print(doc.page_content)



Result 1:
Retrieval-Augmented Generation, also known as RAG, is a technique that enhances language models by retrieving relevant documents from an external knowledge base before generating a response.
RAG helps reduce hallucinations and improves factual accuracy.

Result 2:
LangChain is a framework designed to help developers build applications powered by large language models.
It provides tools for prompt management, memory, chains, and integrations with external data sources.


In [6]:
#install groq longchain support
!pip install langchain-groq




In [7]:
#Groq API setup
import os

os.environ["GROQ_API_KEY"] = "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"

In [8]:
from langchain_groq import ChatGroq

llm = ChatGroq(
    model="llama-3.1-8b-instant",
    temperature=0.2
)


In [9]:
response = llm.invoke("Say hello in one sentence")
print(response.content)

Hello, how are you today?


In [23]:
#uninstall conflicting packages
!pip uninstall -y langchain langchain-core langchain-community langchain-groq

Found existing installation: langchain 1.2.0
Uninstalling langchain-1.2.0:
  Successfully uninstalled langchain-1.2.0
Found existing installation: langchain-core 1.2.5
Uninstalling langchain-core-1.2.5:
  Successfully uninstalled langchain-core-1.2.5
Found existing installation: langchain-community 0.4.1
Uninstalling langchain-community-0.4.1:
  Successfully uninstalled langchain-community-0.4.1
Found existing installation: langchain-groq 1.1.1
Uninstalling langchain-groq-1.1.1:
  Successfully uninstalled langchain-groq-1.1.1


In [24]:
#install compatible packages
!pip install \
langchain==0.2.16 \
langchain-core==0.2.38 \
langchain-community==0.2.16 \
langchain-groq==0.1.9 \
faiss-cpu \
sentence-transformers \
tiktoken


Collecting langchain==0.2.16
  Downloading langchain-0.2.16-py3-none-any.whl.metadata (7.1 kB)
Collecting langchain-core==0.2.38
  Downloading langchain_core-0.2.38-py3-none-any.whl.metadata (6.2 kB)
Collecting langchain-community==0.2.16
  Downloading langchain_community-0.2.16-py3-none-any.whl.metadata (2.7 kB)
Collecting langchain-groq==0.1.9
  Downloading langchain_groq-0.1.9-py3-none-any.whl.metadata (2.9 kB)
Collecting langchain-text-splitters<0.3.0,>=0.2.0 (from langchain==0.2.16)
  Downloading langchain_text_splitters-0.2.4-py3-none-any.whl.metadata (2.3 kB)
Collecting langsmith<0.2.0,>=0.1.17 (from langchain==0.2.16)
  Downloading langsmith-0.1.147-py3-none-any.whl.metadata (14 kB)
Collecting numpy<2.0.0,>=1.26.0 (from langchain==0.2.16)
  Downloading numpy-1.26.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.0/61.0 kB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
Collecting tenacity!

In [10]:
#create langchain conversational memory
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True,
    output_key="answer"
)

In [14]:
#create conversational RAG chain
from langchain.chains import ConversationalRetrievalChain

chatbot = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=retriever,
    memory=memory,
    return_source_documents=True
)


In [15]:
#response_test
response1 = chatbot({"question": "What is LangChain?"})
print("Answer 1:", response1["answer"])

response2 = chatbot({"question": "Why is it useful?"})
print("Answer 2:", response2["answer"])


  response1 = chatbot({"question": "What is LangChain?"})


Answer 1: LangChain is a framework designed to help developers build applications powered by large language models. It provides tools for prompt management, memory, chains, and integrations with external data sources.
Answer 2: LangChain is useful for several reasons:

1. **Building applications with large language models**: LangChain provides a framework for developers to build applications that leverage the power of large language models, making it easier to integrate these models into real-world applications.

2. **Prompt management**: LangChain offers tools for managing prompts, which is essential for getting accurate and relevant responses from language models. This helps reduce the complexity of working with large language models.

3. **Memory and chaining**: LangChain's memory and chaining features allow developers to build more sophisticated applications that can store and retrieve information, and chain multiple language model responses together to create a more cohesive and a

In [17]:
response3 = chatbot.invoke({"question": "What did I ask first?"})
print("Answer 3:", response3["answer"])


Answer 3: I don't know. This is the beginning of our conversation, and I don't have any information about previous user interactions.


**Final Summary & Insights**

*   Built a context-aware conversational chatbot using Retrieval-Augmented Generation (RAG).

*  Created a custom knowledge base and preprocessed it into semantically meaningful text chunks.

*   Integrated a Groq-hosted LLaMA 3 language model to generate document-grounded responses.

*  Generated vector embeddings and stored them in a FAISS vector database for semantic retrieval.

*   Implemented conversational memory to support multi-turn, context-aware interactions.

*   Observed improved factual accuracy and reduced hallucinations compared to standalone LLM responses.







