<a href="https://colab.research.google.com/github/amal2334/NLP-and-LLM-Project/blob/main/chatbots.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Building Specialized Chatbots with LangChain: From Prompts to RAG**

In [None]:
!pip install dotenv
from dotenv import load_dotenv
import os

# Load .env file
load_dotenv()

# Access the keys
hf_token = os.getenv("HUGGINGFACE_API_KEY")
groq_api_key = os.getenv("GROQ_API_KEY")

# Check if loaded (optional for testing)
print("HuggingFace key loaded:", hf_token is not None)
print("Groq key loaded:", groq_api_key is not None)


HuggingFace key loaded: True
Groq key loaded: True


# **Table of content**
In this notebook, we'll explore:

- **Introduction**

- **Setup & Downloads**
  - Install required libraries

  - Import core modules

  - Set API keys and environment

- **1. Human Text Generation with Context**
  - Mini Chatbot: ✈️ Travel Assistant by Recommending  travel ideas based on user input

- **2. Conversation Management and Multi-User Support**
  -  Mini Chatbot: 📅 Personal Productivity Coach by Helping  users plan goals and routines with memory per user

- **3. Prompt Engineering**
  - Mini Chatbot: 🛒 Shopping Assistant by Helping  users choose the right product based on preferences

- **4. Multilingual Capabilities with Chat Histor**y
   - Mini Chatbot: 🗣️ Language Learning Assistant with Memory by Helping users learn and practice Arabic, English, etc.

- **5. Token Trimming for Long Conversations**
   - Mini Chatbot: 📞 Customer Support Assistant by Assisting  users in long chats without losing coherence

- **6. Retrieval-Augmented Generation (RAG)**
    - Mini Chatbot: 📖 AI Study Assistant (RAG + Memory) and Answers research questions using real blog content.
    - Tracks chat history for follow-up questions
-  Conclusion
- Final Thoughts


# **Introduction**

- In this project we will be building a modern Q&A chatbot using Large Language Models (LLMs) and the LangChain framework in which we will explore the field of generative AI and creating a functional, context-aware conversational system.

# **Overview**
- Conversational AI is transforming how humans interact with computers, enabling more natural and intuitive interfaces. The techniques covered in this notebook form the foundation of many modern AI applications, from customer service bots to virtual assistants and knowledge management systems.

# **Envirenment Setup**
The key libraries we'll use include:
- `langchain`: Core framework for working with LLMs
- `langchain_community`: Community-contributed components and integrations
- `langchain_core`: Essential components like message types and prompt templates
- `langchain_chroma`: Vector database integration for knowledge retrieval


**Note:**
- Throughtout this notebook , we'll be using API key from Groq and Hugging Face

In [None]:
!pip install langchain-groq
!pip install dotenv
import os
from dotenv import load_dotenv
load_dotenv()
from langchain_groq import ChatGroq
!pip install -U langchain-community
!pip install langchain_chroma
!pip install bs4



In [None]:
from langchain_groq import ChatGroq
model=ChatGroq(model="Gemma2-9b-It",groq_api_key=groq_api_key)
model

ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x7d89dc573dd0>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x7d89dc56da90>, model_name='Gemma2-9b-It', model_kwargs={}, groq_api_key=SecretStr('**********'))

**Note:**

The foundation of our chatbot is the Large Language Model (LLM). In this project, we're using **Gemma2 9B-IT**, a powerful instruction-tuned model accessible through Ollama.

**What is Gemma2?**:
Gemma2 is Google's lightweight, state-of-the-art open model that delivers strong performance while being efficient enough to run on consumer hardware. The "9B" refers to the number of parameters (9 billion), and "IT" indicates it's instruction-tuned, meaning it's specifically optimized for following instructions and generating helpful responses.

**What is Ollama?**
Ollama is a framework that simplifies running open-source LLMs locally. It handles model downloading, optimization, and inference, providing a simple API for applications to interact with these models.



# **1-Human Text Generation with Context**

In [None]:
from langchain_core.messages import AIMessage
from langchain_core.messages import HumanMessage
from langchain_core.messages import SystemMessage

model.invoke(
    [

        HumanMessage(content="Hi , My name is Amal and I am a data analyst , later on inshallah a data scientis"),
        AIMessage(content="Hello Amal! It's nice to meet you. \n\nAs a data  Analyst, what kind of projects are you working on these days? \n\nI'm always eager to learn more about the exciting work being done in the field of AI and data analysis.\n"),
        HumanMessage(content="Hey What's my name and what do I do?")
    ]
)

AIMessage(content="You are Amal, and you are a data analyst, aspiring to become a data scientist! 😊  \n\nIs there anything else you'd like to tell me about yourself or your work?  I'm happy to chat. \n", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 50, 'prompt_tokens': 109, 'total_tokens': 159, 'completion_time': 0.090909091, 'prompt_time': 0.005150232, 'queue_time': 0.09209899499999999, 'total_time': 0.096059323}, 'model_name': 'Gemma2-9b-It', 'system_fingerprint': 'fp_10c08bf97d', 'finish_reason': 'stop', 'logprobs': None}, id='run--5e117fe3-9562-4bdc-83f5-83917a8cd3ca-0', usage_metadata={'input_tokens': 109, 'output_tokens': 50, 'total_tokens': 159})

In [None]:
model.invoke(
    [
        SystemMessage(content="You are a helpful assistant"),
        HumanMessage(content="Hi , My name is Ajay and I am a data analyst"),
        AIMessage(content="Hello Ajay ! It's nice to meet you. \n\nAs a data Analyst, what kind of projects are you working on these days? \n\nI'm always eager to learn more about the exciting work being done in the field of AI and data analysis.\n"),
        HumanMessage(content="Hey What's my name and what do I do?")
    ]
)

AIMessage(content='You are Ajay, a data analyst!  I remember 😊  \n\nHow can I help you with your data analysis work today?\n', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 29, 'prompt_tokens': 103, 'total_tokens': 132, 'completion_time': 0.052727273, 'prompt_time': 0.00569996, 'queue_time': 0.108991695, 'total_time': 0.058427233}, 'model_name': 'Gemma2-9b-It', 'system_fingerprint': 'fp_10c08bf97d', 'finish_reason': 'stop', 'logprobs': None}, id='run--3fb2f998-79e9-48ae-a61b-2d9f689640f1-0', usage_metadata={'input_tokens': 103, 'output_tokens': 29, 'total_tokens': 132})

# **Mini chatbot :Travel Assistant**

In [None]:
model.invoke(
    [
        SystemMessage(content="You are a smart travel assistant who helps people plan their vacations."),
        HumanMessage(content="Hi, I’m Sarah and I’d like to plan a trip to Italy next month."),
        AIMessage(content="Hi Sarah! That sounds exciting. I'd love to help you plan your trip to Italy. Do you have any cities or activities in mind?"),
        HumanMessage(content="What’s my name and where do I want to go?")
    ]
)


AIMessage(content='You said your name is Sarah and you want to go to Italy! 🇮🇹 \n\nAnything else I can help you with to start planning your trip? Do you have a particular region or city in mind?  Any must-see sights or activities? What kind of budget are you working with?  The more information you give me, the better I can help! 😊\n', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 79, 'prompt_tokens': 95, 'total_tokens': 174, 'completion_time': 0.143636364, 'prompt_time': 0.004291292, 'queue_time': 0.09223461699999999, 'total_time': 0.147927656}, 'model_name': 'Gemma2-9b-It', 'system_fingerprint': 'fp_10c08bf97d', 'finish_reason': 'stop', 'logprobs': None}, id='run--c3db16b4-f192-4c0a-b4b8-cb2c476e5b1e-0', usage_metadata={'input_tokens': 95, 'output_tokens': 79, 'total_tokens': 174})

# **Note:**
Message Types in Conversational AI

In a conversational system, we need to distinguish between different types of messages. LangChain provides several message classes to represent different participants in a conversation:

1. **HumanMessage**: Represents messages from the user
2. **AIMessage**: Represents responses from the AI assistant
3. **SystemMessage**: Contains instructions or context for the AI that aren't part of the visible conversation

# **2-Conversation management and Multi-User Support**
- We can use a Message History class to wrap our model and make it stateful. This will keep track of inputs and outputs of the model, and store them in some datastore.

In [None]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store={}

def get_session_history(session_id:str)->BaseChatMessageHistory:
    if session_id not in store:
        store[session_id]=ChatMessageHistory()
    return store[session_id]

with_message_history=RunnableWithMessageHistory(model,get_session_history)

In [None]:
config={"configurable":{"session_id":"chat1"}}

In [None]:
response=with_message_history.invoke(
    [HumanMessage(content="Hi , My name is Amal and I am a data analyst")],
    config=config
)

In [None]:
response.content

"Hello Amal! It's nice to meet you.  \n\nAs a data analyst, what kind of work do you enjoy doing the most? \n\nDo you have any interesting projects you're working on right now? I'm always eager to learn more about how people use data to solve problems and uncover insights.\n"

In [None]:
with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

AIMessage(content='Your name is Amal. 😊 \n\nI remembered that from our introduction!  How can I help you today?\n', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 26, 'prompt_tokens': 103, 'total_tokens': 129, 'completion_time': 0.047272727, 'prompt_time': 0.005109321, 'queue_time': 0.066195094, 'total_time': 0.052382048}, 'model_name': 'Gemma2-9b-It', 'system_fingerprint': 'fp_10c08bf97d', 'finish_reason': 'stop', 'logprobs': None}, id='run--75a4ff08-dfe7-4657-844d-d958d26f0feb-0', usage_metadata={'input_tokens': 103, 'output_tokens': 26, 'total_tokens': 129})

In [None]:
## change the config-->session id
config1={"configurable":{"session_id":"chat2"}}
response=with_message_history.invoke(
    [HumanMessage(content="Whats my name")],
    config=config1
)
response.content

"As an AI, I do not have access to any personal information about you, including your name. If you'd like to tell me your name, I'd be happy to know! 😊  \n\n"

# **Note:**
- Model memory allows the AI to remember what was said earlier in a conversation, just like talking to a person who listens and responds based on context. In LangChain, this memory is stored using a session ID.
- When you stay in the same session, the model can recall your name, role, or anything you previously mentioned. But when you change the session ID, it’s like starting a brand-new chat.
- the model has no memory of past messages. So, memory works only within the same session, and changing the session resets the conversation history.

In [None]:
response=with_message_history.invoke(
    [HumanMessage(content="Hey My name is Ajay")],
    config=config1
)
response.content

"Hello Ajay, it's nice to meet you! 👋  \n\nIs there anything I can help you with today? 😊  \n\n"

In [None]:
response=with_message_history.invoke(
    [HumanMessage(content="Whats my name")],
    config=config1
)
response.content

'Your name is Ajay! 😊  \n\nI remember that you told me earlier.  \n\n\n\n\n'

# **Note:**
- If we switch sessions, it’s like talking to the AI for the first time.

- We must build memory step by step within each session

# **Mini Chatbot: Personal Productivity Coach**

In [None]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

store = {}

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

system_prompt = (
    "You are a personal productivity coach. "
    "Help users build routines, focus on goals, and stay productive. "
    "Remember what each user tells you during the session."
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        MessagesPlaceholder(variable_name="messages"),
    ]
)


chain = prompt | model


with_memory_chain = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages"
)

user1_session = "amal123"
user2_session = "Ajay456"

# Amal's message
response1 = with_memory_chain.invoke(
    {"messages": [HumanMessage(content="give me some advices on how i can  build a Fashion brand")]},
    config={"configurable": {"session_id": user1_session}}
)

# Ajay 's message
response2 = with_memory_chain.invoke(
    {"messages": [HumanMessage(content="What would a meaningful life look like to you.")]},
    config={"configurable": {"session_id": user2_session}}
)

response1.content


'It\'s fantastic that you\'re looking to build a fashion brand! That\'s a creative and exciting goal. To help you get started, let\'s break it down into manageable steps and build a productive routine around it.  \n\n**First, let\'s talk about your vision:**\n\n* **What makes your brand unique?** What\'s your style, your target audience, your values?  \n* **What kind of fashion are you passionate about?**  Clothing, accessories, footwear?  \n* **Do you have any specific niches in mind?** Sustainable fashion, streetwear, luxury wear?\n\n**Next, let\'s focus on building a routine:**\n\n* **Dedicated Time:** Set aside specific time each day or week to work on your brand. Even 30 minutes can be productive! Treat it like an important appointment you can\'t miss.\n* **Actionable Tasks:** Break down your big goals into smaller, actionable tasks. For example, instead of "design a collection," start with "research fabrics" or "sketch 5 outfit ideas."\n* **Prioritization:** Use a to-do list or a

In [None]:
response2.content

"For me, a meaningful life is about continuous growth, learning, and making a positive impact. It's about using my abilities to help others, to create something new and valuable, and to leave the world a little better than I found it.  \n\nAs a personal productivity coach, I get to witness that kind of meaning unfold in the lives of others. Helping someone break through a barrier, achieve a long-held goal, or simply feel more in control of their time and energy is incredibly fulfilling.\n\nNow, tell me about you. What does a meaningful life look like to you? What are you passionate about? What are some things you'd like to achieve?  Let's work together to make those dreams a reality.\n"

## **3- Prompt Engineering**

In [None]:
from langchain_core.prompts import ChatPromptTemplate,MessagesPlaceholder
prompt=ChatPromptTemplate.from_messages(
    [
        ("system","You are a helpful assistant.Answer all the question to the best of your ability"),
        MessagesPlaceholder(variable_name="messages")
    ]
)

chain=prompt|model

In [None]:
chain.invoke({"messages":[HumanMessage(content="Hi My name is Amal")]})

AIMessage(content="Hello Amal!  It's nice to meet you. \n\nI'm here to help. What can I do for you? 😊  \n\n", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 34, 'prompt_tokens': 30, 'total_tokens': 64, 'completion_time': 0.061818182, 'prompt_time': 0.002158907, 'queue_time': 0.09089634099999999, 'total_time': 0.063977089}, 'model_name': 'Gemma2-9b-It', 'system_fingerprint': 'fp_10c08bf97d', 'finish_reason': 'stop', 'logprobs': None}, id='run--67aac1d3-da33-4545-a70a-95b6aefbc4b8-0', usage_metadata={'input_tokens': 30, 'output_tokens': 34, 'total_tokens': 64})

In [None]:
with_message_history=RunnableWithMessageHistory(chain,get_session_history)

In [None]:
config = {"configurable": {"session_id": "chat3"}}
response=with_message_history.invoke(
    [HumanMessage(content="Hi My name is Amal")],
    config=config
)

response

AIMessage(content="Hi Amal! It's nice to meet you. 😊 \n\nWhat can I do for you today? I'm ready for your questions!  \n\n", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 35, 'prompt_tokens': 30, 'total_tokens': 65, 'completion_time': 0.063636364, 'prompt_time': 0.002145457, 'queue_time': 0.090929571, 'total_time': 0.065781821}, 'model_name': 'Gemma2-9b-It', 'system_fingerprint': 'fp_10c08bf97d', 'finish_reason': 'stop', 'logprobs': None}, id='run--374a453e-9814-48eb-9120-6b27ef8638dc-0', usage_metadata={'input_tokens': 30, 'output_tokens': 35, 'total_tokens': 65})

In [None]:
response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

response.content

'Your name is Amal.  I remember! 😊  \n\n\nIs there anything else I can help you with?\n'

# **Mini chatbot: Shopping Assistant**

In [None]:

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant. Answer all the questions to the best of your ability"),
        MessagesPlaceholder(variable_name="messages")
    ]
)

chain = prompt | model
with_message_history = RunnableWithMessageHistory(chain, get_session_history)

config = {"configurable": {"session_id": "shop_user1"}}

response = with_message_history.invoke(
    {"messages": [HumanMessage(content="Hi, I’m looking for a new laptop for programming.")]},
    config=config
)

response = with_message_history.invoke(
    {"messages": [HumanMessage(content="Hi I care about GPU and RAM ")]},
    config=config
)

print(response.content)


Great! Knowing you prioritize GPU and RAM helps a lot.

To give you the most useful suggestions, could you tell me:

* **What's your budget range for this laptop?**  GPUs and RAM can significantly impact the price. 
* **What kind of programming will you primarily be doing?**  

    *  If it's **data science, machine learning, or AI**, you'll want a powerful GPU. 
    *  For **game development or 3D graphics**, a dedicated GPU is also essential.
    *  If it's more **web development or general coding**, a less powerful GPU might suffice.

* **How much RAM do you ideally need?** 16GB is becoming standard for modern programming, but some tasks might benefit from 32GB or more.


Let me know these details, and I can recommend some excellent laptops with the GPU and RAM power you're looking for!



# **4-multilingual capabilities with chat History**

In [None]:
ch
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | model

NameError: name 'ch' is not defined

In [None]:
response=chain.invoke({"messages":[HumanMessage(content="Hi My name is Amal")],"language":"Arabic"})
response.content

In [None]:
response=chain.invoke({"messages":[HumanMessage(content="Hi My name is Ajay")],"language":"Telugu"})
response.content

In [None]:
response=chain.invoke({"messages":[HumanMessage(content="Hi My name is selcuk")],"language":"Turkish"})
response.content

In [None]:
response=chain.invoke({"messages":[HumanMessage(content="Hi My name is Amal")],"language":"English"})
response.content

In [None]:
with_message_history=RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages"
)

In [None]:
config = {"configurable": {"session_id": "chat4"}}
repsonse=with_message_history.invoke(
    {'messages': [HumanMessage(content="Hi,I am Amal")],"language":"English"},
    config=config
)
repsonse.content

In [None]:
response = with_message_history.invoke(
    {"messages": [HumanMessage(content="whats my name?")], "language": "English"},
    config=config,
)

In [None]:
response.content

# **Mini ChatBot :Language learning Assistant with Memory**

- We’ve built a smart language learning tutor that responds in your chosen language **(like English or Spanish)** and remembers what you’ve learned during the session. This tutor can help you practice vocabulary, correct your mistakes, and track your progress over time using memory. It's like having a personal language coach who understands you and guides you step by step.

In [None]:



prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a supportive language tutor. Respond in {language}. Encourage and correct the student. Remember what they learned."),
    MessagesPlaceholder(variable_name="messages")
])


chain = prompt | model

store = {}

def get_session_history(session_id: str):
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages"
)

config = {"configurable": {"session_id": "english_learning_amal"}}


In [None]:
response1 = with_message_history.invoke(
    {
        "messages": [HumanMessage(content="How do I say 'libro 'in English?")],
        "language": "English"
    },
    config=config
)
print("Bot:", response1.content)


In [None]:
response2 = with_message_history.invoke(
    {
        "messages": [HumanMessage(content="What new word did I learn?")],
        "language": "English"
    },
    config=config
)
print("Bot:", response2.content)


# **5-Token Trimming for Long Conversation**

In [None]:
from langchain_core.messages import SystemMessage,trim_messages
trimmer=trim_messages(
    max_tokens=45,
    strategy="last",
    token_counter=model,
    include_system=True,
    allow_partial=False,
    start_on="human"
)
messages = [
    SystemMessage(content="you're a good assistant"),
    HumanMessage(content="hi! I'm Amal"),
    AIMessage(content="hi!"),
    HumanMessage(content="I like vanilla ice cream"),
    AIMessage(content="nice"),
    HumanMessage(content="whats 2 + 2"),
    AIMessage(content="4"),
    HumanMessage(content="thanks"),
    AIMessage(content="no problem!"),
    HumanMessage(content="having fun?"),
    AIMessage(content="yes!"),
]
trimmer.invoke(messages)

In [None]:
from operator import itemgetter

from langchain_core.runnables import RunnablePassthrough

chain=(
    RunnablePassthrough.assign(messages=itemgetter("messages")|trimmer)
    | prompt
    | model

)

response=chain.invoke(
    {
    "messages":messages + [HumanMessage(content="What ice cream do i like")],
    "language":"English"
    }
)
response.content

In [None]:
response = chain.invoke(
    {
        "messages": messages + [HumanMessage(content="what math problem did i ask")],
        "language": "English",
    }
)
response.content

**Note:**

- The model remembers the math problem ("what’s 2 + 2") because that message appears near the end of the conversation, and the  trimming strategy keeps the most recent messages within a 45-token limit. Since the math question and its answer are still within that limit, they are included when the model is called.
- In contrast, earlier messages like "I like vanilla ice cream" are further back in the history and may be trimmed out, so the model can no longer see or remember them.










In [None]:
## Lets wrap this in the MEssage History
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages",
)
config={"configurable":{"session_id":"chat5"}}

In [None]:
response = with_message_history.invoke(
    {
        "messages": messages + [HumanMessage(content="whats my name?")],
        "language": "English",
    },
    config=config,
)

response.content

**Note:**
- The reason the model replied:“As an AI, I don’t have memory.”is because this was the first time the session ID "chat5" was used.



# **Mini Customer Support chatbot**

In [None]:


trimmer = trim_messages(
    max_tokens=100,
    strategy="last",  # Keep the most recent messages
    token_counter=model,
    include_system=True,
    allow_partial=False
)


prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful customer support assistant."),
    MessagesPlaceholder(variable_name="messages")
])

chain = (
    RunnablePassthrough.assign(messages=itemgetter("messages") | trimmer)
    | prompt
    | model
)

messages = [
    SystemMessage(content="You are a helpful customer support assistant."),
    HumanMessage(content="Hi, I need help with a refund."),
    AIMessage(content="Sure, can you share your order number?"),
    HumanMessage(content="It's #12345"),
    AIMessage(content="Thanks. I see your order."),
    HumanMessage(content="Also, I was charged twice."),
    AIMessage(content="Sorry to hear that. We'll fix it."),
    HumanMessage(content="And I never received the item."),
    AIMessage(content="Let me check that for you."),
    HumanMessage(content="Can I still get my money back?"),
]

response = chain.invoke({
    "messages": messages + [HumanMessage(content="When can I expect a refund?")],
})

print("Bot:", response.content)


- In this customer **support chatbot**, token **trimming** is used to keep the conversation within a safe limit by automatically removing older messages and preserving the most recent ones.
- This ensures that the chatbot focuses only on the relevant part of the conversation, especially during long chats.
- When combined with message history, the system can track each user’s session separately while still **applying token limits** to avoid exceeding the model’s capacity.
- This approach keeps the **chatbot efficient, responsive, and context-aware**, even during extended support interactions.

# **6- Retrieval-Augmented Generation (RAG)**

In [None]:

os.environ["GROQ_API_KEY"]="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"
groq_api_key=os.environ["GROQ_API_KEY"]


llm=ChatGroq(groq_api_key=groq_api_key,model_name="Llama3-8b-8192")

llm


In [None]:

os.environ['HF_TOKEN']=hf_token
hf_token=os.environ['HF_TOKEN']
from langchain.embeddings import HuggingFaceEmbeddings
embeddings=HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

In [None]:

from langchain_chroma import Chroma
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.prompts import ChatPromptTemplate
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

**overview:**
- we loaded a webpage and used **BeautifulSoup** to **parse and filter the HTML content**. By targeting specific class names, we extracted only the relevant sections like the title, header, and main content, allowing us to focus on useful information while ignoring unnecessary parts of the page.

In [None]:
import bs4
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)

docs=loader.load()
docs

In [None]:
text_splitter=RecursiveCharacterTextSplitter(chunk_size=1000,chunk_overlap=200)
splits=text_splitter.split_documents(docs)
vectorstore=Chroma.from_documents(documents=splits,embedding=embeddings)
retriever=vectorstore.as_retriever()
retriever

In [None]:
## Prompt Template
system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

In [None]:
question_answer_chain=create_stuff_documents_chain(llm,prompt)
rag_chain=create_retrieval_chain(retriever,question_answer_chain)

# **Note:**
- This setup creates a RAG-based question-answering system that retrieves relevant information and generates concise answers using a language model. It enhances the accuracy and relevance of responses by grounding them in real-world data, making it ideal for tasks like document-based chat, knowledge assistants, and smart search systems.










In [None]:
response=rag_chain.invoke({"input":"What is Self-Reflection"})
response

In [None]:
response['answer']

In [None]:
rag_chain.invoke({"input":"Howw do we achieve it"})

In [None]:
from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder

contextualize_q_system_prompt = (
    "Given a chat history and the latest user question "
    "which might reference context in the chat history, "
    "formulate a standalone question which can be understood "
    "without the chat history. Do NOT answer the question, "
    "just reformulate it if needed and otherwise return it as is."
)
contextualize_q_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_q_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)

In [None]:
history_aware_retriever=create_history_aware_retriever(llm,retriever,contextualize_q_prompt)
history_aware_retriever

In [None]:
qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)

In [None]:
question_answer_chain=create_stuff_documents_chain(llm,qa_prompt)
rag_chain=create_retrieval_chain(history_aware_retriever,question_answer_chain)

In [None]:
from langchain_core.messages import AIMessage,HumanMessage
chat_history=[]
question="What is Self-Reflection"
response1=rag_chain.invoke({"input":question,"chat_history":chat_history})

chat_history.extend(
    [
        HumanMessage(content=question),
        AIMessage(content=response1["answer"])
    ]
)

question2="Tell me more about it?"
response2=rag_chain.invoke({"input":question,"chat_history":chat_history})
print(response2['answer'])

In [None]:
chat_history

# **Note:**
- This setup demonstrates a conversational RAG (Retrieval-Augmented Generation) system that can understand and answer questions based not only on retrieved documents but also on previous interactions.
- The first user question is answered using relevant content retrieved from a knowledge source. That question and answer are saved as chat history.
- When the user asks a follow-up question like “Tell me more about it?”, the system uses the stored history to understand what “it” refers to and generate a meaningful, context-aware answer.
-This shows how combining document retrieval with chat history enables more natural, coherent, and intelligent multi-turn conversations.

In [None]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store = {}


def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]


conversational_rag_chain = RunnableWithMessageHistory(
    rag_chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history",
    output_messages_key="answer",
)

In [None]:
conversational_rag_chain.invoke(
    {"input": "What is Task Decomposition?"},
    config={
        "configurable": {"session_id": "abc123"}
    },  # constructs a key "abc123" in `store`.
)["answer"]

In [None]:
conversational_rag_chain.invoke(
    {"input": "who is the president of US ?"},
    config={"configurable": {"session_id": "abc123"}},
)["answer"]

In [None]:
conversational_rag_chain.invoke(
    {"input": "What are common ways of doing it?"},
    config={"configurable": {"session_id": "abc123"}},
)["answer"]

In [None]:
store

# **Note:**
- The **conversational AI** was able to answer the question “What are common ways of doing it?” because it had access to previous context in the chat history—specifically, a prior question about “Task Decomposition.”
- This allowed the model to understand what “it” referred to and retrieve relevant information from the external source (the linked article) that discussed strategies for task decomposition.
- In contrast, when asked “**Who is the president of the US**?”, the model responded “**I don’t know**” because that information was not present in the external document. RAG systems can only answer based on what they retrieve, so if the knowledge isn’t in the source documents, the model cannot provide an answer—even if the question is clear.

# **Mini AI Study Chatbot  with RAG + Chat History**



In [None]:



session_id = "student_ai_study"

question1 = "What is Task Decomposition?"
response1 = conversational_rag_chain.invoke(
    {"input": question1},
    config={"configurable": {"session_id": session_id}},
)
print("👩‍🎓", question1)
print("🤖", response1["answer"])


question2 = "What are some common ways of doing it?"
response2 = conversational_rag_chain.invoke(
    {"input": question2},
    config={"configurable": {"session_id": session_id}},
)
print("\n👩‍🎓", question2)
print("🤖", response2["answer"])

question3 = "What evidence supports the existence of dark matter in galaxies?"
response3 = conversational_rag_chain.invoke(
    {"input": question3},
    config={"configurable": {"session_id": session_id}},
)
print("\n👩‍🎓", question3)
print("🤖", response3["answer"])


# **Conclusion**
we have built a sophisticated Q&A chatbot with several advanced capabilities:

1. **Context-Aware Conversations**: Your chatbot remembers previous exchanges and maintains coherent dialogues
2. **Multi-User Support**: The system can handle multiple conversation sessions simultaneously
3. **Multilingual Communication**: Your assistant can converse in various languages
4. **Memory Management**: The chatbot efficiently handles conversation history
5. **Knowledge Integration**: External information can be


# **Final Thoughts**

The field of conversational AI is evolving rapidly, with new models and techniques emerging regularly. By understanding the fundamental components covered in this project, we have built a strong foundation that will allow us to adapt to these changes and create increasingly sophisticated AI applications.

Building effective AI systems is an iterative process that benefits from continuous testing and refinement. As we  continue to experiment and learn, the  chatbots will become more capable, helpful, and natural in their interactions.

**Thank you for exploring this fascinating field with us!**