This chatbot will be able to have a conversation and remember previous interactions.

Note that this chatbot that we build will only use the language model to have a conversation. There are several other related concepts that you may be looking for:

- Conversational RAG: Enable a chatbot experience over an external source of data
- Agents: Build a chatbot that can take actions

In [10]:
import os
from dotenv import load_dotenv
load_dotenv()

groq_api_key = os.getenv("GROQ_API_KEY")

In [19]:
from langchain_groq import ChatGroq
model = ChatGroq(model = "Gemma2-9b-it", groq_api_key = groq_api_key)
model

ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x10d9e7380>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x10da2a9f0>, model_name='Gemma2-9b-it', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [20]:
from langchain_core.messages import HumanMessage
#HumanMessage: user is giving message
model.invoke([HumanMessage(content="Hi, my name is Palak and I am an AI & Data Consultant")])

AIMessage(content="Hi Palak, it's nice to meet you! \n\nThat's a fascinating field.  What kind of projects do you typically work on as an AI & Data Consultant?  \n\nI'm always eager to learn more about how AI is being used in different industries.\n", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 61, 'prompt_tokens': 24, 'total_tokens': 85, 'completion_time': 0.110909091, 'prompt_time': 0.002129204, 'queue_time': 0.017168754, 'total_time': 0.113038295}, 'model_name': 'Gemma2-9b-it', 'system_fingerprint': 'fp_10c08bf97d', 'finish_reason': 'stop', 'logprobs': None}, id='run--df701e4f-f658-4a0f-a823-5c701463748e-0', usage_metadata={'input_tokens': 24, 'output_tokens': 61, 'total_tokens': 85})

In [21]:
from langchain_core.messages import AIMessage
model.invoke([
    HumanMessage(content="Hi, my name is Palak and I am an AI & Data Consultant"),
    AIMessage(content="Hi Palak, it's nice to meet you! \n\nThat's a fascinating field. What kind of projects do you typically work on as an AI & Data Consultant? \n\nI'm always interested in learning more about how AI is being used in different industries.\n"),
    HumanMessage("Hey. What's my name and what do I do?") #model is able to remember previous context

])

AIMessage(content="You said your name is Palak, and you're an AI & Data Consultant!  😊 \n\nIs there anything else you'd like me to know about you or your work?  I'm eager to learn more. \n", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 52, 'prompt_tokens': 105, 'total_tokens': 157, 'completion_time': 0.094545455, 'prompt_time': 0.005551734, 'queue_time': 0.018018244, 'total_time': 0.100097189}, 'model_name': 'Gemma2-9b-it', 'system_fingerprint': 'fp_10c08bf97d', 'finish_reason': 'stop', 'logprobs': None}, id='run--0d034d56-f646-45b3-b649-3fa4132f8196-0', usage_metadata={'input_tokens': 105, 'output_tokens': 52, 'total_tokens': 157})

## Message History

We can use a message history class to wrap our model and make it stateful to keep track of inputs and outputs of the model and store these in some datastore. Future interactions will then store those messages and pass them into the chain as part of the input. Let's see how to use this!

In [22]:
pip install langchain_community

Note: you may need to restart the kernel to use updated packages.


In [25]:
#import important libraries with respect to message history
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory #every message needs to be added to the chat history

#how will we make sure that one session will be different from another session? we will create a function for that:

#we will create a dictionary to define whether the session id is the present session id
store = {}
def get_session_history(session_id: str) -> BaseChatMessageHistory: #return chat history for a specific session id
    if session_id not in store:
        store[session_id] = ChatMessageHistory() #whenever we are creating a session id, we are storing in the dictionary and initializing the chat message history. The chat will go inside the session id.

    return store[session_id]

with_message_history = RunnableWithMessageHistory(model, get_session_history) #interact with model based on the chat history

In [26]:
#We will see how we will use this. First, we will create config
config = {"configurable": {"session_id":"chat1"}} #hardcoding for now


In [28]:
#use the session id to chat with LLM model
response = with_message_history.invoke(
    [
        HumanMessage(content="Hi, my name is Palak and I am an AI & Data Consultant")
    ], config = config
)

In [29]:
response.content

"Hi Palak, it's nice to meet you! \n\nThat's a really interesting field. I'm always curious to hear about the kinds of projects AI and data consultants work on.  Could you tell me a bit more about your work? \n\nWhat are some of the challenges and rewards you find in your role?\n"

In [30]:
with_message_history.invoke(
    [
        HumanMessage(content="What's my name?") #since config is same, session id is same, so model has memory and will respond accordingly
    ], config = config
)

AIMessage(content='Your name is Palak.  \n\nI remembered it from our introduction! 😊  Is there anything else I can help you with?  \n', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 31, 'prompt_tokens': 199, 'total_tokens': 230, 'completion_time': 0.056363636, 'prompt_time': 0.009150666, 'queue_time': 0.092487442, 'total_time': 0.065514302}, 'model_name': 'Gemma2-9b-it', 'system_fingerprint': 'fp_10c08bf97d', 'finish_reason': 'stop', 'logprobs': None}, id='run--0d219070-cdcf-48d7-8dd9-89bc1d4828f9-0', usage_metadata={'input_tokens': 199, 'output_tokens': 31, 'total_tokens': 230})

Let's try changing config, which is session id

In [35]:
config1 = {"configurable":{"session_id":"chat2"}}
response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config1
)
response.content

"Since I don't have access to past conversations, I don't know your name.  \n\nIf you'd like to tell me, I'm happy to learn!  😊  \n\n"

In [36]:
response = with_message_history.invoke(
    [HumanMessage(content="Hey, my name is Lisa")],
    config=config1
)
response.content

"Hi Lisa!  It's nice to meet you. 😊 \n\nIs there anything I can help you with today?\n"

In [37]:

response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config1
)
response.content

'Your name is Lisa!  I remember you telling me. 😊  How can I help you today, Lisa?\n'

So, with the help of session id, we are able to switch the context of the conversation.
An easy way to understand this is using BaseChatMessageHistory, which is an abstract class for storing message history, and ChatMessageHistory, and the chain between model and session history using RunnableWithMessageHistory.

## Prompt templates
Prompt templates help turn user info into a format that the LLM can work with. In this case, raw user input is simply a message we are passing to the LLM. Now, let's add a system message with some customer instructions (but still taking messages as input). Next, we will add more input along with the messages.

In [48]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant. Answer all the questions to the best of your capability."),
        MessagesPlaceholder(variable_name="messages")
    ]
)

#create chain
chain = prompt | model

In [50]:
#invoke chain
#whatever human message we give needs to be given in key value pairs, where key name should be 'messages'.
response = chain.invoke({"messages":[HumanMessage(content="Hi, my name is Palak")]})
response.content

"Hello, Palak! It's nice to meet you. \n\nHow can I help you today? 😊  \n"

In [51]:
## Add more complexity

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | model

In [52]:
response=chain.invoke({"messages":[HumanMessage(content="Hi My name is Palak")],"language":"Hindi"})
response.content

'नमस्ते पालक! मुझे बताएं, आप क्या जानना चाहती हैं? मैं अपनी पूरी कोशिश करूँगा कि आपकी मदद कर सकूँ। 😊  \n'

Let's now wrap this more complicated chain in a Message History class. This time, because there are multiple keys in the input, we need to specify the correct key to use to save the chat history.

In [53]:
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages"
)

In [58]:
config = {"configurable":{"session_id":"chat3"}}
response = with_message_history.invoke(
    {'messages':[HumanMessage(content="Hi, I am Palak")], "language":"Hindi"},
    config=config
)
response.content

'नमस्ते, पालक! 😊 \n\nमुझे खुशी है तुमसे मिलने की।  आप कैसे हो? \n'

In [60]:
response=with_message_history.invoke({"messages":[HumanMessage(content="What's my name?")],"language":"Hindi"}, config = config)
response.content

'आपका नाम पालक है। 😊  \n'

## Manage conversation history
We need to know how to manage conversation history. If left unmanaged, the list of messages will grow a lot and potentially overflow the context window of the LLM. So, we need to add steps that limit the size of messages being passed into the model.

'trim_messages' will help reduce how many messages being sent to the model: the trimmer allows us to specify how many tokens we want to keep.

In [69]:
from langchain_core.messages import SystemMessage, trim_messages
trimmer = trim_messages(
    max_tokens = 45,
    strategy = "last", #focus on last conversation
    token_counter = model,
    include_system = True, #include system message to say what the llm should do
    allow_partial = False,
    start_on = "human"
)

In [67]:
pip install transformers

Collecting transformers
  Using cached transformers-4.51.3-py3-none-any.whl.metadata (38 kB)
Collecting filelock (from transformers)
  Downloading filelock-3.18.0-py3-none-any.whl.metadata (2.9 kB)
Collecting huggingface-hub<1.0,>=0.30.0 (from transformers)
  Using cached huggingface_hub-0.31.1-py3-none-any.whl.metadata (13 kB)
Collecting regex!=2019.12.17 (from transformers)
  Downloading regex-2024.11.6-cp312-cp312-macosx_11_0_arm64.whl.metadata (40 kB)
Collecting tokenizers<0.22,>=0.21 (from transformers)
  Using cached tokenizers-0.21.1-cp39-abi3-macosx_11_0_arm64.whl.metadata (6.8 kB)
Collecting safetensors>=0.4.3 (from transformers)
  Using cached safetensors-0.5.3-cp38-abi3-macosx_11_0_arm64.whl.metadata (3.8 kB)
Collecting tqdm>=4.27 (from transformers)
  Using cached tqdm-4.67.1-py3-none-any.whl.metadata (57 kB)
Collecting fsspec>=2023.5.0 (from huggingface-hub<1.0,>=0.30.0->transformers)
  Downloading fsspec-2025.3.2-py3-none-any.whl.metadata (11 kB)
Collecting hf-xet<2.0.0,>

In [80]:
messages = [
    SystemMessage(content="You are a great assistant!"),
    HumanMessage(content="Hi! I am Palak"),
    AIMessage(content="Hi!"),
    HumanMessage(content="I love vanilla ice cream"),
    AIMessage(content="That's good to hear"),
    HumanMessage(content="What is the 2+2?"),
    AIMessage(content="4"),
    HumanMessage(content="Thanks!"),
    AIMessage(content="No issues :)"),
    HumanMessage(content="Having fun?"),
    AIMessage(content="Yes!!)")


]
trimmer.invoke(messages)

[SystemMessage(content='You are a great assistant!', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='What is the 2+2?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='4', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='Thanks!', additional_kwargs={}, response_metadata={}),
 AIMessage(content='No issues :)', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='Having fun?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='Yes!!)', additional_kwargs={}, response_metadata={})]

In [81]:
from operator import itemgetter

from langchain_core.runnables import RunnablePassthrough

chain = (
    RunnablePassthrough.assign(messages = itemgetter("messages")|trimmer)
    | prompt | model
)

response = chain.invoke(
    {"messages":messages + [HumanMessage(content = "What ice cream do I like?")],
             "language":"English"
             })
response.content #trimmer has been applied, so the context of the ice cream is no longer part of conversation.

"As an AI, I have no memory of past conversations or personal information about you. So, I don't know what your favorite ice cream flavor is!\n\nWhat's your favorite flavor?\n"

In [82]:
response = chain.invoke(
    {"messages":messages + [HumanMessage(content = "What math problem did I ask?")],
             "language":"English"
             })
response.content #trimmer has been applied, so the context of the ice cream is no longer part of conversation.

"As a large language model, I have no memory of past conversations. Every interaction we have is fresh and new!  \n\nIf you'd like to work on a math problem, I'm happy to help. Just ask! 😊\n"

Let's wrap this in message history

In [84]:
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages"
)
config = {"configurable":{"session_id":"chat4"}}

In [85]:
response = chain.invoke(
    {"messages":messages + [HumanMessage(content = "What's my name?'?")],
             "language":"English"
             })
response.content #trimmer has been applied, so the context of the ice cream is no longer part of conversation.

"As an AI, I don't have memory of past conversations or personal information about you. So I don't know your name.  \n\nWhat's your name? 😊  \n\n"