## Building A Chatbot
In this video We'll go over an example of how to design and implement an LLM-powered chatbot. This chatbot will be able to have a conversation and remember previous interactions.

Note that this chatbot that we build will only use the language model to have a conversation. There are several other related concepts that you may be looking for:

- Conversational RAG: Enable a chatbot experience over an external source of data
- Agents: Build a chatbot that can take actions

This video tutorial will cover the basics which will be helpful for those two more advanced topics.

In [40]:
import os
from dotenv import load_dotenv
load_dotenv() ## aloading all the environment variable

groq_api_key=os.getenv("GROQ_API_KEY")
# groq_api_key



In [2]:
from langchain_groq import ChatGroq
model=ChatGroq(model="openai/gpt-oss-20b",groq_api_key=groq_api_key)
model

  from .autonotebook import tqdm as notebook_tqdm


ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x00000239F9C36470>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x00000239F9C36380>, model_name='openai/gpt-oss-20b', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [3]:
from langchain_core.messages import HumanMessage
model.invoke([HumanMessage(content="Hi , My name is Sushant Twayana and I am a Junior AI/ML Engineer")])

AIMessage(content='Hi Sushant! üëã Great to meet a Junior AI/ML Engineer. How can I help you today? Whether you‚Äôre looking for guidance on projects, need help debugging code, or just want to chat about the latest in AI, I‚Äôm here for you.', additional_kwargs={'reasoning_content': 'We need to respond. The user says: "Hi, My name is Sushant Twayana and I am a Junior AI/ML Engineer". They likely want to introduce themselves. We should respond politely, maybe ask how we can help. The user didn\'t ask a question. We should ask what they need. So we respond with greeting, maybe ask if they need help.'}, response_metadata={'token_usage': {'completion_tokens': 142, 'prompt_tokens': 91, 'total_tokens': 233, 'completion_time': 0.148120225, 'prompt_time': 0.007554388, 'queue_time': 0.052941592, 'total_time': 0.155674613, 'completion_tokens_details': {'reasoning_tokens': 78}}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_e99e93f2ac', 'service_tier': 'on_demand', 'finish_reason

In [4]:
from langchain_core.messages import AIMessage
model.invoke(
    [
        HumanMessage(content="Hi , My name is Sushant Twayana and I am a Junior AI/ML Engineer"),
        AIMessage(content="Hi Sushant! üëã Great to meet a Junior AI/ML Engineer. How can I help you today? Whether you‚Äôre looking for guidance on projects, need help debugging code, or just want to chat about the latest in AI, I‚Äôm here for you."),
        HumanMessage(content="Hey What's my name and in which field do I work??")
    ]
)

AIMessage(content='Your name is **Sushant Twayana** and you work in the **AI/ML (Artificial Intelligence / Machine Learning) engineering field** as a Junior AI/ML Engineer.', additional_kwargs={'reasoning_content': 'User: "Hey What\'s my name and in which field do I work??" They previously introduced themselves: "Hi , My name is Sushant Twayana and I am a Junior AI/ML Engineer". So answer: name Sushant Twayana, works in AI/ML engineering. Provide answer.'}, response_metadata={'token_usage': {'completion_tokens': 112, 'prompt_tokens': 168, 'total_tokens': 280, 'completion_time': 0.113533325, 'prompt_time': 0.018717539, 'queue_time': 0.064875581, 'total_time': 0.132250864, 'completion_tokens_details': {'reasoning_tokens': 65}}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_e99e93f2ac', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None, 'model_provider': 'groq'}, id='lc_run--adaba520-d599-4b01-b222-8f328d247a1f-0', usage_metadata={'input_tokens': 168, 

### Message History
We can use a Message History class to wrap our model and make it stateful. This will keep track of inputs and outputs of the model, and store them in some datastore. Future interactions will then load those messages and pass them into the chain as part of the input. Let's see how to use this!

In [5]:
# !pip install langchain_community

* ChatMessageHistory = A simple in-memory implementation of a chat message history. Stores messages in an in memory list.
* BaseChatMessageHistory = Abstract base class for storing chat message history.

In [10]:
from langchain_community.chat_message_histories import ChatMessageHistory #
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store={}

def get_session_history(session_id:str)->BaseChatMessageHistory:
    """Creates a session_id and this id will be used to distinguish one chat session with other and saved to the BaseChatMessageHistory.
       
       When there is a session_id it should first of all go ahead and check in this particular dict(store) and if available it picks up all the chat messages history else creates a new session_id and save the comming msg history..
    """
    if session_id not in store:
        store[session_id]=ChatMessageHistory()
    return store[session_id]

with_message_history=RunnableWithMessageHistory(model,get_session_history)

In [11]:
config={"configurable":{"session_id":"chat_1"}}

In [14]:
response=with_message_history.invoke(
    [HumanMessage(content="Hi , My name is Sushant Twayana and I am a Junior AI/ML Engineer")],
    config=config
)

response

AIMessage(content='Nice to meet you, Sushant! üëã How can I support you in your AI/ML journey today? Whether it‚Äôs a question about a model, a project idea, or anything else, just let me know!', additional_kwargs={'reasoning_content': 'The user repeats introduction. We need to respond in a friendly manner. No policy violation. Just a greeting.'}, response_metadata={'token_usage': {'completion_tokens': 78, 'prompt_tokens': 172, 'total_tokens': 250, 'completion_time': 0.078284153, 'prompt_time': 0.019499299, 'queue_time': 0.05604219, 'total_time': 0.097783452, 'completion_tokens_details': {'reasoning_tokens': 23}}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_e99e93f2ac', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None, 'model_provider': 'groq'}, id='lc_run--e53818fb-8d0b-4bb8-b067-dca0655c74d5-0', usage_metadata={'input_tokens': 172, 'output_tokens': 78, 'total_tokens': 250, 'output_token_details': {'reasoning': 23}})

In [13]:
response.content

'Hello Sushant! üëã It‚Äôs great to meet a Junior AI/ML Engineer. How can I help you today? Are you looking for resources, project ideas, debugging tips, or something else? Let me know what‚Äôs on your mind!'

In [13]:
with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

AIMessage(content='Your name is Krish.  \n\nI remember that you introduced yourself at the beginning of our conversation. üòä  \n\n\n\nIs there anything else I can help you with?\n', response_metadata={'token_usage': {'completion_tokens': 36, 'prompt_tokens': 173, 'total_tokens': 209, 'completion_time': 0.072, 'prompt_time': 0.008576801, 'queue_time': None, 'total_time': 0.08057680099999999}, 'model_name': 'Gemma2-9b-It', 'system_fingerprint': 'fp_e238d2cf52', 'finish_reason': 'stop', 'logprobs': None}, id='run-e4631e90-20b0-4a98-9372-5198792a957a-0', usage_metadata={'input_tokens': 173, 'output_tokens': 36, 'total_tokens': 209})

In [15]:
with_message_history.invoke(
    [HumanMessage(content="Whats my name??")],
    config=config,
)

AIMessage(content='You‚Äôre Sushant\u202fTwayana.', additional_kwargs={'reasoning_content': 'User says "Whats my name??". We know from context: "My name is Sushant Twayana". So answer: Sushant Twayana.'}, response_metadata={'token_usage': {'completion_tokens': 54, 'prompt_tokens': 232, 'total_tokens': 286, 'completion_time': 0.057367443, 'prompt_time': 0.026055311, 'queue_time': 0.051027348, 'total_time': 0.083422754, 'completion_tokens_details': {'reasoning_tokens': 35}}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_e99e93f2ac', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None, 'model_provider': 'groq'}, id='lc_run--8880efd5-15b1-4131-837f-97519a5ef2aa-0', usage_metadata={'input_tokens': 232, 'output_tokens': 54, 'total_tokens': 286, 'output_token_details': {'reasoning': 35}})

In [None]:
## change the config-->session id
config1={"configurable":{"session_id":"chat_2"}}
response=with_message_history.invoke(
    [HumanMessage(content="Whats my name??")],
    config=config1
)

## Retrive the another session_id new one so no previous content.
response.content

'I don‚Äôt have that information‚Äîwhat‚Äôs your name?'

In [19]:
response=with_message_history.invoke(
    [HumanMessage(content="Hey My name is John")],
    config=config1
)
response.content

'Nice to meet you, John! How can I help you today?'

In [20]:
response=with_message_history.invoke(
    [HumanMessage(content="Whats my name")],
    config=config1
)
response.content

'Your name is John.'

### Prompt templates
Prompt Templates help to turn raw user information into a format that the LLM can work with. In this case, the raw user input is just a message, which we are passing to the LLM. Let's now make that a bit more complicated. First, let's add in a system message with some custom instructions (but still taking messages as input). Next, we'll add in more input besides just the messages.

In [21]:
from langchain_core.prompts import ChatPromptTemplate,MessagesPlaceholder
prompt=ChatPromptTemplate.from_messages(
    [
        ("system","You are a helpful assistant.Amnswer all the question to the best of your ability"),
        MessagesPlaceholder(variable_name="messages")
    ]
)

chain=prompt|model

Whenever human message we specifically give, it needs to be given in a key value pair where the key name should be messages

In [22]:
chain.invoke({"messages":[HumanMessage(content="Hi My name is Sushant")]})

AIMessage(content='Hello Sushant! Nice to meet you. How can I help you today?', additional_kwargs={'reasoning_content': 'The user says "Hi My name is Sushant". They introduced themselves. We should respond politely, maybe ask how to help.'}, response_metadata={'token_usage': {'completion_tokens': 54, 'prompt_tokens': 99, 'total_tokens': 153, 'completion_time': 0.05627913, 'prompt_time': 0.006604082, 'queue_time': 0.091395837, 'total_time': 0.062883212, 'completion_tokens_details': {'reasoning_tokens': 28}}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_e99e93f2ac', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None, 'model_provider': 'groq'}, id='lc_run--148ebbea-6833-4910-8c7f-1a9af7bf57b7-0', usage_metadata={'input_tokens': 99, 'output_tokens': 54, 'total_tokens': 153, 'output_token_details': {'reasoning': 28}})

In [23]:
with_message_history=RunnableWithMessageHistory(chain,get_session_history)

In [24]:
config = {"configurable": {"session_id": "chat_3"}}
response=with_message_history.invoke(
    [HumanMessage(content="Hi My name is Sushant")],
    config=config,
)

response

AIMessage(content='Hello Sushant! üëã How can I help you today?', additional_kwargs={'reasoning_content': 'User says "Hi My name is Sushant". Probably greeting. We should respond politely, maybe ask how can help.'}, response_metadata={'token_usage': {'completion_tokens': 49, 'prompt_tokens': 99, 'total_tokens': 148, 'completion_time': 0.050011184, 'prompt_time': 0.010594667, 'queue_time': 0.093868523, 'total_time': 0.060605851, 'completion_tokens_details': {'reasoning_tokens': 26}}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_e99e93f2ac', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None, 'model_provider': 'groq'}, id='lc_run--c1cc8efd-264e-40ee-9349-92e77662ef45-0', usage_metadata={'input_tokens': 99, 'output_tokens': 49, 'total_tokens': 148, 'output_token_details': {'reasoning': 26}})

In [25]:
response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

response.content

'Your name is Sushant.'

In [26]:
## Add more complexity

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | model

In [27]:
response=chain.invoke({"messages":[HumanMessage(content="Hi My name is Sushant")],"language":"Nepali"})
response.content

'‡§®‡§Æ‡§∏‡•ç‡§ï‡§æ‡§∞ ‡§∏‡•Å‡§∂‡§æ‡§®‡•ç‡§§! ‡§§‡§™‡§æ‡§à‡§Ç‡§≤‡§æ‡§à ‡§≠‡•á‡§ü‡•á‡§∞ ‡§ñ‡•Å‡§∂‡•Ä ‡§≤‡§æ‡§ó‡•ç‡§Ø‡•ã‡•§ ‡§Æ ‡§Ø‡§π‡§æ‡§Å ‡§§‡§™‡§æ‡§à‡§Ç‡§ï‡•ã ‡§∏‡§π‡§æ‡§Ø‡§ï‡§ï‡•ã ‡§∞‡•Ç‡§™‡§Æ‡§æ ‡§â‡§™‡§≤‡§¨‡•ç‡§ß ‡§õ‡•Å‡•§ ‡§§‡§™‡§æ‡§à‡§Ç‡§≤‡§æ‡§à ‡§ï‡•á ‡§Æ‡§¶‡•ç‡§¶‡§§ ‡§ö‡§æ‡§π‡§ø‡§®‡•ç‡§õ?'

Let's now wrap this more complicated chain in a Message History class. This time, because there are multiple keys in the input, we need to specify the correct key to use to save the chat history.

In [28]:
with_message_history=RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages"
)

In [29]:
config = {"configurable": {"session_id": "chat_4"}}
repsonse=with_message_history.invoke(
    {'messages': [HumanMessage(content="Hi,I am Sushant Twayana")],"language":"Nepali"},
    config=config
)
repsonse.content

'‡§®‡§Æ‡§∏‡•ç‡§§‡•á ‡§∏‡•Å‡§∂‡§æ‡§®‡•ç‡§§ ‡§§‡•ç‡§µ‡§æ‡§Ø‡§®‡§æ ‡§ú‡•Ä! ‡§§‡§™‡§æ‡§à‡§Ç‡§≤‡§æ‡§à ‡§≠‡•á‡§ü‡•á‡§∞ ‡§ñ‡•Å‡§∂‡•Ä ‡§≤‡§æ‡§ó‡•ç‡§Ø‡•ã‡•§ ‡§Æ ‡§Ø‡§π‡§æ‡§Å ‡§§‡§™‡§æ‡§à‡§Ç‡§≤‡§æ‡§à ‡§Æ‡§¶‡•ç‡§¶‡§§ ‡§ó‡§∞‡•ç‡§® ‡§§‡§Ø‡§æ‡§∞ ‡§õ‡•Å‡•§ ‡§Ü‡§ú ‡§§‡§™‡§æ‡§à‡§Ç‡§≤‡§æ‡§à ‡§ï‡•á ‡§¨‡§æ‡§∞‡•á‡§Æ‡§æ ‡§∏‡§π‡§Ø‡•ã‡§ó ‡§ö‡§æ‡§π‡§ø‡§è‡§ï‡•ã ‡§õ?'

In [30]:
response = with_message_history.invoke(
    {"messages": [HumanMessage(content="whats my name?")], "language": "Nepali"},
    config=config,
)

In [31]:
response.content

'‡§§‡§™‡§æ‡§à‡§Ç‡§ï‡•ã ‡§®‡§æ‡§Æ ‡§∏‡•Å‡§∂‡§æ‡§®‡•ç‡§§ ‡§§‡•ç‡§µ‡§æ‡§Ø‡§®‡§æ ‡§π‡•ã‡•§'

### Managing the Conversation History
One important concept to understand when building chatbots is how to manage conversation history. If left unmanaged, the list of messages will grow unbounded and potentially overflow the context window of the LLM. Therefore, it is important to add a step that limits the size of the messages you are passing in.
'trim_messages' helper to reduce how many messages we're sending to the model. The trimmer allows us to specify how many tokens we want to keep, along with other parameters like if we want to always keep the system message and whether to allow partial messages

In [32]:
from langchain_core.messages import SystemMessage,trim_messages
trimmer=trim_messages(
    max_tokens=45,
    strategy="last",
    token_counter=model,
    include_system=True,
    allow_partial=False,
    start_on="human"
)
messages = [
    SystemMessage(content="you're a good assistant"),
    HumanMessage(content="hi! I'm bob"),
    AIMessage(content="hi!"),
    HumanMessage(content="I like vanilla ice cream"),
    AIMessage(content="nice"),
    HumanMessage(content="whats 2 + 2"),
    AIMessage(content="4"),
    HumanMessage(content="thanks"),
    AIMessage(content="no problem!"),
    HumanMessage(content="having fun?"),
    AIMessage(content="yes!"),
]
trimmer.invoke(messages)

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


[SystemMessage(content="you're a good assistant", additional_kwargs={}, response_metadata={}),
 HumanMessage(content='I like vanilla ice cream', additional_kwargs={}, response_metadata={}),
 AIMessage(content='nice', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='whats 2 + 2', additional_kwargs={}, response_metadata={}),
 AIMessage(content='4', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='thanks', additional_kwargs={}, response_metadata={}),
 AIMessage(content='no problem!', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='having fun?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='yes!', additional_kwargs={}, response_metadata={})]

In [36]:
from operator import itemgetter

from langchain_core.runnables import RunnablePassthrough

chain=(
    RunnablePassthrough.assign(messages=itemgetter("messages")|trimmer)
    | prompt
    | model.bind(tools=[])
    
)

response=chain.invoke(
    {
    "messages":messages + [HumanMessage(content="What ice cream do i like")],
    "language":"English"
    }
)
response.content

'I‚Äôm not sure‚Äîwhat‚Äôs your favorite flavor or brand?'

In [37]:
response = chain.invoke(
    {
        "messages": messages + [HumanMessage(content="what math problem did i ask")],
        "language": "English",
    }
)
response.content

'You asked for the result of the math problem: **2\u202f+\u202f2**.'

In [38]:
## Lets wrap this in the MEssage History
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages",
)
config={"configurable":{"session_id":"chat_5"}}

In [39]:
response = with_message_history.invoke(
    {
        "messages": messages + [HumanMessage(content="whats my name?")],
        "language": "English",
    },
    config=config,
)

response.content

'I‚Äôm not sure‚Äîcould you remind me what you‚Äôd like me to call you?'

In [44]:
response = with_message_history.invoke(
    {
        "messages": [HumanMessage(content="what math problem did i ask?")],
        "language": "English",
    },
    config=config,
)

response.content

"As a large language model, I have no memory of past conversations. If you'd like to ask me a math problem, I'm happy to help! üòä  Just let me know what it is. \n\n"