In this video Weill go over an example of how to design and implement an LLM-powered chatbot. This chatbot
will be able to have a conversation and remember previous interactions.
Note that this chatbot that we build will only use the language model to have a conversation. There are sever
other related concepts that you may be looking for:

‚Ä¢ Conversatiooal RAG: Enable a chatbot experience over an external source of data

‚Ä¢ Agents: Build a chatbot that can take actions

This video tutorial will cover the basics which will be helpful for those two more advanced topics.

In [5]:
import os
from dotenv import load_dotenv

load_dotenv()

groq_api_key = os.getenv("GROQ_API_KEY")
groq_api_key

'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'

In [6]:
from langchain_groq import ChatGroq

model = ChatGroq(model="gemma2-9b-it",groq_api_key=groq_api_key)
model

ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x000002991EFE74F0>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x000002991F01D780>, model_name='gemma2-9b-it', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [7]:
from langchain_core.messages import HumanMessage
model.invoke(
    [HumanMessage(content="Hi , My name is Ishank and I am a AI engineer")]
)

AIMessage(content="Hi Ishank,\n\nIt's nice to meet you! I'm Gemma, an AI assistant.  \n\nBeing an AI engineer is fascinating work. What kind of projects are you working on these days? \n\n", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 48, 'prompt_tokens': 22, 'total_tokens': 70, 'completion_time': 0.087272727, 'prompt_time': 0.00133974, 'queue_time': 0.25164254, 'total_time': 0.088612467}, 'model_name': 'gemma2-9b-it', 'system_fingerprint': 'fp_10c08bf97d', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None}, id='run--f5141aa5-8ace-4322-ba85-f1c5d779e8db-0', usage_metadata={'input_tokens': 22, 'output_tokens': 48, 'total_tokens': 70})

In [8]:
from langchain_core.messages import AIMessage
model.invoke(
    [
        HumanMessage(content="Hi , My name is Ishank and I am a AI engineer"),
        AIMessage(content="Hi Ishank, it's nice to meet you!\n\nThat's awesome, being an AI engineer is a really interesting field. \n\nWhat kind of AI work do you do?  Are you working on any cool projects you can tell me about?"),
        HumanMessage(content="Hey , What's my name and what i do?")
    ],
)

AIMessage(content="According to our conversation, your name is Ishank and you are an AI engineer.  üòä  \n\nIs there anything else you'd like to know or talk about?\n", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 38, 'prompt_tokens': 97, 'total_tokens': 135, 'completion_time': 0.069090909, 'prompt_time': 0.002688069, 'queue_time': 0.25241205899999997, 'total_time': 0.071778978}, 'model_name': 'gemma2-9b-it', 'system_fingerprint': 'fp_10c08bf97d', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None}, id='run--0d669915-6b44-41d0-9ba6-d69b90d3cbc1-0', usage_metadata={'input_tokens': 97, 'output_tokens': 38, 'total_tokens': 135})

### Message History
We can use a Message History class to wrap our model and make it stateful. This will keep track of 
inputs and outputs of the model, and store them in some datastore. Future interactions will then
load these messages and pass them into the chain as part of the input. Let's see how to use this!l

In [9]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store={}

def get_session_history(session_id:str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

with_message_history = RunnableWithMessageHistory(model,get_session_history)

In [10]:
config = {"configurable":{"session_id":"chat1"}}


In [None]:
with_message_history.invoke(
    [HumanMessage(content="Hi , My name is Ishank Sharma and I am an AI Engineer")],
    config=config
)

AIMessage(content="Hello Ishank Sharma, it's nice to meet you!\n\nThat's a fascinating field. What kind of AI work are you involved in?  \n\nI'm eager to learn more about your experience and expertise.\n", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 49, 'prompt_tokens': 23, 'total_tokens': 72, 'completion_time': 0.089090909, 'prompt_time': 0.001327189, 'queue_time': 0.249129616, 'total_time': 0.090418098}, 'model_name': 'gemma2-9b-it', 'system_fingerprint': 'fp_10c08bf97d', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None}, id='run--fdcdcdf8-14a4-4d33-9f5d-a7bc8ee74ac2-0', usage_metadata={'input_tokens': 23, 'output_tokens': 49, 'total_tokens': 72})

In [12]:
with_message_history.invoke(
    [HumanMessage(content="Whats my name")],
    config=config
)


AIMessage(content='Your name is Ishank Sharma.  \n\nI remember it from our introduction! üòä  Is there anything else I can help you with?\n', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 31, 'prompt_tokens': 83, 'total_tokens': 114, 'completion_time': 0.056363636, 'prompt_time': 0.002504809, 'queue_time': 0.252717904, 'total_time': 0.058868445}, 'model_name': 'gemma2-9b-it', 'system_fingerprint': 'fp_10c08bf97d', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None}, id='run--78638679-025f-4849-8688-ee3f754945cf-0', usage_metadata={'input_tokens': 83, 'output_tokens': 31, 'total_tokens': 114})

In [13]:
config1={"configurable":{"session_id":"chat2"}}
response = with_message_history.invoke(
    [HumanMessage(content="Whats my name")],
    config=config1
)

response.content

"As an AI, I have no memory of past conversations and do not know your name. If you'd like to tell me your name, I'd be happy to use it!\n"

#### Prompt templates
Prompt Templates help to turn raw user information into a format that the LLM can work with. In this case, the
raw user input is just a message, which we are passing to the LLM. Let's now make that a bit more complicated.
First, let's add in a system message with some custom instructions (but still taking messages as input). Next,
we'll add in more input besides just the messages.

In [14]:
from  langchain_core.prompts import ChatPromptTemplate,MessagesPlaceholder

prompt = ChatPromptTemplate(
    [
        ("system","You are a helpful assistant. Answer all the question to the best of your ability"),
        MessagesPlaceholder(variable_name="message")
    ]
)

chain = prompt|model

In [15]:
chain.invoke({"message":[HumanMessage(content="Hi, My name is Ishank")]})

AIMessage(content="Hi Ishank, it's nice to meet you!  \n\nWhat can I do for you today? üòä  I'm ready for any questions you might have! \n\n", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 40, 'prompt_tokens': 32, 'total_tokens': 72, 'completion_time': 0.072727273, 'prompt_time': 0.00147693, 'queue_time': 0.262127588, 'total_time': 0.074204203}, 'model_name': 'gemma2-9b-it', 'system_fingerprint': 'fp_10c08bf97d', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None}, id='run--fa2ef801-b670-499d-b74f-7359e32e5c3d-0', usage_metadata={'input_tokens': 32, 'output_tokens': 40, 'total_tokens': 72})

In [16]:
with_message_history=RunnableWithMessageHistory(chain,get_session_history)

In [17]:
config = {"configurable":{"session_id":"chat3"}}
response = with_message_history.invoke(
    [HumanMessage(content="Hi, My name is Ishank")],
    config=config
)
response

AIMessage(content="Hello Ishank! It's nice to meet you. \n\nHow can I help you today?  üòÑ  \n\n", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 28, 'prompt_tokens': 32, 'total_tokens': 60, 'completion_time': 0.050909091, 'prompt_time': 0.001487929, 'queue_time': 0.252446624, 'total_time': 0.05239702}, 'model_name': 'gemma2-9b-it', 'system_fingerprint': 'fp_10c08bf97d', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None}, id='run--e081956c-2d6e-4298-9d55-acfc6e97f041-0', usage_metadata={'input_tokens': 32, 'output_tokens': 28, 'total_tokens': 60})

In [18]:
# Add more complexity

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability in {language}"
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt|model

In [19]:
response = chain.invoke({"messages":[HumanMessage(content="Hi my name is Ishank")],"language":"Hindi"})
response.content

'‡§®‡§Æ‡§∏‡•ç‡§§‡•á ‡§á‡§∂‡§Ç‡§ï!  üòä  ‡§Æ‡•Å‡§ù‡•á ‡§§‡•Å‡§Æ‡•ç‡§π‡§æ‡§∞‡•Ä ‡§Æ‡§¶‡§¶ ‡§ï‡§∞‡§®‡•á ‡§Æ‡•á‡§Ç ‡§ñ‡•Å‡§∂‡•Ä ‡§π‡•ã ‡§∞‡§π‡•Ä ‡§π‡•à‡•§  ‡§ï‡§ø‡§∏‡§Æ‡•á‡§Ç ‡§§‡•Å‡§Æ‡•ç‡§π‡•á‡§Ç ‡§Æ‡§¶‡§¶ ‡§ö‡§æ‡§π‡§ø‡§è? \n'

Let's now wrap this more complicated chain in a Message History class. This time, because there are multiple
keys in the input, we need to specify the correct key to use to save the chat history.

In [20]:
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages"
)

In [21]:
config = {"configurable":{"session_id":"chat3"}}
response = with_message_history.invoke(
    {"messages":[HumanMessage(content="Hi, I am Ishank")],"language":"Hindi"},
    config=config
)
response



AIMessage(content='‡§®‡§Æ‡§∏‡•ç‡§§‡•á ‡§á‡§∂‡§æ‡§Ç‡§ï! üòä  \n\n‡§Æ‡•Å‡§ù‡•á ‡§¨‡§π‡•Å‡§§ ‡§Ö‡§ö‡•ç‡§õ‡§æ ‡§≤‡§ó ‡§∞‡§π‡§æ ‡§π‡•à ‡§ï‡§ø ‡§Ü‡§™ ‡§Æ‡•Å‡§ù‡§∏‡•á ‡§¨‡§æ‡§§ ‡§ï‡§∞ ‡§∞‡§π‡•á ‡§π‡•à‡§Ç‡•§ ‡§Ü‡§™ ‡§ï‡•ç‡§Ø‡§æ ‡§™‡•Ç‡§õ‡§®‡§æ ‡§ö‡§æ‡§π‡§§‡•á ‡§π‡•à‡§Ç?  \n', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 40, 'prompt_tokens': 74, 'total_tokens': 114, 'completion_time': 0.072727273, 'prompt_time': 0.002573499, 'queue_time': 0.253285131, 'total_time': 0.075300772}, 'model_name': 'gemma2-9b-it', 'system_fingerprint': 'fp_10c08bf97d', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None}, id='run--354e2f1d-f692-464c-a2ed-5249cbd77849-0', usage_metadata={'input_tokens': 74, 'output_tokens': 40, 'total_tokens': 114})

### Managing the Conversation History

One important concept to unde$tand when building chatbots is how to manage conversation history. If left
unmanaged, the list of messages will grow unbounded and potentially overflow the context window of the
LLM. Therefore, it is important to add a step that limits the size of the messages you are passing in.

'trim_messages' helper to reduce how many messages we're sending to the model. The trimmer allows
us to specify how many tokens we want to keep, along with other parameters like if we want to
always keep the system message and whether to allow partial messages

In [28]:
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage, trim_messages

# Dummy token counter function for demonstration
def dummy_token_counter(messages):
    return sum(len(m.content.split()) for m in messages)

trimmer = trim_messages(
    max_tokens=10,
    strategy="last",
    token_counter=dummy_token_counter,  # Should be a function
    include_system=True,
    allow_partial=False,
    start_on="human"
)

messages = [
    SystemMessage(content="you're a good assistant"),
    HumanMessage(content="hi! I'm bob"),
    AIMessage(content="hii !"),
    HumanMessage(content="I like vanilla ice cream"),
    AIMessage(content="nice"),
    HumanMessage(content="whats 2 + 2"),
    AIMessage(content="4"),
    HumanMessage(content="thanks"),
    AIMessage(content="no problem!"),
    HumanMessage(content="having fun?"),
    AIMessage(content="yes!")
]

trimmed_messages = trimmer.invoke(messages)
for msg in trimmed_messages:
    print(f"{msg.__class__.__name__}: {msg.content}")


SystemMessage: you're a good assistant
HumanMessage: thanks
AIMessage: no problem!
HumanMessage: having fun?
AIMessage: yes!


In [29]:
from operator import itemgetter

from langchain_core.runnables import RunnablePassthrough

chain=(
    RunnablePassthrough.assign(messages=itemgetter("messages")|trimmer)
    | prompt
    | model
    
)

response=chain.invoke(
    {
    "messages":messages + [HumanMessage(content="What ice cream do i like")],
    "language":"English"
    }
)
response.content

"As a helpful assistant, I don't have personal preferences or memories, including your ice cream taste! \n\nTo figure out what ice cream you like, think about:\n\n* **Your favorite flavors:** Do you prefer chocolate, vanilla, fruity flavors, or something more unique?\n* **Your preferred texture:** Do you like your ice cream creamy, icy, or with chunks or swirls?\n* **Any toppings you enjoy:**  Chocolate sauce, sprinkles, nuts, whipped cream? \n\nLet me know if you want to brainstorm some ice cream ideas based on your answers! üç¶ üòä  \n\n"

In [30]:
## Lets wrap this in the MEssage History
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages",
)
config={"configurable":{"session_id":"chat5"}}

In [31]:
response = with_message_history.invoke(
    {
        "messages": messages + [HumanMessage(content="whats my name?")],
        "language": "English",
    },
    config=config,
)

response.content

"As an AI, I don't have access to any personal information about you, including your name.  \n\nIs there anything else I can help you with? üòä\n"

In [32]:
response = with_message_history.invoke(
    {
        "messages": [HumanMessage(content="what math problem did i ask?")],
        "language": "English",
    },
    config=config,
)

response.content

"As a helpful assistant, I have no memory of past conversations. If you'd like to ask me a math problem, I'm happy to help! üòä  Just let me know what it is.  \n\n"