# Build a Chatbot powered by LLM
To build a chatbot powered by LLM it would be important to remeber the history of conversations because LLM's are not context aware. 


In [2]:
import getpass
import os

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()

 ········


In [3]:
# pip install langchain
# pip install -qU langchain-mistralai

In [4]:
import getpass
import os

os.environ["MISTRAL_API_KEY"] = getpass.getpass()

from langchain_mistralai import ChatMistralAI

model = ChatMistralAI(model="mistral-large-latest")

 ········


In [5]:
from langchain_core.messages import HumanMessage

model.invoke([HumanMessage(content="Hi! I'm Bob")])

AIMessage(content="Hello Bob! It's nice to meet you. How can I assist you today?", response_metadata={'token_usage': {'prompt_tokens': 9, 'total_tokens': 27, 'completion_tokens': 18}, 'model': 'mistral-large-latest', 'finish_reason': 'stop'}, id='run-dc3ea5d4-773a-4576-a9ac-9477329d2e48-0', usage_metadata={'input_tokens': 9, 'output_tokens': 18, 'total_tokens': 27})

In [6]:
model.invoke([HumanMessage(content="What's my name?")])

AIMessage(content="I'm an assistant and I don't have the ability to know your name unless you tell me. Could you please tell me your name so I can assist you better?", response_metadata={'token_usage': {'prompt_tokens': 9, 'total_tokens': 45, 'completion_tokens': 36}, 'model': 'mistral-large-latest', 'finish_reason': 'stop'}, id='run-496a2895-2b22-4a16-b0a2-9633dd08b9c3-0', usage_metadata={'input_tokens': 9, 'output_tokens': 36, 'total_tokens': 45})

The model does not have state. It does not remember earlier conversation  
Therefore we need to pass the complete conversation history 

In [8]:
from langchain_core.messages import AIMessage

model.invoke(
    [
        HumanMessage(content="Hi! I'm John"),
        AIMessage(content="Hello John! How can I assist you today?"),
        HumanMessage(content="What's my name?"),
    ]
)

AIMessage(content='Based on your introduction, your name is John.', response_metadata={'token_usage': {'prompt_tokens': 28, 'total_tokens': 38, 'completion_tokens': 10}, 'model': 'mistral-large-latest', 'finish_reason': 'stop'}, id='run-970cc67d-ccea-466e-8b44-15a359938329-0', usage_metadata={'input_tokens': 28, 'output_tokens': 10, 'total_tokens': 38})

## Message History
We need an object to track entire history of inputs and outputs so that it could be passed to the model every time  
Fortunately langchain has a class for this  
We have to install langchain-community to access those features

In [None]:
# ! pip install langchain_community

After that, we can import the relevant classes and set up our chain which wraps the model and adds in this message history. A key part here is the function we pass into as the get_session_history. This function is expected to take in a session_id and return a Message History object. This session_id is used to distinguish between separate conversations, and should be passed in as part of the config when calling the new chain (we'll show how to do that).

In [9]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store = {}


def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]


with_message_history = RunnableWithMessageHistory(model, get_session_history)

We now need to create a config that we pass into the runnable every time. This config contains information that is not part of the input directly, but is still useful. In this case, we want to include a session_id. This should look like:

In [10]:
config = {"configurable": {"session_id": "001"}}

In [11]:
response = with_message_history.invoke(
    [HumanMessage(content="Hi! I'm Jake")],
    config=config,
)

response.content

"Hello, Jake! It's nice to meet you. How can I assist you today?"

In [12]:
response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

response.content

'Based on our conversation so far, your name is Jake.'

If we change the session id it will not have any history because history is linked to the session id.   
see the code below as we change the session id

In [13]:
config = {"configurable": {"session_id": "002"}}

response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

response.content

"I'm an assistant and I don't have personal information about individuals unless it has been shared with me in the course of our conversation. I'm here to help answer your questions to the best of my ability. If you'd like to tell me your name, I'll be happy to use it during our conversation."

With the help of session id we can have chatbot support multiple users. each user can have it own session id. Also we can support multiple chats for each user with a different session id  

## Prompt Templates
Next step is to add prompt templates. That will make our chatbot more accurate in terms of its responses
We will use a message placeholder to pass our messages

In [14]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | model

Because we have used "messages" as the variable name of the place holder, we will pass our messages in the form of a dictionary with "messages" as key  

In [15]:
chain.invoke({"messages": [HumanMessage("Hi, my name is Dalton")]})

AIMessage(content="Hello Dalton! It's nice to meet you. How can I assist you today?", response_metadata={'token_usage': {'prompt_tokens': 28, 'total_tokens': 47, 'completion_tokens': 19}, 'model': 'mistral-large-latest', 'finish_reason': 'stop'}, id='run-a9b163c1-84a3-4d2f-a234-5125794f6b27-0', usage_metadata={'input_tokens': 28, 'output_tokens': 19, 'total_tokens': 47})

Now lets wrap this chain in a chat history as before

In [16]:
with_message_history = RunnableWithMessageHistory(chain, get_session_history)

In [17]:
config = {"configurable": {"session_id": "003"}}

In [18]:
response = with_message_history.invoke(
    [HumanMessage(content="Hi! I'm Elvis")],
    config=config,
)

response.content

"Hello Elvis! I'm here to help you with any questions or information you need to the best of my ability. Just let me know what you need help with, and I'll do my best to provide you with accurate and helpful information."

In [19]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | model

We have made the prompt more complicated by including a language varaible in there  
so the output will be in a different language what we specify

In [20]:
response = chain.invoke(
    {"messages": [HumanMessage(content="hi! I'm bob")], "language": "Hindi"}
)

response.content

"Hello Bob! I'm here to help you. However, I must inform you that while I can understand and generate responses in Hindi, our conversation has started in English. If you'd like to continue in Hindi, feel free to ask your question again in Hindi, and I'll do my best to assist you in that language."

Let's now wrap this more complicated chain in a Message History class. This time, because there are multiple keys in the input, we need to specify the correct key to use to save the chat history.

In [21]:
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages",
)

In [22]:
config = {"configurable": {"session_id": "005"}}

In [23]:
response = with_message_history.invoke(
    {"messages": [HumanMessage(content="hi! I'm Devarsh")], "language": "Hindi"},
    config=config,
)

response.content

'Namaste! Main aapke liye upayukt sahayak hoon. Mujhe yahi ummeed hai ki main aapke sawalon ka sarvottam tarika se jawab de sakta hoon. Kya aap koi sawal poochna chahte hain?\n\n(Hello! I am your helpful assistant. I hope I can answer your questions to the best of my ability. Would you like to ask me a question?)'

In [24]:
response = with_message_history.invoke(
    {"messages": [HumanMessage(content="What is my name?")], "language": "Hindi"},
    config=config,
)

response.content

'Jab main aapko pehli baar se mila tha toh aapne apna naam Devarsh bataya tha, isliye main aapka naam Devarsh hi samajhta hoon.\n\n(When I met you for the first time, you told me your name was Devarsh, so I assume that your name is Devarsh.)'

## Size of Message History
- Size should not exceed the LLM's max context size. Otherwise it will cause overflow of context window of LLM
- Limit the size before the prompt step just after previous messages are loaded
- use trim_messages function in langchain
- trim_messages can control number of max tokens and other parameters

In [32]:
from langchain_core.messages import SystemMessage, trim_messages

trimmer = trim_messages(
    max_tokens=35,
    strategy="last",
    token_counter=model,
    include_system=True,
    allow_partial=False,
    start_on="human",
)

messages = [
    SystemMessage(content="you're a good assistant"),
    HumanMessage(content="hi! I'm bob"),
    AIMessage(content="hi!"),
    HumanMessage(content="I like vanilla ice cream"),
    AIMessage(content="nice"),
    HumanMessage(content="whats 2 + 2"),
    AIMessage(content="4"),
    HumanMessage(content="thanks"),
    AIMessage(content="no problem!"),
    HumanMessage(content="having fun?"),
    AIMessage(content="yes!"),
]

trimmer.invoke(messages)

[SystemMessage(content="you're a good assistant"),
 HumanMessage(content='whats 2 + 2'),
 AIMessage(content='4'),
 HumanMessage(content='thanks'),
 AIMessage(content='no problem!'),
 HumanMessage(content='having fun?'),
 AIMessage(content='yes!')]

To use it in our chain, we just need to run the trimmer before we pass the messages input to our prompt.

Now if we try asking the model our name, it won't know it since we trimmed that part of the chat history:

In [28]:
from operator import itemgetter

from langchain_core.runnables import RunnablePassthrough

chain = (
    RunnablePassthrough.assign(messages=itemgetter("messages") | trimmer)
    | prompt
    | model
)

response = chain.invoke(
    {
        "messages": messages + [HumanMessage(content="what's my name? also I would like you to book a ticket for me from Mumbai to Torronto. The ticket should be cheapest and also in a non stop flight")],
        "language": "English",
    }
)
response.content

"I'm glad you think I'm a helpful assistant! Unfortunately, I am not able to access personal information such as your name, and I am also unable to perform external tasks such as booking a flight ticket for you. My current capabilities are limited to providing information and answering questions to the best of my ability.\nI can help you find some websites or travel agencies where you can look for cheapest non-stop flight from Mumbai to Toronto and also guide you on how to book it.\nI apologize for any inconvenience."

In [29]:
print(response)

content="I'm glad you think I'm a helpful assistant! Unfortunately, I am not able to access personal information such as your name, and I am also unable to perform external tasks such as booking a flight ticket for you. My current capabilities are limited to providing information and answering questions to the best of my ability.\nI can help you find some websites or travel agencies where you can look for cheapest non-stop flight from Mumbai to Toronto and also guide you on how to book it.\nI apologize for any inconvenience." response_metadata={'token_usage': {'prompt_tokens': 85, 'total_tokens': 197, 'completion_tokens': 112}, 'model': 'mistral-large-latest', 'finish_reason': 'stop'} id='run-4e538063-8da2-4f53-9d4c-7b84d020f0aa-0' usage_metadata={'input_tokens': 85, 'output_tokens': 112, 'total_tokens': 197}


In [34]:
chain = (
    RunnablePassthrough.assign(messages=itemgetter("messages") | trimmer)
    | prompt
    | model
)

In [35]:
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages",
)

config = {"configurable": {"session_id": "abc20"}}

In [36]:
response = with_message_history.invoke(
    {
        "messages": messages + [HumanMessage(content="whats my name?")],
        "language": "English",
    },
    config=config,
)

response.content

'Based on the information provided, I do not know your name.'

## Streaming
- LLM can take some time to send a response
- for a better UX it will be good to stream the output of LLM token by token
- All chains expose a .stream method, and ones that use message history are no different. We can simply use that method to get back a streaming response.

In [37]:
config = {"configurable": {"session_id": "abc15"}}
for r in with_message_history.stream(
    {
        "messages": [HumanMessage(content="hi! I'm todd. tell me a joke")],
        "language": "English",
    },
    config=config,
):
    print(r.content, end="|")

|Hello| Todd|!| Nice| to| meet| you|.| Here|'|s| a| joke| for| you|:|

Why| don|'|t| scientists| trust| atoms|?|

Because| they| make| up| everything|!||

## Chatbot process flow

```mermaid
flowchart LR

A[Message History] -->|store| B(Trimming)
B --> C(Streaming)
```