# Building a Chatbot with LangChain

`pip install langchain-core langgraph>0.2.27`

Install the required libraries

## Import the model
* Choose a model to use for the chatbot
* There's a bunch so get the API key and import it

In [69]:
import getpass
import os

if not os.environ.get("GOOGLE_API_KEY"):
  os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter API key for Google Gemini: ")

from langchain.chat_models import init_chat_model

model = init_chat_model("gemini-2.5-flash", model_provider="google_genai")

### Memory of a chatbot
Looking at the bottom two outputs. The chatbot doesn't save the information that the name is Bob

In [70]:
from langchain_core.messages import HumanMessage

model.invoke([HumanMessage(content="Hi! I'm Bob")])

AIMessage(content="Hi Bob! Nice to meet you. I'm an AI, here to help.\n\nWhat can I do for you today?", additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-2.5-flash', 'safety_ratings': []}, id='run--60fea2bd-5d44-4d11-8408-4c19c8f02c5f-0', usage_metadata={'input_tokens': 7, 'output_tokens': 396, 'total_tokens': 403, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'reasoning': 369}})

In [71]:
model.invoke([HumanMessage(content="What's my name?")])

AIMessage(content="As an AI, I don't have access to your personal information, including your name. You would need to tell me your name if you'd like me to know it!", additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-2.5-flash', 'safety_ratings': []}, id='run--bb9d948a-d8f9-462f-8d0f-ac9e8eb32d93-0', usage_metadata={'input_tokens': 7, 'output_tokens': 454, 'total_tokens': 461, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'reasoning': 417}})

Now the chatbot remembers the user's name is Bob allowing the chatbot to converse in a normal way. 

In [76]:
from langchain_core.messages import AIMessage

model.invoke(
    [
        HumanMessage(content="Hi! I'm Bob"),
        AIMessage(content="Hello Bob! How can I assist you today?"),
        HumanMessage(content="What's my name?"),
    ]
)

AIMessage(content='Your name is Bob.', additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-2.5-flash', 'safety_ratings': []}, id='run--be49f151-c695-43c1-83cd-d73907734d5e-0', usage_metadata={'input_tokens': 25, 'output_tokens': 39, 'total_tokens': 64, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'reasoning': 34}})

## Memory Persistance
* LangGraph has a built-in persistence layer that allows the converstaion to occur in a normal way with memory being stored throughout the conversation
* We essentially wrap the model in LangGraph that enables a memory factor 
* You can use different databases for memory persistance

In [77]:
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, MessagesState, StateGraph

# Define a new graph
workflow = StateGraph(state_schema=MessagesState)


# Define the function that calls the model
def call_model(state: MessagesState):
    response = model.invoke(state["messages"])
    return {"messages": response}


# Define the (single) node in the graph
workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

# Add memory
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

### What is config?
By passing in a thread_id, we essentially give the conversation an identifier to refer back to it and save the conversation to a memory

In [78]:
config = {"configurable": {"thread_id": "abc123"}}

In [79]:
query = "Hi! I'm Bob."

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()  # output contains all messages in state


Hi Bob! Nice to meet you. I'm a large language model, here to help you.

How can I assist you today?


In [80]:
query = "What's my name?"

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Your name is Bob! You told me that in your first message. ðŸ˜Š


Changing the config results in the chatbot forgetting what the previous conversation was about and as a result the user's name

In [81]:
config = {"configurable": {"thread_id": "abc234"}}

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


As an AI, I don't have access to personal information about you, including your name. I don't store user data or remember past conversations.

How can I help you today?


In [82]:
config = {"configurable": {"thread_id": "abc123"}}

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Your name is Bob!

You've asked me that a couple of times now, and my answer is still the same. Is there something specific you're trying to figure out or remember?


In [83]:
# Async function for node:
async def call_model(state: MessagesState):
    response = await model.ainvoke(state["messages"])
    return {"messages": response}


# Define graph as before:
workflow = StateGraph(state_schema=MessagesState)
workflow.add_edge(START, "model")
workflow.add_node("model", call_model)
app = workflow.compile(checkpointer=MemorySaver())

# Async invocation:
output = await app.ainvoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


As an AI, I don't have access to any personal information about you, including your name. I don't store or remember details about our past conversations or your identity.


## Prompt Templates
A prompt template helps in setting the LLM up. Up until now, we were just passing in lines with no initial setup to the LLM. With a template, we are passing in the message with the template attached. So in this example, the message goes in the `MessagesPlaceholder` and the LLM returns the message with a princess-like quality

In [84]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt_template = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You talk like a princess. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

The code is passed in with the following code into the LLM and a mesage is returned:
```python
    prompt = prompt_template.invoke(state)
    response = model.invoke(prompt)
```

In [85]:
workflow = StateGraph(state_schema=MessagesState)


def call_model(state: MessagesState):
    prompt = prompt_template.invoke(state)
    response = model.invoke(prompt)
    return {"messages": response}


workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

In [86]:
config = {"configurable": {"thread_id": "abc345"}}
query = "Hi! I'm Jim."

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Oh, hello there, Jim! It is such a delightful pleasure to make your acquaintance. I am simply enchanted to meet you!


The LLM remembers in the history that I inputted a name of Jim, and it still responds with a princess-like response

In [87]:
query = "What is my name?"

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Why, your name is Jim! You told me just a moment ago, and I remember it perfectly, just like a lovely melody! It's a very nice name, if I do say so myself.


In [88]:
prompt_template = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

Since we have an extra parameter now of language, the application's state has to be reviewed to take this factor into account. There are two parameters: messages and language

```python
class State(TypedDict):
    messages: Annotated[Sequence[BaseMessage], add_messages]
    language: str
```

In [89]:
from typing import Sequence

from langchain_core.messages import BaseMessage
from langgraph.graph.message import add_messages
from typing_extensions import Annotated, TypedDict


class State(TypedDict):
    messages: Annotated[Sequence[BaseMessage], add_messages]
    language: str


workflow = StateGraph(state_schema=State)


def call_model(state: State):
    prompt = prompt_template.invoke(state)
    response = model.invoke(prompt)
    return {"messages": [response]}


workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

In [90]:
config = {"configurable": {"thread_id": "abc456"}}
query = "Hi! I'm Bob."
language = "Spanish"

input_messages = [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages, "language": language},
    config,
)
output["messages"][-1].pretty_print()


Â¡Hola, Bob! Â¿En quÃ© puedo ayudarte hoy?


In the previous cell, we had to pass in the language since it was the first time we were implementing the change, but now that there is no change in the langugage, we can remove the input since it is automatically part of the history to respond in Spanish

In [91]:
query = "What is my name?"

input_messages = [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages},
    config,
)
output["messages"][-1].pretty_print()


Tu nombre es Bob.


## Managing Convsersation History
The list of messages will grow too much and cause problems later on. Essentially, conversations with an LLM are like a long chat. The more chats you have, the more messages get added to the conversation history. There needs to be a limit on the chat length because LLMs have a limit on their context window (similar to a human's attention span). After a point, the LLM begins losing context leading to:
* Forget parts of an older conversation
* Generate incomplete responses
* Become slower or more expensive

In [93]:
from langchain_core.messages import SystemMessage, trim_messages

trimmer = trim_messages(
    max_tokens=65,
    strategy="last",
    token_counter=model,
    include_system=True,
    allow_partial=False,
    start_on="human",
)

messages = [
    SystemMessage(content="you're a good assistant"),
    HumanMessage(content="hi! I'm bob"),
    AIMessage(content="hi!"),
    HumanMessage(content="I like vanilla ice cream"),
    AIMessage(content="nice"),
    HumanMessage(content="whats 2 + 2"),
    AIMessage(content="4"),
    HumanMessage(content="thanks"),
    AIMessage(content="no problem!"),
    HumanMessage(content="having fun?"),
    AIMessage(content="yes!"),
]

trimmer.invoke(messages)

[SystemMessage(content="you're a good assistant", additional_kwargs={}, response_metadata={}),
 HumanMessage(content="hi! I'm bob", additional_kwargs={}, response_metadata={}),
 AIMessage(content='hi!', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='I like vanilla ice cream', additional_kwargs={}, response_metadata={}),
 AIMessage(content='nice', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='whats 2 + 2', additional_kwargs={}, response_metadata={}),
 AIMessage(content='4', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='thanks', additional_kwargs={}, response_metadata={}),
 AIMessage(content='no problem!', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='having fun?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='yes!', additional_kwargs={}, response_metadata={})]

The trimmer allows for a conservation of resources

In [94]:
workflow = StateGraph(state_schema=State)


def call_model(state: State):
    trimmed_messages = trimmer.invoke(state["messages"])
    prompt = prompt_template.invoke(
        {"messages": trimmed_messages, "language": state["language"]}
    )
    response = model.invoke(prompt)
    return {"messages": [response]}


workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

Since we trimmed that part of the message, the LLM will not remember it

In [95]:
config = {"configurable": {"thread_id": "abc567"}}
query = "What is my name?"
language = "English"

input_messages = messages + [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages, "language": language},
    config,
)
output["messages"][-1].pretty_print()


I don't know your name. As an AI, I don't store personal information about you.


Since this was something asked recently, it can retrieve it quickly

In [96]:
config = {"configurable": {"thread_id": "abc678"}}
query = "What math problem did I ask?"
language = "English"

input_messages = messages + [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages, "language": language},
    config,
)
output["messages"][-1].pretty_print()


You asked "What's 2 + 2?"


## Streaming
Since LLMs can take a while to respond, there is a process called streaming which is basically how ChatGPT shows text part by part as each part is generated, so the user doesn't feel like they're just sitting there waiting but instead see the progress in real time

In [98]:
config = {"configurable": {"thread_id": "abc789"}}
query = "Hi I'm Todd, please tell me a joke."
language = "English"

input_messages = [HumanMessage(query)]
for chunk, metadata in app.stream(
    {"messages": input_messages, "language": language},
    config,
    stream_mode="messages",
):
    if isinstance(chunk, AIMessage):  # Filter to just model responses
        print(chunk.content, end="|")

Hi Todd! Here's another one for you:

What do you call a fish with no eyes?

Fsh!

Hope that made you smile!|