# 4. Building a Chatbot

## Setup

In [9]:
%pip install langgraph

import os

try:
    # load environment variables from .env file (requires `python-dotenv`)
    from dotenv import load_dotenv

    load_dotenv()
except ImportError:
    pass

assert os.environ["LANGSMITH_TRACING"] is not None
assert os.environ["LANGSMITH_API_KEY"] is not None
assert os.environ["LANGSMITH_PROJECT"] is not None
assert os.environ["OPENAI_API_KEY"] is not None


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [10]:
from langchain.chat_models import init_chat_model
model = init_chat_model("gpt-4o-mini", model_provider="openai")

## 4.1 Introduction to LangGraph

### 4.1.1 LLMS are Stateless

- **By default, an LLM does not retain the context from previous invocations**. For example, if you tell an LLM your name in one invocation, it will not "remember" your name in the subsequent invocations

In [11]:
from langchain_core.messages import HumanMessage

intro_response = model.invoke([HumanMessage(content="Hi! I'm Bob")])
question_response = model.invoke([HumanMessage(content="What's my name?")])

In [12]:
intro_response.pretty_print()


Hi Bob! How can I assist you today?


In [13]:
question_response.pretty_print()


I'm sorry, but I don't know your name. If you tell me, I can use it while we chat!


- For the LLM to remember the name, the chat history has to be sent with each invocation

In [14]:
from langchain_core.messages import AIMessage

question_response_with_history = model.invoke(
    [
        HumanMessage(content="Hi! I'm Bob"),
        AIMessage(content="Hello Bob! How can I assist you today?"),
        HumanMessage(content="What's my name?"),
    ]
)

question_response_with_history.pretty_print()


Your name is Bob! How can I help you today, Bob?


### 4.1.2 LangGraph Introduction

- **LangGraph** is an open-source library from the creators of LangChain that is used to build A stateful, multi-step agent/workflow applications with reliable execution and persistence.
- Importantly, Human/Agent interactions are modeled as nodes in a graph which makes the orchestration between humans and agents visible and easy to debug.

#### Nodes
- A node is a single action: e.g., call an LLM, run a tool, query a database, or invoke custom code.
- Nodes can represent different actors (LLMs, tools, humans).
- In case of a human, the graph encodes when to pause for human input, how to resume, and how actors exchange state.

#### Edges
- Edges connect nodes and decide what runs next.
- They can be unconditional, conditional (branching), looping, or fan-out/fan-in (parallel paths and joins).
- Graph branches can run in parallel

#### Graph
- A workflow is the directed graph of nodes and edges.
- This makes the system explicit and debuggable: you can see the exact path the execution took.

#### State
- The workflow carries a state object (typically a structured dict/schema) that persists across nodes.
- Nodes read from state and propose updates to state rather than mutating it in place
- After each node runs, LangGraph materialises a new state version.
- This **immutability per step** yields reproducibility, diff-ability, and clear audit trails.

#### Checkpoints
- Each step can be checkpointed (state snapshot + metadata).
- Checkpoints and state can be stored in memory or external stores (DBs, object storage).

- With checkpoints, you can retry or reroute without losing prior work.
- This enables resume after failure/timeouts, deterministic replay, and precise debugging from any point.

#### Thread
- A thread is one concrete run of the graph (e.g., a single user session).
- Each thread has its own state history and checkpoints, isolating concurrent users/sessions cleanly.

![](./docs/langgraph-checkpoints.jpg)

## 4.2 Building a Chatbot

We'll build a simple graph. The graph will consist of:
- A single **node** that calls an LLM
- A **state** that is updated every time a node is called.
- A **checkpointer** that saves the state in-memory every time the state is updated.


In [16]:
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, MessagesState, StateGraph

# Define a new graph
workflow = StateGraph(state_schema=MessagesState)


# Define the function that calls the model
def call_model(state: MessagesState):
    response = model.invoke(state["messages"])
    return {"messages": response}

# "model" is an end-key. If we wand to append a node to an edge, we call add_node passing the end-key of the edge we're appending to.
workflow.add_edge(START, "model")
# Here, we want to append the call_model node to the START edge. So we pass the end-key  of the START node
workflow.add_node("model", call_model)

# Add memory
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

We now need to create a config that we pass into the runnable every time. This config contains information that is not part of the input directly, but is still useful. In this case, we want to include a thread_id. This should look like:

In [47]:
config = {"configurable": {"thread_id": "abc123"}}

This enables us to support multiple conversation threads with a single application, a common requirement when your application has multiple users.

In [48]:
query = "Hi! I'm Bob."

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()  # output contains all messages in state


Hi again, Bob! What would you like to talk about?


In [49]:
query = "What's my name?"

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Your name is Bob. What’s on your mind?


If we change the thread_id, then the chatbot will no longer remember who the human is:

In [50]:
config = {"configurable": {"thread_id": "abc234"}}

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


I'm sorry, but I don't have access to personal information about you unless you've shared it with me in this conversation. How can I assist you today?


**Every time we call the model, the history of messages is retrieved from the state and passed to the model:**
```python
def call_model(state: MessagesState):
    response = model.invoke(state["messages"])
    return {"messages": response}
```

Let's repeat this exercise with a different thread and see how the `state[messages]` changes with each invocation

In [59]:
workflow = StateGraph(state_schema=MessagesState)

MESSAGES_HISTORY = []

def call_model(state: MessagesState):
    MESSAGES_HISTORY.append(state['messages'])
    response = model.invoke(state["messages"])
    return {"messages": response}


workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

config = {"configurable": {"thread_id": "abc999"}}
queries = ["Hi! I'm Jim. I'm from the UK", "Who am I?", "Where am I from?"]

for query in queries:
    input_messages = [HumanMessage(query)]
    output = app.invoke({"messages": input_messages}, config)

MESSAGES_HISTORY

[[HumanMessage(content="Hi! I'm Jim. I'm from the UK", additional_kwargs={}, response_metadata={}, id='1aafd88c-bf1c-4cc8-aa7e-c34f81b37a3b')],
 [HumanMessage(content="Hi! I'm Jim. I'm from the UK", additional_kwargs={}, response_metadata={}, id='1aafd88c-bf1c-4cc8-aa7e-c34f81b37a3b'),
  AIMessage(content='Hi Jim! Nice to meet you. How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 15, 'prompt_tokens': 16, 'total_tokens': 31, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_b83a7d52ea', 'id': 'chatcmpl-CEU39j4KimKCR4Z9QCil1u0BcPbT4', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='run--54634dae-694f-4eaa-8a9a-b6785343e095-0', usage_metadata={'input_tokens': 16, 'output_

We see that every time a new human prompt is passed to the LLM, the previous history of human/ai messages is included. We can also seee the persisted history in the memory:

In [53]:
memory.get_tuple(config)

CheckpointTuple(config={'configurable': {'thread_id': 'abc234', 'checkpoint_ns': '', 'checkpoint_id': '1f08eccc-0a75-6202-8001-c2a854691bde'}}, checkpoint={'v': 4, 'ts': '2025-09-11T05:04:10.514863+00:00', 'id': '1f08eccc-0a75-6202-8001-c2a854691bde', 'channel_versions': {'__start__': '00000000000000000000000000000002.0.8528319022299968', 'messages': '00000000000000000000000000000003.0.48025719766620845', 'branch:to:model': '00000000000000000000000000000003.0.48025719766620845'}, 'versions_seen': {'__input__': {}, '__start__': {'__start__': '00000000000000000000000000000001.0.3374717961583966'}, 'model': {'branch:to:model': '00000000000000000000000000000002.0.8528319022299968'}}, 'updated_channels': ['messages'], 'channel_values': {'messages': [HumanMessage(content="What's my name?", additional_kwargs={}, response_metadata={}, id='bab3ed37-49ab-4405-848b-41b753aac2b8'), AIMessage(content="I'm sorry, but I don't have access to personal information about you unless you've shared it with 

### 4.2.1 Prompt Templates

As we saw earlier, we can use `PromptTemplate`s to include system directives before the human input:

In [55]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt_template = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You talk like a pirate. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

workflow = StateGraph(state_schema=MessagesState)

def call_model(state: MessagesState):
    prompt = prompt_template.invoke(state)
    response = model.invoke(prompt)
    return {"messages": response}


workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

config = {"configurable": {"thread_id": "abc345"}}
query = "Hi! I'm Jim."

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Ahoy there, Jim! What be bringin' ye to these here waters today? Speak yer mind, matey!


In [56]:
query = "What is my name?"

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Yer name be Jim, if I be recallin' correctly, matey! What other treasures of knowledge can I assist ye with?


As we mentioned earlier, we send the history of messages every time we invoke the LLM. In order to do this with PromptTemplates, we use `MessagesPlaceholder`. 

```python
prompt_template = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You talk like a pirate. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"), # extract 'messages' from the state and insert them here.
    ]
)

workflow = StateGraph(state_schema=MessagesState)

def call_model(state: MessagesState):
    prompt = prompt_template.invoke(state) # state contains history of messages
    response = model.invoke(prompt)
    return {"messages": response}
```

### 4.2.2 Prompt Templates With Variables

So far our state only contains messages. What if our state also needs to consider other properties e.g. `language`.
In this case, we need to extend the state schema to include additional attributes

In [64]:
from typing import Sequence

from langchain_core.messages import BaseMessage
from langgraph.graph.message import add_messages
from typing_extensions import Annotated, TypedDict

prompt_template = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
        ),
        MessagesPlaceholder(variable_name="messages"), # state['messages'] will be inserted in this placeholder.
    ]
)

class State(TypedDict):
    messages: Annotated[Sequence[BaseMessage], add_messages] # state stores messages
    language: str # messages contain a variable 'language' whose value also needs to be stored with the mesages


workflow = StateGraph(state_schema=State)

def call_model(state: State):
    prompt = prompt_template.invoke(state)
    response = model.invoke(prompt)
    return {"messages": [response]}


workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

In [67]:
config = {"configurable": {"thread_id": "abc456"}}
query = "Hi! I'm Bob."
language = "Lebanese Arabic"

input_messages = [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages, "language": language},
    config,
)
output["messages"][-1].pretty_print()


مرحبًا بوب! كيفك اليوم؟ شو فيك تسأل أو تحكي؟


In [68]:
query = "What is my name?"

input_messages = [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages},
    config,
)
output["messages"][-1].pretty_print()


اسمك هو بوب!


## 4.3 Managing Conversation History