# Build an Agent

By themselves, language models can't take actions - they just output text.
A big use case for LangChain is creating **agents**.
[Agents](/docs/concepts/agents) are systems that use [LLMs](/docs/concepts/chat_models) as reasoning engines to determine which actions to take and the inputs necessary to perform the action.
After executing actions, the results can be fed back into the LLM to determine whether more actions are needed, or whether it is okay to finish. This is often achieved via [tool-calling](/docs/concepts/tool_calling).

In this tutorial we will build an agent that can interact with a search engine. You will be able to ask this agent questions, watch it call the search tool, and have conversations with it.

## End-to-end agent

The code snippet below represents a fully functional agent that uses an LLM to decide which tools to use. It is equipped with a generic search tool. It has conversational memory - meaning that it can be used as a multi-turn chatbot.

In the rest of the guide, we will walk through the individual components and what each part does - but if you want to just grab some code and get started, feel free to use this!

In [50]:
# Load environment variables from .env file
import dotenv
import os
dotenv.load_dotenv()

# Import LangChain components for building the agent
from langchain_openai import ChatOpenAI  # OpenAI's language model interface
from langchain_community.tools.tavily_search import TavilySearchResults  # Web search tool
from langchain_core.messages import HumanMessage  # Message structure for human input
from langgraph.checkpoint.memory import MemorySaver  # Memory persistence for conversations
from langgraph.prebuilt import create_react_agent  # Pre-built ReAct agent factory

# Initialize memory for conversation state persistence
memory = MemorySaver()

# Configure the language model (using OpenAI's GPT-4o-mini)
# Alternative Anthropic Claude model is commented out
# model = ChatAnthropic(model_name="claude-3-sonnet-20240229")
model = ChatOpenAI(model_name='gpt-4o-mini')

# Configure the search tool with maximum 2 results per query
search = TavilySearchResults(max_results=2)

# Create tools list (can add more tools later)
tools = [search]

# Create the ReAct agent with model, tools, and memory checkpointing
agent_executor = create_react_agent(model, tools, checkpointer=memory)

In [52]:
# Test the agent with a simple greeting message
# Configure thread ID for conversation memory persistence
config = {"configurable": {"thread_id": "abc123"}}

# Stream the agent's response in real-time
# The agent processes the human message and responds appropriately
for step in agent_executor.stream(
    {"messages": [HumanMessage(content="hi im chadi! and i live in byblos, lebanon")]},
    config,
    stream_mode="values",  # Stream complete message values
):
    # Pretty print the last message in each step
    step["messages"][-1].pretty_print()


hi im chadi! and i live in byblos, lebanon

Hi Chadi! It's nice to meet you. Byblos is a beautiful city with a rich history. How can I assist you today?


In [54]:
# Demonstrate agent tool usage with a context-aware query
# The agent remembers that Chadi lives in Byblos, Lebanon from the previous message
# However, the search seems to default to San Francisco (possibly a model confusion)
for step in agent_executor.stream(
    {"messages": [HumanMessage(content="whats the weather where I live? can  go to the beach?")]},
    config,  # Same thread ID to maintain conversation context
    stream_mode="values",
):
    # Display the conversation flow:
    # 1. Human asks about weather
    # 2. AI decides to use search tool
    # 3. Tool returns search results
    # 4. AI synthesizes response from search results
    step["messages"][-1].pretty_print()


whats the weather where I live? can  go to the beach?
Tool Calls:
  tavily_search_results_json (call_iN7YfmWSDyFY5z17p9x9Q4Ea)
 Call ID: call_iN7YfmWSDyFY5z17p9x9Q4Ea
  Args:
    query: Byblos Lebanon weather
Name: tavily_search_results_json

[{"title": "Byblos weather in December 2025 - Weather25.com", "url": "https://www.weather25.com/asia/lebanon/mont-liban/byblos?page=month&month=December", "content": "The temperatures in Byblos in December are usually low and can range between 55°F and 64°F. You can expect about 3 to 8 days of rain in Byblos during the month", "score": 0.9242583}, {"title": "Byblos weather in December 2025 | Lebanon: How hot?", "url": "https://www.weather2travel.com/lebanon/byblos/december/", "content": "[Back to months](#monthly)\n\n### Byblos weather in December\n\nExpect  **daytime maximum temperatures of 11°C** in **Byblos, Lebanon** in **December** based on long-term weather averages. There are **5 hours of sunshine per day** on average with **13 days with s

-----------
## Define tools

We first need to create the tools we want to use. Our main tool of choice will be [Tavily](/docs/integrations/tools/tavily_search) - a search engine. We have a built-in tool in LangChain to easily use Tavily search engine as tool.


In [57]:
# Import and configure the Tavily search tool
from langchain_community.tools.tavily_search import TavilySearchResults

# Create search tool instance with max 2 results per query
search = TavilySearchResults(max_results=2)

# Test the search tool directly to see its output format
search_results = search.invoke("what is the weather in SF")
print(
)

# Additional tools can be created and added here as needed
# Once we have all the tools we want, we put them in a list for the agent
tools = [search]

[{'title': 'San Francisco weather in December 2025 - Weather25.com', 'url': 'https://www.weather25.com/north-america/usa/california/san-francisco?page=month&month=December', 'content': '| [7 ![Image 20: Moderate or heavy rain shower](https://res.weather25.com/images/weather_icons/new/day/rain-3.svg) 14°/9°](https://www.weather25.com/north-america/usa/california/san-francisco?page=past-weather#day=7&month=12 "Weather in 7 december 2025") | [8 ![Image 21: Overcast](https://res.weather25.com/images/weather_icons/new/day/overcast.svg) 14°/10°](https://www.weather25.com/north-america/usa/california/san-francisco?page=past-weather#day=8&month=12 "Weather in 8 december 2025") | [9 [...] The temperatures in San Francisco in December are quite cold with **temperatures between 8°C and 14°C**, warm clothes are a must.\n\nYou can expect about **3 to 8 days of rain** in San Francisco during the month of December. It’s a good idea to bring along your umbrella so that you don’t get caught in poor wea

## Using Language Models

Next, let's learn how to use a language model to call tools. LangChain supports many different language models that you can use interchangably - select the one you want to use below!

import ChatModelTabs from "@theme/ChatModelTabs";

<ChatModelTabs overrideParams={{openai: {model: "gpt-4"}}} />


In [59]:
# Configure alternative language model (Anthropic Claude)
# This cell shows how to use Claude instead of OpenAI
# Note: This is commented out in actual execution
from langchain_openai import ChatOpenAI  # OpenAI's language model interface

# Initialize Claude-3 Sonnet model
model = ChatOpenAI(model="gpt-4o-mini")

You can call the language model by passing in a list of messages. By default, the response is a `content` string.

In [62]:
# Test basic language model interaction without tools
from langchain_core.messages import HumanMessage

# Send a simple message to the model and get response
response = model.invoke([HumanMessage(content="hi!")])

# Extract and display the text content of the response
response.content

'Hello! How can I assist you today?'

We can now see what it is like to enable this model to do tool calling. In order to enable that we use `.bind_tools` to give the language model knowledge of these tools

In [None]:
# Bind tools to the language model to enable tool-calling capabilities
# This gives the model awareness of available tools and their schemas
model_with_tools = model.bind_tools(tools)

We can now call the model. Let's first call it with a normal message, and see how it responds. We can look at both the `content` field as well as the `tool_calls` field.

In [64]:
# Test the tool-enabled model with a non-tool requiring message
response = model_with_tools.invoke([HumanMessage(content="Hi!")])

# Examine both content and tool_calls fields
# For simple greetings, expect normal text response with no tool calls
print(f"ContentString: {response.content}")
print(f"ToolCalls: {response.tool_calls}")

ContentString: Hello! How can I assist you today?
ToolCalls: []


Now, let's try calling it with some input that would expect a tool to be called.

In [66]:
# Test the tool-enabled model with a query that requires tool usage
response = model_with_tools.invoke([HumanMessage(content="What's the weather in SF?")])

# Notice how the model now:
# 1. Has empty content (no direct text response)
# 2. Has populated tool_calls with search query
# This shows the model recognized the need to use the search tool
print(f"ContentString: {response.content}")
print(f"ToolCalls: {response.tool_calls}")

ContentString: 
ToolCalls: [{'name': 'tavily_search_results_json', 'args': {'query': 'current weather San Francisco'}, 'id': 'call_Lc1SjHmt6XYe4z6gaZydWFID', 'type': 'tool_call'}]


We can see that there's now no text content, but there is a tool call! It wants us to call the Tavily Search tool.

This isn't calling that tool yet - it's just telling us to. In order to actually call it, we'll want to create our agent.

## Create the agent

Now that we have defined the tools and the LLM, we can create the agent. We will be using [LangGraph](/docs/concepts/architecture/#langgraph) to construct the agent. 
Currently, we are using a high level interface to construct the agent, but the nice thing about LangGraph is that this high-level interface is backed by a low-level, highly controllable API in case you want to modify the agent logic.


Now, we can initialize the agent with the LLM and the tools.

Note that we are passing in the `model`, not `model_with_tools`. That is because `create_react_agent` will call `.bind_tools` for us under the hood.

In [68]:
# Create a ReAct (Reasoning and Acting) agent using LangGraph
from langgraph.prebuilt import create_react_agent

# Initialize the agent executor with:
# - model: The language model (without pre-bound tools)
# - tools: List of available tools (search in this case)
# Note: create_react_agent will automatically bind tools to the model
agent_executor = create_react_agent(model, tools)

## Run the agent

We can now run the agent with a few queries! Note that for now, these are all **stateless** queries (it won't remember previous interactions). Note that the agent will return the **final** state at the end of the interaction (which includes any inputs, we will see later on how to get only the outputs).

First up, let's see how it responds when there's no need to call a tool:

In [75]:
# Test agent with a simple greeting (no tools needed)
# The invoke method runs the agent synchronously and returns final state
response = agent_executor.invoke({"messages": [HumanMessage(content="hi!")]})

# The response contains all messages in the conversation:
# 1. Human message (input)
# 2. AI message (response)
# Note: No tool calls occurred since greeting doesn't require search
response["messages"]

[HumanMessage(content='hi!', additional_kwargs={}, response_metadata={}, id='27efca6c-b438-4074-8c26-ca7d8058afbc'),
 AIMessage(content='Hello! How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 10, 'prompt_tokens': 81, 'total_tokens': 91, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_34a54ae93c', 'finish_reason': 'stop', 'logprobs': None}, id='run--2d658eef-0273-4046-980d-b6a257297423-0', usage_metadata={'input_tokens': 81, 'output_tokens': 10, 'total_tokens': 91, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]

In order to see exactly what is happening under the hood (and to make sure it's not calling a tool) we can take a look at the [LangSmith trace](https://smith.langchain.com/public/28311faa-e135-4d6a-ab6b-caecf6482aaa/r)

Let's now try it out on an example where it should be invoking the tool

In [77]:
# Test agent with a query requiring tool usage (weather information)
response = agent_executor.invoke(
    {"messages": [HumanMessage(content="whats the weather in sf?")]}
)

# The response now contains 4 messages showing the complete ReAct cycle:
# 1. HumanMessage: User's weather query
# 2. AIMessage: Agent decides to use search tool (tool_calls populated)
# 3. ToolMessage: Search results from Tavily API
# 4. AIMessage: Agent synthesizes final answer from search results
response["messages"]

[HumanMessage(content='whats the weather in sf?', additional_kwargs={}, response_metadata={}, id='8bda8f97-abba-426b-a492-d0eef994c1eb'),
 AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_d5sfNpAbqaUxsvVUyuoH7YLt', 'function': {'arguments': '{"query":"San Francisco weather"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 20, 'prompt_tokens': 86, 'total_tokens': 106, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_34a54ae93c', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run--1cbf92bc-d826-4053-b995-3136e6005c6d-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'San Francisco weather'}, 'id': 'call_d5sfNpAbqaUxsvVUyuoH7YLt', 't

We can check out the [LangSmith trace](https://smith.langchain.com/public/f520839d-cd4d-4495-8764-e32b548e235d/r) to make sure it's calling the search tool effectively.

## Streaming Messages

We've seen how the agent can be called with `.invoke` to get  a final response. If the agent executes multiple steps, this may take a while. To show intermediate progress, we can stream back messages as they occur.

In [79]:
# Demonstrate streaming agent execution to show intermediate steps
# stream_mode="values" returns complete message states as they're updated
for step in agent_executor.stream(
    {"messages": [HumanMessage(content="whats the weather in sf?")]},
    stream_mode="values",
):
    # Print each message as it becomes available:
    # 1. First shows human message
    # 2. Then shows AI deciding to use tool
    # 3. Then shows tool execution results
    # 4. Finally shows AI's synthesized response
    step["messages"][-1].pretty_print()


whats the weather in sf?
Tool Calls:
  tavily_search_results_json (call_9M13tm95oNGksicoBSij5u89)
 Call ID: call_9M13tm95oNGksicoBSij5u89
  Args:
    query: current weather in San Francisco
Name: tavily_search_results_json

[{"title": "San Francisco weather in December 2025 - Weather25.com", "url": "https://www.weather25.com/north-america/usa/california/san-francisco?page=month&month=December", "content": "The temperatures in San Francisco in December are quite cold with temperatures between 46°F and 57°F, warm clothes are a must.", "score": 0.85967577}, {"title": "San Francisco weather in June 2025 | Weather25.com", "url": "https://www.weather25.com/north-america/usa/california/san-francisco?page=month&month=June", "content": "| December | **14°** / 8° | 4 | 27 | 0 | 55 mm | Good | [San Francisco in December](https://www.weather25.com/north-america/usa/california/san-francisco?page=month&month=December) | [...] /13°](https://www.weather25.com/north-america/usa/california/san-francisc

## Streaming tokens

In addition to streaming back messages, it is also useful to stream back tokens.
We can do this by specifying `stream_mode="messages"`.


::: note

Below we use `message.text()`, which requires `langchain-core>=0.3.37`.

:::

In [81]:
# Demonstrate token-level streaming for real-time response generation
# stream_mode="messages" provides individual message chunks as they're generated
for step, metadata in agent_executor.stream(
    {"messages": [HumanMessage(content="whats the weather in sf?")]},
    stream_mode="messages",
):
    # Filter to only show text tokens from the agent's final response
    # langgraph_node="agent" ensures we only see the LLM's output, not tool calls
    if metadata["langgraph_node"] == "agent" and (text := step.text()):
        # Print each token as it's generated, separated by | for visibility
        print(text, end="|")

I| couldn't| find| the| current| weather| for| San| Francisco| directly|.| However|,| you| can| check| out| detailed| weather| forecasts| on| sites| like| [|Weather|25|](|https|://|www|.weather|25|.com|/n|orth|-amer|ica|/|usa|/cal|ifornia|/s|an|-fr|anc|isco|)| for| the| latest| updates|.| If| you| need| more| specific| information| or| a| different| source|,| let| me| know|!|

## Adding in memory

As mentioned earlier, this agent is stateless. This means it does not remember previous interactions. To give it memory we need to pass in a checkpointer. When passing in a checkpointer, we also have to pass in a `thread_id` when invoking the agent (so it knows which thread/conversation to resume from).

In [83]:
# Initialize memory system for conversation persistence
from langgraph.checkpoint.memory import MemorySaver

# MemorySaver stores conversation state in memory (lost when session ends)
# For production, consider using persistent storage like SQLite or Redis
memory = MemorySaver()

In [85]:
# Recreate agent with memory capabilities
agent_executor = create_react_agent(model, tools, checkpointer=memory)

# Configure conversation thread ID for memory persistence
# Same thread_id = same conversation; different thread_id = new conversation
config = {"configurable": {"thread_id": "abc123"}}

In [87]:
# Test memory functionality by introducing the agent to a user
# This information should be remembered for subsequent interactions
for chunk in agent_executor.stream(
    {"messages": [HumanMessage(content="hi im chadi!")]}, config
):
    # Display each step of the agent's response
    # Agent acknowledges the introduction and stores the name "Bob"
    print(chunk)
    print("----")

{'agent': {'messages': [AIMessage(content='Hello Chadi! How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 12, 'prompt_tokens': 84, 'total_tokens': 96, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_34a54ae93c', 'finish_reason': 'stop', 'logprobs': None}, id='run--f3e4c370-5e57-44fe-89a7-b30863d77a49-0', usage_metadata={'input_tokens': 84, 'output_tokens': 12, 'total_tokens': 96, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]}}
----


In [89]:
# Test memory recall by asking the agent about previously shared information
# Using the same config (thread_id) ensures conversation continuity
for chunk in agent_executor.stream(
    {"messages": [HumanMessage(content="whats my name?")]}, config
):
    # Agent successfully recalls the name "Bob" from previous interaction
    # This demonstrates working conversational memory
    print(chunk)
    print("----")

{'agent': {'messages': [AIMessage(content='Your name is Chadi! How can I help you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 108, 'total_tokens': 122, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_34a54ae93c', 'finish_reason': 'stop', 'logprobs': None}, id='run--c6ff644d-6f3d-488f-84ad-c134a5f8fefa-0', usage_metadata={'input_tokens': 108, 'output_tokens': 14, 'total_tokens': 122, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]}}
----


If you want to start a new conversation, all you have to do is change the `thread_id` used

In [91]:
# Demonstrate conversation isolation with a different thread_id
# New thread = fresh conversation with no memory of previous interactions
config = {"configurable": {"thread_id": "xyz123"}}

for chunk in agent_executor.stream(
    {"messages": [HumanMessage(content="whats my name?")]}, config
):
    # Agent doesn't know the name since this is a different conversation thread
    # This proves memory isolation between different thread_ids
    print(chunk)
    print("----")

{'agent': {'messages': [AIMessage(content="I'm sorry, but I don't have access to personal information about you unless you provide it. How can I assist you today?", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 26, 'prompt_tokens': 84, 'total_tokens': 110, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_34a54ae93c', 'finish_reason': 'stop', 'logprobs': None}, id='run--f9013d6c-451d-4188-bed6-a545e7893058-0', usage_metadata={'input_tokens': 84, 'output_tokens': 26, 'total_tokens': 110, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]}}
----


## Conclusion

That's a wrap! In this quick start we covered how to create a simple agent. 
We've then shown how to stream back a response - not only with the intermediate steps, but also tokens!
We've also added in memory so you can have a conversation with them.
Agents are a complex topic with lots to learn! 

For more information on Agents, please check out the [LangGraph](/docs/concepts/architecture/#langgraph) documentation. This has it's own set of concepts, tutorials, and how-to guides.

In [None]:
# This empty cell can be used for additional experimentation
# Try adding more tools, testing different queries, or exploring agent configurations