# Advanced LLM agents with LangGraph and LangSmith

In following this tutorial, you will learn how to:

- Use language models, in particular their tool calling ability
- Use a Search Tool to look up information from the Internet
- Compose a [LangGraph](https://langchain-ai.github.io/langgraph/tutorials/introduction/) Agent, which use an LLM to determine actions and then execute them

**LangSmith:**   
Many of the applications you build with LangChain will contain multiple steps with multiple invocations of LLM calls. As these applications get more and more complex, it becomes crucial to be able to inspect what exactly is going on inside your chain or agent. The best way to do this is with LangSmith.

In [1]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

os.environ["LANGCHAIN_TRACING_V2"] = "true"
# os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()

import warnings
warnings.filterwarnings("ignore")

import pprint
# A function for printing nicely
def nprint(text, indent=2):
    pp = pprint.PrettyPrinter(indent=indent)
    pp.pprint(text)

Define tools
We first need to create the tools we want to use. Our main tool of choice will be Tavily - a search engine. We have a built-in tool in LangChain to easily use Tavily search engine as tool.

In [19]:
from langchain_community.tools.tavily_search import TavilySearchResults

search = TavilySearchResults(max_results=2)
search_results = search.invoke("what is the weather in SF")
print(search_results)
# If we want, we can create other tools.
# Once we have all the tools we want, we can put them in a list that we will reference later.
tools = [search]

[{'url': 'https://www.weatherapi.com/', 'content': "{'location': {'name': 'San Francisco', 'region': 'California', 'country': 'United States of America', 'lat': 37.78, 'lon': -122.42, 'tz_id': 'America/Los_Angeles', 'localtime_epoch': 1718313877, 'localtime': '2024-06-13 14:24'}, 'current': {'last_updated_epoch': 1718313300, 'last_updated': '2024-06-13 14:15', 'temp_c': 17.8, 'temp_f': 64.0, 'is_day': 1, 'condition': {'text': 'Sunny', 'icon': '//cdn.weatherapi.com/weather/64x64/day/113.png', 'code': 1000}, 'wind_mph': 12.5, 'wind_kph': 20.2, 'wind_degree': 300, 'wind_dir': 'WNW', 'pressure_mb': 1016.0, 'pressure_in': 29.99, 'precip_mm': 0.0, 'precip_in': 0.0, 'humidity': 63, 'cloud': 0, 'feelslike_c': 17.8, 'feelslike_f': 64.0, 'windchill_c': 15.9, 'windchill_f': 60.6, 'heatindex_c': 15.9, 'heatindex_f': 60.7, 'dewpoint_c': 9.6, 'dewpoint_f': 49.3, 'vis_km': 16.0, 'vis_miles': 9.0, 'uv': 5.0, 'gust_mph': 16.7, 'gust_kph': 26.9}}"}, {'url': 'https://www.wunderground.com/hourly/us/ca/san

In [20]:
nprint(search_results[0]['content'])

("{'location': {'name': 'San Francisco', 'region': 'California', 'country': "
 "'United States of America', 'lat': 37.78, 'lon': -122.42, 'tz_id': "
 "'America/Los_Angeles', 'localtime_epoch': 1718313877, 'localtime': "
 "'2024-06-13 14:24'}, 'current': {'last_updated_epoch': 1718313300, "
 "'last_updated': '2024-06-13 14:15', 'temp_c': 17.8, 'temp_f': 64.0, "
 "'is_day': 1, 'condition': {'text': 'Sunny', 'icon': "
 "'//cdn.weatherapi.com/weather/64x64/day/113.png', 'code': 1000}, 'wind_mph': "
 "12.5, 'wind_kph': 20.2, 'wind_degree': 300, 'wind_dir': 'WNW', "
 "'pressure_mb': 1016.0, 'pressure_in': 29.99, 'precip_mm': 0.0, 'precip_in': "
 "0.0, 'humidity': 63, 'cloud': 0, 'feelslike_c': 17.8, 'feelslike_f': 64.0, "
 "'windchill_c': 15.9, 'windchill_f': 60.6, 'heatindex_c': 15.9, "
 "'heatindex_f': 60.7, 'dewpoint_c': 9.6, 'dewpoint_f': 49.3, 'vis_km': 16.0, "
 "'vis_miles': 9.0, 'uv': 5.0, 'gust_mph': 16.7, 'gust_kph': 26.9}}")


Using Language Models
Next, let's learn how to use a language model by to call tools. LangChain supports many different language models that you can use interchangably - select the one you want to use below!

In [30]:
from langchain_openai import ChatOpenAI
modelID = "gpt-3.5-turbo"
model = ChatOpenAI(model=modelID)

In [27]:
from langchain_core.messages import HumanMessage

response = model.invoke([HumanMessage(content="hi!")])
response.content

'Hello! How can I assist you today?'

We can now see what it is like to enable this model to do tool calling. In order to enable that we use .bind_tools to give the language model knowledge of these tools

In [26]:
model_with_tools = model.bind_tools(tools)

We can now call the model. Let's first call it with a normal message, and see how it responds. We can look at both the content field as well as the tool_calls field.

In [28]:
response = model_with_tools.invoke([HumanMessage(content="Hi!")])

print(f"ContentString: {response.content}")
print(f"ToolCalls: {response.tool_calls}")

ContentString: Hello! How can I assist you today?
ToolCalls: []


the agent didn't call any tool based on the input prompt

ow, let's try calling it with some input that would expect a tool to be called.

In [29]:
response = model_with_tools.invoke([HumanMessage(content="What's the weather in SF?")])

print(f"ContentString: {response.content}")
print(f"ToolCalls: {response.tool_calls}")
# response

ContentString: 
ToolCalls: [{'name': 'tavily_search_results_json', 'args': {'query': 'current weather in San Francisco'}, 'id': 'call_ZiclSw6CK0kZvbb9k3LYTZa6'}]


Create the agent
Now that we have defined the tools and the LLM, we can create the agent. We will be using LangGraph to construct the agent. Currently we are using a high level interface to construct the agent, but the nice thing about LangGraph is that this high-level interface is backed by a low-level, highly controllable API in case you want to modify the agent logic.

Now, we can initalize the agent with the LLM and the tools.

Note that we are passing in the model, not model_with_tools. That is because create_react_agent will call .bind_tools for us under the hood.

In [32]:
from langgraph.prebuilt import create_react_agent

agent_executor = create_react_agent(model, tools)

Run the agent

In [33]:
response = agent_executor.invoke({"messages": [HumanMessage(content="hi!")]})

response["messages"]

[HumanMessage(content='hi!', id='dc8ec231-05f5-4def-b510-24043bd8c20f'),
 AIMessage(content='Hello! How can I assist you today?', response_metadata={'token_usage': {'completion_tokens': 10, 'prompt_tokens': 83, 'total_tokens': 93}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-8448bac2-0168-4fa5-bb88-344c577e1893-0', usage_metadata={'input_tokens': 83, 'output_tokens': 10, 'total_tokens': 93})]

In [35]:
response = agent_executor.invoke(
    {"messages": [HumanMessage(content="whats the weather in NY?")]}
)
response["messages"]

[HumanMessage(content='whats the weather in NY?', id='b7b18c7e-5e19-4053-927e-1ed469cbbf47'),
 AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_MvFLjTWS4aBLgVk4z7bQK2GZ', 'function': {'arguments': '{"query":"weather in New York"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 21, 'prompt_tokens': 88, 'total_tokens': 109}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-a1564e2e-ec18-422a-855c-4a94c0a2b1d7-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'weather in New York'}, 'id': 'call_MvFLjTWS4aBLgVk4z7bQK2GZ'}], usage_metadata={'input_tokens': 88, 'output_tokens': 21, 'total_tokens': 109}),
 ToolMessage(content='[{"url": "https://www.weatherapi.com/", "content": "{\'location\': {\'name\': \'New York\', \'region\': \'New York\', \'country\': \'United States of America\', \'lat\': 40.71, \'lon\': -74

Streaming intermediate Messages
We've seen how the agent can be called with .invoke to get back a final response. If the agent is executing multiple steps, that may take a while. In order to show intermediate progress, we can stream back messages as they occur.

In [36]:
for chunk in agent_executor.stream(
    {"messages": [HumanMessage(content="whats the weather in sf?")]}
):
    print(chunk)
    print("----")

{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_xmOWOBt5cZskvtR6cgCugGvi', 'function': {'arguments': '{"query":"weather in San Francisco"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 21, 'prompt_tokens': 88, 'total_tokens': 109}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-89f10d09-3835-4d31-9b65-ad00fdc2580a-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'weather in San Francisco'}, 'id': 'call_xmOWOBt5cZskvtR6cgCugGvi'}], usage_metadata={'input_tokens': 88, 'output_tokens': 21, 'total_tokens': 109})]}}
----
{'tools': {'messages': [ToolMessage(content='[{"url": "https://www.weatherapi.com/", "content": "{\'location\': {\'name\': \'San Francisco\', \'region\': \'California\', \'country\': \'United States of America\', \'lat\': 37.78, \'lon\': -122.42, \'tz_id\': \'Ameri

Adding in memory
As mentioned earlier, this agent is stateless. This means it does not remember previous interactions. To give it memory we need to pass in a checkpointer. When passing in a checkpointer, we also have to pass in a thread_id when invoking the agent (so it knows which thread/conversation to resume from).

In [37]:
from langgraph.checkpoint.sqlite import SqliteSaver

memory = SqliteSaver.from_conn_string(":memory:")


In [38]:
agent_executor = create_react_agent(model, tools, checkpointer=memory)

config = {"configurable": {"thread_id": "abc123"}}

In [39]:
for chunk in agent_executor.stream(
    {"messages": [HumanMessage(content="hi im bob!")]}, config
):
    print(chunk)
    print("----")

{'agent': {'messages': [AIMessage(content='Hello Bob! How can I assist you today?', response_metadata={'token_usage': {'completion_tokens': 11, 'prompt_tokens': 85, 'total_tokens': 96}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-3cd05824-a0dd-46c0-b19f-be2e5c8c6cd2-0', usage_metadata={'input_tokens': 85, 'output_tokens': 11, 'total_tokens': 96})]}}
----


In [40]:
for chunk in agent_executor.stream(
    {"messages": [HumanMessage(content="whats my name?")]}, config
):
    print(chunk)
    print("----")

{'agent': {'messages': [AIMessage(content='Your name is Bob! How can I help you further, Bob?', response_metadata={'token_usage': {'completion_tokens': 15, 'prompt_tokens': 108, 'total_tokens': 123}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-4885ff68-9d17-422f-b88d-7f67d27a52ac-0', usage_metadata={'input_tokens': 108, 'output_tokens': 15, 'total_tokens': 123})]}}
----


****

If I want to start a new conversation, all I have to do is change the thread_id used



In [41]:
config = {"configurable": {"thread_id": "xyz123"}}
for chunk in agent_executor.stream(
    {"messages": [HumanMessage(content="whats my name?")]}, config
):
    print(chunk)
    print("----")

{'agent': {'messages': [AIMessage(content="I don't have access to personal information like your name. How can I assist you today?", response_metadata={'token_usage': {'completion_tokens': 20, 'prompt_tokens': 86, 'total_tokens': 106}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-bac537f2-33de-4e38-b38c-fe548be9fe8a-0', usage_metadata={'input_tokens': 86, 'output_tokens': 20, 'total_tokens': 106})]}}
----


https://smith.langchain.com/