# FIRST TIMER AGENTIC AI

# LangSmith

for model, chain, agent maintaining and inspection

In [1]:
import getpass
import os

os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_API_KEY"] = getpass.getpass()

## Tavily

as a search engine tool

In [2]:
os.environ["TAVILY_API_KEY"] = getpass.getpass()

## Defining tools

We first need to create the tools we want to use. Our main tool of choice will be Tavily - a search engine. We have a built-in tool in LangChain to easily use Tavily search engine as tool.



In [3]:
from langchain_community.tools.tavily_search import TavilySearchResults

search = TavilySearchResults(max_results=2)
search_results = search.invoke("what is the weather in SF")
print(search_results)
# If we want, we can create other tools.
# Once we have all the tools we want, we can put them in a list that we will reference later.
tools = [search]



Failed to multipart ingest runs: langsmith.utils.LangSmithError: Failed to POST https://api.smith.langchain.com/runs/multipart in LangSmith API. HTTPError('403 Client Error: Forbidden for url: https://api.smith.langchain.com/runs/multipart', '{"error":"Forbidden"}\n')


[{'url': 'https://www.weatherapi.com/', 'content': "{'location': {'name': 'San Francisco', 'region': 'California', 'country': 'United States of America', 'lat': 37.775, 'lon': -122.4183, 'tz_id': 'America/Los_Angeles', 'localtime_epoch': 1739496524, 'localtime': '2025-02-13 17:28'}, 'current': {'last_updated_epoch': 1739495700, 'last_updated': '2025-02-13 17:15', 'temp_c': 13.9, 'temp_f': 57.0, 'is_day': 1, 'condition': {'text': 'Mist', 'icon': '//cdn.weatherapi.com/weather/64x64/day/143.png', 'code': 1030}, 'wind_mph': 25.5, 'wind_kph': 41.0, 'wind_degree': 227, 'wind_dir': 'SW', 'pressure_mb': 1003.0, 'pressure_in': 29.62, 'precip_mm': 3.6, 'precip_in': 0.14, 'humidity': 83, 'cloud': 100, 'feelslike_c': 11.1, 'feelslike_f': 52.1, 'windchill_c': 9.4, 'windchill_f': 48.9, 'heatindex_c': 12.2, 'heatindex_f': 54.0, 'dewpoint_c': 11.5, 'dewpoint_f': 52.6, 'vis_km': 9.7, 'vis_miles': 6.0, 'uv': 0.0, 'gust_mph': 38.3, 'gust_kph': 61.6}}"}, {'url': 'https://world-weather.info/forecast/usa/sa

In [4]:
## Using language models

import getpass
import os

if not os.environ.get("GROQ_API_KEY"):
    os.environ["GROQ_API_KEY"] = getpass.getpass("Enter API key for Groq: ")

from langchain.chat_models import init_chat_model

model = init_chat_model("llama3-8b-8192", model_provider="groq")

you can call the language model by passing in a list of messages. By default, the response is a `content` string

In [33]:
from langchain_core.messages import HumanMessage

response = model.invoke([HumanMessage(content="who is david alaba?")])
response.content

"David Alaba is an Austrian professional footballer who plays as a left-back or left midfielder for Bayern Munich and the Austria national team. He is considered one of the best left-backs in the world, known for his exceptional defensive skills, vision, and attacking prowess.\n\nAlaba was born on June 24, 1992, in Vienna, Austria. He joined Bayern Munich's youth academy at the age of 10 and rose through the ranks to make his professional debut for the club in 2010. He quickly established himself as a key player for Bayern, winning numerous titles including eight Bundesliga championships, four DFB-Pokals, and the 2020 UEFA Champions League.\n\nAt the international level, Alaba has been a mainstay for the Austria national team since his debut in 2009. He has appeared in several major tournaments, including the World Cup and the European Championship.\n\nAlaba is known for his versatility, ability to play in multiple positions, and his exceptional technical skills. He is also an attackin

We can now see what it is like to enable this model to do tool calling. In order to enable that we use .bind_tools to give the language model knowledge of these tools

In [6]:
model_with_tools = model.bind_tools(tools)

We can now call the model. Let's first call it with a normal message, and see how it responds. We can look at both the content field as well as the tool_calls field.

In [7]:
response = model_with_tools.invoke([HumanMessage(content="give me a handphone recommendation for productivity?")])

print(f"ContentString: {response.content}")
print(f"ToolCalls: {response.tool_calls}")

ContentString: For a handphone recommendation for productivity, I'd suggest considering the following factors: screen size, battery life, multitasking capabilities, and app selection.

Based on these factors, I'd recommend the Samsung Galaxy S22 Ultra or the Google Pixel 6 Pro. Both devices offer large screens, long battery life, and seamless multitasking capabilities.

The Samsung Galaxy S22 Ultra features a massive 6.8-inch Dynamic AMOLED display, a large 5000mAh battery, and up to 16GB of RAM for smooth multitasking. It's also available with a range of storage options, including a microSD card slot.

The Google Pixel 6 Pro, on the other hand, boasts a 6.7-inch OLED display, a 5124mAh battery, and 12GB of RAM. It's also known for its exceptional camera performance and timely software updates.

Both devices are compatible with a wide range of productivity apps, including Microsoft Office, Google Workspace, and Todoist.

Ultimately, the best handphone for productivity will depend on yo

Now, let's try calling it with some input that would expect a tool to be called.

In [8]:
response = model_with_tools.invoke([HumanMessage(content="who is thomas alfa edinson")])

print(f"ContentString: {response.content}")
print(f"ToolCalls: {response.tool_calls}")

# sampe otw create agent

ContentString: 
ToolCalls: [{'name': 'tavily_search_results_json', 'args': {'query': 'Thomas Alfa Edison'}, 'id': 'call_h1m2', 'type': 'tool_call'}]


We can see that there's now no text content, but there is a tool call! It wants us to call the Tavily Search tool.

This isn't calling that tool yet - it's just telling us to. In order to actually call it, we'll want to create our agent.

## Creating Agent

Now that we have defined the tools and the LLM, we can create the agent. We will be using LangGraph to construct the agent. Currently, we are using a high level interface to construct the agent, but the nice thing about LangGraph is that this high-level interface is backed by a low-level, highly controllable API in case you want to modify the agent logic.

Now, we can initialize the agent with the LLM and the tools.

Note that we are passing in the model, not model_with_tools. That is because create_react_agent will call .bind_tools for us under the hood.

In [9]:
from langgraph.prebuilt import create_react_agent

agent_executor = create_react_agent(model, tools)

## Run the agent

We can now run the agent with a few queries! Note that for now, these are all stateless queries (it won't remember previous interactions). Note that the agent will return the final state at the end of the interaction (which includes any inputs, we will see later on how to get only the outputs).

First up, let's see how it responds when there's no need to call a tool:

In [10]:
response = agent_executor.invoke({"messages": [HumanMessage(content="hi!")]})

response["messages"]

[HumanMessage(content='hi!', additional_kwargs={}, response_metadata={}, id='78a8e01d-776d-4d90-b5ff-ea668ac52ffe'),
 AIMessage(content='Hi!', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 3, 'prompt_tokens': 1901, 'total_tokens': 1904, 'completion_time': 0.0025, 'prompt_time': 0.34279927, 'queue_time': -0.783156675, 'total_time': 0.34529927}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_6a6771ae9c', 'finish_reason': 'stop', 'logprobs': None}, id='run-0117deb9-64a5-47a4-a9f1-c2b8f10a6f37-0', usage_metadata={'input_tokens': 1901, 'output_tokens': 3, 'total_tokens': 1904})]

In [30]:
response = agent_executor.invoke(
    {"messages": [HumanMessage(content="whats the weather in sf?")]}
)
response["messages"]

[HumanMessage(content='whats the weather in sf?', additional_kwargs={}, response_metadata={}, id='e671f072-934b-4af5-a7c4-a93ed28def78'),
 AIMessage(content='According to the latest data from the National Weather Service, the current weather in San Francisco is mostly cloudy with a high temperature of 63°F (17°C) and a low of 55°F (13°C). There is a 20% chance of light rain showers throughout the day.', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 59, 'prompt_tokens': 1912, 'total_tokens': 1971, 'completion_time': 0.049166667, 'prompt_time': 0.229291882, 'queue_time': -0.324496803, 'total_time': 0.278458549}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_6a6771ae9c', 'finish_reason': 'stop', 'logprobs': None}, id='run-d3e97578-965f-4a86-bf54-1d43ed4425bb-0', usage_metadata={'input_tokens': 1912, 'output_tokens': 59, 'total_tokens': 1971})]

## Streaming messages

We've seen how the agent can be called with .invoke to get a final response. If the agent executes multiple steps, this may take a while. To show intermediate progress, we can stream back messages as they occur.

In [38]:
for chunk in agent_executor.stream(
        {"messages": [HumanMessage(content="whats the weather in sf?")]}
):
    print(chunk)
    print("----")

{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_kjr7', 'function': {'arguments': '{"query":"what is the current weather in san francisco"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 78, 'prompt_tokens': 1913, 'total_tokens': 1991, 'completion_time': 0.065, 'prompt_time': 0.343215887, 'queue_time': -2.395552903, 'total_time': 0.408215887}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_a97cfe35ae', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-6fd1721f-45f5-4527-9606-3ef9afc462e8-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'what is the current weather in san francisco'}, 'id': 'call_kjr7', 'type': 'tool_call'}], usage_metadata={'input_tokens': 1913, 'output_tokens': 78, 'total_tokens': 1991})]}}
----
----
{'agent': {'messages': [AIMessage(content='The current weather in San Francisco is partly cloudy with a high of 

## Streaming Tokens

In addition to streaming back messages, it is also useful to stream back tokens. We can do this with the .astream_events method.

In [40]:
async for event in agent_executor.astream_events(
        {"messages": [HumanMessage(content="whats the weather in sf?")]}, version="v1"
):
    kind = event["event"]
    if kind == "on_chain_start":
        if (
                event["name"] == "Agent"
        ):  # Was assigned when creating the agent with `.with_config({"run_name": "Agent"})`
            print(
                f"Starting agent: {event['name']} with input: {event['data'].get('input')}"
            )
    elif kind == "on_chain_end":
        if (
                event["name"] == "Agent"
        ):  # Was assigned when creating the agent with `.with_config({"run_name": "Agent"})`
            print()
            print("--")
            print(
                f"Done agent: {event['name']} with output: {event['data'].get('output')['output']}"
            )
    if kind == "on_chat_model_stream":
        content = event["data"]["chunk"].content
        if content:
            # Empty content in the context of OpenAI means
            # that the model is asking for a tool to be invoked.
            # So we only print non-empty content
            print(content, end="|")
    elif kind == "on_tool_start":
        print("--")
        print(
            f"Starting tool: {event['name']} with inputs: {event['data'].get('input')}"
        )
    elif kind == "on_tool_end":
        print(f"Done tool: {event['name']}")
        print(f"Tool output was: {event['data'].get('output')}")
        print("--")

APIError: Failed to call a function. Please adjust your prompt. See 'failed_generation' for more details.

## Adding in memory

As mentioned earlier, this agent is stateless. This means it does not remember previous interactions. To give it memory we need to pass in a checkpointer. When passing in a checkpointer, we also have to pass in a thread_id when invoking the agent (so it knows which thread/conversation to resume from).

In [43]:
from langgraph.checkpoint.memory import MemorySaver

memori = MemorySaver()

In [44]:
agent_executor = create_react_agent(model, tools, checkpointer=memori)

config = {"configurable" : {"thread_id" : "abc123"}}

In [45]:
for chunk in agent_executor.stream(
        {"messages": [HumanMessage(content="hi im bob!")]}, config
):
    print(chunk)
    print("----")

{'agent': {'messages': [AIMessage(content='Hello Bob!', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 4, 'prompt_tokens': 1905, 'total_tokens': 1909, 'completion_time': 0.003333333, 'prompt_time': 0.239943398, 'queue_time': -0.29073663, 'total_time': 0.243276731}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_179b0f92c9', 'finish_reason': 'stop', 'logprobs': None}, id='run-ea0184d1-5721-48ac-8bf5-baa20caa0afa-0', usage_metadata={'input_tokens': 1905, 'output_tokens': 4, 'total_tokens': 1909})]}}
----


In [46]:
for chunk in agent_executor.stream(
        {"messages": [HumanMessage(content="do you remember my name?")]}, config
):
    print(chunk)
    print("----")

{'agent': {'messages': [AIMessage(content='Yes, I remember that your name is Bob.', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 11, 'prompt_tokens': 1944, 'total_tokens': 1955, 'completion_time': 0.009166667, 'prompt_time': 0.342846021, 'queue_time': -0.403341719, 'total_time': 0.352012688}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_6a6771ae9c', 'finish_reason': 'stop', 'logprobs': None}, id='run-e649eef6-093a-4680-b22c-23aff985eada-0', usage_metadata={'input_tokens': 1944, 'output_tokens': 11, 'total_tokens': 1955})]}}
----


if wanna have a new convo, change the `thread_id` used

In [47]:
mengconfig = {"configurable" : {"thread_id" : "diyrad"}}

for chunk in agent_executor.stream(
        {"messages": [HumanMessage(content="what's my name?")]}, mengconfig
):
    print(chunk)
    print("----")

{'agent': {'messages': [AIMessage(content="I'm not sure. I don't have any information about your name.", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 16, 'prompt_tokens': 1909, 'total_tokens': 1925, 'completion_time': 0.013333333, 'prompt_time': 0.245515131, 'queue_time': -0.297238971, 'total_time': 0.258848464}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_179b0f92c9', 'finish_reason': 'stop', 'logprobs': None}, id='run-632d92aa-7dd3-4445-87e1-5791bf0f525a-0', usage_metadata={'input_tokens': 1909, 'output_tokens': 16, 'total_tokens': 1925})]}}
----


## tamat....

That's a wrap! In this quick start we covered how to create a simple agent. We've then shown how to stream back a response - not only with the intermediate steps, but also tokens! We've also added in memory so you can have a conversation with them. Agents are a complex topic with lots to learn!