# Agents in LangChain

This quickstart takes you from a simple setup to a fully functional AI agent in just a few minutes.

## Questions

- What is an Agent in LangChain and how to make one?
- What's the relationship between an LLM and an Agent?
- What can agents do?
- Can I run an agent locally without a provider?

## Setup Virtual Environment

```sh
uv init
uv venv -p 3.12
```

### Activate the virtual environment

::: {.panel-tabset}
#### Windows

```sh
.venv\Scripts\activate.bat
```

#### MacOS / Linux

```sh
source .venv/bin/activate
```
:::

## What is an **API Key**?

Think of an API Key as a **hotel key card**.

* **The Hotel (Server):** Has resources (rooms) but keeps them locked.
* **The Guest (Client):** Wants access.
* **The Key Card (API Key):** Identifies you and proves you are allowed to enter specific rooms.

---

### What & Why

An API key is a unique string of characters used to identify the calling program.

* **Identification:** Keys "authenticate the calling project," allowing the server to recognize who is asking for data.
* **Control:** This lets the server track usage for billing and enforce limits (quotas) so one user doesn't crash the system.

---

### Security Risks

If you lose your key, it is like dropping your credit card.

* **Theft:** Attackers can use your key to make requests on your behalf.
* **Consequences:** You suffer **financial loss** (paying for their usage) or **service denial** (they use up your available quota).

> **Rule:** Never post keys on public sites like GitHub.

### How to Set Your API Key?

This project uses OpenRouter (**The Unified Interface For LLMs**), via LiteLLM to access the DeepSeek model, which requires an API key. If you don't already have an OpenRouter API key, you can create one for free at: [OpenRouter](https://openrouter.ai/keys).

Write your API key into an `.env` file as an environment variable, as follows:

```sh
OPENROUTER_API_KEY=...
```

> Note: make sure to add it to `.gitignore` to avoid committing it to the repository.
> 
> Note: this is different than the `.venv` file used for the virtual environment.

If we use the OpenAI API, we'll have to add:

```sh
OPENAI_API_BASE="https://openrouter.ai/api/v1"
```

.. such that the model uses OpenRouter instead of the default OpenAI API.

### Sign up and Set LangSmith API (Free)

* Cost: 
* Sign up for LangSmith [here](https://docs.langchain.com/langsmith/create-account-api-key#create-an-account-and-api-key), find out more about LangSmith and how to use it within your workflow [here](https://www.langchain.com/langsmith). 
*  Set `LANGSMITH_API_KEY`, `LANGSMITH_TRACING_V2="true"` `LANGSMITH_PROJECT="langchain-academy"`in your environment 
*  If you are on the EU instance also set `LANGSMITH_ENDPOINT`="https://eu.api.smith.langchain.com" as well.

### Set up Tavily API for web search (Free)

* Tavily Search API is a search engine optimized for LLMs and RAG, aimed at efficient, 
quick, and persistent search results. 
* You can sign up for an API key [here](https://tavily.com/). 
It's easy to sign up and offers a very generous free tier. Some lessons (in Module 4) will use Tavily. 

* Set `TAVILY_API_KEY` in your environment.

### Install dependencies

```sh
uv add langchain tavily-python langchain_openai langchain_community 
```

In [None]:
import os
from dotenv import load_dotenv

load_dotenv()

# We use OpenRouter for the agent ‚Äî set OPENROUTER_API_KEY in .env
# Get your key at https://openrouter.ai/keys
if not os.environ.get("OPENROUTER_API_KEY"):
    raise RuntimeError(
        "OPENROUTER_API_KEY is not set. Add it to your .env file, e.g.:\n"
        "OPENROUTER_API_KEY=your-openrouter-api-key"
    )

# Models

[LLMs](https://docs.langchain.com/oss/python/langchain/models) are powerful AI tools that can interpret and generate text like humans. They're versatile enough to write content, translate languages, summarize, and answer questions without needing specialized training for each task.

The quality and capabilities of the model you choose directly impact your agent's baseline reliability and performance. Different models excel at different tasks - some are better at following complex instructions, others at structured reasoning, and some support larger context windows for handling more information.

### Choosing between models

- [Models | OpenRouter.ai](https://openrouter.ai/models)
- [LLM Stats](https://llm-stats.com/)
- [Model Recommendation | Artficial Analysis](https://artificialanalysis.ai/models/recommend)
  - [TTS | Artficial Analysis](https://artificialanalysis.ai/text-to-speech/leaderboard)
- [Arena.ai](https://arena.ai/leaderboard/text-to-image)
- [MTEB: Embedding Leaderboard](https://huggingface.co/spaces/mteb/leaderboard)
- [Open ASR](https://huggingface.co/spaces/hf-audio/open_asr_leaderboard)


Note: **Agents** require [**a model that supports tool calling**](https://openrouter.ai/models?fmt=cards&supported_parameters=tools).

## Basic usage

Models can be utilized in two ways:

1. **With agents** - Models can be dynamically specified when creating an [agent](/oss/python/langchain/agents#model).
2. **Standalone** - Models can be called directly (outside of the agent loop) for tasks like text generation, classification, or extraction without the need for an agent framework.

[Here](https://docs.langchain.com/oss/python/langchain/models) is a useful how-to for all the things that you can do with chat models, but we'll show a few highlights below.

There are [a few standard parameters](https://docs.langchain.com/oss/python/langchain/models#parameters) that we can set with chat models. Two of the most common are:

* `model`: the name of the model
* `temperature`: the sampling temperature
* `max_tokens`: the maximum number of tokens to generate

`Temperature` controls the randomness or creativity of the model's output where:

- **Low temperature** (close to 0) is more deterministic and focused outputs. This is good for tasks requiring accuracy or factual responses.
- **High temperature** (close to 1) is good for creative tasks or generating varied responses. 

`max_tokens` limits the total number of tokens in the response, effectively controlling how long the output can be.

LangChain supports many models via [third-party integrations](https://docs.langchain.com/oss/python/integrations/chat). By default, the course will use  [ChatOpenAI](https://docs.langchain.com/oss/python/integrations/chat/openai) because it is both popular and performant.

In [None]:
from langchain_openai import ChatOpenAI

# https://openrouter.ai/openai/gpt-5-nano
model_gpt5_nano = ChatOpenAI(
    model="openai/gpt-5-nano",
    temperature=0,
    base_url="https://openrouter.ai/api/v1",
    api_key=os.environ.get("OPENROUTER_API_KEY"),
)

# https://openrouter.ai/nvidia/nemotron-3-nano-30b-a3b:free
model_nemotron3_nano = ChatOpenAI(
    model="nvidia/nemotron-3-nano-30b-a3b:free",
    temperature=0,
    base_url="https://openrouter.ai/api/v1",
    api_key=os.environ.get("OPENROUTER_API_KEY"),
)

  from .autonotebook import tqdm as notebook_tqdm


### Running a model locally

LangChain supports running models locally on your own hardware. This is useful for scenarios where either data privacy is critical, you want to invoke a custom model, or when you want to avoid the costs incurred when using a cloud-based model.

[Ollama](https://docs.langchain.com/oss/python/integrations/chat/ollama) is one of the easiest ways to run chat and embedding models locally.

## Key Methods

1. Invoke
3. Stream
2. Batch

### 1. Invoke

The most straightforward way to call a model is to use `invoke()` with a single message or a list of messages:

In [None]:
message = model_nemotron3_nano.invoke("why do parrots talk?")


.. this returns an `AIMessage` object:

In [5]:
message

AIMessage(content='Parrots ‚Äútalk‚Äù because they‚Äôre one of the few animals that can **learn and reproduce complex vocalizations**‚Äîa skill called vocal learning. Here‚Äôs a quick rundown of why and how they do it:\n\n| Reason | What it means for parrots |\n|--------|--------------------------|\n| **Social bonding** | In the wild, parrots use calls to stay in contact with flock members, coordinate movement, and reinforce pair bonds. Mimicking the sounds of their companions (including human speech) helps them stay socially connected. |\n| **Territory & status** | Some species use distinctive vocalizations to claim space or signal dominance. A parrot that can produce a clear, attention‚Äëgrabbing ‚Äúspeech‚Äù may gain more social leverage. |\n| **Mental stimulation** | Parrots are highly intelligent; they need cognitive challenges. Learning new sounds is a form of problem‚Äësolving that keeps their brains active and reduces boredom‚Äërelated behaviors (like feather‚Äëplucking). |\n| 

.. which has a `content` property, which includes the generated response text:

In [7]:
print(message.content)

Parrots ‚Äútalk‚Äù because they‚Äôre one of the few animals that can **learn and reproduce complex vocalizations**‚Äîa skill called vocal learning. Here‚Äôs a quick rundown of why and how they do it:

| Reason | What it means for parrots |
|--------|--------------------------|
| **Social bonding** | In the wild, parrots use calls to stay in contact with flock members, coordinate movement, and reinforce pair bonds. Mimicking the sounds of their companions (including human speech) helps them stay socially connected. |
| **Territory & status** | Some species use distinctive vocalizations to claim space or signal dominance. A parrot that can produce a clear, attention‚Äëgrabbing ‚Äúspeech‚Äù may gain more social leverage. |
| **Mental stimulation** | Parrots are highly intelligent; they need cognitive challenges. Learning new sounds is a form of problem‚Äësolving that keeps their brains active and reduces boredom‚Äërelated behaviors (like feather‚Äëplucking). |
| **Mimicry as a survival to

A list of messages can be provided to a chat model to represent conversation history. Each message has a role that models use to indicate who sent the message in the conversation.



In [None]:
from langchain.messages import SystemMessage, HumanMessage, AIMessage

conversation = [    
    SystemMessage(content="You are a helpful assistant that translates English to Arabic."),
    HumanMessage(content="Translate: I love programming."),
    AIMessage(content="ÿ£ÿ≠ÿ® ÿßŸÑÿ®ÿ±ŸÖÿ¨ÿ©."),
    HumanMessage(content="I love building applications.")
]

message = model_nemotron3_nano.invoke(conversation)
print(message.content)

ÿ£ÿ≠ÿ® ÿ™ÿ∑ŸàŸäÿ± ÿßŸÑÿ™ÿ∑ÿ®ŸäŸÇÿßÿ™.


### 2. Stream

Most models can stream their output content while it is being generated. By displaying output progressively, streaming significantly improves user experience, particularly for longer responses.

Calling `stream()` returns an iterator that yields output chunks as they are produced. You can use a loop to process each chunk in real-time:



In [9]:
for chunk in model_nemotron3_nano.stream("Why do parrots have colorful feathers?"):
    print(chunk.text, end="|", flush=True)

||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||Par|ro|ts| are| famous| for| their| vivid| plum|age|,| and| that| color|ation| isn|‚Äôt| just| for| show| ‚Äî| it| serves| several| important| functions| that| have| been| shaped| by| evolution|.| Here|‚Äôs| a| quick| rund|own| of| the| main| reasons|:

||| Reason| || How| it| works| || Why| it| matters| for| parro|ts| |
|||--------|||------------|--|||----------------|------------||
||| **|Sex|ual| selection|**| || Bright|,| contrasting| colors| signal| health|,| good| genetics|,| and| strong| immune| systems|.| Males| and| females| often| use| plum|age| to| attract| mates| or| to| assess| rivals|.| || In| many| par|rot| species|,| brighter| males| are| preferred| by| females|,| leading| to| stronger| reproductive| success| for| those| with| more| vivid| feathers|.| |
||| **|Species| and| individual| recognition|**| || Dist|inct| color| patterns| help| individuals| identify| members| of| their| own| species| (|and| sometimes|

### 3. Batch

Batching a collection of independent requests to a model can significantly improve performance and reduce costs, as the processing can be done in parallel:

In [15]:
responses = model_nemotron3_nano.batch([
    "What is the capital of Saudi Arabia?",
    "What is 2 + 8",
    "Is the sky blue or is it our perception? give a short and concise answer"
])

for response in responses:
    print(response)

content='The capital of Saudi Arabia is **Riyadh**.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 38, 'prompt_tokens': 24, 'total_tokens': 62, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': 0, 'reasoning_tokens': 26, 'rejected_prediction_tokens': None}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}, 'cost': 0, 'is_byok': False, 'cost_details': {'upstream_inference_cost': 0, 'upstream_inference_prompt_cost': 0, 'upstream_inference_completions_cost': 0}}, 'model_provider': 'openai', 'model_name': 'nvidia/nemotron-3-nano-30b-a3b:free', 'system_fingerprint': None, 'id': 'gen-1771587230-rws1sKd4yd2Ir45jBE92', 'finish_reason': 'stop', 'logprobs': None} id='lc_run--019c7ad3-dab8-7670-aebf-be1a19d73902-0' tool_calls=[] invalid_tool_calls=[] usage_metadata={'input_tokens': 24, 'output_tokens': 38, 'total_tokens': 62, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {

In [16]:
for i, response in enumerate(responses):
    print(response.content)
    print("="*100)

The capital of Saudi Arabia is **Riyadh**.
2‚ÄØ+‚ÄØ8‚ÄØ=‚ÄØ10.
The sky appears blue because molecules in the atmosphere scatter short‚Äëwavelength (blue) sunlight‚Äîan objective physical effect that our visual system interprets as the color blue.


## Structured output

Models can be requested to provide their response in a format matching a given schema. This is useful for ensuring the output can be easily parsed and used in subsequent processing. LangChain supports multiple schema types and methods for enforcing structured output.

[Pydantic models](https://docs.pydantic.dev/latest/concepts/models/#basic-model-usage) provide the richest feature set with field validation, descriptions, and nested structures.

In [18]:
from pydantic import BaseModel, Field

class Movie(BaseModel):
    """A movie with details."""
    title: str = Field(..., description="The title of the movie")
    year: int = Field(..., description="The year the movie was released")
    director: str = Field(..., description="The director of the movie")
    rating: float = Field(..., description="The movie's rating out of 10")

title='Inception' year=2010 director='Christopher Nolan' rating=8.8


  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=Movie(title='Inception', ...pher Nolan', rating=8.8), input_type=Movie])
  return self.__pydantic_serializer__.to_python(


In [None]:
model_with_structure = model_nemotron3_nano.with_structured_output(Movie)
response = model_with_structure.invoke("Provide details about the movie Inception")

In [19]:
print("Title:", response.title)
print("Year:", response.year)
print("Director:", response.director)
print("Rating:", response.rating)

Title: Inception
Year: 2010
Director: Christopher Nolan
Rating: 8.8


## Tool calling

Models can request to call tools that perform tasks such as fetching data from a database, searching the web, or running code. Tools are pairings of:

1. A schema, including the name of the tool, a description, and/or argument definitions (often a JSON schema)
2. A function or coroutine to execute.

Note: A *coroutine* is a method that can suspend execution and resume at a later time


In [None]:
from langchain.tools import tool

@tool
def get_weather(location: str) -> str:
    """Get the weather at a location."""
    return f"It's always sunny in {location}."


model_with_tools = model_nemotron3_nano.bind_tools([get_weather])

response = model_with_tools.invoke("What's the weather like in the Moon?")
for tool_call in response.tool_calls:
    # View tool calls made by the model
    print(f"Tool: {tool_call['name']}")
    print(f"Args: {tool_call['args']}")

Tool: get_weather
Args: {'location': 'Boston'}


### Tool Input Schemas

Define complex inputs with Pydantic models or JSON schemas:

In [23]:
from pydantic import BaseModel, Field
from typing import Literal

class WeatherInput(BaseModel):
    """Input for weather queries."""
    location: str = Field(description="City name or coordinates")
    units: Literal["celsius", "fahrenheit"] = Field(
        default="celsius",
        description="Temperature unit preference"
    )
    include_forecast: bool = Field(
        default=False,
        description="Include 5-day forecast"
    )

@tool(args_schema=WeatherInput)
def get_weather(location: str, units: str = "celsius", include_forecast: bool = False) -> str:
    """Get current weather and optional forecast."""
    temp = 22 if units == "celsius" else 72
    result = f"Current weather in {location}: {temp} degrees {units[0].upper()}"
    if include_forecast:
        result += "\nNext 5 days: Sunny"
    return result

In [24]:
model_with_tools = model_nemotron3_nano.bind_tools([get_weather])

In [25]:
response = model_with_tools.invoke(
    "What's the weather like in the Moon? "
    "in fahrenheit and include the forecast please."
)
for tool_call in response.tool_calls:
    # View tool calls made by the model
    print(f"Tool: {tool_call['name']}")
    print(f"Args: {tool_call['args']}")

Tool: get_weather
Args: {'location': 'the Moon', 'units': 'fahrenheit', 'include_forecast': True}


## Search Tools

Tavily is a search engine optimized for LLMs and RAG, aimed at efficient, quick, and persistent search results. As mentioned, it's easy to sign up and offers a generous free tier.

In [26]:
from tavily import TavilyClient

tavily_client = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])

In [34]:
def internet_search(
    query: str,
    max_results: int = 5,
    topic: Literal["general", "news", "finance"] = "general",
    include_raw_content: bool = False,
):
    """Run a web search"""
    return tavily_client.search(
        query,
        max_results=max_results,
        include_raw_content=include_raw_content,
        topic=topic,
    )

In [35]:
result = internet_search("What is LangGraph?", max_results=3)
result

{'query': 'What is LangGraph?',
 'response_time': 0.61,
 'follow_up_questions': None,
 'answer': None,
 'images': [],
 'results': [{'url': 'https://www.ibm.com/think/topics/langgraph',
   'title': 'What is LangGraph? - IBM',
   'content': 'LangGraph, created by LangChain, is an open source AI agent framework designed to build, deploy and manage complex generative AI agent workflows. At its core, LangGraph uses the power of graph-based architectures to model and manage the intricate relationships between various components of an AI agent workflow. LangGraph illuminates the processes within an AI workflow, allowing full transparency of the agent‚Äôs state. By combining these technologies with a set of APIs and tools, LangGraph provides users with a versatile platform for developing AI solutions and workflows including chatbots, state graphs and other agent-based systems. **Nodes**: In LangGraph, nodes represent individual components or agents within an AI workflow. LangGraph uses enhance

## Create an Agent

Agents combine language models with tools to create systems that can reason about tasks, decide which tools to use, and iteratively work towards solutions.

An LLM Agent runs tools in a loop to achieve a goal. An agent runs until a stop condition is met - i.e., when the model emits a final output or an iteration limit is reached.

![Agent Loop](./assets/agent_loop.png)

`create_agent` provides a production-ready agent implementation.

In [None]:
from langchain.agents import create_agent

# System prompt to steer the agent to be an expert researcher
AGENT_PROMPT = """You are an expert researcher. Your job is to conduct thorough research and then write a polished report.

You have access to an internet search tool as your primary means of gathering information.

Keep it short and concise.

## `internet_search`

Use this to run an internet search for a given query. You can specify the max number of results to return, the topic, and whether raw content should be included.
"""

agent = create_agent(
    model=model_nemotron3_nano,
    tools=[internet_search],
    system_prompt=AGENT_PROMPT
)

### Invoke

In [37]:
result = agent.invoke({"messages": [{"role": "user", "content": "What is langgraph?"}]})

In [38]:
result

{'messages': [HumanMessage(content='What is langgraph?', additional_kwargs={}, response_metadata={}, id='29083314-fa84-4871-ba0d-f594b69ce1fe'),
  AIMessage(content='', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 327, 'prompt_tokens': 447, 'total_tokens': 774, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': 0, 'reasoning_tokens': 292, 'rejected_prediction_tokens': None}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}, 'cost': 0, 'is_byok': False, 'cost_details': {'upstream_inference_cost': 0, 'upstream_inference_prompt_cost': 0, 'upstream_inference_completions_cost': 0}}, 'model_provider': 'openai', 'model_name': 'nvidia/nemotron-3-nano-30b-a3b:free', 'system_fingerprint': None, 'id': 'gen-1771589000-vrLW5z3J6cADEb7ZfFeY', 'finish_reason': 'tool_calls', 'logprobs': None}, id='lc_run--019c7aee-db92-70c2-9d0e-ef86169a1ec1-0', tool_calls=[{'name': 'internet_search', 'args': {'max_results':

In [39]:
# Print the agent's response
print(result["messages"][-1].content)

**LangGraph ‚Äì A Polished Overview**

---

### 1. Executive Summary
LangGraph is an **open‚Äësource orchestration framework** that lets developers build, deploy, and manage **stateful, long‚Äërunning AI agents** as graph‚Äëbased workflows. Created by the team behind **LangChain**, it provides low‚Äëlevel primitives for durable execution, human‚Äëin‚Äëthe‚Äëloop control, memory, streaming, and debugging. Because it models agent logic as a graph, LangGraph makes complex, multi‚Äëstep, and multi‚Äëagent processes transparent, scalable, and easy to reason about.

> *‚ÄúLangGraph, created by LangChain, is an open source AI agent framework designed to build, deploy and manage complex generative AI agent workflows.‚Äù* ‚Äì IBM Think article„Äê1‚Ä†L1-L4„Äë  

---

### 2. Core Concepts

| Concept | What It Is | Why It Matters |
|---------|------------|----------------|
| **Nodes** | Individual components or ‚Äúactors‚Äù (e.g., an LLM call, a tool, a data fetcher). | Represent the atomic steps 

## Key Takeaways

- Three key methods for models: invoke, stream, and batch.
- LLMs can be configured to responsd in a structured format
- Agent = Model + Tools
- Models (LLMs) are the brain-power of agents
- Tools are simply names and agruments of defined Python functions

## Activity

**Over to you:** create an Agent that is able to answer questions, with an added internet search capability.