# Streaming

<img src="./assets/LC_streaming.png" width="400">

Streaming reduces the latency between generating data and the user receiving it.
There are two types frequently used with Agents:

## Setup

Load and/or check for needed environmental variables

In [2]:
from dotenv import load_dotenv
from env_utils import doublecheck_env

# Load environment variables from .env
load_dotenv()

# Check and print results
doublecheck_env(".env")

# Declaring default model
default_model = "google_genai:gemini-2.5-flash"

LANGSMITH_API_KEY=****3d99
LANGSMITH_TRACING=true
LANGSMITH_PROJECT=****ials
LANGSMITH_ENDPOINT=****.com
GOOGLE_API_KEY=****zZa4
DEFAULT_MODEL=****lash


In [3]:
from langchain.agents import create_agent

In [4]:
agent = create_agent(
    model=default_model,
    system_prompt="You are a full-stack comedian",
)

## No Steaming (invoke)

In [6]:
result = agent.invoke({"messages": [{"role": "user", "content": "Tell me a joke"}]})
print(result["messages"][1].content)

Why did the full-stack developer get kicked out of the restaurant?

Because they kept trying to *inspect element* on the menu and complained the *waiter's API* was poorly documented!


## values
You have seen this streaming mode in our examples so far. 

In [7]:
# Stream = values
for step in agent.stream(
    {"messages": [{"role": "user", "content": "Tell me a Dad joke"}]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()


Tell me a Dad joke

Why did the scarecrow win an award?

Because he was outstanding in his field!


## messages
Messages stream data token by token - the lowest latency possible. This is perfect for interactive applications like chatbots.

In [8]:
for token, metadata in agent.stream(
    {"messages": [{"role": "user", "content": "Write me a family friendly poem."}]},
    stream_mode="messages",
):
    print(f"{token.content}", end="")

The sun peeks up with a sleepy yawn,
And paints the sky at the break of dawn.
With colors soft, of pink and gold,
A brand new story about to unfold.

Little birds begin to tweet,
A happy song, oh so sweet!
The flowers stretch, and gently sway,
Waving hello to the brand new day.

A tiny squirrel climbs up a tree,
As busy as a bee, you see!
The gentle breeze begins to play,
Whispering secrets all the way.

So let's all smile and laugh and cheer,
And chase away each tiny fear.
For every day, a gift so grand,
With wonders waiting close at hand!

## Tools can stream too!
Streaming generally means delivering information to the user before the final result is ready. There are many cases where this is useful. A `get_stream_writer` writer allows you to easily stream `custom` data from sources you create.

In [11]:
from langchain.agents import create_agent
from langgraph.config import get_stream_writer


def get_weather(city: str) -> str:
    """Get weather for a given city."""
    writer = get_stream_writer()
    # stream any arbitrary data
    writer(f"Looking up data for city: {city}")
    writer(f"Acquired data for city: {city}")
    return f"It's always sunny in {city}!"


agent = create_agent(
    model=default_model,
    tools=[get_weather],
)

for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in São Paulo?"}]},
    stream_mode=["values", "custom"],
):
    print(chunk)

('values', {'messages': [HumanMessage(content='What is the weather in São Paulo?', additional_kwargs={}, response_metadata={}, id='417d7959-6c1a-4806-976a-2ab14c519033')]})
('values', {'messages': [HumanMessage(content='What is the weather in São Paulo?', additional_kwargs={}, response_metadata={}, id='417d7959-6c1a-4806-976a-2ab14c519033'), AIMessage(content='', additional_kwargs={'function_call': {'name': 'get_weather', 'arguments': '{"city": "S\\u00e3o Paulo"}'}}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-2.5-flash', 'safety_ratings': [], 'model_provider': 'google_genai'}, id='lc_run--135cc16d-90fd-4804-b200-abbeef94e18c-0', tool_calls=[{'name': 'get_weather', 'args': {'city': 'São Paulo'}, 'id': 'ba7305ad-0441-43d6-a851-abf25d1f0950', 'type': 'tool_call'}], usage_metadata={'input_tokens': 48, 'output_tokens': 78, 'total_tokens': 126, 'input_token_details': {'cache_read': 0}, 'output_token_details'

In [13]:
for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in São Paulo?"}]},
    stream_mode=["custom"],
):
    print(chunk)

('custom', 'Looking up data for city: São Paulo')
('custom', 'Acquired data for city: São Paulo')


## Try different modes on your own!
Modify the stream mode and the select to produce different results.

In [26]:
for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in Cajamar?"}]},
    stream_mode=["values", "custom"],
):
    if chunk[0] == "custom":
        print(chunk[1])
    if chunk[0] == "values":
        print(chunk[1]['messages'][-1].content)

What is the weather in Cajamar?

Looking up data for city: Cajamar
Acquired data for city: Cajamar
It's always sunny in Cajamar!
It's always sunny in Cajamar!
