# Streaming

<img src="./assets/LC_streaming.png" width="400">

Streaming reduces the latency between generating data and the user receiving it.
There are two types frequently used with Agents:

## Setup

Load and/or check for needed environmental variables

In [1]:
from dotenv import load_dotenv

from env_utils import doublecheck_env

# Load environment variables from .env
load_dotenv()

# Check and print results
doublecheck_env("example.env")

OPENAI_API_KEY=****_1cA
LANGSMITH_API_KEY=****7ac4
LANGSMITH_TRACING=true
LANGSMITH_PROJECT=****ials


In [2]:
from langchain.agents import create_agent

In [3]:
agent = create_agent(
    model="openai:gpt-5",
    system_prompt="You are a full-stack comedian",
)

## No Streaming (invoke)

In [4]:
result = agent.invoke({"messages": [{"role": "user", "content": "Tell me a joke"}]})
print(result["messages"][1].content)

I tried to tell a JavaScript joke, but the punchline was asynchronous—everyone laughed three seconds later.


## `values`
You have seen this streaming mode in our examples so far. 

In [5]:
# Stream = values
for step in agent.stream(
    {"messages": [{"role": "user", "content": "Tell me a Dad joke"}]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()


Tell me a Dad joke

What do you call fake spaghetti? An impasta.


## `messages`
Messages stream data token by token - the lowest latency possible. This is perfect for interactive applications like chatbots.

In [6]:
for token, metadata in agent.stream(
    {"messages": [{"role": "user", "content": "Write me a family friendly poem."}]},
    stream_mode="messages",
):
    print(f"{token.content}", end="")

The House That Laughs

Our house wakes up like toast that pops,
Sunlight does cartwheels over mops.
The cat holds meetings on the stair—
Agenda item one: “My chair.”

Dad stirs pancakes, flips a pun,
Calls them “flour power,” one by one.
Mom’s eight calendars juggle air;
She high-fives Tuesday, braids its hair.

My brother bargains with his peas:
“Two bites and I’ll sign the treaty, please.”
My sister builds a blanket state,
With teddy guards at every gate.

Grandpa’s remote, an ancient rune,
Finds documentaries by noon.
The dog, a furry vacuum zoom,
Collects the crumbs, patrols the room.

A sock escapes the laundry sea,
Returns as puppet royalty.
Wi‑Fi hiccups—no big deal—
We dust off cards; the laughs are real.

At bedtime, stars tap on the glass;
We chart a route to Dreamland’s pass.
“Love you to the fridge and back,”
Then night tucks in our cozy stack.

## Tools can stream too!
Streaming generally means delivering information to the user before the final result is ready. There are many cases where this is useful. A `get_stream_writer` writer allows you to easily stream `custom` data from sources you create.

In [7]:
from langchain.agents import create_agent
from langchain_core.tools import tool
from langgraph.config import get_stream_writer


@tool
def get_weather(city: str) -> str:
    """Get weather for a given city."""
    writer = get_stream_writer()
    # stream any arbitrary data
    writer(f"Looking up data for city: {city}")
    writer(f"Acquired data for city: {city}")
    return f"It's always sunny in {city}!"


agent = create_agent(
    model="openai:gpt-5-mini",
    tools=[get_weather],
)

for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode=["values", "custom"],
):
    print(chunk)

('values', {'messages': [HumanMessage(content='What is the weather in SF?', additional_kwargs={}, response_metadata={}, id='2c3e7c2f-65d3-4c20-8eeb-bdf3ebe6295e')]})
('values', {'messages': [HumanMessage(content='What is the weather in SF?', additional_kwargs={}, response_metadata={}, id='2c3e7c2f-65d3-4c20-8eeb-bdf3ebe6295e'), AIMessage(content='', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 88, 'prompt_tokens': 132, 'total_tokens': 220, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 64, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-5-mini-2025-08-07', 'system_fingerprint': None, 'id': 'chatcmpl-DBdVHCSmTFafO0gpe5L9XsO9rk0FQ', 'service_tier': 'default', 'finish_reason': 'tool_calls', 'logprobs': None}, id='lc_run--019c7f83-32f0-78a1-b4b2-0f4015752f90-0', tool_calls=[{'name': 'get_we

In [8]:
from langchain_core.messages import AIMessageChunk

answer = []

for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode=["messages", "custom"],
):
    chunk_type, content = chunk
    
    print("=====================================================================")
    print(f"chunk type: {chunk_type}")
    
    if chunk_type == "messages":
        print(f"message: {content[0]}")
        print(f"metadata: {content[1]}")
        
        if (isinstance(content[0], AIMessageChunk) and not content[0].tool_calls and 
            not content[0].tool_call_chunks and not content[0].additional_kwargs):
            answer.append(content[0].content)
    else:
        print(f"message: {chunk[-1]}")

chunk type: messages
message: content='' additional_kwargs={} response_metadata={'model_provider': 'openai'} id='lc_run--019c7f83-4b42-7931-a601-367951756b40' tool_calls=[{'name': 'get_weather', 'args': {}, 'id': 'call_l3xIu06rRaoNbUsnAsmT0R9U', 'type': 'tool_call'}] invalid_tool_calls=[] tool_call_chunks=[{'name': 'get_weather', 'args': '', 'id': 'call_l3xIu06rRaoNbUsnAsmT0R9U', 'index': 0, 'type': 'tool_call_chunk'}]
metadata: {'langgraph_step': 1, 'langgraph_node': 'model', 'langgraph_triggers': ('branch:to:model',), 'langgraph_path': ('__pregel_pull', 'model'), 'langgraph_checkpoint_ns': 'model:0955ca57-83f5-516d-4f6a-e055baeafc51', 'checkpoint_ns': 'model:0955ca57-83f5-516d-4f6a-e055baeafc51', 'ls_provider': 'openai', 'ls_model_name': 'gpt-5-mini', 'ls_model_type': 'chat', 'ls_temperature': None}
chunk type: messages
message: content='' additional_kwargs={} response_metadata={'model_provider': 'openai'} id='lc_run--019c7f83-4b42-7931-a601-367951756b40' tool_calls=[{'name': '', 'ar

In [9]:
print("")
print("".join(answer))


According to the weather service: "It's always sunny in San Francisco!" 

Want a real-time temperature, hourly forecast, or a 7-day outlook?


In [10]:
for token, metadata in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode="messages",
):
    print(f"{token.content}", end="")

It's always sunny in San Francisco!"It's always sunny in San Francisco!" 

Would you like the current temperature, hourly forecast, or weather for a different city?

The things will be different if we just only use `message`, there will be `token` and `metadata` to stream. It make to be focus on messages only.

In [11]:
for token, metadata in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode="messages",
):
    print(token)

content='' additional_kwargs={} response_metadata={'model_provider': 'openai'} id='lc_run--019c7f83-7260-7713-8f87-93bd1fd8998d' tool_calls=[{'name': 'get_weather', 'args': {}, 'id': 'call_rST5a0GaSsa1ABpebH16E9Dk', 'type': 'tool_call'}] invalid_tool_calls=[] tool_call_chunks=[{'name': 'get_weather', 'args': '', 'id': 'call_rST5a0GaSsa1ABpebH16E9Dk', 'index': 0, 'type': 'tool_call_chunk'}]
content='' additional_kwargs={} response_metadata={'model_provider': 'openai'} id='lc_run--019c7f83-7260-7713-8f87-93bd1fd8998d' tool_calls=[{'name': '', 'args': {}, 'id': None, 'type': 'tool_call'}] invalid_tool_calls=[] tool_call_chunks=[{'name': None, 'args': '{"', 'id': None, 'index': 0, 'type': 'tool_call_chunk'}]
content='' additional_kwargs={} response_metadata={'model_provider': 'openai'} id='lc_run--019c7f83-7260-7713-8f87-93bd1fd8998d' tool_calls=[] invalid_tool_calls=[{'name': None, 'args': 'city', 'id': None, 'error': None, 'type': 'invalid_tool_call'}] tool_call_chunks=[{'name': None, 'a

## Try different modes on your own!
Modify the stream mode and the select to produce different results.

In [12]:
for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode=["values", "custom"],
):
    if chunk[0] == "custom":
        print(chunk[1])

Looking up data for city: San Francisco
Acquired data for city: San Francisco
