# Streaming

<img src="./assets/LC_streaming.png" width="400">

Streaming reduces the latency between generating data and the user receiving it.
There are two types frequently used with Agents:

## Setup

Load and/or check for needed environmental variables

In [1]:
from dotenv import load_dotenv
from env_utils import doublecheck_env

# Load environment variables from .env
load_dotenv()

# Check and print results
doublecheck_env(".env")

OPENAI_API_KEY=****here
LANGSMITH_API_KEY=****754b
LANGSMITH_TRACING=true
LANGSMITH_PROJECT=****ials


In [2]:
from langchain_ollama import ChatOllama

llm = ChatOllama(model="llama3.1:8b", temperature=0.8)

In [3]:
from langchain.agents import create_agent

In [4]:
agent = create_agent(
    model=llm,
    system_prompt="You are a full-stack comedian",
)

## No Streaming (invoke)

In [5]:
result = agent.invoke({"messages": [{"role": "user", "content": "Tell me a joke"}]})
print(result["messages"][1].content)

Here's one:

Why don't some people like pizza?

(wait for it...)

Because it's a little "saucy"!

(get it? Saucy, like the sauce on the pizza, but also sassy and annoying... Ahh, I crack myself up!)

What do you think? Should I stick to coding or keep trying out my stand-up skills?


## values
You have seen this streaming mode in our examples so far. 

In [6]:
# Stream = values
for step in agent.stream(
    {"messages": [{"role": "user", "content": "Tell me a Dad joke"}]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()


Tell me a Dad joke

Here's one that's sure to "punder" your expectations:

Why did the mushroom go to the party?

Because he was a fun-gi!

(Sorry, I know it's a bit of a groaner!)


## messages
Messages stream data token by token - the lowest latency possible. This is perfect for interactive applications like chatbots.

In [7]:
for token, metadata in agent.stream(
    {"messages": [{"role": "user", "content": "Write me a family friendly poem."}]},
    stream_mode="messages",
):
    print(f"{token.content}", end="")

Here's a silly poem for the whole crew:

In the land of laughter and play,
Lived a family in their own special way.
Mom was the sunshine, Dad was the fun,
The kids were the giggles, one by one.

There was Timmy, the jokester supreme,
Samantha, the singer, with a voice so serene.
Emma, the artist, with colors bright,
And Benny, the builder, who loved to ignite.

Together they'd dance in the kitchen space,
Make pancakes and laughter fill the place.
They'd play outside 'til the stars came out high,
A family of friends, with love shining bright in the sky.

Their home was a haven, full of joy and cheer,
Where everyone's quirks were welcome, year after year.
So if you're feeling blue or feeling down,
Just remember this family, spinning around!

Hope you enjoyed it!

## Tools can stream too!
Streaming generally means delivering information to the user before the final result is ready. There are many cases where this is useful. A `get_stream_writer` writer allows you to easily stream `custom` data from sources you create.

In [8]:
from langchain.agents import create_agent
from langgraph.config import get_stream_writer


def get_weather(city: str) -> str:
    """Get weather for a given city."""
    writer = get_stream_writer()
    # stream any arbitrary data
    writer(f"Looking up data for city: {city}")
    writer(f"Acquired data for city: {city}")
    return f"It's always sunny in {city}!"


agent = create_agent(
    model=llm,
    tools=[get_weather],
)

for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode=["values", "custom"],
):
    print(chunk)

('values', {'messages': [HumanMessage(content='What is the weather in SF?', additional_kwargs={}, response_metadata={}, id='527d46e4-fa50-4c04-b129-71891ed2cc0b')]})
('values', {'messages': [HumanMessage(content='What is the weather in SF?', additional_kwargs={}, response_metadata={}, id='527d46e4-fa50-4c04-b129-71891ed2cc0b'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'llama3.1:8b', 'created_at': '2026-01-14T17:05:16.633234905Z', 'done': True, 'done_reason': 'stop', 'total_duration': 26095098311, 'load_duration': 146261775, 'prompt_eval_count': 158, 'prompt_eval_duration': 20692507367, 'eval_count': 17, 'eval_duration': 5186077199, 'logprobs': None, 'model_name': 'llama3.1:8b', 'model_provider': 'ollama'}, id='lc_run--fd164c3d-4e98-46a8-9088-ab8de2ec14ee-0', tool_calls=[{'name': 'get_weather', 'args': {'city': 'SF'}, 'id': '93d7b4cb-49fd-448e-b8c4-90601a988150', 'type': 'tool_call'}], usage_metadata={'input_tokens': 158, 'output_tokens': 17, 'total_tokens

In [9]:
for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode=["custom"],
):
    print(chunk)

('custom', 'Looking up data for city: SF')
('custom', 'Acquired data for city: SF')


## Try different modes on your own!
Modify the stream mode and the select to produce different results.

In [10]:
for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode=["values", "custom"],
):
    if chunk[0] == "custom":
        print(chunk[1])

Looking up data for city: SF
Acquired data for city: SF
