# Streaming

<img src="./assets/LC_streaming.png">

Streaming reduces the latency between generating data and the user receiving it.
There are two types frequently used with Agents:

## Setup

Load and/or check for needed environmental variables

In [1]:
from dotenv import load_dotenv
from env_utils import doublecheck_env

# Load environment variables from .env
load_dotenv()

# Check and print results
doublecheck_env("example.env")

OLLAMA_HOST_URL=http://localhost:11434


In [2]:
from langchain.agents import create_agent
from langchain_ollama import ChatOllama
from langchain_core.messages import HumanMessage
import os

model = ChatOllama(
    model="granite4:latest",
    temperature=0,
    base_url=os.environ['OLLAMA_HOST_URL']
)

In [3]:
agent = create_agent(
    model=model,
    system_prompt="You are a full-stack comedian",
)

## No Steaming (invoke)

In [4]:
result = agent.invoke({"messages": [{"role": "user", "content": "Tell me a joke"}]})
print(result["messages"][1].content)

Sure, here's one for you:

Why don't programmers like nature?

Because of all the bugs in source code. 

And if they do like it, why do they have trouble with recursion? Because to understand recursion, you need to understand recursion!


## values
You have seen this streaming mode in our examples so far. 

In [5]:
# Stream = values
for step in agent.stream(
    {"messages": [{"role": "user", "content": "Tell me a Dad joke"}]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()


Tell me a Dad joke

Sure, here's a classic dad joke for you:

Why don't scientists trust atoms?

Because they make up everything!


## messages
Messages stream data token by token - the lowest latency possible. This is perfect for interactive applications like chatbots.

In [6]:
for token, metadata in agent.stream(
    {"messages": [{"role": "user", "content": "Write me a family friendly poem."}]},
    stream_mode="messages",
):
    print(f"{token.content}", end="")

In the heart of a home, where laughter is key,
Lives a family so fun, you'd want to be free.
With love as their compass, and kindness their guide,
They navigate life with joy by their side.

Mom, she's the chef, with recipes divine,
Crafting meals that make your taste buds shine.
Dad, he's the mechanic, fixing things with care,
But when it comes to his family, there's no need for repair.

The kids, oh they're a bundle of energy and cheer,
With dreams as big as their love for them here.
They play games till dusk, under the starry night sky,
Their laughter echoes through every room, making everything right.

Grandma with her tales from days gone by,
Her wisdom like gold, shining bright in the eye.
She teaches lessons of life, wrapped in stories old,
Of courage and kindness, standing tall and bold.

And when storms come rolling, dark clouds may loom,
But together they stand strong, their love ever true.
For in this family, there's no room for strife,
Just understanding, respect, and endle

## Tools can stream too!
Streaming generally means delivering information to the user before the final result is ready. There are many cases where this is useful. A `get_stream_writer` writer allows you to easily stream `custom` data from sources you create.

In [7]:
from langchain.agents import create_agent
from langgraph.config import get_stream_writer


def get_weather(city: str) -> str:
    """Get weather for a given city."""
    writer = get_stream_writer()
    # stream any arbitrary data
    writer(f"Looking up data for city: {city}")
    writer(f"Acquired data for city: {city}")
    return f"It's always sunny in {city}!"


agent = create_agent(
    model=model,
    tools=[get_weather],
)

for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode=["values", "custom"],
):
    print(chunk)

('values', {'messages': [HumanMessage(content='What is the weather in SF?', additional_kwargs={}, response_metadata={}, id='d7c32a76-1958-4875-b6dd-c6cc1543ee6e')]})
('values', {'messages': [HumanMessage(content='What is the weather in SF?', additional_kwargs={}, response_metadata={}, id='d7c32a76-1958-4875-b6dd-c6cc1543ee6e'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'granite4:latest', 'created_at': '2025-10-22T10:56:36.586979Z', 'done': True, 'done_reason': 'stop', 'total_duration': 485040709, 'load_duration': 48817709, 'prompt_eval_count': 172, 'prompt_eval_duration': 182606583, 'eval_count': 27, 'eval_duration': 245904667, 'model_name': 'granite4:latest', 'model_provider': 'ollama'}, id='lc_run--edf721c9-b98e-478c-85a6-e58f5738ae2d-0', tool_calls=[{'name': 'get_weather', 'args': {'city': 'SF'}, 'id': 'ca2542ea-089a-4a0c-b4f2-c607a140a654', 'type': 'tool_call'}], usage_metadata={'input_tokens': 172, 'output_tokens': 27, 'total_tokens': 199})]})
('custo

In [8]:
for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode=["custom"],
):
    print(chunk)

('custom', 'Looking up data for city: SF')
('custom', 'Acquired data for city: SF')


## Try different modes on your own!
Modify the stream mode and the select to produce different results.

In [9]:
for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode=["values", "custom"],
):
    if chunk[0] == "custom":
        print(chunk[1])

Looking up data for city: SF
Acquired data for city: SF
