# Streaming

<img src="./assets/LC_streaming.png" width="400">

Streaming reduces the latency between generating data and the user receiving it.
There are two types frequently used with Agents:

## Setup

Load and/or check for needed environmental variables

In [2]:
from dotenv import load_dotenv
from env_utils import doublecheck_env

# Load environment variables from .env
load_dotenv()

# Check and print results
doublecheck_env("example.env")

OPENAI_API_KEY=<not set>
LANGSMITH_API_KEY=****4abe
LANGSMITH_TRACING=true
LANGSMITH_PROJECT=****ad-6


In [3]:
from langchain.agents import create_agent

In [4]:
agent = create_agent(
    model="ollama:gpt-oss:20b",
    system_prompt="You are a full-stack comedian",
)

## No Steaming (invoke)

In [5]:
result = agent.invoke({"messages": [{"role": "user", "content": "Tell me a joke"}]})
print(result["messages"][1].content)

**Why did the full‑stack developer break up with the database?**  

Because every time they tried to commit, the DB replied, *“I can’t even.”*  

And the developer, still hopeful, said, “You’re just *schema* away from my heart!”  

— *Now that’s a classic relational romance.*


## values
You have seen this streaming mode in our examples so far. 

In [6]:
# Stream = values
for step in agent.stream(
    {"messages": [{"role": "user", "content": "Tell me a Dad joke"}]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()


Tell me a Dad joke

Why don't skeletons fight each other?  

They don't have the guts! 😄


## messages
Messages stream data token by token - the lowest latency possible. This is perfect for interactive applications like chatbots.

In [7]:
for token, metadata in agent.stream(
    {"messages": [{"role": "user", "content": "Write me a family friendly poem."}]},
    stream_mode="messages",
):
    print(f"{token.content}", end="")

**The Great Family Picnic Fiasco**

We packed a picnic basket, full of treats,  
A sandwich, a cookie, a jar of sweets.  
Grandpa said, “Let’s spread out on the green,  
And watch the clouds float on a lazy stream.”

The kids all ran, the dog barked loud,  
The sandwich fell to the proud lawn.  
Mama’s “cherry pie” was a perfect sphere,  
But it landed on the cat, who didn’t care!

A bird flew in, ate crumbs from the plate,  
The kids giggled as it chirped, “That’s great!”  
The dog, it chased the bird around,  
While Grandpa laughed and said, “I’m still sound.”

We found a pond that was oddly cold,  
The kids splashed in, so brave and bold.  
Mama whispered, “Let’s not get wet,  
We’re just a family, not a pet.”

The sun began to set, the sky turned gold,  
We gathered back in the evening’s fold.  
We laughed about the picnic’s chaotic fun,  
And promised next time to just use the one—  

The umbrella!  
So the next picnic, we’ll stay dry,  
But the memories—yes—will never die.  

---

## Tools can stream too!
Streaming generally means delivering information to the user before the final result is ready. There are many cases where this is useful. A `get_stream_writer` writer allows you to easily stream `custom` data from sources you create.

In [9]:
from langchain.agents import create_agent
from langgraph.config import get_stream_writer


def get_weather(city: str) -> str:
    """Get weather for a given city."""
    writer = get_stream_writer()
    # stream any arbitrary data
    writer(f"Looking up data for city: {city}")
    writer(f"Acquired data for city: {city}")
    return f"It's always sunny in {city}!"


agent = create_agent(
    model="ollama:gpt-oss:20b",
    tools=[get_weather],
)

for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode=["values", "custom"],
):
    print(chunk)

('values', {'messages': [HumanMessage(content='What is the weather in SF?', additional_kwargs={}, response_metadata={}, id='c793f77f-b494-41b0-941f-dc60d9a88739')]})
('values', {'messages': [HumanMessage(content='What is the weather in SF?', additional_kwargs={}, response_metadata={}, id='c793f77f-b494-41b0-941f-dc60d9a88739'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'gpt-oss:20b', 'created_at': '2025-10-28T10:07:50.099134Z', 'done': True, 'done_reason': 'stop', 'total_duration': 7335305333, 'load_duration': 163918833, 'prompt_eval_count': 128, 'prompt_eval_duration': 5358105750, 'eval_count': 113, 'eval_duration': 1776516955, 'model_name': 'gpt-oss:20b', 'model_provider': 'ollama'}, id='lc_run--37dfbec9-9e37-4b83-927b-67e3864bb3c8-0', tool_calls=[{'name': 'get_weather', 'args': {'city': 'San Francisco'}, 'id': 'e0dca267-5e46-4332-aca2-1ee7e54c4b21', 'type': 'tool_call'}], usage_metadata={'input_tokens': 128, 'output_tokens': 113, 'total_tokens': 241})]}

In [10]:
for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode=["custom"],
):
    print(chunk)

('custom', 'Looking up data for city: SF')
('custom', 'Acquired data for city: SF')


## Try different modes on your own!
Modify the stream mode and the select to produce different results.

In [12]:
for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode=["values"],
):
    if chunk[0] == "custom":
        print(chunk[1])
    if chunk[0] == "values":
        print(chunk[1]["messages"][-1].content, end="")

What is the weather in SF?It's always sunny in San Francisco!I’m sorry for the mistake earlier. The function call you saw didn’t return the actual current weather—it just gave a generic “always sunny” response.

I don’t have real‑time data in this chat, but here’s what you can do to get an up‑to‑date forecast for San Francisco:

| Method | How to use it |
|--------|---------------|
| **Weather app on your phone** | Open the built‑in weather app or a third‑party app (e.g., AccuWeather, Weather Underground). |
| **Web search** | Go to a search engine and type “San Francisco weather” to see the latest forecast, temperature, humidity, wind, and alerts. |
| **Voice assistants** | Ask Siri, Google Assistant, or Alexa “What’s the weather in San Francisco?” |
| **Local news** | Check the local news station’s website or channel for a weather segment. |

If you’d like me to provide the *average* conditions for this time of year: early October in San Francisco typically sees temperatures ranging 