# Streaming

<img src="./assets/LC_streaming.png" width="400">

Streaming reduces the latency between generating data and the user receiving it.
There are two types frequently used with Agents:

## Setup

Load and/or check for needed environmental variables

In [None]:
from dotenv import load_dotenv
from env_utils import doublecheck_env

# Load environment variables from .env
load_dotenv()

# Check and print results
doublecheck_env("example.env")

OPENAI_API_KEY=****eJgA
LANGSMITH_API_KEY=****2eed
LANGSMITH_TRACING=true
LANGSMITH_PROJECT=****ject


In [None]:
!pip install -U langchain langchain-core langchain-community langchain-openai langgraph langchain-groq groq python-dotenv


Collecting langchain
  Downloading langchain-1.0.8-py3-none-any.whl.metadata (4.9 kB)
Collecting langchain-core
  Downloading langchain_core-1.0.7-py3-none-any.whl.metadata (3.6 kB)
Collecting langchain-community
  Downloading langchain_community-0.4.1-py3-none-any.whl.metadata (3.0 kB)
Collecting langchain-openai
  Downloading langchain_openai-1.0.3-py3-none-any.whl.metadata (2.6 kB)
Collecting langgraph
  Downloading langgraph-1.0.3-py3-none-any.whl.metadata (7.8 kB)
Collecting langchain-groq
  Downloading langchain_groq-1.0.1-py3-none-any.whl.metadata (2.4 kB)
Collecting groq
  Downloading groq-0.36.0-py3-none-any.whl.metadata (16 kB)
Collecting langchain-classic<2.0.0,>=1.0.0 (from langchain-community)
  Downloading langchain_classic-1.0.0-py3-none-any.whl.metadata (3.9 kB)
Collecting requests<3.0.0,>=2.32.5 (from langchain-community)
  Downloading requests-2.32.5-py3-none-any.whl.metadata (4.9 kB)
Collecting dataclasses-json<0.7.0,>=0.6.7 (from langchain-community)
  Downloading d

In [None]:
import getpass
import os
os.environ["GROQ_API_KEY"] = ""

In [None]:
from langchain_groq import ChatGroq

llm = ChatGroq(
    model="openai/gpt-oss-120b",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
    # other params...
)

In [None]:
from langchain.agents import create_agent

In [None]:
agent = create_agent(
    model=llm,
    system_prompt="You are a full-stack comedian",
)

## No Steaming (invoke)

In [None]:
result = agent.invoke({"messages": [{"role": "user", "content": "Tell me a joke"}]})
print(result["messages"][1].content)

Sure, here’s a full‑stack joke that compiles on both the front‑end and the back‑end:

> **Why did the full‑stack developer break up with the database?**  
> Because every time they tried to *commit*, the DB kept *rolling back* their feelings, and the UI kept *rendering* a “404 Not Found” error for love. 

*(Bonus: The only thing that stayed consistent was the infinite loop of “I’ll fix it in production.”)*


## values
You have seen this streaming mode in our examples so far.

In [None]:
# Stream = values
for step in agent.stream(
    {"messages": [{"role": "user", "content": "Tell me a Dad joke"}]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()


Tell me a Dad joke

Why did the scarecrow become a successful software engineer?

Because he was outstanding in his field… and he finally learned how to handle *bugs* without getting *corn*-fused!


## messages
Messages stream data token by token - the lowest latency possible. This is perfect for interactive applications like chatbots.

In [None]:
for token, metadata in agent.stream(
    {"messages": [{"role": "user", "content": "Write me a family friendly poem."}]},
    stream_mode="messages",
):
    print(f"{token.content}", end="")

**A Day of Sunshine and Giggles**

When morning wakes with golden light,  
The world stretches, “Good‑morning, bright!”  
Birds chirp a tune, the sky turns blue,  
And every day feels fresh and new.

We gather round the kitchen table,  
With pancakes stacked—so tall, so stable!  
A drizzle of syrup, a smile so wide,  
Mom’s secret recipe, love inside.

The garden calls with whispering leaves,  
A playground for the buzzing bees.  
We plant a seed, we water, we wait—  
Soon sprouts a sprout, it’s never too late!

Out in the park, the swings go “whoosh,”  
Laughter bubbles, a joyful swoosh.  
We chase the clouds, we count the stars,  
And share a joke that’s never far.

When evening paints the world in gold,  
We snuggle close, the stories told.  
A bedtime hug, a kiss, a sigh—  
Dreams of tomorrow drifting by.

So here’s a toast to simple cheer,  
To family moments we hold dear.  
May every day be bright and sweet,  
With love and giggles at our feet.

## Tools can stream too!
Streaming generally means delivering information to the user before the final result is ready. There are many cases where this is useful. A `get_stream_writer` writer allows you to easily stream `custom` data from sources you create.

In [None]:
from langchain.agents import create_agent
from langgraph.config import get_stream_writer


def get_weather(city: str) -> str:
    """Get weather for a given city."""
    writer = get_stream_writer()
    # stream any arbitrary data
    writer(f"Looking up data for city: {city}")
    writer(f"Acquired data for city: {city}")
    return f"It's always sunny in {city}!"


agent = create_agent(
    model=llm,
    tools=[get_weather],
)

for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode=["values", "custom"],
):
    print(chunk)

('values', {'messages': [HumanMessage(content='What is the weather in SF?', additional_kwargs={}, response_metadata={}, id='310e1b20-a616-465b-9d3b-45fa5f653ba0')]})
('values', {'messages': [HumanMessage(content='What is the weather in SF?', additional_kwargs={}, response_metadata={}, id='310e1b20-a616-465b-9d3b-45fa5f653ba0'), AIMessage(content='', additional_kwargs={'reasoning_content': 'User asks: "What is the weather in SF?" Likely San Francisco. Need to get weather via function. Use get_weather with city "San Francisco".', 'tool_calls': [{'id': 'fc_2b4fa69f-831a-4a59-acdf-3e6cff170390', 'function': {'arguments': '{"city":"San Francisco"}', 'name': 'get_weather'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 61, 'prompt_tokens': 127, 'total_tokens': 188, 'completion_time': 0.126888231, 'completion_tokens_details': {'reasoning_tokens': 33}, 'prompt_time': 0.005084894, 'prompt_tokens_details': None, 'queue_time': 0.018799167, 'total_time': 0.13197312

In [None]:
for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode=["custom"],
):
    print(chunk)

('custom', 'Looking up data for city: San Francisco')
('custom', 'Acquired data for city: San Francisco')


## Try different modes on your own!
Modify the stream mode and the select to produce different results.

In [None]:
for chunk in agent.stream(
    {"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
    stream_mode=["values", "custom"],
):
    if chunk[0] == "custom":
        print(chunk[1])