# Streaming

Streaming is an important UX consideration for LLM apps, and agents are no exception. Streaming with agents is made more complicated by the fact that it's not just tokens that you will want to stream, but you may also want to stream back the intermediate steps an agent takes.

Our agent will use the OpenAI tools API for tool invocation, and we'll provide the agent with two tools:

1. `where_cat_is_hiding`: A tool that uses an LLM to tell us where the cat is hiding
2. `tell_me_a_joke`: A tool that can use an LLM to tell a joke about the given topic

In this notebook, we'll see how to use `.stream` to stream action / observation pairs, and then we'll see how to use `.astream_log` to stream LLM output token by token, including from within the underlying tools.

In [1]:
from typing import TYPE_CHECKING, Any, Dict, List, Optional, Sequence, TypeVar, Union
from uuid import UUID

from langchain import agents, hub
from langchain.prompts import ChatPromptTemplate
from langchain.tools import tool
from langchain_core.agents import AgentAction, AgentFinish
from langchain_core.callbacks import Callbacks
from langchain_core.callbacks.base import AsyncCallbackHandler
from langchain_core.documents import Document
from langchain_core.messages import BaseMessage
from langchain_core.outputs import ChatGenerationChunk, GenerationChunk, LLMResult
from langchain_openai import ChatOpenAI

## Create the model

**Attention** For older versions of langchain, we must set `streaming=True`

In [2]:
model = ChatOpenAI(temperature=0, streaming=True)

## Tools

We define two tools that rely on a chat model to generate output!

Please note a few different things:

1. We invoke the model using .stream() to force the output to stream (unfortunately for older langchain versions you should still set `streaming=True` on the model)
2. We attach tags to the model so that we can filter on said tags when using astream_log

In [46]:
@tool
def where_cat_is_hiding(callbacks: Callbacks) -> str:  # <--- Accept callbackstool
    """Where is the cat hiding right now?"""
    chunks = list(
        model.stream(
            "Give one up to three word answer about where the cat might be hiding in the house right now.",
            {
                "tags": ["hiding_spot"],
                "callbacks": callbacks,
            },  # <--- Propagate callbacks and assign a tag to this model
        )
    )
    return "".join(chunk.content for chunk in chunks)


@tool
def tell_me_a_joke(
    topic: str, callbacks: Callbacks
) -> str:  # <--- Accept callbacks
    """Tell a joke about a given topic."""
    template = ChatPromptTemplate.from_messages(
        [
            ("system", "You are a comedian."),
            ("human", "Tell me a joke about {topic}"),
        ]
    )
    chain = template | model.with_config({"tags": ["joke"]})
    chunks = list(
        chain.stream({"topic": topic}, {"callbacks": callbacks})
    )  # <--- Propagate callbacks and assign a tag to this model
    return "".join(chunk.content for chunk in chunks)

## Initialize the agent

Here, we'll initialize an OpenAI tools agent.

In [47]:
# Get the prompt to use - you can modify this!
prompt = hub.pull("hwchase17/openai-tools-agent")
# print(prompt.messages) -- to see the prompt
tools = [tell_me_a_joke, where_cat_is_hiding]
agent = agents.create_openai_tools_agent(
    model.with_config({"tags": ["agent"]}), tools, prompt
)
agent_executor = agents.AgentExecutor(agent=agent, tools=tools)

## Stream Intermediate Steps

We'll use `.stream` method of the AgentExecutor to stream the agent's intermediate steps.

The output from `.stream` alternates between (action, observation) pairs, finally concluding with the answer if the agent achieved its objective. 

It'll look like this:

1. actions output
2. observations output
3. actions output
4. observations output

**... (continue until goal is reached) ...**

Then, if the final goal is reached, the agent will output the **final answer**.


The contents of these outputs are summarized here:

| Output             | Contents                                                                                          |
|----------------------|------------------------------------------------------------------------------------------------------|
| **Actions**   |  <ul> <li> `actions` `AgentAction` or a subclass </li><li> `messages` chat messages corresponding to action invocation </li></ul> |
| **Observations** | <ul> <li> `steps` History of what the agent did so far, including the current action and its observation </li><li> `messages` chat message with function invocation results (aka observations) </li></ul>|
| **Final answer** | <ul> <li> `output` `AgentFinish`  </li><li> `messages` chat messages with the final output </li></ul>|

In [48]:
# Note: We use `pprint` to print only to depth 1, it makes it easier to see the output from a high level, before digging in.
import pprint

chunks = []

for chunk in agent_executor.stream({"input": "where is the cat hiding, tell me a joke about that place?"}):
    chunks.append(chunk)
    print("------")
    pprint.pprint(chunk, depth=1)

------
{'actions': [...], 'messages': [...]}
------
{'messages': [...], 'steps': [...]}
------
{'actions': [...], 'messages': [...]}
------
{'messages': [...], 'steps': [...]}
------
{'messages': [...],
 'output': "The cat is hiding under the bed. Here's a joke about that place: \n"
           '\n'
           'Why did the monster put a mattress under his bed? Because he '
           'wanted to sleep tight and scare right!'}


### Using Messages

You can access the underlying `messages` from the outputs. Using messages can be nice when working with chat applications - because everything is a message!

In [6]:
chunks[0]['actions']

[OpenAIToolAgentAction(tool='where_cat_is_hiding', tool_input={}, log='\nInvoking: `where_cat_is_hiding` with `{}`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_poauqsU3KMy9EFhqgi1ZNBg0', 'function': {'arguments': '{}', 'name': 'where_cat_is_hiding'}, 'type': 'function'}]})], tool_call_id='call_poauqsU3KMy9EFhqgi1ZNBg0')]

In [7]:
for chunk in chunks:
    print(chunk['messages'])

[AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_poauqsU3KMy9EFhqgi1ZNBg0', 'function': {'arguments': '{}', 'name': 'where_cat_is_hiding'}, 'type': 'function'}]})]
[FunctionMessage(content='Under the bed.', name='where_cat_is_hiding')]
[AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_BO6giBvzFzUnZwer7ep1Q5T1', 'function': {'arguments': '{\n  "topic": "under the bed"\n}', 'name': 'tell_me_a_joke'}, 'type': 'function'}]})]
[FunctionMessage(content='Why did the scarecrow bring a ladder under the bed?\n\nBecause it heard there was a "bed spring" party going on!', name='tell_me_a_joke')]
[AIMessage(content='The cat is hiding under the bed. Here\'s a joke about that place: \n\nWhy did the scarecrow bring a ladder under the bed? Because it heard there was a "bed spring" party going on!')]


In addition, they contain full logging information (`actions` and `steps`) which may be easier to process for rendering purposes.

### Using AgentAction/Observation

The outputs also contain richer structured information inside of `actions` and `steps`, which could be useful in some situations, but can also be harder to parse.

In [8]:
for chunk in agent_executor.stream({"input": "tell me a joke about the place where the cat is hiding"}):
    # Agent Action
    if "actions" in chunk:
        for action in chunk["actions"]:
            print(
                f"Calling Tool: `{action.tool}` with input `{action.tool_input}`"
            )
    # Observation
    elif "steps" in chunk:
        for step in chunk["steps"]:
            print(f"Tool Result: `{step.observation}`")
    # Final result
    elif "output" in chunk:
        print(f'Final Output: {chunk["output"]}')
    else:
        raise ValueError()
    print('---')

Calling Tool: `where_cat_is_hiding` with input `{}`
---
Tool Result: `Under the bed.`
---
Calling Tool: `tell_me_a_joke` with input `{'topic': 'under the bed'}`
---
Tool Result: `Why did the scarecrow bring a ladder under the bed?

Because it heard there was a "bed spring" party going on!`
---
Final Output: Why did the scarecrow bring a ladder under the bed? Because it heard there was a "bed spring" party going on!
---


## Streaming Tokens & More

For some applications, you may want to stream individual LLM tokens, surface information about tool execution, or output custom messages before / after tool executions.

You can do all of that with [astream_log](https://python.langchain.com/docs/expression_language/interface#async-stream-intermediate-steps) method!

**astream_log** gives a granular log of all the events that occur during the execution of the agent, allowing you to write that can act on these events.

The log format is based on the [JSONPatch](https://jsonpatch.com/) standard.


**ATTENTION** Make sure that you set the LLM to `streaming=True`

In [143]:
log_patches = [
    log_patch async for log_patch in agent_executor.astream_log({"input": "tell me a joke about the place where the cat is hiding"}, include_types=['tool', 'llm'])
]

In [159]:
def compute_state(log_patches):
    state = None
    for log_patch in log_patches:
        if state is None:
            state = log_patch
        else:
            state = state + log_patch
    return state

In [141]:
from typing import Iterable

In [150]:
'/a/b'.split('/')

['', 'a', 'b']

In [158]:
log_patches

[RunLogPatch({'op': 'replace',
   'path': '',
   'value': {'final_output': None,
             'id': '5e902c1a-3ac9-4f7b-8434-19585e75790c',
             'logs': {},
             'streamed_output': []}}),
 RunLogPatch({'op': 'add',
   'path': '/logs/ChatOpenAI',
   'value': {'end_time': None,
             'final_output': None,
             'id': '64dcfb63-4879-47a6-8173-f12f3baaf0f5',
             'metadata': {},
             'name': 'ChatOpenAI',
             'start_time': '2024-01-12T04:19:06.974+00:00',
             'streamed_output': [],
             'streamed_output_str': [],
             'tags': ['seq:step:3', 'agent'],
             'type': 'llm'}}),
 RunLogPatch({'op': 'add', 'path': '/logs/ChatOpenAI/streamed_output_str/-', 'value': ''},
  {'op': 'add',
   'path': '/logs/ChatOpenAI/streamed_output/-',
   'value': AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_12TnV8jJkeDsjQuvxBrPtV7o', 'function': {'arguments': '', 'name': 'where_cat_is_hid

In [157]:
class EventGenerator:
    def __init__(self, run_log_patch: Iterable):
        self.log_patch_stream = run_log_patch
        self.step_info = {}

    def yield_events(self):
        for run_log_patch in self.log_patch_stream:
            for op in run_log_patch.ops:
                if op["op"] != "add":
                    continue

                path = op['path']

                if not path.startswith('/logs/'):
                    continue

                num = len('/logs/')
                rest = path[num:]

                if '/' in rest:
                    continue

                stage_name = rest

                info = {
                    "tags": op["value"].get("tags", []),
                    "type": op["value"].get("type", None),
                    "name": op["value"].get("name", None),
                }
                yield info


s = EventGenerator(log_patches)

list(s.yield_events())

[{'tags': ['seq:step:3', 'agent'], 'type': 'llm', 'name': 'ChatOpenAI'},
 {'tags': [], 'type': 'tool', 'name': 'where_cat_is_hiding'},
 {'tags': ['hiding_spot'], 'type': 'llm', 'name': 'ChatOpenAI'},
 {'tags': ['seq:step:3', 'agent'], 'type': 'llm', 'name': 'ChatOpenAI'},
 {'tags': [], 'type': 'tool', 'name': 'tell_me_a_joke'},
 {'tags': ['seq:step:2', 'joke'], 'type': 'llm', 'name': 'ChatOpenAI'},
 {'tags': ['seq:step:3', 'agent'], 'type': 'llm', 'name': 'ChatOpenAI'}]

In [None]:
EventGenerator(log_p

In [139]:
compute_state(log_patches[:30]).state['logs']

{'ChatOpenAI': {'id': '9f365199-896c-4c42-b7d9-aebe7f0f24ee',
  'name': 'ChatOpenAI',
  'type': 'llm',
  'tags': ['seq:step:3', 'agent'],
  'metadata': {},
  'start_time': '2024-01-12T04:09:44.036+00:00',
  'streamed_output': [AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_TVcCk0skzVG8q2J7lTtBTgXh', 'function': {'arguments': '', 'name': 'where_cat_is_hiding'}, 'type': 'function'}]}),
   AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': None, 'function': {'arguments': '{}', 'name': None}, 'type': None}]}),
   AIMessageChunk(content='')],
  'streamed_output_str': ['', '', ''],
  'final_output': {'generations': [[{'text': '',
      'generation_info': {'finish_reason': 'tool_calls'},
      'type': 'ChatGenerationChunk',
      'message': AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_TVcCk0skzVG8q2J7lTtBTgXh', 'function': {'arguments': '{}', 'name': 'where_cat_is_hiding'}, 'type': 'fu

In [131]:
log_patches[1]

RunLogPatch({'op': 'add',
  'path': '/logs/ChatOpenAI',
  'value': {'end_time': None,
            'final_output': None,
            'id': '9f365199-896c-4c42-b7d9-aebe7f0f24ee',
            'metadata': {},
            'name': 'ChatOpenAI',
            'start_time': '2024-01-12T04:09:44.036+00:00',
            'streamed_output': [],
            'streamed_output_str': [],
            'tags': ['seq:step:3', 'agent'],
            'type': 'llm'}})

In [106]:
compute_state(log_patches[:10])

RunLog({'final_output': {'actions': [OpenAIToolAgentAction(tool='where_cat_is_hiding', tool_input={}, log='\nInvoking: `where_cat_is_hiding` with `{}`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_aAmk7c55UDe7Idw7nx7y6i9M', 'function': {'arguments': '{}', 'name': 'where_cat_is_hiding'}, 'type': 'function'}]})], tool_call_id='call_aAmk7c55UDe7Idw7nx7y6i9M')],
                  'messages': [AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_aAmk7c55UDe7Idw7nx7y6i9M', 'function': {'arguments': '{}', 'name': 'where_cat_is_hiding'}, 'type': 'function'}]})]},
 'id': '3357f7f0-1685-40bf-b75e-fa3b10ccc644',
 'logs': {'ChatOpenAI': {'end_time': '2024-01-12T04:08:05.501+00:00',
                         'final_output': {'generations': [[{'generation_info': {'finish_reason': 'tool_calls'},
                                                            'message': AIMessageChunk(content='', additional_kwargs

In [101]:
log_patches[:8]

[RunLogPatch({'op': 'replace',
   'path': '',
   'value': {'final_output': None,
             'id': 'efe95a91-967b-4240-999b-fb192f00c784',
             'logs': {},
             'streamed_output': []}}),
 RunLogPatch({'op': 'add',
   'path': '/logs/ChatOpenAI',
   'value': {'end_time': None,
             'final_output': None,
             'id': '9be00d9a-837b-4f84-9724-4e67dea75230',
             'metadata': {},
             'name': 'ChatOpenAI',
             'start_time': '2024-01-12T04:07:50.185+00:00',
             'streamed_output': [],
             'streamed_output_str': [],
             'tags': ['seq:step:3', 'agent'],
             'type': 'llm'}}),
 RunLogPatch({'op': 'add', 'path': '/logs/ChatOpenAI/streamed_output_str/-', 'value': ''},
  {'op': 'add',
   'path': '/logs/ChatOpenAI/streamed_output/-',
   'value': AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_U0wPxbMjVUqtcoqbDcXf9br9', 'function': {'arguments': '', 'name': 'where_cat_is_hid

In [93]:
log_patches[34]

RunLogPatch({'op': 'add',
  'path': '/logs/tell_me_a_joke',
  'value': {'end_time': None,
            'final_output': None,
            'id': 'a27ab17b-1f5f-49a9-bc90-94849ada0bf1',
            'metadata': {},
            'name': 'tell_me_a_joke',
            'start_time': '2024-01-12T03:59:04.965+00:00',
            'streamed_output': [],
            'streamed_output_str': [],
            'tags': [],
            'type': 'tool'}})

In [66]:
from langchain_core.messages import AIMessageChunk

In [67]:
log_patches[4]

RunLogPatch({'op': 'add', 'path': '/logs/ChatOpenAI/streamed_output_str/-', 'value': ''},
 {'op': 'add',
  'path': '/logs/ChatOpenAI/streamed_output/-',
  'value': AIMessageChunk(content='')})

In [74]:
state = None

for log_patch in log_patches[:10]:
    if state is None:
        state = log_patch
    else:
        state = state + log_patch
        print(state.state)

{'id': 'e3302303-fb62-4544-9fe6-b0fe3117f0dd', 'streamed_output': [], 'final_output': None, 'logs': {'ChatOpenAI': {'id': 'c5d84946-f72e-4f9d-80ae-ca8868400d44', 'name': 'ChatOpenAI', 'type': 'llm', 'tags': ['seq:step:3', 'agent'], 'metadata': {}, 'start_time': '2024-01-12T03:59:02.191+00:00', 'streamed_output': [], 'streamed_output_str': [], 'final_output': None, 'end_time': None}}}
{'id': 'e3302303-fb62-4544-9fe6-b0fe3117f0dd', 'streamed_output': [], 'final_output': None, 'logs': {'ChatOpenAI': {'id': 'c5d84946-f72e-4f9d-80ae-ca8868400d44', 'name': 'ChatOpenAI', 'type': 'llm', 'tags': ['seq:step:3', 'agent'], 'metadata': {}, 'start_time': '2024-01-12T03:59:02.191+00:00', 'streamed_output': [AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_sf39dEbCEAPI4SbFSQrp2jcF', 'function': {'arguments': '', 'name': 'where_cat_is_hiding'}, 'type': 'function'}]})], 'streamed_output_str': [''], 'final_output': None, 'end_time': None}}}
{'id': 'e3302303-fb62-4544-

In [85]:
state

RunLog({'final_output': {'actions': [OpenAIToolAgentAction(tool='where_cat_is_hiding', tool_input={}, log='\nInvoking: `where_cat_is_hiding` with `{}`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_sf39dEbCEAPI4SbFSQrp2jcF', 'function': {'arguments': '{}', 'name': 'where_cat_is_hiding'}, 'type': 'function'}]})], tool_call_id='call_sf39dEbCEAPI4SbFSQrp2jcF')],
                  'messages': [AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_sf39dEbCEAPI4SbFSQrp2jcF', 'function': {'arguments': '{}', 'name': 'where_cat_is_hiding'}, 'type': 'function'}]})]},
 'id': 'e3302303-fb62-4544-9fe6-b0fe3117f0dd',
 'logs': {'ChatOpenAI': {'end_time': '2024-01-12T03:59:02.664+00:00',
                         'final_output': {'generations': [[{'generation_info': {'finish_reason': 'tool_calls'},
                                                            'message': AIMessageChunk(content='', additional_kwargs

In [89]:
log_patches[:10]

[RunLogPatch({'op': 'replace',
   'path': '',
   'value': {'final_output': None,
             'id': 'e3302303-fb62-4544-9fe6-b0fe3117f0dd',
             'logs': {},
             'streamed_output': []}}),
 RunLogPatch({'op': 'add',
   'path': '/logs/ChatOpenAI',
   'value': {'end_time': None,
             'final_output': None,
             'id': 'c5d84946-f72e-4f9d-80ae-ca8868400d44',
             'metadata': {},
             'name': 'ChatOpenAI',
             'start_time': '2024-01-12T03:59:02.191+00:00',
             'streamed_output': [],
             'streamed_output_str': [],
             'tags': ['seq:step:3', 'agent'],
             'type': 'llm'}}),
 RunLogPatch({'op': 'add', 'path': '/logs/ChatOpenAI/streamed_output_str/-', 'value': ''},
  {'op': 'add',
   'path': '/logs/ChatOpenAI/streamed_output/-',
   'value': AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_sf39dEbCEAPI4SbFSQrp2jcF', 'function': {'arguments': '', 'name': 'where_cat_is_hid

In [None]:
 RunLogPatch({'op': 'add',
   'path': '/logs/ChatOpenAI',
   'value': {'end_time': None,
             'final_output': None,
             'id': '9b100b22-de65-4f55-945a-dc5b825ec276',
             'metadata': {},
             'name': 'ChatOpenAI',
             'start_time': '2024-01-12T03:49:48.933+00:00',
             'streamed_output': [],
             'streamed_output_str': [],
             'tags': ['seq:step:3', 'agent'],
             'type': 'llm'}}),/

In [91]:
for chunk in log_patches:
    for op in chunk.ops:
        if op['op'] != 'add':
            continue
            
        value = op['value']
            
        if not isinstance(value, AIMessageChunk):
            continue
        
        if value.content == "": # Then it's a function invocation message
            continue
            
        print(value.content, end='|')

Under| the| bed|.|Why| did| the| monster| put| a| mattress| under| his| bed|?

|Because| he| wanted| to| sleep| tight| and| scare| right|!|Here|'s| a| joke| about| the| place| where| the| cat| is| hiding|:

|Why| did| the| monster| put| a| mattress| under| his| bed|?
|Because| he| wanted| to| sleep| tight| and| scare| right|!|

This may require some logic to get in a workable format

In [10]:
path_status = {}
async for chunk in agent_executor.astream_log(
    {"input": "what is the weather in sf", "chat_history": []},
    include_names=["ChatOpenAI"],
):
    for op in chunk.ops:
        if op["op"] == "add":
            if op["path"] not in path_status:
                path_status[op["path"]] = op["value"]
            else:
                path_status[op["path"]] += op["value"]
    print(op["path"])
    print(path_status.get(op["path"]))
    print("----")


None
----
/logs/ChatOpenAI
{'id': '3f6d3587-600f-419b-8225-8908a347b7d2', 'name': 'ChatOpenAI', 'type': 'llm', 'tags': ['seq:step:3'], 'metadata': {}, 'start_time': '2023-12-26T17:56:19.884', 'streamed_output': [], 'streamed_output_str': [], 'final_output': None, 'end_time': None}
----
/logs/ChatOpenAI/streamed_output/-
content='' additional_kwargs={'function_call': {'arguments': '', 'name': 'tavily_search_results_json'}}
----
/logs/ChatOpenAI/streamed_output/-
content='' additional_kwargs={'function_call': {'arguments': '{\n', 'name': 'tavily_search_results_json'}}
----
/logs/ChatOpenAI/streamed_output/-
content='' additional_kwargs={'function_call': {'arguments': '{\n ', 'name': 'tavily_search_results_json'}}
----
/logs/ChatOpenAI/streamed_output/-
content='' additional_kwargs={'function_call': {'arguments': '{\n  "', 'name': 'tavily_search_results_json'}}
----
/logs/ChatOpenAI/streamed_output/-
content='' additional_kwargs={'function_call': {'arguments': '{\n  "query', 'name': 'tav