# Streaming Last Agent Response

If you want to attach a callback on the last agent response, you can use the callback ``StreamingLastResponseCallbackHandler``.
For this, the underlying LLM has to support streaming as well.

In [3]:
from langchain.agents import load_tools, initialize_agent, AgentType
from langchain.chat_models import ChatOpenAI
from langchain.callbacks import StreamingLastResponseCallbackHandler

llm = ChatOpenAI(temperature=0, streaming=True)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=False
)

Now you can initialize the callback handler by using ``StreamingLastResponseCallbackHandler.from_agent_type(agent)``

In [4]:
stream = StreamingLastResponseCallbackHandler.from_agent_type(
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION
)

## 1. Using callback function decorator

You can attach a callback function by using the ``on_last_response_new_token()`` decorator.

In [5]:
@stream.on_last_response_new_token()
def on_new_token(token: str):
    if token is StopIteration:
        print("\n[Done]")
        return
    else:
        print(f"Next token: '{token}'")

Now run it with ``agent.run()`` and ``verbose=False``, you can see the callback function is called when the last agent response is received.

In [6]:
agent.run(
    "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?",
    callbacks=[stream],
)

Next token: ' Cam'
Next token: 'ila'
Next token: ' Morr'
Next token: 'one'
Next token: ''s'
Next token: ' current'
Next token: ' age'
Next token: ' raised'
Next token: ' to'
Next token: ' the'
Next token: ' '
Next token: '0'
Next token: '.'
Next token: '43'
Next token: ' power'
Next token: ' is'
Next token: ' approximately'
Next token: ' '
Next token: '4'
Next token: '.'
Next token: '059'
Next token: '.'
Next token: ''

[Done]


"Camila Morrone's current age raised to the 0.43 power is approximately 4.059."

## 2. Using for-loop

We can create a separate thread to run the agent, and use a for-loop to get the last agent response.

In [7]:
import threading

stream = StreamingLastResponseCallbackHandler.from_agent_type(
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION
)


def _run():
    agent.run(
        "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?",
        callbacks=[stream],
    )


threading.Thread(target=_run).start()

for token in stream:
    print(token, end="", flush=True)

 Camila Morrone's current age raised to the 0.43 power is approximately 4.059.

## 3. Post process on-the-fly

You can also post process on-the-fly by using ``postprocess`` decorator.

In [8]:
from typing import List
import tiktoken

enc = tiktoken.get_encoding("cl100k_base")

stream = StreamingLastResponseCallbackHandler.from_agent_type(
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION
)


@stream.postprocess(sliding_window_step=1, window_size=3)
def postprocess_func(tokens: List[str]) -> List[str]:
    sentence = "".join(tokens).replace("Python", "LangChain")
    out_tokens = [
        enc.decode([t]) for t in enc.encode(sentence)
    ]  # postprocess output can have different size!
    return out_tokens


def _run():
    agent.run("Is python good?", callbacks=[stream])


threading.Thread(target=_run).start()

for token in stream:
    print(token, end="", flush=True)

 Yes, LangChain is considered good.