# How to stream LLM responses

LangChain provides streaming support for LLMs. Currently, we only support streaming for the `OpenAI` and `ChatOpenAI` LLM implementation, but streaming support for other LLM implementations is on the roadmap. To utilize streaming, use a [`CallbackHandler`](https://github.com/hwchase17/langchain/blob/master/langchain/callbacks/base.py) that implements `on_llm_new_token`. In this example, we are using [`StreamingStdOutCallbackHandler`]().

In [None]:
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.callbacks.base import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.schema import HumanMessage

import os
#os.environ["OPENAI_API_KEY"]="sk-"
llm = OpenAI(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)
resp = llm("写一首关于勤劳的公鸡的歌.")

We still have access to the end `LLMResult` if using `generate`. However, `token_usage` is not currently supported for streaming.

In [None]:
llm.generate(["讲个中国小笑话."])

Here's an example with `ChatOpenAI`:

In [None]:
chat = ChatOpenAI(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)
resp = chat([HumanMessage(content="写一首儿歌，爱干净的小猫咪.")])