# Streaming the answer
## Introduction
You might think that the LLM answers in one single answer. Most LLMs answer the completion in small parts. The small parts are often refered to as *tokens*.

If you wait for the result, this can be slow. It's often useful to stream the results to the enduser as you get the results.

## Installation

In [1]:
%pip install -q langchain langchain-openai

Note: you may need to restart the kernel to use updated packages.


## Enabling streaming in Langchain

- We add the option `streaming=True` and use the async method `astream`.
- We print out the results as we get them , seperated by '|' symbol

In [2]:
from langchain_openai import ChatOpenAI

# Add streaming = True
chat = ChatOpenAI(model="gpt-4o-mini",streaming=True, temperature=0)

# Now the results come in chunks
chunks = []
async for chunk in chat.astream("hello. tell me something about yourself"):
    chunks.append(chunk)
    # We print out the chunk and add a delimiter '|' to seperate the next one
    print(chunk.content, end="|", flush=True)

|Hello|!| I'm| an| AI| language|Hello|!| I'm| an| AI| language| model| created| by| Open|AI|,| designed| to| assist| with| a| wide| range| of| questions| and| topics|.| I| can| provide| information|,| answer| questions|,| help| with| writing|,| and| engage| in| conversation|.| My| knowledge| is| based| on| a| diverse| set| of| texts|,| and| I'm| here| to| help| you| with| whatever| you| need|.| What| would| you| like| to| know|?||