# Streaming Chat Completion

Streaming chat completion allows you to receive responses from the LLM in real-time, as the model generates them. This can improve the user experience by providing faster feedback.

**Theory and Explanations**

*   **Stream Parameter**: To enable streaming, set the `stream` parameter to `True` in the `client.chat.completions.create()` method.
*   **Response Stream**: The method returns a response stream, which is an iterable object that yields individual response chunks as they are generated by the model.
*   **Iterating over the Stream**: You can iterate over the response stream using a `for` loop to process each response chunk.

**Example from Text**

Here's an example from the provided text:


In [1]:
import os
from dotenv import load_dotenv
load_dotenv()
perplexity_api_key = os.getenv("PERPLEXITY_API_KEY")

In [None]:
from openai import OpenAI

YOUR_API_KEY = perplexity_api_key

messages = [
    {
        "role": "system",
        "content": (
            "You are an artificial intelligence assistant and you need to "
            "engage in a helpful, detailed, polite conversation with a user."
        ),
    },
    {
        "role": "user",
        "content": (
            "How many stars are in the universe?"
        ),
    },
]

client = OpenAI(api_key=YOUR_API_KEY, base_url="https://api.perplexity.ai")

# chat completion with streaming
response_stream = client.chat.completions.create(
    model="sonar",
    messages=messages,
    stream=False,
)
for response in response_stream:
    # For each chunk of the streamed response, print the content
    # The 'end=""' prevents adding a newline after each chunk
    # The 'flush=True' ensures immediate output to the console
    print(response.choices[0].delta.content, end="", flush=True)

Astronomers estimate that there are approximately 200 billion trillion stars in the observable universe. This number is arrived at by estimating the number of stars in a typical galaxy like our Milky Way, which contains about 100 billion stars, and then multiplying by the estimated number of galaxies in the observable universe, which is around 2 trillion[1][5].

To break it down more clearly:
- The Milky Way has about 100 billion stars.
- There are roughly 2 trillion galaxies in the observable universe.
- Multiplying these gives an estimate of about 200 billion trillion stars (200 sextillion stars) in the observable universe[1][5].

Other sources provide similar orders of magnitude, suggesting the universe contains on the order of \(10^{22}\) to \(10^{24}\) stars, sometimes expressed as one septillion stars (a 1 followed by 24 zeros)[3][4].

It is important to note:
- This estimate applies to the *observable* universe only, the part of the universe from which light has had time to reac

**Practice Exercises**

1.  Modify the user prompt to ask a different question (e.g., "What is the population of the world?"). Run the code and observe the streaming response.
2.  Modify the system prompt to change the assistant's behavior (e.g., "You are an AI assistant that responds in haikus"). Run the code and observe the streaming response.
3.  Add a delay (e.g., `time.sleep(0.1)`) within the loop to slow down the streaming response. Observe the effect.