# Chat Completions Three 

## Basic Connection and Packages

### Importing OpenAI and Initializing the Client

To begin, we'll import the `OpenAI` class from the `openai` library, which allows us to interact with the OpenAI API. Next, we initialize a client instance, which we'll use to send requests and receive responses from the OpenAI models.

In [1]:
"""
This script is a simple example of using the OpenAI API
It uses the OpenAI Python client library to open a connection to the OpenAI API.
This also looks for the OPENAI_API_KEY environment variable to authenticate the client.
"""
from openai import OpenAI

client = OpenAI()

## Stream Options

### Real-Time Responses Using the Stream Parameter

In this example, we introduce the `stream` parameter (`stream=True`) to enable real-time streaming of the model's output. Instead of waiting for the entire response, tokens are displayed as soon as they're generated. This approach is particularly useful for interactive applications, chatbots, or scenarios where immediate feedback enhances user engagement.

In [None]:
"""
This script shows how to use the OpenAI API to generate text completions.
We add the stream parameter to dynamically show tokens to the user in real-time.
In this case, we want the response to start showing as soon as possible.
"""

stream = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=[
    {"role": "developer", "content": "You are a brilliant author of children's books."},
    {"role": "user", "content": "Write two paragraphs about a frog."}
    ],
    response_format=None,  
    temperature=None,
    max_completion_tokens=None, 
    stop=None,
    top_p=None, 
    frequency_penalty=None,
    presence_penalty=None,
    stream=True 
    )

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")

### Displaying Token Usage Statistics with `stream_options`

In this example, we introduce the `stream_options` parameter to access detailed token usage statistics during streaming responses from the OpenAI API. Setting `stream_options={"include_usage": True}` enables real-time reporting of token usage, providing insights into:

- **Prompt Tokens**: Number of tokens used in your prompt.
- **Completion Tokens**: Number of tokens generated by the model in response.
- **Total Tokens**: Combined total of prompt and completion tokens.

This approach is helpful for monitoring and managing token consumption during model interactions, especially when optimizing costs or tracking usage limits.


In [None]:
"""
This script shows how to use the OpenAI API to generate text completions.
We add the stream_options parameter to show token usage statistics.
"""

stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a brilliant author of children's books."},
        {"role": "user", "content": "Write one paragraph about a frog."}
    ],
    response_format=None,  
    temperature=None,
    max_completion_tokens=None, 
    stop=None,
    top_p=None, 
    frequency_penalty=None,
    presence_penalty=None,
    stream=True,
    stream_options={"include_usage": True},
)

for chunk in stream:
    # Handle content chunks
    if hasattr(chunk, 'choices') and len(chunk.choices) > 0:
        if chunk.choices[0].delta.content is not None:
            print(chunk.choices[0].delta.content, end="")
    # Handle usage or other non-content chunks
    elif hasattr(chunk, 'usage') and chunk.usage is not None:
        print("\n\nUsage Statistics:")
        print(f"Prompt Tokens: {chunk.usage.prompt_tokens}")
        print(f"Completion Tokens: {chunk.usage.completion_tokens}")
        print(f"Total Tokens: {chunk.usage.total_tokens}")

Once upon a time, in a shimmering emerald pond, there lived a little frog named Fredrick. With his bright green skin speckled with tiny golden spots, Fredrick loved to leap from lily pad to lily pad, singing cheerful songs that made even the sunflowers sway with joy. Every morning, as the mist lifted, he would practice his biggest jumps, dreaming of flying as high as the clouds. Frisky and full of curiosity, Fredrick made friends with fluttering dragonflies and wise old turtles, discovering that every day in the pond was an adventure waiting to happen.

Usage Statistics:
Prompt Tokens: 27
Completion Tokens: 118
Total Tokens: 145


### Inspecting Raw Stream Chunks for Debugging

In this example, we print the raw data (`chunk`) returned from the streaming response. Inspecting raw chunks can help you better understand the structure of the streamed responses from the OpenAI API. This approach is particularly valuable for debugging or exploring how streaming data is structured, especially when you’re handling different kinds of chunks—such as content chunks or chunks containing usage statistics.


In [11]:
stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a brilliant author of children's books."},
        {"role": "user", "content": "Write one paragraph about a frog."}
    ],
    response_format=None,  
    temperature=None,
    max_completion_tokens=None, 
    stop=None,
    top_p=None, 
    frequency_penalty=None,
    presence_penalty=None,
    stream=True,
    stream_options={"include_usage": True},
)

for chunk in stream:
    # Print the raw chunk data for inspection
    print("\nRaw Chunk:", chunk)
    
    # Handle content chunks
    if hasattr(chunk, 'choices') and len(chunk.choices) > 0:
        if chunk.choices[0].delta.content is not None:
            print(chunk.choices[0].delta.content, end="")
    # Handle usage or other non-content chunks
    elif hasattr(chunk, 'usage') and chunk.usage is not None:
        print("\n\nUsage Statistics:")
        print(f"Prompt Tokens: {chunk.usage.prompt_tokens}")
        print(f"Completion Tokens: {chunk.usage.completion_tokens}")
        print(f"Total Tokens: {chunk.usage.total_tokens}")


Raw Chunk: ChatCompletionChunk(id='chatcmpl-B9JNjrYgqrRcdfFp1Bu1ojYIFc164', choices=[Choice(delta=ChoiceDelta(content='', function_call=None, refusal=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1741559155, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier='default', system_fingerprint='fp_06737a9306', usage=None)

Raw Chunk: ChatCompletionChunk(id='chatcmpl-B9JNjrYgqrRcdfFp1Bu1ojYIFc164', choices=[Choice(delta=ChoiceDelta(content='In', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1741559155, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier='default', system_fingerprint='fp_06737a9306', usage=None)
In
Raw Chunk: ChatCompletionChunk(id='chatcmpl-B9JNjrYgqrRcdfFp1Bu1ojYIFc164', choices=[Choice(delta=ChoiceDelta(content=' a', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index