# Responses Two

## Basic Connection and Packages

### Importing OpenAI and Initializing the Client

To begin, we'll import the `OpenAI` class from the `openai` library, which allows us to interact with the OpenAI API. Next, we initialize a client instance, which we'll use to send requests and receive responses from the OpenAI models.



In [None]:
"""
This script is a simple example of using the OpenAI API
It uses the OpenAI Python client library to open a connection to the OpenAI API.
This also looks for the OPENAI_API_KEY environment variable to authenticate the client.
"""
from openai import OpenAI
client = OpenAI()

### Importing Additional Libraries

Next, we import the `json` library. This standard Python library helps us handle JSON data, allowing easy conversion between JSON strings and Python data structures.



In [None]:
"""
Additional imports
These imports are used for various purposes in the script.
"""

import json # common library for working with JSON data

## Temperature

### Controlling Creativity with the Temperature Parameter

In this example, we introduce the `temperature` parameter to control the randomness and creativity of the model's responses. A lower temperature (e.g., `0`) produces deterministic, predictable outputs suitable for clear and consistent writing. A higher temperature (closer to `1`) yields more creative and varied responses.

Here, we've set the temperature to `0`, ensuring a consistent and less random output, appropriate for structured or educational content, such as children's books.


In [None]:
"""
This script shows how to use the OpenAI API to generate text completions.
We add the temperature parameter to specify the randomness/creativity of the response.
In this case, we want the response to be less random/creative.
"""
response = client.responses.create(
  model="gpt-4o-mini",
  input=[
    {"role": "developer", "content": "You are a brilliant author of children's books."},
    {"role": "user", "content": "Write two paragraphs about a frog."}
  ],
  text=None,  
  temperature=0 # Lower temperature for more deterministic output
)

print(response.output_text)

### Increasing Creativity with a Higher Temperature

Here, we use a higher `temperature` parameter (`1.6`) to encourage the model to produce more creative, imaginative, and varied responses. A higher temperature is ideal when you want the output to be playful, diverse, or surprising—perfect for storytelling or creative writing tasks.

In this case, the prompt asks the model, acting as a children's book author, to write two paragraphs about a frog. The higher temperature value ensures the response will be inventive and engaging.


In [None]:
"""
This script shows how to use the OpenAI API to generate text completions.
We add the temperature parameter to specify the randomness/creativity of the response.
In this case, we want the response to be more random/creative.
"""
response = client.responses.create(
  model="gpt-4o-mini",
  input=[
    {"role": "developer", "content": "You are a brilliant author of children's books."},
    {"role": "user", "content": "Write two paragraphs about a frog."}
  ],
  text=None,  
  temperature=1.6 # Higher temperature for less deterministic output
)

print(response.output_text)

## Max Output Tokens

### Limiting Response Length with `max_output_tokens`

In this example, we demonstrate the use of the `max_output_tokens` parameter, which limits the length of the generated text response. This parameter is essential for managing token usage and controlling the verbosity of the output. Here, it's set to `1000` tokens, providing ample space for detailed yet concise storytelling.

We also set the `temperature` parameter to `1`, balancing creativity with coherence, ideal for writing engaging children's literature.


In [None]:
"""
This script shows how to use the OpenAI API to generate text completions.
We add the max_completion_tokens parameter to specify the number of tokens in the response.
In this case, we want the response to be limited to one thousand tokens.
The maximum number of tokens for gpt-4o-mini is 16,384.
"""
response = client.responses.create(
  model="gpt-4o-mini",
  input=[
    {"role": "developer", "content": "You are a brilliant author of children's books."},
    {"role": "user", "content": "Write two paragraphs about a frog."}
  ],
  text=None,  
  temperature=None,
  max_output_tokens=1000 # Limit the number of tokens in the response, 16,384 tokens is the maximum for gpt-4o-mini
)

print(response.output_text)

### Effect of a Low `max_output_tokens` Value on Responses

This example highlights how setting the `max_output_tokens` parameter to the lowest value we can (`16` tokens) significantly constrains the length of the model's response. Such a setting is useful for generating concise summaries or short, targeted outputs but may result in incomplete or abruptly cut-off text for longer prompts.

We've maintained a moderate `temperature` (`1`) to encourage creativity, but the output is heavily restricted by the token limit.


In [None]:
"""
This script shows how to use the OpenAI API to generate text completions.
We add the max_completion_tokens parameter to specify the number of tokens in the response.
In this case, we want the response to be limited to ten tokens.
The maximum number of tokens for gpt-4o-mini is 16,384.
"""
response = client.responses.create(
  model="gpt-4o-mini",
  input=[
    {"role": "developer", "content": "You are a brilliant author of children's books."},
    {"role": "user", "content": "Write two paragraphs about a frog."}
  ],
  text=None,  
  temperature=None,
  max_output_tokens=16 # Limit the number of tokens in the response
)

print(response.output_text)

## Top P

### Using Top-p (Nucleus) Sampling to Control Response Randomness

In this example, we introduce the `top_p` parameter, also known as nucleus sampling, to control the randomness of generated text. By setting `top_p` to a low value (`0.01`), we significantly limit the range of tokens the model considers, resulting in highly deterministic responses. A higher `top_p` value allows for more variability and creativity.




In [None]:
"""
This script shows how to use the OpenAI API to generate text completions.
We add the top_p parameter to specify the randomness/creativity of the response.
In this case, we want the response to be less random/creative.
"""
response = client.responses.create(
  model="gpt-4o-mini",
  input=[
    {"role": "developer", "content": "You are a brilliant author of children's books."},
    {"role": "user", "content": "Write two paragraphs about a frog."}
  ],
  text=None,  
  temperature=None,
  max_output_tokens=None, 
  top_p=0.01, # Top-p sampling (nucleus sampling) to control randomness
)

print(response.output_text)

### Increasing Randomness with Higher Top-p Values

In this example, we've set the `top_p` parameter (nucleus sampling) to a higher value (`0.90`). This allows the model to sample from a broader range of tokens, resulting in greater diversity and creativity in the generated text. Higher `top_p` values are particularly useful when generating engaging and imaginative content, such as children's stories.


In [None]:
"""
This script shows how to use the OpenAI API to generate text completions.
We add the top_p parameter to specify the randomness/creativity of the response.
In this case, we want the response to be more random/creative.
"""
response = client.responses.create(
  model="gpt-4o-mini",
  input=[
    {"role": "developer", "content": "You are a brilliant author of children's books."},
    {"role": "user", "content": "Write two paragraphs about a frog."}
  ],
  text=None,  
  temperature=None,
  max_output_tokens=None, 
  top_p=0.90, # Top-p sampling (nucleus sampling) to control randomness
)

print(response.output_text)

## Streaming

### Real-Time Responses Using the Stream Parameter

In this example, we introduce the `stream` parameter (`stream=True`) to enable real-time streaming of the model's output. Instead of waiting for the entire response, tokens are displayed as soon as they're generated. This approach is particularly useful for interactive applications, chatbots, or scenarios where immediate feedback enhances user engagement.


In [None]:
"""
This script shows how to use the OpenAI API to generate text completions.
We add the stream parameter to dynamically show tokens to the user in real-time.
In this case, we want the response to start showing as soon as possible.
"""

stream = client.responses.create(
    model="gpt-4o-mini",
    input=[
        {"role": "developer", "content": "You are a brilliant author of children's books."},
        {"role": "user", "content": "Write two paragraphs about a frog."}
    ],
    text=None,  
    temperature=None,
    max_output_tokens=None, 
    top_p=None, 
    stream=True  # Enable streaming
)

for event in stream:  # Iterate through the streaming events
    if event.type == "response.output_text.delta": # Check if the event is a text delta
        print(event.delta, end='', flush=True)  # Print each text delta as it arrives

### Receiving Complete Responses with Streaming Disabled

In this example, we've set `stream=False` to disable real-time token streaming. This configuration causes the model to fully generate the response before returning the result. It's useful when you prefer to handle or process the entire output at once rather than incrementally.


In [None]:
"""
This script shows how to use the OpenAI API to generate text completions.
We add the stream parameter to dynamically show tokens to the user in real-time.
In this case, we want the response to delay showing the response until it is complete.
"""

nostream = client.responses.create(
  model="gpt-4o-mini",
  input=[
    {"role": "developer", "content": "You are a brilliant author of children's books."},
    {"role": "user", "content": "Write two paragraphs about a frog."}
    ],
    text=None,  
    temperature=None,
    max_output_tokens=None, 
    top_p=None, 
    stream=False
    )

print(nostream.output_text)  # Print the entire response at once)