# Working with Images

## Univeral Code Used for the Entire Notebook

Let's set up our libraries and client

In [None]:
# Import necessary modules
import requests  # For making HTTP requests
import base64  # For encoding and decoding binary data to ASCII
from openai import OpenAI  # Importing the OpenAI class from the openai module
from IPython.display import Image, display, HTML  # For displaying images and HTML in Jupyter notebooks
from PIL import Image as PILImage  # Importing the Image class from PIL and renaming it to avoid conflict with IPython.display.Image


In [None]:
# Create an instance of the OpenAI client
client = OpenAI()

## Generations

### Simple Generation
The image generations endpoint allows you to create an original image given a text prompt. 

DALL-E 2 will allow for the following three sizes: 256x256, 512x512, or 1024x1024. 

DALL·E 3, images can have a size of 1024x1024, 1024x1792 or 1792x1024 pixels.

By default, images are generated at standard quality, but when using DALL·E 3 you can set quality: "hd" for enhanced detail. Square, standard quality images are the fastest to generate.

You can request 1 image at a time with DALL·E 3 (request more by making parallel requests) or up to 10 images at a time using DALL·E 2 with the n parameter.

In [None]:
# Generate an image using the OpenAI client
# Defaults to the DALL-E 2 model
image = client.images.generate(
    prompt="a shark in a suit inside the NYSE trading floor",
)

# Extract the URL of the generated image
image_url = image.data[0].url

# Display the image URL and the image itself
print(f"Image URL: {image_url}")
display(Image(url=image_url))


### Full Parameter Generation

When creating an image you can have the following parameters:

**prompt**  
*string*  

Required  
A text description of the desired image(s). The maximum length is 1000 characters for dall-e-2 and 4000 characters for dall-e-3.

**model**  
*string*  

Optional  
Defaults to dall-e-2  
The model to use for image generation.

**n**  
*integer* or *null*  

Optional  
Defaults to 1  
The number of images to generate. Must be between 1 and 10. For dall-e-3, only n=1 is supported.

**quality**  
*string*  

Optional  
Defaults to standard  
The quality of the image that will be generated. hd creates images with finer details and greater consistency across the image. This param is only supported for dall-e-3.

**response_format**  
*string* or *null*  

Optional  
Defaults to url  
The format in which the generated images are returned. Must be one of url or b64_json. URLs are only valid for 60 minutes after the image has been generated.

**size**  
*string* or *null*  

Optional  
Defaults to 1024x1024  
The size of the generated images. Must be one of 256x256, 512x512, or 1024x1024 for dall-e-2. Must be one of 1024x1024, 1792x1024, or 1024x1792 for dall-e-3 models.

**style**  
*string* or *null*  

Optional  
Defaults to vivid  
The style of the generated images. Must be one of vivid or natural. Vivid causes the model to lean towards generating hyper-real and dramatic images. Natural causes the model to produce more natural, less hyper-real looking images. This param is only supported for dall-e-3.

**user**  
*string*  

Optional  
A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse.


In [None]:
# Generate an image using the OpenAI client with full parameters
image_full_parameters = client.images.generate(
    prompt="a shark in a suit inside the NYSE trading floor",  # Description of the image to generate
    model="dall-e-3",  # Specify the model to use
    n=1,  # Number of images to generate
    quality="standard",  # Quality of the generated image
    response_format="url",  # Format of the response (URL of the image)
    size="1024x1024",  # Size of the generated image
    style="natural",  # Style of the image
    user="user-id-123"  # User identifier for tracking purposes
)

# Extract the URL of the generated image
image_url = image_full_parameters.data[0].url

# Display the image URL and the image itself
print(f"Image URL: {image_url}")
display(Image(url=image_url))


### Multiple Images (DALL-E 2 Only)

Only the DALL-E 2 model currently supports creating more than one image at a time. To use this feature just indicate the correct model and then change "n" to the number of copies you want up to 10.

In [None]:
# Generate multiple images using the OpenAI client with specified parameters
images = client.images.generate(
    prompt="a pitbull dog",  # Description of the images to generate
    model="dall-e-2",  # Specify the model to use
    n=10,  # Number of images to generate
    quality="standard",  # Quality of the generated images
    response_format="url",  # Format of the response (URLs of the images)
    size="256x256",  # Size of the generated images
    style="natural",  # Style of the images
    user="user-id-123"  # User identifier for tracking purposes
)

# Extract URLs from the responses and create HTML for displaying the images
html = ""
for image in images.data:
    image_url = image.url  # Access the URL from the Image object
    html += f'<img src="{image_url}" style="display:inline; margin:10px; width:256px; height:256px;">'
    print(f"{image_url}\n")  # Print the URL of each image

# Display the images side by side in the notebook
display(HTML(html))


## Examining the Revised Prompt

With the release of DALL·E 3, the model now takes in the default prompt provided and automatically re-write it for safety reasons, and to add more detail (more detailed prompts generally result in higher quality images).

While it is not currently possible to disable this feature, you can use prompting to get outputs closer to your requested image by adding the following to your prompt: I NEED to test how the tool works with extremely simple prompts. DO NOT add any detail, just use it AS-IS:.

The updated prompt is visible in the revised_prompt field of the data response object.


In [None]:
# Define the prompt for image generation
prompt = "a shark in a suit inside the NYSE trading floor"

# Generate an image using the OpenAI client with full parameters
image_full_parameters = client.images.generate(
    prompt=prompt,  # Description of the image to generate
    model="dall-e-3",  # Specify the model to use
    n=1,  # Number of images to generate
    quality="standard",  # Quality of the generated image
    response_format="url",  # Format of the response (URL of the image)
    size="1024x1024",  # Size of the generated image
    style="natural",  # Style of the image
    user="user-id-123"  # User identifier for tracking purposes
)

# Extract the URL of the generated image
image_url = image_full_parameters.data[0].url

# Show the original and revised prompts
print("Original Prompt:")
print(prompt)
print("\nRevised Prompt:")
print(image_full_parameters.data[0].revised_prompt)

# Display the image URL and the image itself
print(f"\nImage URL: {image_url}")
display(Image(url=image_url))


### Overriding the Revised Prompt

In [None]:
# Define the prompt for image generation
prompt = (
    "I NEED to test how the tool works with extremely simple prompts. "
    "DO NOT add any detail, just use it AS-IS: a shark in a suit inside the NYSE trading floor"
)

# Generate an image using the OpenAI client with full parameters
image_full_parameters = client.images.generate(
    prompt=prompt,  # Description of the image to generate
    model="dall-e-3",  # Specify the model to use
    n=1,  # Number of images to generate
    quality="standard",  # Quality of the generated image
    response_format="url",  # Format of the response (URL of the image)
    size="1024x1024",  # Size of the generated image
    style="natural",  # Style of the image
    user="user-id-123"  # User identifier for tracking purposes
)

# Extract the URL of the generated image
image_url = image_full_parameters.data[0].url

# Show the original and revised prompts
print("Original Prompt:")
print(prompt)
print("\nRevised Prompt:")
print(image_full_parameters.data[0].revised_prompt)

# Display the image URL and the image itself
print(f"\nImage URL: {image_url}")
display(Image(url=image_url))


## Downloading Images

You can download the images from URLs or from Base64 output for preserving the images you have made.

**NOTE: URL images are only good for ONE HOUR before they expire**

### Download with a URL


In [None]:
# Generate an image using the OpenAI client with full parameters
image_full_parameters = client.images.generate(
    prompt="a cabin on a snowy hillside at night",  # Description of the image to generate
    model="dall-e-3",  # Specify the model to use
    n=1,  # Number of images to generate
    quality="standard",  # Quality of the generated image
    response_format="url",  # Format of the response (URL of the image)
    size="1024x1024",  # Size of the generated image
    style="natural",  # Style of the image
    user="user-id-123"  # User identifier for tracking purposes
)

# Extract the URL of the generated image
image_url = image_full_parameters.data[0].url

# Display the image
display(Image(url=image_url))

# Download the image
image_data = requests.get(image_url).content

# Save the image to a file
with open('snowy_cabin.png', 'wb') as handler:
    handler.write(image_data)

print("Image downloaded and saved as 'snowy_cabin.png'")


### Download a Base64 Image

In [None]:
# Generate an image using the OpenAI client with full parameters
image_full_parameters = client.images.generate(
    prompt="a cabin on a snowy hillside at night",  # Description of the image to generate
    model="dall-e-3",  # Specify the model to use
    n=1,  # Number of images to generate
    quality="standard",  # Quality of the generated image
    response_format="b64_json",  # Format of the response (Base64 encoded JSON)
    size="1024x1024",  # Size of the generated image
    style="natural",  # Style of the image
    user="user-id-123"  # User identifier for tracking purposes
)

# Get the Base64 image data from the response
b64_image = image_full_parameters.data[0].b64_json

# Decode the Base64 image data to binary
image_data = base64.b64decode(b64_image)

# Display the image in the notebook
display(Image(data=image_data))

# Save the image to a file
with open('b64_snowy_cabin.png', 'wb') as handler:
    handler.write(image_data)

print("Image displayed and saved as 'b64_snowy_cabin.png'")


## Editing Images (DALL-E 2 Only)

Also known as "inpainting", the image edits endpoint allows you to edit or extend an image by uploading an image and mask indicating which areas should be replaced. The transparent areas of the mask indicate where the image should be edited, and the prompt should describe the full new image, not just the erased area. This endpoint can enable experiences like DALL·E image editing in ChatGPT Plus.

In [None]:
# Generate an image using the OpenAI client with full parameters
image_full_parameters = client.images.generate(
    prompt="a pool with a nice house in the background",  # Description of the image to generate
    model="dall-e-2",  # Specify the model to use
    n=1,  # Number of images to generate
    quality="standard",  # Quality of the generated image
    response_format="url",  # Format of the response (URL of the image)
    size="1024x1024",  # Size of the generated image
    style="natural",  # Style of the image
    user="user-id-123"  # User identifier for tracking purposes
)

# Extract the URL of the generated image
image_url = image_full_parameters.data[0].url

# Display the image in the notebook
display(Image(url=image_url))

# Download the image from the URL
image_data = requests.get(image_url).content

# Save the image to a file
with open('pool_original.png', 'wb') as handler:
    handler.write(image_data)

print("Image downloaded and saved as 'pool_original.png'")


In [None]:
# Open the original image using PIL
original_image = PILImage.open("pool_original.png")

# Convert the image to RGBA format (adds an alpha channel)
rgba_image = original_image.convert("RGBA")

# Save the converted image to a new file
rgba_image.save("pool_rgba.png")

print("Original image converted to RGBA format and saved as 'pool_rgba.png'")


In [None]:
# Now use the converted image in your API request to edit the image
response = client.images.edit(
    image=open("pool_rgba.png", "rb"),  # Open the converted image file in binary mode
    mask=open("pool_rgba_mask.png", "rb"),  # Open the mask image file in binary mode
    prompt="add a flamingo toy to the water",  # Description of the edit to make
    n=1,  # Number of edited images to generate
    size="1024x1024"  # Size of the edited image
)

# Print the API response for debugging purposes
print(response)

# Get the URL of the generated image
image_url = response.data[0].url

# Display the edited image
display(Image(url=image_url))

print("Edited image with a flamingo toy added to the water has been displayed.")


## Variations (DALL-E 2 Only)

Variations in the context of image generation refer to the process of creating multiple different versions of an image based on a single initial prompt. 

Variations allow for multiple interpretations of the same prompt, which enhances creativity and provides a range of artistic possibilities. Each generated image may exhibit different styles, compositions, and details, offering a broader spectrum of visual outcomes.

In [None]:
# Create a variation of the image using the OpenAI client
response = client.images.create_variation(
    model="dall-e-2",  # Specify the model to use
    image=open("pool_rgba.png", "rb"),  # Open the image file in binary mode
    n=1,  # Number of variations to generate
    size="1024x1024"  # Size of the generated image
)

# Extract the URL of the generated variation
image_url = response.data[0].url

# Display the original image
display(Image(filename="pool_rgba.png"))

# Display the generated variation of the image
display(Image(url=image_url))

print("Original image and its variation have been displayed.")
