## 4.5. Image Generation (DALL-E)



The open AI image API provides three methods for interacting with images, namely creating images from scratch using a text prompt (DALL-E 3 and DALL-E 2), creating edited versions of images by having the model change some areas of a pre-existing image based on a new text prompt (DALL-E 2 only) and creating variations of an existing image (DALL-E 2 only).

This guide covers the basics of the API methods with useful code examples. 



### 4.5.1 Image Generation 

The image generation method allows you to create an original image with a text prompt.

Initially, it is necessary to import the display and Image functions from the IPython.display module, in order to display the image inside a Jupyter notebook.

It also imports os, used for operating system operations, as explained earlier for creating a venv environment, and imports the OpenAI class from the openai module, an interface for using the API.

The code creates an instance of the OpenAI() client.
A request is made to the API to generate an image, with the parameters:

* __model :__ specifies the model used to generate the image, in this case "dall-e-2", which specialises in generating images based on textual descriptions;
* __prompt :__ descriptive text that will serve as input for the model to generate the image, in this example, "a white Siamese cat";
* __size :__ specifies the size, 1024x1024 pixels;
* __quality :__  sets the quality, "standard". When using DALL-E 3 it is possible to set quality: "hd", i.e. fine detail. However, standard quality square images are generated more quickly;
* __n :__ defines the number of images generated.

After the API generates the image, the image URL is extracted from the response and stored in the image_url variable.
Finally, the image URL is printed and the image is displayed using the display function with the Image class, passing the URL as a parameter and setting the image width to 500 pixels.











In [2]:
from IPython.display import display, Image

import os
from openai import OpenAI

client = OpenAI()

response = client.images.generate(
  model="dall-e-2",
  prompt="A sunlit indoor lounge area with a pool containing a flamingo",
  size="1024x1024",
  quality="standard",
  n=1,
)

image_url = response.data[0].url
print(image_url)
display(Image(url=image_url, width=500))

https://oaidalleapiprodscus.blob.core.windows.net/private/org-WBkw2zo1WHpT09Ib6xAN0sL0/user-n6ilnWbPionjiMJDmHYLi7un/img-DScd1CnuxYh4qoZdjZYNuHRk.png?st=2024-04-24T09%3A44%3A20Z&se=2024-04-24T11%3A44%3A20Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2024-04-23T18%3A04%3A07Z&ske=2024-04-24T18%3A04%3A07Z&sks=b&skv=2021-08-06&sig=RRvJlK3ApDq5ZpoWyCD6DEjMe2ZBwtvnTkhLuBmamI8%3D


### 4.5.2 Edits (DALLE 2 only)

Also known as "inpainting", the image editing endpoint allows you to edit an image by loading an image and a mask indicating the areas that should be replaced, you can use tools such as [GIMP](https://www.gimp.org/) or Adobe Photoshop, to erase a certain area of the image. 
The transparent areas of the mask indicate where the image should be edited, and the prompt should describe the complete new image, not just the deleted area.

The image and mask sent must be square PNG images of less than 4 MB and must also have the same dimensions. The non-transparent areas of the mask are not used to generate the output, so they don't necessarily have to match the original image as in the following example, which shows the original image, the mask and the resulting image after the editing process.



<figure>
    <img src="original.png" width="400" height="400" alt="">
    <figcaption>Original</figcaption>
</figure>

<figure>
    <img src="mask1.png" width="400" height="400" alt="">
    <figcaption>Mask</figcaption>
</figure>

<figure>
    <img src="output.png" width="400" height="400" alt="">
    <figcaption>Output</figcaption>
</figure>

In [4]:
from openai import OpenAI
client = OpenAI()
from IPython.display import display, Image

response = client.images.edit(
  model="dall-e-2",
  image=open("original.png", "rb"),
  mask=open("mask1.png", "rb"),
  prompt="A sunlit indoor lounge area with a pool containing a flamingo",
  n=1,
  size="1024x1024"
)
image_url = response.data[0].url

print(image_url)
display(Image(url=image_url, width=500))

https://oaidalleapiprodscus.blob.core.windows.net/private/org-WBkw2zo1WHpT09Ib6xAN0sL0/user-n6ilnWbPionjiMJDmHYLi7un/img-jsriaE8IV2LJXJbpOi9UQFoA.png?st=2024-04-28T07%3A11%3A58Z&se=2024-04-28T09%3A11%3A58Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2024-04-27T19%3A23%3A03Z&ske=2024-04-28T19%3A23%3A03Z&sks=b&skv=2021-08-06&sig=2%2B%2BDBFl9IaomB3gJUF6hJFbm1Et4/%2BAg/ZuMlFOPnao%3D


### 4.5.3 Variations (DALL·E 2 only)

In [5]:
from IPython.display import display, Image
from openai import OpenAI
client = OpenAI()

response = client.images.create_variation(
  model="dall-e-2",
  image=open("output.png", "rb"),
  n=1,
  size="1024x1024"
)

image_url = response.data[0].url

print(image_url)
display(Image(url=image_url, width=500))

https://oaidalleapiprodscus.blob.core.windows.net/private/org-WBkw2zo1WHpT09Ib6xAN0sL0/user-n6ilnWbPionjiMJDmHYLi7un/img-bm5FFHuJZztZu9h0j4ngo9Ab.png?st=2024-04-28T07%3A12%3A18Z&se=2024-04-28T09%3A12%3A18Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2024-04-27T19%3A24%3A18Z&ske=2024-04-28T19%3A24%3A18Z&sks=b&skv=2021-08-06&sig=6n03Xd9tS3VhqET9hQ7au0DS5TjfNb9Fmlkeyw05Ivc%3D
