## What is a Large Language Model (LLM)?

Large language models (LLMs) are a type of foundation model trained on vast datasets. This extensive training enables them to understand, interpret, and generate human language, as well as other forms of content. Because of this, LLMs can perform a wide variety of tasks such as answering questions, summarizing information, translating languages, and generating creative text.

---

## What is Multimodal?

Multimodal refers to a model’s ability to work with multiple types of data at the same time. In artificial intelligence, multimodal systems can process and combine information from different formats, including:

* **Text to image**: Creating images from written descriptions, such as with models like DALL·E
* **Text to audio**: Transforming written text into spoken language or sounds
* **Image to text**: Interpreting images to produce captions or descriptions
* **Audio to text**: Converting spoken words into written text
* **Video analysis**: Understanding video content by analyzing both visual and audio elements

By integrating multiple data types, multimodal models provide a richer and more accurate understanding of content. For example, they can generate images from text prompts or describe visual scenes in words, enabling more advanced applications like interactive creative tools and context-aware conversational AI.

---

## What is DALL·E 2?

DALL·E 2 is an artificial intelligence system created by OpenAI that generates realistic images and artwork from text prompts. Released in 2022, it builds on the original DALL·E model and offers several enhanced capabilities, including:

* **Text-to-image generation** for creating visuals from natural language
* **Image editing** to modify existing images
* **Image variations** that produce multiple versions of the same concept
* **Flexible resolution options**
* **Proprietary technology**, as it is a commercial product developed by OpenAI

---

## What is DALL·E 3?

DALL·E 3 is OpenAI’s most advanced text-to-image model, released in 2023. It significantly improves upon DALL·E 2 by offering:

* **Higher-quality image generation** with greater detail and accuracy
* **Improved prompt comprehension**, especially for complex instructions
* **Enhanced text rendering**, allowing readable text within images
* **Stronger artistic style replication**
* **Advanced safety mechanisms** for responsible use
* **Direct integration with ChatGPT**, enabling users to refine prompts interactively

With higher-resolution outputs and closer alignment to user intent, DALL·E 3 is especially useful for professional design, creative projects, and detailed visual storytelling.


## Image Generation

The Images API provides three different endpoints, each designed for a specific image-related task:

* **Generations**: Creates new images entirely from a text description.
* **Edits**: Modifies existing images by replacing selected areas based on a text prompt.
* **Variations**: Produces alternative versions of an existing image.

## Dall-E-2

In [None]:
from openai import OpenAI
from IPython import display

client = OpenAI()

response = client.images.generate(
    model="dall-e-2",
    prompt="a white siamese cat",
    size="720x720",
    # quality="standard",
    n=1,
)

url = response.data[0].url
display.Image(url=url, width=512)

In [None]:
# a beautiful lake with a sunset

from openai import OpenAI
from IPython import display

client = OpenAI()

response = client.images.generate(
    model="dall-e-2",
    prompt="a beautiful lake with a sunset",
    size="1024x1024",
    n=1,
)

url = response.data[0].url
display.Image(url=url, width=512)

## Dall-E-3

In [None]:
from openai import OpenAI
from IPython import display

client = OpenAI()

response = client.images.generate(
    model="dall-e-3",
    prompt="a white siamese cat",
    size="1024x1024",
    quality="standard",
    n=1,
)

url = response.data[0].url
display.Image(url=url, width=512)

In [None]:
# Your code here
from openai import OpenAI
from IPython import display

client = OpenAI()

response = client.images.generate(
    model="dall-e-3",
    prompt="a beautiful lake with a sunset",
    size="1024x1024",
    quality="standard",
    n=1,
)

url = response.data[0].url
display.Image(url=url, width=512)

### Which Model Should You Use?

DALL·E 2 and DALL·E 3 offer different capabilities for image generation.

| Model    | Supported Endpoints            | Best Use Cases                                                                                                                               |
| -------- | ------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------- |
| DALL·E 2 | Generations, Edits, Variations | Greater flexibility with image editing and variations, finer prompt control, and the ability to generate multiple images in a single request |
| DALL·E 3 | Generations only               | Higher-quality outputs and support for larger image dimensions                                                                               |

---

### Image Generations

The image generation endpoint allows users to create original images using text prompts. Generated images can be returned as a URL or encoded in Base64 format, with URLs expiring after a limited time.

**Size and Quality Options**

Square images with standard quality are the fastest to generate. By default, images are created at a resolution of `1024 × 1024` pixels, though supported sizes vary by model.

| Model    | Supported Sizes (pixels)              | Quality Settings                                           | Request Limits                                                      |
| -------- | ------------------------------------- | ---------------------------------------------------------- | ------------------------------------------------------------------- |
| DALL·E 2 | `256×256`, `512×512`, `1024×1024`     | Standard quality only                                      | Up to 10 images per request                                         |
| DALL·E 3 | `1024×1024`, `1024×1792`, `1792×1024` | Standard by default, with an optional high-definition mode | One image per request (additional images require parallel requests) |

---

### Image Edits (DALL·E 2 Only)

The image editing endpoint allows users to modify or extend an image by providing the original image along with a mask that specifies which areas should be replaced. This technique is commonly known as **inpainting**.

The transparent sections of the mask indicate where changes will occur. The accompanying text prompt should describe the entire updated image, not just the edited portion.

To use this feature, both the original image and the mask must be square PNG files, under 4 MB in size, and share the same dimensions. Non-transparent areas of the mask are ignored during image generation and do not need to exactly match the original image.

---

### Image Variations (DALL·E 2 Only)

The image variation endpoint generates alternative versions of a provided image. These variations retain the core structure of the original while introducing visual differences.

As with image edits, the input image must be a square PNG file smaller than 4 MB.

---
