# 📓 The GenAI Revolution Cookbook

**Title:** AI Image Generation Simplified: Using DALL·E 2 for Stunning Results

**Description:** Unlock the full potential of AI image generation with DALL·E 2. Master API integration, prompt engineering, and output management for stunning results.

**📖 Read the full article:** [AI Image Generation Simplified: Using DALL·E 2 for Stunning Results](https://blog.thegenairevolution.com/article/ai-image-generation-simplified-using-dalle-2-for-stunning-results-2)

---

*This jupyter notebook contains executable code examples. Run the cells below to try out the code yourself!*



## Introduction
So I was messing around with DALL·E 2 the other day, and honestly? I was kind of blown away by how simple the whole thing was to get running. I mean, here's this incredibly powerful AI that can create images from text, and the API is just... straightforward. No weird hoops to jump through, no cryptic documentation that makes you want to pull your hair out.

Look, I'm going to walk you through the actual nuts and bolts here - how to set it up, generate images, create variations, all that good stuff. And I'll try to keep it practical because, let's face it, we're all just trying to build something that works.

## Why Use DALL·E 2?
Okay, real talk - DALL·E 2 isn't the newest kid on the block anymore, but it's still really solid for most things you'd want to build:

<ul>
- **The quality is genuinely good**: Not every image is a masterpiece (sometimes you get weird hands or whatever), but most of the time? Totally usable.
- **The API is dead simple**: RESTful, works with Python, you can literally get it running in like 5 minutes
- **It does what you need**: Text-to-image, variations, edits - that covers basically everything
- **OpenAI's infrastructure**: You're not dealing with random server crashes at 3 AM
</ul>
I've been using it mostly for generating custom visuals on the fly. Actually, wait - let me be more specific. Last week I built this little tool for our marketing team where they could generate social media graphics just by describing what they wanted. Saved them hours of stock photo hunting.

## Core Concepts
Before we dive into the code, here's what you actually need to know:

**Prompts**: This is where most people mess up. "A red car" will give you... well, a red car. But probably not what you actually wanted. Try something like "A vintage red Ferrari on a mountain road at sunset, cinematic lighting, shot on film" - now we're talking.

**Image Generation**: You get three sizes - 256x256, 512x512, or 1024x1024. Bigger = more expensive. I usually test with 512x512 because I'm cheap like that.

**Variations**: This is super handy. Got an image that's almost perfect but not quite? Generate variations. Sometimes the third or fourth one is exactly what you needed.

**Edits (Inpainting)**: You can change specific parts of an image using masks. To be honest, this one's a bit finicky. Sometimes it works beautifully, sometimes it completely ignores what you asked for. But when it works? Chef's kiss.

## Setup
Alright, let's get our hands dirty. First, install what you need:

In [None]:
!pip install openai pillow requests

Then set up your imports and API key:

In [None]:
import openai
import os
from PIL import Image
import requests
from io import BytesIO

# Set your OpenAI API key
openai.api_key = os.getenv("OPENAI_API_KEY")  # or set directly: "your-api-key-here"

Get your API key from <a href="https://platform.openai.com/api-keys">OpenAI's platform</a>. And seriously, use environment variables. I once accidentally pushed my API key to GitHub and... yeah, that was a fun conversation with accounting.

## Generating Images from Text
This is the bread and butter - turning your words into pictures:

In [None]:
def generate_image(prompt, size="1024x1024", n=1):
    """
    Generate images using DALL·E 2
    
    Args:
        prompt: Text description of the desired image
        size: Image dimensions (256x256, 512x512, or 1024x1024)
        n: Number of images to generate (1-10)
    """
    response = openai.Image.create(
        prompt=prompt,
        n=n,
        size=size
    )
    return response['data']

# Generate a single image
prompt = "A futuristic city skyline at sunset with flying cars, digital art style"
images = generate_image(prompt, size="1024x1024", n=1)

# The response contains URLs to the generated images
image_url = images[0]['url']
print(f"Generated image URL: {image_url}")

And here's how you actually save the thing:

In [None]:
def download_image(url):
    """Download image from URL and return PIL Image object"""
    response = requests.get(url)
    return Image.open(BytesIO(response.content))

# Download and save the image
img = download_image(image_url)
img.save("generated_image.png")
print("Image saved as generated_image.png")

Some prompt tips I've picked up (mostly through trial and error):

<ul>
- Style descriptors are your friend - "oil painting", "3D render", "photograph", whatever
- Lighting matters way more than you'd think
- Be weirdly specific about composition
- Actually, the more specific you are about everything, the better. It's like the AI needs you to paint a picture with words before it can... paint a picture
</ul>
## Creating Image Variations
This feature has saved my bacon more times than I can count. You know when you generate something that's like 90% perfect? That's when variations come in:

In [None]:
def create_variation(image_path, n=1, size="1024x1024"):
    """
    Create variations of an existing image
    
    Args:
        image_path: Path to the source image (must be PNG, < 4MB, square)
        n: Number of variations to generate
        size: Output image size
    """
    with open(image_path, "rb") as image_file:
        response = openai.Image.create_variation(
            image=image_file,
            n=n,
            size=size
        )
    return response['data']

# Create variations of the generated image
variations = create_variation("generated_image.png", n=2)

# Download variations
for i, var in enumerate(variations):
    img = download_image(var['url'])
    img.save(f"variation_{i}.png")
    print(f"Saved variation_{i}.png")

Quick heads up - your source image needs to be square, PNG format, and under 4MB. Found this out after spending 20 minutes wondering why my rectangular JPEGs weren't working. The error messages aren't always super helpful.

## Editing Images with Inpainting
Okay, this is where things get interesting. Want to change just one part of an image? Here's how:

In [None]:
def edit_image(image_path, mask_path, prompt, n=1, size="1024x1024"):
    """
    Edit an image using a mask and text prompt
    
    Args:
        image_path: Path to the original image (PNG, < 4MB, square)
        mask_path: Path to the mask image (PNG, transparent areas indicate edit regions)
        prompt: Description of the desired edit
        n: Number of edited versions to generate
        size: Output image size
    """
    with open(image_path, "rb") as image_file, open(mask_path, "rb") as mask_file:
        response = openai.Image.create_edit(
            image=image_file,
            mask=mask_file,
            prompt=prompt,
            n=n,
            size=size
        )
    return response['data']

# Example: Edit a specific region
# You need to create a mask image where transparent areas indicate what to edit
edited = edit_image(
    "generated_image.png",
    "mask.png",
    "Add a bright full moon in the sky",
    n=1
)

img = download_image(edited[0]['url'])
img.save("edited_image.png")

But wait - creating masks in Photoshop is annoying. Let me show you a programmatic way that's saved me tons of time:

In [None]:
from PIL import Image, ImageDraw

def create_circular_mask(image_path, center, radius, output_path):
    """Create a circular mask for inpainting"""
    img = Image.open(image_path).convert("RGBA")
    mask = Image.new("RGBA", img.size, (0, 0, 0, 255))  # Opaque black
    draw = ImageDraw.Draw(mask)
    
    # Draw transparent circle (area to edit)
    left = center[0] - radius
    top = center[1] - radius
    right = center[0] + radius
    bottom = center[1] + radius
    draw.ellipse([left, top, right, bottom], fill=(0, 0, 0, 0))
    
    mask.save(output_path)
    return output_path

# Create a mask for the top-right region
create_circular_mask("generated_image.png", center=(768, 256), radius=200, output_path="mask.png")

## Building a Production-Ready Generator
Alright, so everything above works great when you're just playing around. But if you're actually shipping something? You need error handling, retries, the whole nine yards. Here's what I use in production:

In [None]:
import time
from typing import List, Optional

class DALLEGenerator:
    """Production-ready DALL·E 2 image generator with error handling"""
    
    def __init__(self, api_key: str, max_retries: int = 3):
        openai.api_key = api_key
        self.max_retries = max_retries
    
    def generate(self, prompt: str, size: str = "1024x1024", n: int = 1) -> List[str]:
        """Generate images with retry logic"""
        for attempt in range(self.max_retries):
            try:
                response = openai.Image.create(
                    prompt=prompt,
                    n=n,
                    size=size
                )
                return [img['url'] for img in response['data']]
            
            except openai.error.RateLimitError:
                if attempt < self.max_retries - 1:
                    wait_time = 2 ** attempt  # Exponential backoff
                    print(f"Rate limit hit. Waiting {wait_time}s...")
                    time.sleep(wait_time)
                else:
                    raise
            
            except openai.error.InvalidRequestError as e:
                print(f"Invalid request: {e}")
                raise
            
            except Exception as e:
                print(f"Unexpected error: {e}")
                if attempt < self.max_retries - 1:
                    time.sleep(1)
                else:
                    raise
    
    def download_and_save(self, url: str, filepath: str) -> None:
        """Download image from URL and save to disk"""
        try:
            response = requests.get(url, timeout=30)
            response.raise_for_status()
            
            img = Image.open(BytesIO(response.content))
            img.save(filepath)
            print(f"Saved image to {filepath}")
        
        except Exception as e:
            print(f"Failed to download image: {e}")
            raise

# Usage example
generator = DALLEGenerator(api_key=os.getenv("OPENAI_API_KEY"))

prompts = [
    "A serene mountain landscape with a lake reflection, photorealistic",
    "Abstract geometric patterns in vibrant colors, modern art style",
    "A cozy coffee shop interior with warm lighting, architectural photography"
]

for i, prompt in enumerate(prompts):
    print(f"\nGenerating image {i+1}: {prompt}")
    urls = generator.generate(prompt, size="1024x1024", n=1)
    generator.download_and_save(urls[0], f"output_{i}.png")

## Run and Evaluate
Let's put it all together and see if this thing actually works:

In [None]:
# Initialize generator
generator = DALLEGenerator(api_key=os.getenv("OPENAI_API_KEY"))

# Test prompt
test_prompt = "A robot reading a book in a library, warm lighting, digital illustration"

# Generate image
print("Generating image...")
image_urls = generator.generate(test_prompt, size="512x512", n=1)

# Download and save
generator.download_and_save(image_urls[0], "test_output.png")

# Create variations
print("\nCreating variations...")
variations = create_variation("test_output.png", n=2, size="512x512")

for i, var in enumerate(variations):
    generator.download_and_save(var['url'], f"test_variation_{i}.png")

print("\nGeneration complete. Check the output files.")

Oh, and here's something I wish someone had told me earlier - keep track of your costs:

In [None]:
# Track generation costs (approximate)
def estimate_cost(size: str, n: int) -> float:
    """Estimate cost for DALL·E 2 generation (as of 2024)"""
    costs = {
        "1024x1024": 0.020,  # $0.020 per image
        "512x512": 0.018,    # $0.018 per image
        "256x256": 0.016     # $0.016 per image
    }
    return costs.get(size, 0.020) * n

# Calculate cost for batch generation
total_cost = sum(estimate_cost("1024x1024", 1) for _ in prompts)
print(f"Estimated cost for batch: ${total_cost:.3f}")

I learned this one the hard way when I accidentally left a loop running overnight. That was a fun expense report.

## Conclusion
So there you have it - everything you need to get DALL·E 2 up and running in your projects. The big takeaways?

<ul>
- Your prompts are everything. Seriously, spend time on them
- Always, always add retry logic. The API will hiccup, usually at the worst possible moment
- Test with smaller images first. Your wallet will thank you
- When you get something close, use variations instead of regenerating from scratch
</ul>
What's next? Just start building something. Hook it up to a web interface, combine it with GPT for automated prompt generation, build that weird idea you've been thinking about. The API is stable enough now that you can actually ship stuff without constantly worrying about it breaking.

Actually, let me put it this way - I've been running this in production for months now, and the only issues I've had were my own stupid mistakes (like that overnight loop thing). The technology itself? Rock solid.

For the latest updates and all the nitty-gritty details, check the <a href="https://platform.openai.com/docs/guides/images">OpenAI DALL·E API reference</a>.