# DALL-E Image Generation Agent with LangChain

This notebook demonstrates how to build an AI agent that can generate images using DALL-E 3.

**Features:**
- ✅ Generate images with DALL-E 3
- ✅ Conversational memory for iterative refinements
- ✅ Display images directly in notebook
- ✅ Base64 image encoding

## Step 1: Install Required Dependencies

In [None]:
!pip install --pre -U langchain langchain-openai langgraph openai pillow matplotlib

## Step 2: Setup OpenAI API Key

Choose one of the following methods to set your OpenAI API key:

In [None]:
import os

# METHOD 1: For Google Colab - Using Colab Secrets
# Uncomment the lines below if using Google Colab
# from google.colab import userdata
# OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')

# METHOD 2: Direct input (Not recommended for production)
# Uncomment and replace with your key
# OPENAI_API_KEY = "sk-your-api-key-here"

# METHOD 3: Environment variable (Recommended)
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')

# METHOD 4: Input prompt (Interactive)
# Uncomment to enter key interactively
# from getpass import getpass
# OPENAI_API_KEY = getpass('Enter your OpenAI API Key: ')

# Set as environment variable
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY

print("✅ API Key configured!")

## Step 3: Import Libraries

In [None]:
import base64
from io import BytesIO
from openai import OpenAI
from typing import Annotated
from langchain.agents import create_agent
from langgraph.checkpoint.memory import InMemorySaver
from IPython.display import Image, display
import matplotlib.pyplot as plt
from PIL import Image as PILImage

print("✅ Libraries imported successfully!")

## Step 4: Initialize OpenAI Client

In [None]:
# Initialize OpenAI client
client = OpenAI(api_key=OPENAI_API_KEY)

# Global variable to store the last generated image
last_image_base64 = None

print("✅ OpenAI client initialized!")

## Step 5: Define DALL-E Image Generation Tool

In [None]:
def generate_image_with_dalle(prompt: Annotated[str, "The detailed description of the image to generate"]) -> str:
    """
    Generate an image using DALL-E 3 based on a text prompt.
    Returns base64 encoded image data.
    """
    global last_image_base64
    
    try:
        print(f"🎨 Generating image with prompt: {prompt[:100]}...")
        
        response = client.images.generate(
            model="dall-e-3",
            prompt=prompt,
            size="1024x1024",
            quality="standard",
            n=1,
            response_format="b64_json"  # Request base64 format
        )
        
        # Store base64 image data
        last_image_base64 = response.data[0].b64_json
        
        return f"✅ Image generated successfully! The image has been stored and is ready to display."
    
    except Exception as e:
        return f"❌ Error generating image: {str(e)}"

print("✅ DALL-E tool function defined!")

## Step 6: Define Image Display Function

In [None]:
def display_last_image():
    """
    Display the last generated image in the notebook.
    """
    global last_image_base64
    
    if last_image_base64 is None:
        print("⚠️ No image has been generated yet. Please generate an image first.")
        return
    
    try:
        # Decode base64 to image
        image_data = base64.b64decode(last_image_base64)
        image = PILImage.open(BytesIO(image_data))
        
        # Display using matplotlib
        plt.figure(figsize=(10, 10))
        plt.imshow(image)
        plt.axis('off')
        plt.tight_layout()
        plt.show()
        
    except Exception as e:
        print(f"❌ Error displaying image: {str(e)}")

print("✅ Display function defined!")

## Step 7: Create the DALL-E Agent

In [None]:
# Create memory for the agent
checkpointer = InMemorySaver()

# Create DALL-E Agent with tools and memory
dalle_agent = create_agent(
    model="openai:gpt-4o-mini",
    tools=[generate_image_with_dalle],
    prompt="""You are a creative AI assistant with image generation capabilities using DALL-E 3. 
    
    When a user requests an image:
    1. Use the generate_image_with_dalle tool with a detailed, creative prompt
    2. Enhance the user's description with artistic details if needed
    3. Remember the conversation context for follow-up modifications
    4. After generating, inform the user the image is ready to view
    
    Be creative, helpful, and remember all previous images in the conversation.""",
    checkpointer=checkpointer
)

print("✅ DALL-E Agent created successfully!")
print("🎨 The agent has:")
print("   - DALL-E 3 image generation capability")
print("   - Conversational memory")
print("   - Context awareness for iterations")

---
## Usage Examples

Now let's test the agent with various image generation requests!

### Example 1: Generate a Simple Image

In [None]:
# Create a session configuration
config = {"configurable": {"thread_id": "session_1"}}

# Generate an image
print("🚀 Requesting image generation...\n")
result = dalle_agent.invoke(
    {"messages": [{"role": "user", "content": "Create an image of a cat in a space suit floating in space with Earth in the background"}]},
    config
)

print("\n" + "="*60)
print("Agent Response:")
print("="*60)
print(result['messages'][-1].content)

In [None]:
# Display the generated image
display_last_image()

### Example 2: Iterative Refinement (Using Memory)

In [None]:
# Follow-up request - the agent remembers the previous image!
print("🔄 Requesting modification...\n")
result = dalle_agent.invoke(
    {"messages": [{"role": "user", "content": "Now add a spaceship passing by in the background"}]},
    config  # Same session
)

print("\n" + "="*60)
print("Agent Response:")
print("="*60)
print(result['messages'][-1].content)

In [None]:
# Display the modified image
display_last_image()

### Example 3: Another Iteration

In [None]:
# Another modification
print("🔄 Requesting another modification...\n")
result = dalle_agent.invoke(
    {"messages": [{"role": "user", "content": "Make it more dramatic with nebula clouds and stars"}]},
    config
)

print("\n" + "="*60)
print("Agent Response:")
print("="*60)
print(result['messages'][-1].content)

In [None]:
# Display the final image
display_last_image()

### Example 4: Start a New Image (New Topic)

In [None]:
# Create a completely different image
print("🎨 Requesting a new image...\n")
result = dalle_agent.invoke(
    {"messages": [{"role": "user", "content": "Create a serene Japanese garden with a koi pond at sunset"}]},
    config
)

print("\n" + "="*60)
print("Agent Response:")
print("="*60)
print(result['messages'][-1].content)

In [None]:
# Display the new image
display_last_image()

### Example 5: Test Memory - Ask About Previous Images

In [None]:
# Test if the agent remembers what we've created
print("🧠 Testing memory...\n")
result = dalle_agent.invoke(
    {"messages": [{"role": "user", "content": "What images have we created in this session?"}]},
    config
)

print("\n" + "="*60)
print("Agent Response:")
print("="*60)
print(result['messages'][-1].content)

---
## Your Turn!

Try creating your own images below:

In [None]:
# Your custom image request
your_prompt = "Create an image of a robot reading a book in a cozy library"  # Change this!

result = dalle_agent.invoke(
    {"messages": [{"role": "user", "content": your_prompt}]},
    config
)

print(result['messages'][-1].content)

In [None]:
# Display your image
display_last_image()

---
## Summary

**What we built:**
- ✅ LangChain agent with DALL-E 3 integration
- ✅ Conversational memory for iterative refinements
- ✅ Base64 image encoding and display
- ✅ Context-aware image generation

**Key Features:**
- The agent remembers your conversation
- You can iteratively refine images
- Images are displayed directly in the notebook
- The agent enhances your prompts for better results

**Next Steps:**
- Try different image sizes (1024x1024, 1792x1024, 1024x1792)
- Experiment with different quality settings
- Save images to disk
- Build a web interface for the agent

---
**Created with:** LangChain + OpenAI DALL-E 3 + LangGraph