# AI Image Model Benchmark Analysis

## Overview
This notebook provides a framework for comparing AI image generation models using standardized prompts and qualitative analysis.

## How to Use This Notebook
1. Manually generate images using each prompt with different AI models (DALL-E 3, Midjourney, Stable Diffusion)
2. Save images to the appropriate folders in `/generated_images/`
3. Use the cells below to load and compare images side-by-side
4. Document your observations and insights

In [None]:
# Import required libraries
import os
from PIL import Image
import matplotlib.pyplot as plt
import pandas as pd

print("✅ Libraries imported successfully!")

## Setup: Folder Structure
Create these folders manually in your repository:
- `generated_images/dalle-3/`
- `generated_images/midjourney/`
- `generated_images/stable-diffusion-xl/`

Save your test images in these folders with consistent naming (e.g., `photorealism_dog.jpg`)

In [None]:
# Define your image directories
image_dirs = {
    'DALL-E 3': 'generated_images/dalle-3',
    'Midjourney': 'generated_images/midjourney', 
    'Stable Diffusion XL': 'generated_images/stable-diffusion-xl'
}

# Check if directories exist
for model, directory in image_dirs.items():
    if os.path.exists(directory):
        print(f"✅ {model} directory found: {directory}")
        print(f"   Images found: {len(os.listdir(directory))}")
    else:
        print(f"❌ {model} directory missing: {directory}")

## Image Comparison Function
This function displays the same prompt generated by all three models for easy comparison.

In [None]:
def compare_prompt(image_name, prompt_description):
    """
    Compare the same prompt across all three models
    
    Args:
        image_name (str): Base name of the image file (e.g., 'photorealism_dog')
        prompt_description (str): The actual prompt used for context
    """
    
    fig, axes = plt.subplots(1, 3, figsize=(18, 6))
    fig.suptitle(f'Prompt: "{prompt_description}"', fontsize=16, y=1.05)
    
    models = list(image_dirs.keys())
    
    for i, model in enumerate(models):
        image_path = os.path.join(image_dirs[model], f"{image_name}.jpg")
        
        if os.path.exists(image_path):
            img = Image.open(image_path)
            axes[i].imshow(img)
            axes[i].set_title(f'{model}', fontsize=12)
            axes[i].axis('off')
        else:
            axes[i].text(0.5, 0.5, f'Image not found\n{image_path}', 
                       ha='center', va='center', transform=axes[i].transAxes)
            axes[i].set_title(f'{model}', fontsize=12)
            axes[i].axis('off')
    
    plt.tight_layout()
    plt.show()

print("✅ Comparison function ready!")

## Example Comparison
When you have images, you can compare them like this:

```python
# compare_prompt('photorealism_dog', 'A photorealistic image of a wet German Shepherd playing in a sprinkler')
```

Remove the # symbol to run the comparison once you have images.

## Qualitative Analysis Template

### Prompt: [Prompt Name]
**Observations:**
- **DALL-E 3:** [Your notes here]
- **Midjourney:** [Your notes here] 
- **Stable Diffusion XL:** [Your notes here]

**Key Takeaways:**
- [Main insights about model differences]
- [Surprising results or limitations]
- [Implications for product marketing]

## Summary Scoring
Rate each model on key dimensions (1-5 scale):

In [None]:
# Create an evaluation dataframe
evaluation_data = {
    'Model': ['DALL-E 3', 'Midjourney', 'Stable Diffusion XL'],
    'Prompt_Adherence': [0, 0, 0],  # Fill in after testing
    'Aesthetic_Quality': [0, 0, 0],
    'Style_Range': [0, 0, 0],
    'Text_Rendering': [0, 0, 0],
    'Technical_Execution': [0, 0, 0]
}

df_evaluation = pd.DataFrame(evaluation_data)
print("Evaluation framework ready!")
print("\nFill in your scores (1-5) after testing:")
df_evaluation