# AI Text-to-Image Generator (DALL·E 3 & Stable Diffusion)

# Install required libraries (Uncomment in Google Colab)
# !pip install torch torchvision diffusers transformers openai PIL gradio flask

"""
### Project Overview
This project is a **Text-to-Image Generator** that takes text descriptions as input and generates images using two models:
1. **DALL·E 3 (OpenAI API)** – Generates high-quality AI images via OpenAI’s cloud service.
2. **Stable Diffusion (Local GPU Processing)** – Runs a locally hosted AI model to generate images.

### Key Features
✅ Supports both **DALL·E 3 (API-based)** and **Stable Diffusion (Local Processing)**.
✅ Provides a **Flask API** for generating images via HTTP requests.
✅ Includes a **Gradio Web UI** for an easy user interface.
✅ Optimized for GPU processing to enhance image generation speed.

### Dependencies & Setup
- **Python Libraries**: `torch`, `diffusers`, `openai`, `PIL`, `gradio`, `flask`
- **Hardware**: A **GPU (NVIDIA recommended)** for Stable Diffusion
- **API Key**: Required for OpenAI’s DALL·E 3

"""

In [1]:
!pip install torch torchvision torchaudio
!pip install diffusers transformers accelerate
!pip install pillow
!pip install matplotlib
!pip install openai
!pip install gradio
!pip install flask

Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch)
  Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.5.147 (from torch)
  Downloading nvidia_curand_cu12-10.3.5

In [2]:
import torch
from diffusers import StableDiffusionPipeline
from PIL import Image
import matplotlib.pyplot as plt
import openai
import os
import gradio as gr
from flask import Flask, request, jsonify

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

In [3]:
openai.api_key = "your_api_key"

"""
### Function: Generate Image using DALL·E 3
- Calls OpenAI’s API with a given text prompt.
- Returns the URL of the generated image.
"""


In [4]:
def generate_image_dalle(prompt):
    response = openai.Image.create(
        prompt=prompt,
        n=1,
        size="1024x1024"
    )
    image_url = response['data'][0]['url']
    return image_url

"""
### Function: Generate Image using Stable Diffusion
- Loads the Stable Diffusion model for local generation.
- Uses GPU if available for faster processing.
"""

In [5]:
device = "cuda" if torch.cuda.is_available() else "cpu"
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4").to(device)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


model_index.json:   0%|          | 0.00/541 [00:00<?, ?B/s]

Fetching 16 files:   0%|          | 0/16 [00:00<?, ?it/s]

(…)ure_extractor%2Fpreprocessor_config.json:   0%|          | 0.00/342 [00:00<?, ?B/s]

scheduler%2Fscheduler_config.json:   0%|          | 0.00/313 [00:00<?, ?B/s]

(…)oints%2Fscheduler_config-checkpoint.json:   0%|          | 0.00/209 [00:00<?, ?B/s]

tokenizer%2Fmerges.txt:   0%|          | 0.00/525k [00:00<?, ?B/s]

text_encoder%2Fconfig.json:   0%|          | 0.00/592 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/492M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

tokenizer%2Ftokenizer_config.json:   0%|          | 0.00/806 [00:00<?, ?B/s]

tokenizer%2Fspecial_tokens_map.json:   0%|          | 0.00/472 [00:00<?, ?B/s]

tokenizer%2Fvocab.json:   0%|          | 0.00/1.06M [00:00<?, ?B/s]

unet%2Fconfig.json:   0%|          | 0.00/743 [00:00<?, ?B/s]

vae%2Fconfig.json:   0%|          | 0.00/551 [00:00<?, ?B/s]

diffusion_pytorch_model.safetensors:   0%|          | 0.00/3.44G [00:00<?, ?B/s]

safety_checker%2Fconfig.json:   0%|          | 0.00/4.56k [00:00<?, ?B/s]

diffusion_pytorch_model.safetensors:   0%|          | 0.00/335M [00:00<?, ?B/s]

Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]

In [6]:
def generate_image_stable_diffusion(prompt):
    image = pipe(prompt).images[0]
    return image

"""
### Flask API: Handle Requests for Image Generation
- Accepts JSON input with `prompt` and `model`.
- Calls the appropriate model (DALL·E or Stable Diffusion).
- Returns the generated image (URL or local path).
"""

In [7]:
app = Flask(__name__)

@app.route("/generate", methods=["POST"])
def generate():
    data = request.json
    prompt = data.get("prompt", "A beautiful landscape")
    model = data.get("model", "dalle")

    if model == "dalle":
        image_url = generate_image_dalle(prompt)
        return jsonify({"image_url": image_url})
    else:
        image = generate_image_stable_diffusion(prompt)
        image.save("output.png")
        return jsonify({"image_path": "output.png"})

"""
### Gradio Web UI: User-Friendly Interface for Image Generation
- Allows users to enter a text prompt.
- Select between **DALL·E 3** or **Stable Diffusion**.
- Displays the generated image instantly.
"""

In [8]:
def generate_ui(prompt, model="dalle"):
    if model == "dalle":
        return generate_image_dalle(prompt)
    else:
        return generate_image_stable_diffusion(prompt)

demo = gr.Interface(
    fn=generate_ui,
    inputs=[gr.Textbox(label="Enter Text Prompt"), gr.Radio(["dalle", "stable_diffusion"], label="Select Model")],
    outputs=gr.Image(label="Generated Image"),
    title="AI Text-to-Image Generator",
    description="Generate images using OpenAI's DALL·E 3 or Stable Diffusion"
)

if __name__ == "__main__":
    demo.launch()
    app.run(debug=True)

Running Gradio in a Colab notebook requires sharing enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://9c7295411422d0b02c.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


 * Serving Flask app '__main__'
 * Debug mode: on


 * Running on http://127.0.0.1:5000
INFO:werkzeug:[33mPress CTRL+C to quit[0m
INFO:werkzeug: * Restarting with stat


  0%|          | 0/50 [00:00<?, ?it/s]