# Project : Text To AI Image Generator

**AI Image Generator with Stable Diffusion**


This is a simple and powerful AI image generation web app built using Hugging Face's diffusers library and deployed on Gradio + Hugging Face Spaces.

It uses **Stable Diffusion v1.5** to generate* high-quality images *from text prompts given by the user.

-------------------------------------------------------------------------

 **Features**

 Text-to-Image generation using Stable Diffusion

 Clean and simple web interface via Gradio

 Fast image generation with GPU support (CUDA)

 Fully deployable and shareable via Hugging Face Spaces

 ------------------------------------------------------------------------

 **Demo**

 Try it here: https://<your-space-name>.huggingface.space
(Replace with your actual Hugging Face Space URL)

 How It Works
User enters a text prompt (e.g., "a cat sitting on the moon").

The backend runs Stable Diffusion to convert text into an image.

The output image is displayed on the web interface in seconds.

 Requirements
These packages are required (also saved in requirements.txt):


1.   Txt
2.   Copy
3.   Edit
4.   Diffusers
5.   Transformers
6.   Accelerate
7.   Safetensors
8.   Torch
9.   Gradio




-------------------------------------------------------------------------



**Step 1:** Installed and upgraded key HuggingFace libraries italicized text upgraded key HuggingFace libraries


1.  Diffusers
2.  Transformers

Transformers
Accelerate
Safetensors -to enable use of text generation, text-to-image generation, and fast model execution with secure and efficient model weight loading.

In [None]:

!pip install --upgrade diffusers transformers accelerate
!pip install safetensors

Collecting transformers
  Downloading transformers-4.54.1-py3-none-any.whl.metadata (41 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.7/41.7 kB[0m [31m2.3 MB/s[0m eta [36m0:00:00[0m
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=2.0.0->accelerate)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=2.0.0->accelerate)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=2.0.0->accelerate)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=2.0.0->accelerate)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch>=2.0.0->accelerate)
  Downloading nvidia_cubla

----------------------------------------------------------------------------------------------------

 **Step 2:** Login to Hugging Face

*  Used a token to authenticate access to models:




In [None]:
import torch
import gradio as gr
from diffusers import StableDiffusionPipeline
from huggingface_hub import login

-----------------------------------------------------------------------------------------

**Step 3:** Load the Pretrained Model

*   Used StableDiffusionPipeline from diffusers:





In [None]:
# Step 3: Authenticate with Hugging Face
login("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")

-----------------------------------------------------------------------------------------

**Step 4**: Generate Image from Prompt

*   User inputs a text prompt, and the model generates an image





In [None]:
#  Step 4: Load the pipeline

pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16,
    use_safetensors=True
)
pipe = pipe.to("cuda")


-----------------------------------------------------------------------------------------

**Step 5:** Display Image

*   Displayed the image using matplotlib:





In [None]:
def generate_image(prompt):
    print(" Prompt received:", prompt)
    try:
        image = pipe(prompt).images[0]
        print("Image generated successfully")
        return image
    except Exception as e:
        print(" Error in generating image:", str(e))  # Yeh sabse important line hai
        return None

-----------------------------------------------------------------------------------------

**Step 6 :** Building the Web Interface**
      Why Gradio?

*    Gradio provides a quick and easy way to create web interfaces for machine learning models.



In [None]:
# Gradio UI
demo = gr.Interface(
    fn=generate_image,
    inputs=gr.Textbox(label="Enter your image prompt"),
    outputs=gr.Image(type="pil"),
    title="Stable Diffusion Generator",
    description="Enter any creative prompt and get an AI-generated image!"
)

# Launch UI
demo.launch(share=True)



## **Possible Challenges You May Have Faced**

### 1. **Environment and Dependency Issues**

* Installing heavy libraries like `diffusers`, `transformers`, `torch`, `safetensors`take time and cause version conflicts.
* Frequent updates of Hugging Face libraries break compatibility.
* CUDA compatibility issues during PyTorch or Stable Diffusion usage (e.g., wrong `torch_dtype` or incompatible CUDA versions).
* Long installation times or memory overflows due to Colab RAM/VRAM limitations.

---

### 2. **Hugging Face Authentication Problems**

* Token errors due to:

  * Expired token
  * Wrong token scope (read vs write)
  * Using `login()` in the wrong order or without internet access

---

### 3. **Model Loading Errors**

* Issues with `StableDiffusionPipeline`:

  * `use_safetensors=True` but the model doesn’t support it
  * `torch_dtype=torch.float16` not supported on CPU
  * Colab free tier  not have GPU available, which can cause the pipeline to crash or be slow

---

### 4. **Image Generation Failures**

* Prompt too long or too short causing `pipe(prompt).images[0]` to fail
* Timeout due to GPU overload or slow Hugging Face response
* Output not returned as expected — resulting in corrupted or blank images

---

### 5. **Debugging and Exception Handling**

* Lack of detailed error messages makes debugging harder
* Your `try/except` block is good, but adding full tracebacks could improve diagnostics

---

### 6. **Gradio Deployment Challenges**

* Internet issues during `demo.launch(share=True)` can cause interface failures
* Gradio UI may lag in Colab environments
* Colab runtime disconnects or times out frequently
* Deploying to Hugging Face Spaces requires additional setup: GitHub integration, model card, etc.

---

### 7. **Hardware Limitations (Colab)**

* Free GPU quota exceeded message
* Out-of-memory error during image generation with high-resolution prompts
* Colab shuts down runtime due to inactivity if testing takes too long

---

### 8. **Performance and Usability Gaps**

* No visual feedback (like a progress bar) during image generation
* No input validation — blank or broken prompt may crash the pipeline
* Output image display lacks metadata, download options, or customization

---

### 9. **Security and Safety Concerns**

* Hardcoded Hugging Face access token is visible in your notebook — this is a security vulnerability
* The app allows open-ended text input, which could be misused for inappropriate prompt generation
* If `safety_checker=None` is used, NSFW content may be generated

---

### 10. **Missing Functional Enhancements**

* No prompt history or way to store previously generated images
* No control over resolution, number of images, or visual style
* No download/save option for the generated image

---

