# Project : Text To AI Image Generator

**AI Image Generator with Stable Diffusion**


This is a simple and powerful AI image generation web app built using Hugging Face's diffusers library and deployed on Gradio + Hugging Face Spaces.

It uses **Stable Diffusion v1.5** to generate* high-quality images *from text prompts given by the user.

-------------------------------------------------------------------------

 **Features**

 Text-to-Image generation using Stable Diffusion

 Clean and simple web interface via Gradio

 Fast image generation with GPU support (CUDA)

 Fully deployable and shareable via Hugging Face Spaces

 ------------------------------------------------------------------------

 **Demo**

 Try it here: https://<your-space-name>.huggingface.space
(Replace with your actual Hugging Face Space URL)

 How It Works
User enters a text prompt (e.g., "a cat sitting on the moon").

The backend runs Stable Diffusion to convert text into an image.

The output image is displayed on the web interface in seconds.

 Requirements
These packages are required (also saved in requirements.txt):


1.   Txt
2.   Copy
3.   Edit
4.   Diffusers
5.   Transformers
6.   Accelerate
7.   Safetensors
8.   Torch
9.   Gradio




-------------------------------------------------------------------------



# **Step 1:** Installed and upgraded key HuggingFace libraries italicized text upgraded key HuggingFace libraries


1.  Diffusers
2.  Transformers

Transformers
Accelerate
Safetensors -to enable use of text generation, text-to-image generation, and fast model execution with secure and efficient model weight loading.

In [None]:
!pip install diffusers==0.25.0 transformers accelerate gradio --quiet
!pip install diffusers transformers accelerate --upgrade
!pip install safetensors

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m35.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m363.4/363.4 MB[0m [31m3.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.8/13.8 MB[0m [31m42.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.6/24.6 MB[0m [31m25.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m883.7/883.7 kB[0m [31m35.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m664.8/664.8 MB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m211.5/211.5 MB[0m [31m5.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.3/56.3 MB[0m [31m7.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

----------------------------------------------------------------------------------------------------


# **Step 2:** Load the Pretrained Model

*   Used StableDiffusionPipeline from diffusers:



In [None]:
import torch
from diffusers import StableDiffusionPipeline
import matplotlib.pyplot as plt

-----------------------------------------------------------------------------------------


# **Step 3:** Login to Hugging Face

*  Used a token to authenticate access to models:






In [None]:
from huggingface_hub import login
login("YOUR TOKEN HERE")  # Paste your token here

-----------------------------------------------------------------------------------------

# **Step 4**: Generate Image from Prompt

*   User inputs a text prompt, and the model generates an image





In [None]:
pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16,
    use_safetensors=True
)
pipe = pipe.to("cuda" if torch.cuda.is_available() else "cpu")


Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]

Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support for`float16` operations on this device in PyTorch. Please, remove the `torch_dtype=torch.float16` argument, or use another device for inference.
Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support for`float16` operations on this device in PyTorch. Please, remove the `torch_dtype=torch.float16` argument, or use another device for inference.
Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, du

-----------------------------------------------------------------------------------------

#   Step 6: Deploying an Interactive Web Interface Using Gradio
In this step, we utilize Gradio to create an interactive web-based interface for the text-to-image generation model.

* Users can input a descriptive text prompt into the interface.

* The pre-trained Stable Diffusion pipeline (pipe) processes the prompt and generates a corresponding image.

* The generated image is then displayed directly within the interface.

* The launch(share=True) parameter enables access to a publicly shareable URL for demonstration or deployment purposes.





In [6]:
import gradio as gr

def generate_image(prompt):
    image = pipe(prompt).images[0]
    return image

-----------------------------------------------------------------------------------------

# **Step 7 : Building the Web Interface**
Why Gradio?

*    Gradio provides a quick and easy way to create web interfaces for machine learning models.



In [8]:
# Gradio UI
gr.Interface(
    fn=generate_image,
    inputs=gr.Textbox(
        label="📝 Enter your image prompt",
        placeholder="e.g. A surreal landscape with floating islands"
    ),
    outputs=gr.Image(type="pil", label="🖼️ Generated Image"),
    title="🎨 Text-to-Image Generator (Stable Diffusion)",
    description="Enter a creative prompt to generate AI images using Stable Diffusion!",
    theme="default"
).launch(share=True)


* Running on local URL:  http://127.0.0.1:7861
* Running on public URL: https://a6543e47de1fefc0c0.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)






---





## **Possible Challenges we Have Faced**

### 1. **Environment and Dependency Issues**

* Installing heavy libraries like `diffusers`, `transformers`, `torch`, `safetensors`take time and cause version conflicts.
* Frequent updates of Hugging Face libraries break compatibility.
* CUDA compatibility issues during PyTorch or Stable Diffusion usage (e.g., wrong `torch_dtype` or incompatible CUDA versions).
* Long installation times or memory overflows due to Colab RAM/VRAM limitations.

---

### 2. **Hugging Face Authentication Problems**

* Token errors due to:

  * Expired token
  * Wrong token scope (read vs write)
  * Using `login()` in the wrong order or without internet access

---

### 3. **Model Loading Errors**

* Issues with `StableDiffusionPipeline`:

  * `use_safetensors=True` but the model doesn’t support it
  * `torch_dtype=torch.float16` not supported on CPU
  * Colab free tier  not have GPU available, which can cause the pipeline to crash or be slow

---

### 4. **Image Generation Failures**

* Prompt too long or too short causing `pipe(prompt).images[0]` to fail
* Timeout due to GPU overload or slow Hugging Face response
* Output not returned as expected — resulting in corrupted or blank images

---

### 5. **Debugging and Exception Handling**

* Lack of detailed error messages makes debugging harder
* Your `try/except` block is good, but adding full tracebacks could improve diagnostics

---

### 6. **Gradio Deployment Challenges**

* Internet issues during `demo.launch(share=True)` can cause interface failures
* Gradio UI may lag in Colab environments
* Colab runtime disconnects or times out frequently
* Deploying to Hugging Face Spaces requires additional setup: GitHub integration, model card, etc.

---

### 7. **Hardware Limitations (Colab)**

* Free GPU quota exceeded message
* Out-of-memory error during image generation with high-resolution prompts
* Colab shuts down runtime due to inactivity if testing takes too long

---

### 8. **Performance and Usability Gaps**

* No visual feedback (like a progress bar) during image generation
* No input validation — blank or broken prompt may crash the pipeline
* Output image display lacks metadata, download options, or customization

---

### 9. **Security and Safety Concerns**

* Hardcoded Hugging Face access token is visible in your notebook — this is a security vulnerability
* The app allows open-ended text input, which could be misused for inappropriate prompt generation
* If `safety_checker=None` is used, NSFW content may be generated

---

### 10. **Missing Functional Enhancements**

* No prompt history or way to store previously generated images
* No control over resolution, number of images, or visual style
* No download/save option for the generated image

---



Thank you!!!!!