# Project : Text To AI Image Generator


# **AI-Powered Image Creation with Stable Diffusion**
This project is a clean and efficient AI image generation web app developed using Hugging Face's diffusers library. The app utilizes Stable Diffusion v1.5 to generate high-quality images from natural language prompts provided by users.

It was initially deployed using Gradio and hosted on Hugging Face Spaces, offering a quick and interactive interface for AI image generation.

# **Key Features**
Text-to-Image Generation powered by Stable Diffusion

Minimalist and User-Friendly Interface built with Gradio

Fast Processing with GPU Support (CUDA)

Easily Shareable & Deployable via Hugging Face Spaces

# **Live Demo**
Try the app here: https://your-space-name.huggingface.space
(Replace this with your actual Hugging Face Space URL)

# **How It Works**
User submits a text prompt (e.g., "a cat sitting on the moon")

The backend uses Stable Diffusion to convert the text into a corresponding image

The generated image is displayed on the web interface within seconds

# **Requirements**
The following Python packages are required (also listed in requirements.txt):

diffusers

transformers

accelerate

safetensors

torch

gradio



-------------------------------------------------------------------------



# **Step 1:** Install  libraries


1.  Diffusers
2.  Transformers

Transformers
Accelerate
Safetensors -to enable use of text generation, text-to-image generation, and fast model execution with secure and efficient model weight loading.

In [1]:
!pip install diffusers==0.25.0 transformers accelerate gradio --quiet
!pip install diffusers transformers accelerate --upgrade
!pip install safetensors


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m
Collecting diffusers
  Downloading diffusers-0.34.0-py3-none-any.whl.metadata (20 kB)
Downloading diffusers-0.34.0-py3-none-any.whl (3.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.8/3.8 MB[0m [31m13.0 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hInstalling collected packages: diffusers
  Attempting uninstall: diffusers
    Found existing installation: diffusers 0.25.0
    Uninstalling diffusers-0.25.0:
      Successfully uninstalled diffusers-0.25.0
Successfully installed diffusers-0.34.0

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run:

----------------------------------------------------------------------------------------------------


# **Step 2:** Load the Pretrained Model

*   Used StableDiffusionPipeline from diffusers:



In [2]:
import torch
from diffusers import StableDiffusionPipeline
import matplotlib.pyplot as plt

  from .autonotebook import tqdm as notebook_tqdm


-----------------------------------------------------------------------------------------


# **Step 3:** Login to Hugging Face

*  Used a token to authenticate access to models:






In [None]:
from huggingface_hub import login
login("enter your API key")  # Paste your token here
# To run the project you have to paste you api key

-----------------------------------------------------------------------------------------

# **Step 4**: Generate Image from Prompt

*   User inputs a text prompt, and the model generates an image





In [4]:
pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16,
    use_safetensors=True
)
pipe = pipe.to("cuda" if torch.cuda.is_available() else "cpu")


Fetching 15 files: 100%|██████████| 15/15 [03:42<00:00, 14.84s/it]
Loading pipeline components...: 100%|██████████| 7/7 [00:41<00:00,  5.99s/it]
Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support for`float16` operations on this device in PyTorch. Please, remove the `torch_dtype=torch.float16` argument, or use another device for inference.
Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support for`float16` operations on this device in PyTorch. Please, remove the `torch_dtype=torch.float16` argument, or use another device for inference.
Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. I

-----------------------------------------------------------------------------------------

#   Step 6: Deploying an Interactive Web Interface Using Gradio
In this step, we utilize Gradio to create an interactive web-based interface for the text-to-image generation model.

* Users can input  descriptive text prompt into the interface.

* The pre-trained Stable Diffusion pipeline (pipe) processes the prompt and generates a corresponding image.

* The generated image is then displayed directly within the interface.

* The launch(share=True) parameter enables access to a publicly shareable URL for demonstration or deployment purposes.





In [5]:
import gradio as gr

def generate_image(prompt):
    image = pipe(prompt).images[0]
    return image

-----------------------------------------------------------------------------------------

# **Step 7 :** Building the Web Interface**
Why Gradio?

*    Gradio provides a quick and easy way to create web interfaces for machine learning models.



In [None]:
# Gradio UI
gr.Interface(
    fn=generate_image,
    inputs=gr.Textbox(
        label=" Enter your image prompt",
        placeholder="e.g. A surreal landscape with floating islands"
    ),
    outputs=gr.Image(type="pil", label=" Generated Image"),
    title=" Text-to-Image Generator (Stable Diffusion)",
    description="Enter a creative prompt to generate AI images using Stable Diffusion!",
    theme="default"
).launch(share=True)


* Running on local URL:  http://127.0.0.1:7860
* Running on public URL: https://12ca68fc827ea72ce9.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




: 



---



# **Common Challenges and Limitations Faced During Development**

## **1.  Environment & Dependency Issues**
Installing large libraries like diffusers, transformers, torch, and safetensors can be time-consuming and lead to version conflicts.

Frequent updates in Hugging Face packages often cause compatibility issues.

CUDA-related errors may occur if the specified torch_dtype isn't supported on the user's hardware or if CUDA versions are mismatched.

Colab limitations such as long install times and memory/VRAM overflows occasionally disrupted the workflow.

2.  Hugging Face Authentication Errors
Access token problems are common due to:

Expired or invalid tokens.

Insufficient token scopes (read-only instead of write).

Misuse of login() function without an active internet connection or in the wrong sequence.

3. Model Loading Failures
Errors during loading StableDiffusionPipeline, such as:

Using use_safetensors=True when the model doesn’t support it.

Selecting torch.float16 on CPU-only systems, which causes crashes.

Absence of GPU in the Colab free tier leads to very slow or failed inference.

4.  Image Generation Issues
Prompt-related errors (too long or too short) can cause failures like:

pipe(prompt).images[0] returning None or crashing.

Slow Hugging Face response or GPU overload resulting in timeouts.

Occasional blank or corrupted output images.

5.  Debugging & Error Handling
Lack of detailed error logs made debugging more difficult.

Although a try/except block was implemented, adding full traceback logs would improve diagnostics and issue tracking.

6.  Gradio Deployment Challenges
Internet issues during demo.launch(share=True) often caused Gradio UI failures.

Colab environments sometimes lag or disconnect, especially when running the UI.

Deploying to Hugging Face Spaces required additional steps like GitHub repo linking, model card setup, and token integration.

7.  Hardware Constraints (Colab-Specific)
Limited GPU usage in Colab often showed:

"Free GPU quota exceeded" warnings.

Out-of-memory errors during generation of high-resolution images.

Colab runtime auto-shutdowns during idle periods interrupted the development process.

8.  Performance & Usability Gaps
No visual loading indicator (e.g., progress bar) during image generation.

No input validation — blank or invalid prompts could cause errors.

Output image lacked metadata, download buttons, or format control.

9.  Security & Safety Concerns
Hardcoded Hugging Face access token in the code is a major security risk.

Open-ended user input raises the risk of NSFW or unsafe content generation.

If safety_checker=None is enabled, inappropriate outputs could bypass filters.

10.  Missing Functional Enhancements
No prompt history or way to review past generations.

Lack of control settings like:

Image resolution,

Style customization,

Number of output images.

No download/save button for the generated images.

Would you like this to be included in your final PDF report as a dedicated "Challenges Faced" section? I can also help you write a short "Future Improvements" part based on these.








Ask ChatGPT


Thank you!!!!!