# Project : Text To AI Image Generator

**AI Image Generator with Stable Diffusion**


This is a simple and powerful AI image generation web app built using Hugging Face's diffusers library and deployed on Gradio + Hugging Face Spaces.

It uses **Stable Diffusion v1.5** to generate* high-quality images *from text prompts given by the user.

-------------------------------------------------------------------------

 **Features**

 Text-to-Image generation using Stable Diffusion

 Clean and simple web interface via Gradio

 Fast image generation with GPU support (CUDA)

 Fully deployable and shareable via Hugging Face Spaces

 ------------------------------------------------------------------------

 **Demo**

 Try it here: https://<your-space-name>.huggingface.space
(Replace with your actual Hugging Face Space URL)

 How It Works
User enters a text prompt (e.g., "a cat sitting on the moon").

The backend runs Stable Diffusion to convert text into an image.

The output image is displayed on the web interface in seconds.

 Requirements
These packages are required (also saved in requirements.txt):


1.   Txt
2.   Copy
3.   Edit
4.   Diffusers
5.   Transformers
6.   Accelerate
7.   Safetensors
8.   Torch
9.   Gradio




-------------------------------------------------------------------------



# Step 1: Installed and upgraded essential Hugging Face libraries

Diffusers

*   Diffusers
*   Transformers
*   Accelerate
*   Safetensors



These core libraries were updated to their latest stable versions to support advanced capabilities such as text generation, text-to-image synthesis, and optimized model execution. Safetensors ensures secure and efficient loading of model weights, contributing to faster and more reliable performance across the pipeline.

In [10]:

!pip install --upgrade diffusers transformers accelerate
!pip install safetensors
!pip install gradio --quiet
!pip install diffusers transformers accelerate safetensors gradio --quiet



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m

----------------------------------------------------------------------------------------------------

# **Step 2:** Importing Important Libraries & Logging in Hugging Face

Used an authentication token to enable secure access to Hugging Face-hosted models and resources.




In [2]:
import torch
import gradio as gr
from diffusers import StableDiffusionPipeline
from huggingface_hub import login

  from .autonotebook import tqdm as notebook_tqdm


-----------------------------------------------------------------------------------------

# Step 3: Authenticate with Hugging Face to enable loading of the Stable Diffusion v1.5 model for text-to-image generation in later steps.

*  Utilized StableDiffusionPipeline from the diffusers library to load the pretrained Stable Diffusion v1.5 model for text-to-image generation.




In [3]:
# Step 3: Authenticate with Hugging Face
login("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")

-----------------------------------------------------------------------------------------


# Step 4: Generate Image from Prompt
*  The user provides a descriptive text prompt, which is processed by the model to generate a corresponding high-quality image.







In [4]:
#  Step 4: Load the pipeline

pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16,
    use_safetensors=True
)
device = "cuda" if torch.cuda.is_available() else "cpu"
pipe = pipe.to(device)


Loading pipeline components...: 100%|██████████| 7/7 [00:32<00:00,  4.62s/it]
Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support for`float16` operations on this device in PyTorch. Please, remove the `torch_dtype=torch.float16` argument, or use another device for inference.
Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support for`float16` operations on this device in PyTorch. Please, remove the `torch_dtype=torch.float16` argument, or use another device for inference.
Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fai

-----------------------------------------------------------------------------------------

# **Step 5:** Display Image

*   Displayed the image using matplotlib:





In [5]:
def generate_image(prompt):
    print(" Prompt received:", prompt)
    try:
        image = pipe(prompt).images[0]
        print("Image generated successfully")
        return image
    except Exception as e:
        print(" Error in generating image:", str(e))  # Yeh sabse important line hai
        return None

-----------------------------------------------------------------------------------------

# Step 6: Building the Web Interface
Why Gradio?

* Gradio offers a streamlined and efficient framework for rapidly developing interactive web interfaces tailored for machine learning models, enabling seamless user interaction without extensive frontend development.


In [9]:
# Gradio UI
demo = gr.Interface(
    fn=generate_image,
    inputs=gr.Textbox(label="Enter your image prompt"),
    outputs=gr.Image(type="pil"),
    title="Stable Diffusion Generator",
    description="Enter any creative prompt and get an AI-generated image!"
)

# Launch UI
demo.launch(share=True)

* Running on local URL:  http://127.0.0.1:7863

Could not create share link. Missing file: /home/codespace/.cache/huggingface/gradio/frpc/frpc_linux_amd64_v0.3. 

Please check your internet connection. This can happen if your antivirus software blocks the download of this file. You can install manually by following these steps: 

1. Download this file: https://cdn-media.huggingface.co/frpc-gradio-0.3/frpc_linux_amd64
2. Rename the downloaded file to: frpc_linux_amd64_v0.3
3. Move the file to this location: /home/codespace/.cache/huggingface/gradio/frpc




# Possible Challenges I Encountered
# 1. Environment and Dependency Conflicts
Installing large libraries such as diffusers, transformers, torch, and safetensors can be time-consuming and may lead to version incompatibilities.


Frequent updates in Hugging Face libraries can result in breaking changes or deprecations.

Compatibility issues between PyTorch and CUDA (e.g., incorrect torch_dtype or mismatched CUDA versions).

Long installation times and memory-related errors, especially on platforms like Google Colab with limited RAM/VRAM.


*   List item
*   List item



--------------------------------------------------------------------------------


# 2. Authentication with Hugging Face Issues related to token usage, such as:

Expired or invalid tokens

Incorrect token scope (e.g., requiring write access but using a read-only token)

Calling login() without an active internet connection or in an incorrect sequence


--------------------------------------------------------------------------------

# 3. Errors During Model Loading Common problems while initializing StableDiffusionPipeline:

Setting use_safetensors=True when the model does not support safetensors

Using torch_dtype=torch.float16 on CPUs, which only support float32

Running the pipeline without GPU access (e.g., on Colab's free tier), resulting in performance bottlenecks or crashes


--------------------------------------------------------------------------------


# 4. Failures in Image Generation Input prompts that are too long, too short, or malformed can cause .images[0] to fail

Delays or timeouts caused by limited GPU resources or slow Hugging Face model server response

Output not rendered as expected — occasionally producing blank, distorted, or corrupt images

--------------------------------------------------------------------------------


# 5. Challenges in Debugging and Exception Handling Lack of descriptive error messages can hinder effective troubleshooting

Although try/except blocks are helpful, capturing and printing full tracebacks would offer better insights during development

--------------------------------------------------------------------------------

# 6. Difficulties in Gradio Deployment Connectivity issues during demo.launch(share=True) may prevent the interface from launching

Gradio UI responsiveness can degrade in Colab environments

Frequent disconnections or timeouts in Colab sessions interrupt testing and development

Deploying to Hugging Face Spaces requires additional setup, such as GitHub integration and creating a model card

--------------------------------------------------------------------------------

# 7. Hardware Constraints (Google Colab) Hitting the free GPU quota limit restricts further usage

Out-of-memory (OOM) errors when using high-resolution or multiple prompts

Colab runtime disconnects after periods of inactivity, interrupting ongoing processes

--------------------------------------------------------------------------------


# 8. Gaps in Performance and User Experience No progress indicator during image generation, leaving the user uncertain about runtime status

Lack of input validation — empty or incorrect prompts can break the generation pipeline

Output image display is basic, without metadata, download options, or customization features

--------------------------------------------------------------------------------


# 9. Security and Content Moderation Issues Hardcoded access tokens in notebooks pose serious security risks

Open-ended text input enables misuse for generating inappropriate or unsafe content

Omitting safety_checker may result in NSFW or offensive outputs without filtering

--------------------------------------------------------------------------------

# 10. Missing Functional Enhancements No built-in prompt history or image storage mechanism

Lack of configurable options such as image resolution, style, or batch generation

No functionality for users to download or save generated images within the interface