Diffusion Model

A powerful conditional diffusion model for image generation from text descriptions.

Overview

This project implements a conditional diffusion model that generates images from text descriptions. The model learns to gradually denoise random Gaussian noise into coherent images, guided by text prompts.

Key Features

Text-to-Image Generation: Convert textual descriptions into high-quality images
Web Interface: User-friendly interface for image generation
Conditional Generation: Fine-grained control over generated content
Interactive Process Visualization: Watch the denoising process in real-time
Multi-Class Support: Generate images across various categories

Architecture

The model consists of several key components:

├── models/
│   ├── modules.py        # Neural network building blocks
│   ├── text_encoder.py   # Text embedding module
│   ├── time_encoder.py   # Timestep encoding module
│   └── unet.py           # Conditional UNet architecture
│
├── output_ImageNet/      # Generated outputs and visualizations
│
├── diffusion.py          # Core diffusion model implementation
├── plot_func.py          # Visualization utilities
├── preprocess.py         # Data loading and preprocessing
└── web.py                # Web interface for image generation

Web Interface

The project includes a sleek web interface for easy interaction with the model:

Features:

Text prompt input
Multiple image generation
Process visualization
Real-time generation progress

Model Components

1. Diffusion Process

Forward diffusion adds noise gradually
Reverse diffusion learns to remove noise
Conditional generation guided by text embeddings

2. Architecture Details

UNet backbone with skip connections
Text conditioning through cross-attention
Time embedding using sinusoidal positions
Batch normalization for stable training

3. Dataset

Tiny ImageNet, over 1Million images and 200 classes

3. Training Process

Dataset: ImageNet subset
Text embeddings: SentenceTransformer
Loss: MSE between predicted and actual noise
Optimizer: AdamW with gradient scaling

Sample Generations

Generate images from text descriptions like:

"red apple"
"golden retriever"
"sunset over mountains"

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diffusion Model

Overview

Key Features

Architecture

Web Interface

Model Components

1. Diffusion Process

2. Architecture Details

3. Dataset

3. Training Process

Sample Generations

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
models		models
output_ImageNet		output_ImageNet
.gitignore		.gitignore
README.md		README.md
diffusion.py		diffusion.py
plot_func.py		plot_func.py
preprocess.py		preprocess.py
web.py		web.py

Folders and files

Latest commit

History

Repository files navigation

Diffusion Model

Overview

Key Features

Architecture

Web Interface

Model Components

1. Diffusion Process

2. Architecture Details

3. Dataset

3. Training Process

Sample Generations

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages