A powerful conditional diffusion model for image generation from text descriptions.
This project implements a conditional diffusion model that generates images from text descriptions. The model learns to gradually denoise random Gaussian noise into coherent images, guided by text prompts.
- Text-to-Image Generation: Convert textual descriptions into high-quality images
- Web Interface: User-friendly interface for image generation
- Conditional Generation: Fine-grained control over generated content
- Interactive Process Visualization: Watch the denoising process in real-time
- Multi-Class Support: Generate images across various categories
The model consists of several key components:
├── models/
│ ├── modules.py # Neural network building blocks
│ ├── text_encoder.py # Text embedding module
│ ├── time_encoder.py # Timestep encoding module
│ └── unet.py # Conditional UNet architecture
│
├── output_ImageNet/ # Generated outputs and visualizations
│
├── diffusion.py # Core diffusion model implementation
├── plot_func.py # Visualization utilities
├── preprocess.py # Data loading and preprocessing
└── web.py # Web interface for image generation
The project includes a sleek web interface for easy interaction with the model:
Features:
- Text prompt input
- Multiple image generation
- Process visualization
- Real-time generation progress
- Forward diffusion adds noise gradually
- Reverse diffusion learns to remove noise
- Conditional generation guided by text embeddings
- UNet backbone with skip connections
- Text conditioning through cross-attention
- Time embedding using sinusoidal positions
- Batch normalization for stable training
- Tiny ImageNet, over 1Million images and 200 classes
- Dataset: ImageNet subset
- Text embeddings: SentenceTransformer
- Loss: MSE between predicted and actual noise
- Optimizer: AdamW with gradient scaling
Generate images from text descriptions like:
- "red apple"
- "golden retriever"
- "sunset over mountains"
This project is licensed under the MIT License - see the LICENSE file for details.



