A toolkit for exploring and manipulating latent space representations in Stable Diffusion models, focusing on artistic applications through interpolation and visualization.
- VAE Encoding/Decoding: Encode images to SD 1.5 latent space and decode back to images
- Spherical Linear Interpolation (SLERP): Smooth transitions between latent representations
- Linear Interpolation (LERP): Alternative interpolation method for comparison
- Multi-way Interpolation: Blend multiple latent vectors with custom weights
- Streamlit Web UI: Interactive interface for real-time latent exploration
- Comprehensive Testing: Unit tests for core algorithms and edge cases
- Python 3.13 (recommended for best performance)
- macOS with Apple Silicon (MPS support) or x86_64 with CUDA/CPU
- Clone the repository:
git clone https://github.com/aadjones/diffusions
cd diffusions
- Set up development environment:
make setup # Create virtual environment
make install # Install all dependencies
Alternatively, manual setup:
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
Launch the Streamlit app for interactive exploration:
make run # Recommended
# or
streamlit run app.py
This opens a web interface where you can:
- Upload two images
- Adjust the interpolation slider to morph between them
- See real-time results in the latent space
from src.diffusion_art.models.vae import SD15VAE
from src.diffusion_art.core.interpolation import slerp, lerp
from PIL import Image
# Initialize VAE
vae = SD15VAE()
# Load and encode images
img1 = Image.open("image1.jpg")
img2 = Image.open("image2.jpg")
z1 = vae.encode(img1)
z2 = vae.encode(img2)
# Interpolate in latent space
interpolated = slerp(z1, z2, t=0.5) # Midpoint
# Decode back to image
result_image = vae.decode(interpolated)
result_image.save("interpolated.jpg")
from src.diffusion_art.core.interpolation import multi_slerp, create_interpolation_path
# Multi-way interpolation
latents = [z1, z2, z3, z4]
weights = [0.3, 0.2, 0.3, 0.2]
blended = multi_slerp(latents, weights)
# Create interpolation path
path = create_interpolation_path(z1, z2, steps=10, method="slerp")
for i, latent in enumerate(path):
frame = vae.decode(latent)
frame.save(f"frame_{i:03d}.jpg")
src/diffusion_art/
├── core/
│ └── interpolation.py # Interpolation algorithms
├── models/
│ └── vae.py # VAE wrapper for SD 1.5
└── ui/ # Future UI components
- SD15VAE: Wrapper around Stable Diffusion 1.5's VAE with proper scaling
- SLERP: Spherical linear interpolation preserving angular relationships
- LERP: Standard linear interpolation for baseline comparison
- Multi-SLERP: N-way interpolation with weighted blending
The project includes a comprehensive Makefile for development tasks:
# Development workflow
make format # Format code (black + isort + pre-commit hooks)
make lint # Lint code (flake8)
make typecheck # Type checking (mypy)
make test # Run tests
make clean # Clean up cache files
# Using Makefile (recommended)
make test
# Direct pytest usage
pytest
pytest --cov=src/diffusion_art # With coverage
pytest tests/unit/core/test_interpolation.py -v # Specific test
diffusion/
├── src/diffusion_art/ # Main package
├── tests/ # Test suite
├── app.py # Streamlit application
├── requirements.txt # Dependencies
├── Makefile # Development commands
├── pyproject.toml # Project configuration
├── .pre-commit-config.yaml # Pre-commit hooks
├── .flake8 # Flake8 configuration
├── mypy.ini # MyPy configuration
├── .github/workflows/ci.yml # CI/CD pipeline
└── CLAUDE.md # AI assistant instructions
Stable Diffusion 1.5 uses a 4-channel latent space of size 64×64, representing 512×512 pixel images with a compression factor of ~8×. The VAE scale factor of 0.18215 is applied during encoding/decoding.
- SLERP: Maintains constant magnitude while interpolating along great circles on the unit sphere. Better for preserving semantic meaning in high-dimensional spaces.
- LERP: Direct linear interpolation in Euclidean space. Simpler but may produce less semantically coherent results.
- Model loading is cached using Streamlit's
@st.cache_resource
- Supports MPS (Apple Silicon) and CPU backends
- Includes
watchdog
for faster file watching and auto-reload - Batch processing capabilities for generating sequences
- Currently supports SD 1.5 VAE only
- Requires significant VRAM/memory for larger models
- Image preprocessing fixes input size to 512×512
- Support for other diffusion model VAEs
- Advanced path planning (geodesics, loops)
- Real-time audio-driven interpolation
- Fisher information metric integration
- PCA/UMAP visualization of latent spaces
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass:
pytest
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- Stable Diffusion by Stability AI
- Diffusers library by Hugging Face