Generate images from a text prompt using SANA Sprint, a fast 1.6B-parameter diffusion model.
Prompt: a mouse eating cheese
- Python 3.14+
- uv
- Apple Silicon Mac or Linux with CUDA GPU (device auto-detected: CUDA → MPS → CPU)
- Model weights downloaded locally (
Efficient-Large-Model/Sana_Sprint_1.6B_1024px_diffusers)
On a new machine, download the model weights once:
uv run main.py --download
Then generate images (subsequent runs stay fully offline):
uv run main.py 'a mouse eating cheese' # defaults to square
uv run main.py 'a mouse eating cheese' -s # square (1024x1024)
uv run main.py 'a mouse eating cheese' -p # portrait (768x1024)
uv run main.py 'a mouse eating cheese' -l # landscape (1024x768)
uv run main.py 'a mouse eating cheese' -pl # portrait & landscape
uv run main.py 'a mouse eating cheese' -spl # all three sizes
Output files are saved to output/ as output_<timestamp>_<size>.png.
| Package | Purpose |
|---|---|
torch |
Tensor operations and device support (CUDA/MPS/CPU) |
diffusers |
SANA Sprint pipeline |
transformers |
Text encoder |
accelerate |
Optimized model loading |
