This repository contains a diffusion model from scratch, trained on images of Pokemon. The image size (32x32) is relatively small, so the model can be trained on a laptop.
Training data | Model output |
---|---|
- training time on M1 Pro MacBook with MPS enabled ≈ 4h
- training data: images of all Pokemon
- output image size 32x32
pip install torch torchvision numpy tqdm matplotlib
Dataset: Run the notebook dataset.ipynb
. This creates a images.npy
and labels.npy
in the data directory.
Training: In order to train the model you have to run:
python train.py
The training time depends on whether you are using a GPU, MPS on MAC or a CPU. On a MacBook with a M1 Pro chip it takes about 4 hours. The trained model is saved in weights
.
Sampling: Running the notebook sample.ipynb
creates outputs like this:
Additionally, you can specify the type of Pokemon you want to sample by including a type embedding in the notebook sample_with_type.ipynb
. For example, the output for a Pokemon of type Dragon and Flying looks like this:
This code is modified from minDiffusion and DeepLearning.AI.
Diffusion model is based on Denoising Diffusion Probabilistic Models and Denoising Diffusion Implicit Models.
Training data is taken from this Pokemon image dataset.