Official PyTorch implementation of "SinFusion - Training Diffusion Models on a Single Image or Video".
You can find training images, training video frames, and some sample checkpoints in the Dropbox Folder.
You can download the training images and video frames from the images
folder into a local ./images
folder.
Note that a video is represented as a directory with PNG files in the format <frame number>.png,
and should be placed within the ./images/video
folder.
For example:
some/path/sinfusion/images/video/my_video/
1.png
2.png
3.png
...
To download the checkpoints, download the checkpoints/lightning_logs
directory from the Dropbox to a local ./lightning_logs
directory. The structure should be, for example -
some/path/sinfusion/lightning_logs/
balloons.png/
image_balloons_sample_weights/checkpoints/last.ckpt
tornado/
video_tornado_sample_weights_predictor/checkpoints/last.ckpt
video_tornado_sample_weights_projector/checkpoints/last.ckpt
Following this you can directly sample the trained weights (using the default configuration in config.py
) with the generation command lines below (notice you need to use the relevant run_name
, for example - image_balloons_sample_weights
or video_tornado_sample_weights
).
Training a single-image DDPM:
python main.py --task='image' --image_name='balloons.png' --run_name='train_balloons_0'
Training a single-video DDPM (Predictor and Projector) for generation and extrapolation:
python main.py --task='video' --image_name='tornado' --run_name='tornado_video_model_0'
Training a single-video DDPM (Interpolator and Projector) for temporal upsampling:
python main.py --task='video_interp' --image_name='hula_hoop' --run_name='hula_hoop_interpolate_0'
To choose the image/video, place the image/video in the data folder (see Data)
and replace the image_name
argument with the name of the image/video.
Also, notice the run_name
argument. This is the name of the experiment, and will be used in the generation to
load the model. If this argument isn't specified, a default name will be used.
You may also change any configuration parameter via the command line. For a full list of configuration options, please see config.py
.
To run the training script asynchronously, you can run a line similar to the following -
PYTHONUNBUFFERED=1 nohup python main.py --task='video' --image_name='tornado' --run_name='tornado_video_model_0' > tornado_video_model_0.out &
Most arguments are the same for the generation process as the training process.
Notice that the run_name
argument should be the same as the one used in the training cmdline.
To generate a diverse set of images from a single-image DDPM:
python sample.py --task='image' --image_name='balloons.png' --run_name='train_balloons_0' [--sample_count=8] [--sample_size='(H, W)']
To generate a diverse video from a single-video DDPM:
python sample.py --task='video' --image_name='tornado' --output_video_len=100 --run_name=tornado_video_0
To extrapolate a video from a given frame using a single-video DDPM:
python sample.py --task='video' --image_name=tornado --output_video_len=100 --start_frame_index=33 --run_name=tornado_video_0
To temporally upsample a video by a specific factor (x2, x4, ...) -
python sample.py --task='video_interp' --image_name=hula_hoop --interpolation_rate=4 --run_name=hula_hoop_interpolate_0
By default (if output_dir
isn't specified, the generated images/videos will be saved in the ./outputs
folder.
For evaluating using NNFDIV metric (in Section 7.2), please check out the NNF Diversity Repository.
If you find our project useful for your work please cite:
@inproceedings{nikankin2022sinfusion,
title={SinFusion: Training Diffusion Models on a Single Image or Video},
author={Nikankin, Yaniv and Haim, Niv and Irani, Michal},
booktitle={International Conference on Machine Learning},
organization={PMLR},
year={2023}
}