Skip to content

Commit

Permalink
Add pre-trained model files
Browse files Browse the repository at this point in the history
  • Loading branch information
samb-t committed Jan 5, 2022
1 parent 8e83604 commit 1f1410f
Showing 1 changed file with 12 additions and 8 deletions.
20 changes: 12 additions & 8 deletions README.md
Expand Up @@ -26,6 +26,7 @@ This is the repository containing code used for the [Unleashing Transformers pap
- [Setup](#setup)
- [Set up conda environment](#set-up-conda-environment)
- [Dataset Set Up](#dataset-setup)
- [Pre-Trained Models](#pre-trained-models)
- [Commands](#commands)
- [Set up visdom server](#set-up-visdom-server)
- [Train a Vector-Quantized autoencoder on LSUN Churches](#train-a-vector-quantized-autoencoder-on-lsun-churches)
Expand Down Expand Up @@ -61,6 +62,9 @@ To configure the default paths for datasets used for training the models in this
| FFHQ | [Official FFHQ](https://github.com/NVlabs/ffhq-dataset) | [Academic Torrents FFHQ](https://academictorrents.com/details/1c1e60f484e911b564de6b4d8b643e19154d5809) |
| LSUN | [Official LSUN](https://github.com/fyu/lsun) | [Academic Torrents LSUN](https://academictorrents.com/details/c53c374bd6de76da7fe76ed5c9e3c7c6c691c489) |

### Pre-Trained Models
Pre-trained models can be found [here](https://drive.google.com/drive/folders/1pjTYcm-2NNzuAiNEO24gSt9dXu_kGQ6b?usp=sharing). To obtain all models, download the logs folder to the root directory of this repo.

## Commands
This section contains details on the basic commands for training and calculating metrics on the Absorbing Diffusion models. All training was completed on a single NVIDIA RTX 2080 Ti and these commands presume the same level of hardware. If your GPU has less VRAM than a 2080 Ti then you may need to train using smaller batch sizes and/or smaller models than the defaults.

Expand All @@ -84,7 +88,7 @@ To specify a different port when training any models, use the `--visdom_port` fl

The following command starts the training for a VQGAN on LSUN Churches:
```
python3 train_vqgan.py --dataset churches --log_dir vqae_churches --amp --ema --batch_size 4
python3 train_vqgan.py --dataset churches --log_dir vqgan_churches --amp --ema --batch_size 4
```

As specified with the `--log_dir` flag, results will be saved to the directory `logs/vqae_churches`. This includes all logs, model checkpoints and saved outputs. The `--amp` flag enables mixed-precision training, necessary for training using a batch size of 4 (the default) on a single 2080 Ti.
Expand All @@ -94,7 +98,7 @@ As specified with the `--log_dir` flag, results will be saved to the directory `
After training the VQ model using the previous command, you'll be able to run the following commands to train a discrete diffusion prior on the latent space of the Vector-Quantized model:

```
python3 train_sampler.py --sampler absorbing --dataset churches --log_dir absorbing_churches --ae_load_dir vqae_churches --ae_load_step 2200000 --amp --ema
python3 train_sampler.py --sampler absorbing --dataset churches --log_dir absorbing_churches --ae_load_dir vqgan_churches --ae_load_step 2200000 --amp --ema
```

The sampler needs to load the trained Vector-Quantized autoencoder in order to generate the latents it will use as for training (and validation). Latents are cached after the first time this is run to speed up training.
Expand All @@ -106,13 +110,13 @@ This section contains simple template commands for calculating metrics and other
**Calculate FID**

```
python experiments/calc_FID.py --sampler absorbing --dataset churches --log_dir FID_log --ae_load_dir vqae_churches --ae_load_step 2200000 --load_dir absorbing_churches --load_step 2000000 --ema --n_samples 50000
python experiments/calc_FID.py --sampler absorbing --dataset churches --log_dir FID_log --ae_load_dir vqgan_churches --ae_load_step 2200000 --load_dir absorbing_churches --load_step 2000000 --ema --n_samples 50000 --temp 0.9
```

**Calculate PRDC Scores**

```
python experiments/calc_PRDC.py --sampler absorbing --dataset churches --log_dir PRDC_log --ae_load_dir vqae_churches --ae_load_step 2200000 --load_dir absorbing_churches --load_step 2000000 --ema --n_samples 50000
python experiments/calc_PRDC.py --sampler absorbing --dataset churches --log_dir PRDC_log --ae_load_dir vqgan_churches --ae_load_step 2200000 --load_dir absorbing_churches --load_step 2000000 --ema --n_samples 50000
```


Expand All @@ -121,7 +125,7 @@ python experiments/calc_PRDC.py --sampler absorbing --dataset churches --log_dir
The following command fine-tunes a Vector-Quantized autoencoder to compute reconstruction likelihood, and then evaluates the ELBO of the overall model.

```
python experiments/calc_approximate_ELBO.py --sampler absorbing --dataset ffhq --log_dir nll_churches --ae_load_dir vqae_churches --ae_load_step 2200000 --load_dir absorbing_churches --load_step 2000000 --ema --steps_per_eval 5000 --train_steps 10000
python experiments/calc_approximate_ELBO.py --sampler absorbing --dataset ffhq --log_dir nll_churches --ae_load_dir vqgan_churches --ae_load_step 2200000 --load_dir absorbing_churches --load_step 2000000 --ema --steps_per_eval 5000 --train_steps 10000
```

NOTE: the `--steps_per_eval` flag is required for this script, as a validation dataset is used.
Expand All @@ -132,15 +136,15 @@ NOTE: the `--steps_per_eval` flag is required for this script, as a validation d
Produces a random batch of samples and finds the nearest neighbour images in the training set based on LPIPS distance.

```
python experiments/calc_nearest_neighbours.py --sampler absorbing --dataset churches --log_dir nearest_neighbours_churches --ae_load_dir vqae_churches --ae_load_step 2200000 --load_dir absorbing_churches --load_step 2000000 --ema
python experiments/calc_nearest_neighbours.py --sampler absorbing --dataset churches --log_dir nearest_neighbours_churches --ae_load_dir vqgan_churches --ae_load_step 2200000 --load_dir absorbing_churches --load_step 2000000 --ema
```

**Generate Higher Resolution Samples**

By applying the absorbing diffusion model to various locations at once and aggregating denoising probabilities, larger samples than observed during training are able to be generated (see Figures 4 and 11).

```
python experiments/generate_big_samples.py --sampler absorbing --dataset churches --log_dir big_samples_churches --ae_load_dir vqae_churches --ae_load_step 2200000 load_dir absorbing_churches --load_step 2000000 --ema --shape 32 16
python experiments/generate_big_samples.py --sampler absorbing --dataset churches --log_dir big_samples_churches --ae_load_dir vqgan_churches --ae_load_step 2200000 load_dir absorbing_churches --load_step 2000000 --ema --shape 32 16
```

Use the `--shape` flag to specify the dimensions of the latents to generate.
Expand All @@ -162,4 +166,4 @@ The following papers were particularly helpful when developing this work:
journal = {arXiv preprint arXiv:2111.12701},
year = {2021}
}
```
```

0 comments on commit 1f1410f

Please sign in to comment.