Latent-Space-Anchoring

PyTorch implementation of Domain-Scalable Unpaired Image Translation via Latent Space Anchoring

Siyu Huang^* (Harvard), Jie An^* (Rochester), Donglai Wei (BC), Zudi Lin (Amazon Alexa), Jiebo Luo (Rochester), Hanspeter Pfister (Harvard)
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

[Paper]

Given an unpaired image-to-image translation (UNIT) model trained on certain domains, it is challenging to incorporate new domains. This work includes a domain-scalable UNIT method, termed as latent space anchoring, anchors images of different domains to the same latent space of frozen GANs by learning lightweight encoder and regressor models to reconstruct single-domain images. In inference, the learned encoders and decoders of different domains can be arbitrarily combined to translate images between any two domains without fine-tuning:

Installation

We recommend installing using Anaconda. All dependencies are provided in env.yaml.

conda env create -f env.yaml
conda activate lsa

Pretrained Models

Please download the pre-trained models from the following links.

Name	Enc/Dec Domain	Generator Backbone
seg2ffhq.pt	facial segmentation mask (CelebAMask-HQ)	StyleGAN2 trained on FFHQ face.
sketch2ffhq.pt	facial sketch (CUFSF)	StyleGAN2 trained on FFHQ face.
cat2dog.pt	cat face (AFHQ-cat)	StyleGAN2 trained on AFHQ-dog.

In addition, we provide the auxiliary pre-trained models used for training our models.

Name	Description
stylegan2-ffhq-config-f.pt	StyleGAN2 generator on FFHQ face.
psp_ffhq_encode.pt	The encoder for StyleGAN2-FFHQ inversion.
model_ir_se50.pth	IR-SE50 model used for encoder's weight initialization.

Testing

CelebAMask-to-FFHQ

input regressor output generator output

Unpaired image translation from CelebAMask-HQ mask to FFHQ image.

Please download CelebAMask-HQ dataset, put it in ./data/. Download pre-trained model seg2ffhq.pt, put it in ./pretrained_models. The folder structure is

Latent-Space-Anchoring
├── data
│   ├── CelebAMask-HQ
│   │   ├── face_parsing
│   │   │   ├── Data_preprocessing
│   │   │   ├── ├── train_img
│   │   │   ├── ├── train_label
│   │   │   ├── ├── test_img
│   │   │   ├── ├── test_label
├── pretrained_models
│   ├── seg2ffhq.pt
├── commands
│   ├── test_seg2ffhq.sh

Run:

bash commands/test_seg2ffhq.sh

Sketch-to-FFHQ

input regressor output generator output

Unpaired image translation from CUFSF facial sketch to FFHQ image. Figures: input, regressor output, generator output.

Please download CUFSF dataset, put it in ./data/. Manually split the dataset into training and test sets (we use the first 1k images as training set and the rest as test set). Download pre-trained model sketch2ffhq.pt, put it in ./pretrained_models. The folder structure is

Latent-Space-Anchoring
├── data
│   ├── CUFSF
│   │   ├── train
│   │   ├── test
├── pretrained_models
│   ├── sketch2ffhq.pt
├── commands
│   ├── test_sketch2ffhq.sh

Run:

bash commands/test_sketch2ffhq.sh

FFHQ-to-CelebAMask

input regressor output generator output

Unpaired image translation from FFHQ image to CelebAMask-HQ mask. Figures: input, regressor output, generator output.

Please download FFHQ dataset, put it in ./data/. Download pre-trained models seg2ffhq.pt and psp_ffhq_encode.pt, put them in ./pretrained_models. The folder structure is

Latent-Space-Anchoring
├── data
│   ├── ffhq
│   │   ├── images1024x1024
├── pretrained_models
│   ├── seg2ffhq.pt
│   ├── psp_ffhq_encode.pt
├── commands
│   ├── test_ffhq2seg.sh

Run:

bash commands/test_seg2ffhq.sh

FFHQ-to-Sketch

input regressor output generator output

Unpaired image translation from FFHQ image to CUFSF facial sketch. Figures: input, regressor output, generator output.

Please download FFHQ dataset, put it in ./data/. Download pre-trained models sketch2ffhq.pt and psp_ffhq_encode.pt, put them in ./pretrained_models. The folder structure is

Latent-Space-Anchoring
├── data
│   ├── ffhq
│   │   ├── images1024x1024
├── pretrained_models
│   ├── sketch2ffhq.pt
│   ├── psp_ffhq_encode.pt
├── commands
│   ├── test_ffhq2sketch.sh

Run:

bash commands/test_ffhq2sketch.sh

CelebAMask-to-Sketch

input regressor output generator output

Unpaired image translation from CelebAMask-HQ mask to CUFSF facial sketch.

Please download CelebAMask-HQ dataset, put it in ./data/. Download pre-trained model seg2ffhq.pt and sketch2ffhq.pt, put them in ./pretrained_models. The folder structure is

Latent-Space-Anchoring
├── data
│   ├── CelebAMask-HQ
│   │   ├── face_parsing
│   │   │   ├── Data_preprocessing
│   │   │   ├── ├── train_img
│   │   │   ├── ├── train_label
│   │   │   ├── ├── test_img
│   │   │   ├── ├── test_label
├── pretrained_models
│   ├── seg2ffhq.pt
│   ├── sketch2ffhq.pt
├── commands
│   ├── test_seg2sketch.sh

Run:

bash commands/test_seg2sketch.sh

Cat-to-Dog

input regressor output generator output

Unpaired image translation from AFHQ-cat to AFHQ-dog.

Please download AFHQ dataset, put it in ./data/. Download pre-trained model cat2dog.pt, put it in ./pretrained_models. The folder structure is

Latent-Space-Anchoring
├── data
│   ├── AFHQ
│   │   ├── afhq
│   │   │   ├── train
│   │   │   ├── ├── cat
│   │   │   ├── ├── dog
│   │   │   ├── ├── wild
│   │   │   ├── test
│   │   │   ├── ├── cat
│   │   │   ├── ├── dog
│   │   │   ├── ├── wild
├── pretrained_models
│   ├── cat2dog.pt
├── commands
│   ├── test_cat2dog.sh

Run:

bash commands/test_cat2dog.sh

Training

It requires a single GPU with at least 16GB memory. Less GPU memory with a smaller batch size is potentially feasible, although we have not tested it.

CelebAMask-to-FFHQ

Train encoder and regressor for CelebAMask-HQ mask domain, by using StyleGAN2-FFHQ as the generator backbone.

Please download CelebAMask-HQ dataset and FFHQ dataset, put them in ./data/. Download pre-trained models stylegan2-ffhq-config-f.pt and model_ir_se50.pth, put them in ./pretrained_models. The folder structure is

Latent-Space-Anchoring
├── data
│   ├── CelebAMask-HQ
│   │   ├── face_parsing
│   │   │   ├── Data_preprocessing
│   │   │   ├── ├── train_img
│   │   │   ├── ├── train_label
│   │   │   ├── ├── test_img
│   │   │   ├── ├── test_label
│   ├── ffhq
│   │   ├── images1024x1024
├── pretrained_models
│   ├── stylegan2-ffhq-config-f.pt
│   ├── model_ir_se50.pth
├── commands
│   ├── train_seg2ffhq.sh

Run:

bash commands/train_seg2ffhq.sh

The training results and model checkpoints will be saved in ./logs/seg2ffhq.

Sketch-to-FFHQ

Train encoder and regressor for CUFSF facial sketch domain, by using StyleGAN2-FFHQ as the generator backbone.

Please download CUFSF and FFHQ dataset, put them in ./data/. Download pre-trained models stylegan2-ffhq-config-f.pt and model_ir_se50.pth, put them in ./pretrained_models. The folder structure is

Latent-Space-Anchoring
├── data
│   ├── CUFSF
│   │   ├── train
│   │   ├── test
│   ├── ffhq
│   │   ├── images1024x1024
├── pretrained_models
│   ├── stylegan2-ffhq-config-f.pt
│   ├── model_ir_se50.pth
├── commands
│   ├── train_sketch2ffhq.sh

Run:

bash commands/train_sketch2ffhq.sh

The training results and model checkpoints will be saved in ./logs/sketch2ffhq.

Diverse Generations

We support diverse model generations.

Please follow *Testing: CelebAMask-to-FFHQ to prepare dataset and pre-trained models. Run:

bash commands/inference_seg2ffhq.sh

High-Resolution Sampling

We support sampling high-resolution (i.e., 1024x1024) mask and images from a random noise.

Please follow Testing: CelebAMask-to-FFHQ to prepare dataset and pre-trained models. Run:

bash commands/sampling_seg2ffhq.sh

Acknowledgements

This implementation is built upon StyleGAN2 and pixel2style2pixel.

Citation

@article{huang2023domain,
  author={Huang, Siyu and An, Jie and Wei, Donglai and Lin, Zudi and Luo, Jiebo and Pfister, Hanspeter},
  title={Domain-Scalable Unpaired Image Translation Via Latent Space Anchoring},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  year={2023},
}

Contact

Siyu Huang (huangsiyutc@gmail.com)

Name		Name	Last commit message	Last commit date
Latest commit History 135 Commits
commands		commands
configs		configs
criteria		criteria
datasets		datasets
docs		docs
models		models
options		options
scripts		scripts
training		training
utils		utils
README.md		README.md
env.yaml		env.yaml

siyuhuang/Latent-Space-Anchoring

Folders and files

Latest commit

History

Repository files navigation

Latent-Space-Anchoring

Installation

Pretrained Models

Testing

CelebAMask-to-FFHQ

Sketch-to-FFHQ

FFHQ-to-CelebAMask

FFHQ-to-Sketch

CelebAMask-to-Sketch

Cat-to-Dog

Training

CelebAMask-to-FFHQ

Sketch-to-FFHQ

Diverse Generations

High-Resolution Sampling

Acknowledgements

Citation

Contact

About

Resources

Stars

Watchers

Forks

Languages