This repo contains training and synthesis code for domain-expanded models as well as pre-trained weights.
Paper | Project Page | Demo
Domain Expansion of Image Generators
Yotam Nitzan, Michaël Gharbi, Richard Zhang, Taesung Park, Jun-Yan Zhu, Daniel Cohen-Or, Eli Shechtman
Tel-Aviv University, Adobe Research, CMU
Abstract: Can one inject new concepts into an already trained generative model, while respecting its existing structure and knowledge? We propose a new task - domain expansion - to address this. Given a pretrained generator and novel (but related) domains, we expand the generator to jointly model all domains, old and new, harmoniously. First, we note the generator contains a meaningful, pretrained latent space. Is it possible to minimally perturb this hard-earned representation, while maximally representing the new domains? Interestingly, we find that the latent space offers unused, dormant directions, which do not affect the output. This provides an opportunity: By repurposing these directions, we can represent new domains without perturbing the original representation. In fact, we find that pretrained generators have the capacity to add several - even hundreds - of new domains! Using our expansion method, one expanded model can supersede numerous domain-specific models, without expanding the model size. Additionally, a single expanded generator natively supports smooth transitions between domains, as well as composition of domains.
Code was tested with Python 3.8.13, Pytorch 1.7.1 and CUDA 11.3 on Ubuntu 20.04.
This repository is built on top of stylegan2-ada-pytorch. You can follow their setup instructions and install our additional dependencies with:
pip install git+https://github.com/openai/CLIP.git
pip install wandb lpips
Altenatively, we provide an environment.yml
file that can be used to create a Conda environment from scratch.
conda env create -f environment.yml
You can generate aligned images - i.e., the same latent code projected to various subspaces - using generate_aligned.py.
MyStyle operates slightly different since the effect of training is local,
and hence a latent is often meaningless in different subspaces.
To generate with MyStyle-repurposed subspace, you can use generate_mystyle.py.
For convenience, we provide a couple of pretrained NADA-expanded models:
Parent Model | Number of new domains | Model |
---|---|---|
StyleGAN2 FFHQ | 100 | Model |
StyleGAN2-ADA AFHQ Dog | 50 | Model |
Training interface is similar to that in stylegan2-ada-pytorch, with a few additional arguments. A training command example is given here.
Parameter --expansion_cfg_file
points to a JSON configuration file specifying the domain expansions to perform.
Two examples, applying NADA and MyStyle, are in config_examples
directory.
Here's NADA's example:
{
"tasks": [
{"type": "NADA", "dimension": 510, "args": {"source_text": "photo","target_text": "sketch"}},
{"type": "NADA", "args": {"source_text": "person","target_text": "tolkein elf"}}
],
"tasks_losses": {
"NADA" : {
"clip_models": ["ViT-B/32","ViT-B/16"],
"clip_model_weights": [1.0, 1.0]
}
}
}
The first key, "tasks", defines the training task to perform on specific latent directions. If the dimension number is not specified, we use the "most dormant" direction that isn't already specified. In the above example, elf would repurpose the dim 511. Different tasks might perform the same adaptation method. We therefore specify shared arguments separately under "tasks_losses".
Since adaptation methods might require different number of steps, we recommend expanding the domain gradually. For example, repurpose several subspaces with MyStyle first. When results are satisfactory, repurpose more subspace with NADA.
To extend this repository and support additional domain adaptation tasks, you only need to define a new class inheriting from BaseTask. Please consider sending us a pull request if you do!
@article{nitzan2023domain,
title={Domain Expansion of Image Generators},
author={Nitzan, Yotam and Gharbi, Micha{\"e}l and Zhang, Richard and Park, Taesung and Zhu, Jun-Yan and Cohen-Or, Daniel and Shechtman, Eli},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
year={2023}
}
Our code is built on top of stylegan2-ada-pytorch and borrows from StyleGAN-NADA and MyStyle.
Thanks to alvanlii for creating the HuggingFace Demo!