This is the PyTorch implementation of Retrieve in Style: Unsupervised Facial Feature Transfer and Retrieval.
Abstract:
We present Retrieve in Style (RIS), an unsupervised framework for fine-grained facial feature transfer and retrieval on real images Recent work shows that it is possible to learn a catalog that allows local semantic transfers of facial features on generated images by capitalizing on the disentanglement property of the StyleGAN latent space. RIS improves existing art on:
- feature disentanglement and allows for challenging transfers (\ie, hair and pose) that were not shown possible in SoTA methods.
- eliminating the needs for per-image hyperparameter tuning, and for computing a catalog over a large batch of images.
- enabling face retrieval using the proposed facial features (\eg, eyes), and to our best knowledge, is the first work to retrieve face images at the fine-grained level.
- robustness and natural application to real images. Our qualitative and quantitative analyses show RIS achieves both high-fidelity feature transfers and accurate fine-grained retrievals on real images. We discuss the responsible application of RIS.
Our codebase is based off stylegan2 by rosalinity.
conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=<CUDA_VERSION>
pip install tqdm gdown scikit-learn scipy lpips dlib opencv-python
Everything to get started is in the colab notebook.
If you use this code or ideas from our paper, please cite our paper:
@InProceedings{Chong_2021_ICCV,
author = {Chong, Min Jin and Chu, Wen-Sheng and Kumar, Abhishek and Forsyth, David},
title = {Retrieve in Style: Unsupervised Facial Feature Transfer and Retrieval},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2021},
pages = {3887-3896}
}
This code borrows from StyleGAN2 by rosalinity, Editing in Style, StyleClip, PTI. Encoder used is borrowed directly from encoder4editing.