Skip to content

A list of things I've used myself and found to be robust and useful.

Notifications You must be signed in to change notification settings

kingsj0405/awesome-face-related-list

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 

Repository files navigation

Awesome Face Related List

In summary, this repository includes papers and implementations of face modeling and utilization. A list of things I've used myself and found to be robust and useful. Many basics of computer vision things are also included.

Dataset

  • [VoxCeleb2] Chung, J. S., Nagrani, A., & Zisserman, A. (2018). Voxceleb2: Deep speaker recognition. arXiv preprint arXiv:1806.05622. Homepage. pdf. kingsj0405/video-preprocessing.
  • [WFLW] Wu, W., Qian, C., Yang, S., Wang, Q., Cai, Y., & Zhou, Q. (2018). Look at boundary: A boundary-aware face alignment algorithm. CVPR. Homepage.
  • [LRW] Chung, J. S., & Zisserman, A. (2017). Lip reading in the wild. ACCV. Homepage. pdf.
  • [CelebVHQ] Zhu, H., Wu, W., Zhu, W., Jiang, L., Tang, S., Zhang, L., ... & Loy, C. C. (2022, November). CelebV-HQ: A large-scale video facial attributes dataset. ECCV. Project Page. Code. arXiv
  • [VFHQ] Xie, L., Wang, X., Zhang, H., Dong, C., & Shan, Y. (2022). VFHQ: A High-Quality Dataset and Benchmark for Video Face Super-Resolution. CVPR Workshop. pdf.

Face modeling

3D Morphable Face Model (3DMM)

  • [survey] Egger, B., Smith, W. A., Tewari, A., Wuhrer, S., Zollhoefer, M., Beeler, T., ... & Vetter, T. (2020). 3d morphable face models—past, present, and future. ACM Transactions on Graphics (TOG), 39(5), 1-38. arXiv
  • [3DDFA_V2] Guo, J., Zhu, X., Yang, Y., Yang, F., Lei, Z., & Li, S. Z. (2020, November). Towards fast, accurate and stable 3d dense face alignment. ECCV. code. arXiv

Face Perception

Face Detection & Tracking

  • [dlib] http://dlib.net/face_detector.py.html. code.
  • [ArcFace] Deng, J., Guo, J., Xue, N., & Zafeiriou, S. (2019). Arcface: Additive angular margin loss for deep face recognition. CVPR. pdf. code.
  • [RetinaFace] Deng, J., Guo, J., Ververas, E., Kotsia, I., & Zafeiriou, S. (2020). Retinaface: Single-shot multi-level face localisation in the wild. CVPR. arXiv. code. code2.

Facial Landmark

Face Tracking

  • [SORT] Bewley, A., Ge, Z., Ott, L., Ramos, F., & Upcroft, B. (2016, September). Simple online and realtime tracking. In 2016 IEEE international conference on image processing (ICIP) (pp. 3464-3468). IEEE. code. arXiv.

Face Manipulation

Face Reenactment with driving video

  • [Cross-Identity] Jeon, S., Nam, S., Oh, S. W., & Kim, S. J. (2020). Cross-identity motion transfer for arbitrary objects through pose-attentive video reassembling. ECCV. paper.
  • [LIA] Wang, Y., Yang, D., Bremond, F., & Dantcheva, A. (2022). Latent image animator: Learning to animate images via latent space navigation. ICLR. arXiv. code. project page.

Lip-sync with speech

  • [Wav2Lip] Prajwal, K. R., Mukhopadhyay, R., Namboodiri, V. P., & Jawahar, C. V. (2020, October). A lip sync expert is all you need for speech to lip generation in the wild. ACM Multimedia. arXiv. code. project page.
  • [PC-AVS] Zhou, H., Sun, Y., Wu, W., Loy, C. C., Wang, X., & Liu, Z. (2021). Pose-controllable talking face generation by implicitly modularized audio-visual representation. CVPR. arXiv. code. project page.
  • [StyleSync] Guan, J., Zhang, Z., Zhou, H., Hu, T., Wang, K., He, D., ... & Wang, J. (2023). StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator. CVPR. arXiv. code.

Face Reenactment with driving audio

  • [MakeItTalk] Zhou, Y., Han, X., Shechtman, E., Echevarria, J., Kalogerakis, E., & Li, D. (2020). Makelttalk: speaker-aware talking-head animation. ACM Transactions On Graphics (TOG). arXiv. code.
  • [AD-NeRF] Guo, Y., Chen, K., Liang, S., Liu, Y. J., Bao, H., & Zhang, J. (2021). Ad-nerf: Audio driven neural radiance fields for talking head synthesis. ICCV. arXiv. code. project page.
  • [SSP-NeRF] Liu, X., Xu, Y., Wu, Q., Zhou, H., Wu, W., & Zhou, B. (2022, October). Semantic-aware implicit neural audio-driven video portrait generation. ECCV. arXiv. code. project page.
  • [GeneFace] Ye, Z., Jiang, Z., Ren, Y., Liu, J., He, J., & Zhao, Z. (2023). Geneface: Generalized and high-fidelity audio-driven 3d talking face synthesis. ICLR. arXiv. code. project page.
  • [SadTalker] Zhang, W., Cun, X., Wang, X., Zhang, Y., Shen, X., Guo, Y., ... & Wang, F. (2023). SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation. CVPR. arXiv. code. project page.
  • [IP_LAP] Zhong, W., Fang, C., Cai, Y., Wei, P., Zhao, G., Lin, L., & Li, G. (2023). Identity-Preserving Talking Face Generation with Landmark and Appearance Priors. CVPR. arXiv. code.

Face IQA

  • [HyperIQA] Su, S., Yan, Q., Zhu, Y., Zhang, C., Ge, X., Sun, J., & Zhang, Y. (2020). Blindly assess image quality in the wild guided by a self-adaptive hyper network. CVPR. code. pdf.

Non-human Face

Face Manipulation

  • [Pareidolia Face Reenactment] Song, L., Wu, W., Fu, C., Qian, C., Loy, C. C., & He, R. (2021). Pareidolia Face Reenactment. CVPR. paper. code. project page.

Perception

  • [DIFE] Yang, S., Jeon, S., Nam, S., & Kim, S. J. (2022). Dense Interspecies Face Embedding. NeurIPS. paper. code. project page

Generative Model

Variational Inference (e.g. VAE, Flow-based Model)

  • [survey] Kingma, D. P., & Welling, M. (2019). An introduction to variational autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307-392. arXiv.
  • [thesis] Kingma, D. P. (2017). Variational inference & deep learning: A new synthesis. pdf.
  • [VAE] Kingma, D. P., & Welling, M. (2014). Auto-encoding variational bayes. ICLR. arXiv.
  • [IAF] Kingma, D. P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., & Welling, M. (2016). Improved variational inference with inverse autoregressive flow. NeurIPS. arXiv.
  • [Glow] Kingma, D. P., & Dhariwal, P. (2018). Glow: Generative flow with invertible 1x1 convolutions. NeurIPS. arXiv. code.
  • [Flow++] Ho, J., Chen, X., Srinivas, A., Duan, Y., & Abbeel, P. (2019, May). Flow++: Improving flow-based generative models with variational dequantization and architecture design. In International Conference on Machine Learning (pp. 2722-2730). PMLR. code. arXiv.

Diffusion Model

  • [DDPM] Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. NeurIPS. arXiv. code.
  • [ImprovedDDPM] Nichol, A. Q., & Dhariwal, P. (2021, July). Improved denoising diffusion probabilistic models. In International Conference on Machine Learning (pp. 8162-8171). PMLR. code. arXiv.
  • [GuidedDiffusion] Dhariwal, P., & Nichol, A. (2021). Diffusion models beat gans on image synthesis. NuerIPS. arXiv. code.

Generative Adversarial Network (GAN)

  • [StyleGAN] Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. CVPR. code. arXiv.
  • [StyleGAN2] Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., & Aila, T. (2020). Analyzing and improving the image quality of stylegan. CVPR. code. arXiv
  • [StyleGAN2-ADA] Karras, T., Aittala, M., Hellsten, J., Laine, S., Lehtinen, J., & Aila, T. (2020). Training generative adversarial networks with limited data. NuerIPS. code. arXiv.
  • [StyleGAN3] Karras, T., Aittala, M., Laine, S., Härkönen, E., Hellsten, J., Lehtinen, J., & Aila, T. (2021). Alias-free generative adversarial networks. NeurIPS. code. arXiv.
  • [MoStGAN-V] Shen, X., Li, X., & Elhoseiny, M. (2023). MoStGAN-V: Video Generation with Temporal Motion Styles. CVPR. code. project page. arXiv.

About

A list of things I've used myself and found to be robust and useful.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published