On Learning Associations of Faces and Voices

This repository contains a single-file, reference implementation of the following publication:

On Learning Associations of Faces and Voices
Changil Kim, Hijung Valentina Shin, Tae-Hyun Oh, Alexandre Kaspar, Mohamed Elgharib, Wojciech Matusik
ACCV 2018
Paper | ArXiv | Project Website

Please cite the above paper if you use this software. See the project website for more information about the paper.

Requirements

The software runs with Python 2 or 3, and TensorFlow r1.4 or later. Additionally, it requires NumPy, SciPy, and scikit-image packages.

Pre-trained models

Two pre-trained models are provided as TensorFlow checkpoints.

Usage

Download pre-trained models and unzip them. Prepare input facial images and voice files: facial images must be JPEG or PNG color images, and audio files must be WAV audio files sampled at 22,050 hz.

Depending on the reference modality, run one of the following two commands. Make sure you specify the correct checkpoint matching the reference modality.

Given a voice, find the matching face from two candidates (v2f):

facevoice.py v2f -c CHECKPOINTDIR --voice VOICEFILE --face0 FACEFILE --face1 FACEFILE

Given a face, find the matching voice from two candidates (f2v):

facevoice.py f2v -c CHECKPOINTDIR --face FACEFILE --voice0 VOICEFILE --voice1 VOICEFILE

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
LICENSE		LICENSE
README.md		README.md
facevoice.py		facevoice.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

On Learning Associations of Faces and Voices

Requirements

Pre-trained models

Usage

About

Releases

Packages

Languages

License

changil/facevoice

Folders and files

Latest commit

History

Repository files navigation

On Learning Associations of Faces and Voices

Requirements

Pre-trained models

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages