Wav2Lip - a modified wav2lip 384 version

About this Repo

This repo is modified directly from the original wav2lip repo, and it requires a large amount of high-resolution data to train. So it's obvious that you won't train well. But currently, I have a strategy to train the target person and it only requires about 40 hours of data. Please contact me if your company needs a full-time data scientist to deploy this model and wants to research better models. This code just committed to finding a new job. I will remove the code when I get a job. If your company wants to offer a full-time job for the lipsync project, please get in touch with me. I will show my results and when I get the offer, I will deploy my full strategy for your company. I have other plans in the future to deploy other models such as GeneFace and MemFace but I lack the resources to research further.

Lip-syncing videos using the pre-trained models (Inference)

You can lip-sync any video to any audio:

python inference.py --checkpoint_path <ckpt> --face <video.mp4> --audio <an-audio-source>

The result is saved (by default) in results/result_voice.mp4. You can specify it as an argument, similar to several other available options. The audio source can be any file supported by FFMPEG containing audio data: *.wav, *.mp3 or even a video file, from which the code will automatically extract the audio.

Train!

There are two major steps: (i) Train the expert lip-sync discriminator, (ii) Train the Wav2Lip model(s).

Training the expert discriminator

You can use your own data (with resolution 384x384)

python parallel_syncnet_tanh.py

Training the Wav2Lip models

You can either train the model without the additional visual quality disriminator (< 1 day of training) or use the discriminator (~2 days). For the former, run:

python parallel_wav2lip_margin.py

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
checkpoints		checkpoints
evaluation		evaluation
face_detection		face_detection
filelists		filelists
models		models
results		results
temp		temp
.gitignore		.gitignore
README.md		README.md
audio.py		audio.py
color_syncnet_train.py		color_syncnet_train.py
hparams.py		hparams.py
hq_wav2lip_train.py		hq_wav2lip_train.py
inference.py		inference.py
parallel_syncnet_tanh.py		parallel_syncnet_tanh.py
parallel_wav2lip_margin.py		parallel_wav2lip_margin.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
wav2lip_train.py		wav2lip_train.py

nghiakvnvsd/wav2lip384

Folders and files

Latest commit

History

Repository files navigation

Wav2Lip - a modified wav2lip 384 version

About this Repo

Lip-syncing videos using the pre-trained models (Inference)

Train!

Training the expert discriminator

Training the Wav2Lip models

About

Resources

Stars

Watchers

Forks

Languages