# Wav2Lip Lip sync Video assignment

Connect to Google drive where the models and input files are stored

In [1]:
from google.colab import drive
drive.mount('/content/gdrive')

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


Clone the Wav2Lip Repo

In [3]:
!git clone https://github.com/Rudrabha/Wav2Lip.git

Cloning into 'Wav2Lip'...
remote: Enumerating objects: 403, done.[K
remote: Counting objects: 100% (18/18), done.[K
remote: Compressing objects: 100% (17/17), done.[K
remote: Total 403 (delta 14), reused 1 (delta 1), pack-reused 385 (from 4)[K
Receiving objects: 100% (403/403), 540.95 KiB | 12.29 MiB/s, done.
Resolving deltas: 100% (224/224), done.


In [4]:
!ls /content/gdrive/MyDrive/Wav2Lip

audio.mp3      audiotry2.mp3  face.png	  video.mp4	   wav2lip.pth
audiotry1.mp3  face.gif       video2.mp4  wav2lip_gan.pth


## Copy the wav2Lip models in the `checkpoints` folder

In [5]:
!cp -ri "/content/gdrive/MyDrive/Wav2Lip/wav2lip.pth" "/content/gdrive/MyDrive/Wav2Lip/wav2lip_gan.pth" /content/Wav2Lip/checkpoints/

# Get the pre-requisites

In [6]:
!pip uninstall tensorflow tensorflow-gpu

Found existing installation: tensorflow 2.18.0
Uninstalling tensorflow-2.18.0:
  Would remove:
    /usr/local/bin/import_pb_to_tensorboard
    /usr/local/bin/saved_model_cli
    /usr/local/bin/tensorboard
    /usr/local/bin/tf_upgrade_v2
    /usr/local/bin/tflite_convert
    /usr/local/bin/toco
    /usr/local/bin/toco_from_protos
    /usr/local/lib/python3.11/dist-packages/tensorflow-2.18.0.dist-info/*
    /usr/local/lib/python3.11/dist-packages/tensorflow/*
Proceed (Y/n)? y
  Successfully uninstalled tensorflow-2.18.0
[0m

In [7]:
# Not necessary
pip uninstall --yes librosa

Found existing installation: librosa 0.10.2.post1
Uninstalling librosa-0.10.2.post1:
  Successfully uninstalled librosa-0.10.2.post1


## Change the requirements.txt file to this:
```
librosa==0.9.1
numpy
torch
torchvision
tqdm
numba
```

In [None]:
!cd Wav2Lip && pip install -r requirements.txt

## Get the pre-trained Face detection model

In [9]:
!wget "https://www.adrianbulat.com/downloads/python-fan/s3fd-619a316812.pth" -O "Wav2Lip/face_detection/detection/sfd/s3fd.pth"

--2025-02-26 09:08:34--  https://www.adrianbulat.com/downloads/python-fan/s3fd-619a316812.pth
Resolving www.adrianbulat.com (www.adrianbulat.com)... 45.136.29.207
Connecting to www.adrianbulat.com (www.adrianbulat.com)|45.136.29.207|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 89843225 (86M) [application/octet-stream]
Saving to: ‘Wav2Lip/face_detection/detection/sfd/s3fd.pth’


2025-02-26 09:08:41 (15.9 MB/s) - ‘Wav2Lip/face_detection/detection/sfd/s3fd.pth’ saved [89843225/89843225]



# The following code will generated a audio file from the given script. It uses the Google Text-to-Speech (gTTS) library for the conversion. It'll save the file in the `sample_data` folder

In [None]:
!pip install -q gtts

from gtts import gTTS
import os
import IPython.display as ipd

script = """
Namaste Mathangi! My name is Anika, and I'm here to guide you through managing your credit card dues.
Mathangi, as of today, your credit card bill shows an amount due of INR 5,053 which needs to be paid by 31st December 2024
Missing this payment could lead to two significant consequences:
First, a late fee will be added to your outstanding balance.
Second, your credit score will be negatively impacted, which may affect your future borrowing ability.
Make your payment by clicking the link here... Pay through UPI or bank transfer.
Thank you!
"""

audio_output_path = "sample_data/audio2.mp3"  # We can change the path or the audio filename here.

print("Generating TTS audio with Indian accent...")
tts = gTTS(text=script, lang='en', tld='co.in', slow=False)

tts.save(audio_output_path)
print(f"Audio saved to: {audio_output_path}")

print("Playing the generated audio:")
ipd.display(ipd.Audio(audio_output_path))

In [11]:
!find /content -name "audio2.mp3" # Test to where the audio file is located

/content/sample_data/audio2.mp3


## We need a video as a input, below we are copying it from our Google drive to the `sample_data` folder

In [13]:
!cp "/content/gdrive/My Drive/Wav2Lip/video.mp4" "/content/gdrive/My Drive/Wav2Lip/audio.mp3" sample_data/
!ls sample_data/

anscombe.json  california_housing_test.csv   mnist_train_small.csv
audio2.mp3     california_housing_train.csv  README.md
audio.mp3      mnist_test.csv		     video.mp4


## Running and generating the video

### When we use the Wav2Lip model

In [None]:
!cd Wav2Lip && python inference.py --checkpoint_path checkpoints/wav2lip.pth --face "../sample_data/video.mp4" --audio "../sample_data/audio2.mp3"

### When we use the Wav2Lip + GAN

In [None]:
!cd Wav2Lip && python inference.py --checkpoint_path checkpoints/wav2lip_gan.pth --face "../sample_data/video.mp4" --audio "../sample_data/audio2.mp3"

## Push code to GitHub

In [19]:
# Delete the repository folder
!rm -rf /content/Wav2Lipsync

In [21]:
from google.colab import drive
drive.mount('/content/drive')

# Create a project directory in your Drive
!mkdir -p "/content/drive/MyDrive/Wav2Lip_Project"

# Copy your important files to Drive
!cp -r /content/Wav2Lip/ "/content/drive/MyDrive/Wav2Lip_Project/" 2>/dev/null || echo "No result videos found"
!cp -r /content/sample_data/ "/content/drive/MyDrive/Wav2Lip_Project/" 2>/dev/null || echo "No audio files found"

# Create a README file
!echo "# Wav2Lip Implementation\n\nThis repository contains my implementation of the Wav2Lip model for lip-syncing." > "/content/drive/MyDrive/Wav2Lip_Project/README.md"

Mounted at /content/drive
