**Prepared and Maintained by [justinjohn-03](https://github.com/justinjohn0306/)**

In [1]:
#@markdown ### **🚀 Clone Repository and Install Requirements**
from IPython.display import clear_output, display, HTML

display(HTML('<font color="red">Cloning video-retalking repository:</font>'))
!git clone https://github.com/justinjohn0306/video-retalking
%cd video-retalking
clear_output()

display(HTML('<font color="red">Uninstalling existing gdown and reinstalling from source to avoid Google Drive download quota issues:</font>'))
!pip uninstall gdown -y
!pip install git+https://github.com/wkentaro/gdown.git
clear_output()

display(HTML('<font color="red">Installing other project requirements:</font>'))
!pip install -r requirements_colab.txt
clear_output()

display(HTML('<font color="red">Now we are set up and ready to proceed!</font>'))

In [2]:
#@markdown ### **📥 Download the Pretrained Models**
#@markdown The following code will download and unzip the pretrained models for this project:
from IPython.display import clear_output, display, HTML
%cd /content/video-retalking

import gdown

gdown.download("https://drive.google.com/uc?id=1Qtg-GVUKZ7aXtz-4O9Mm4ncXjEYRB8-p", "/content/checkpoints.zip", quiet=False)
!unzip -o /content/checkpoints.zip -d /content/video-retalking/
!rm /content/checkpoints.zip
clear_output()

display(HTML('<font color="red">Now you are ready to run the inference!</font>'))

In [None]:
import os
from IPython.display import clear_output, display, HTML

#@markdown ### **🚀 Set up Files for Inference**

#@markdown Enter the path to your face video file:
face_video = '/content/input_video.mp4'  #@param {type:"string"}

#@markdown Enter the path to your audio input file:
audio_input = '/content/input_audio.wav'  #@param {type:"string"}

#@markdown Enter the path where you want the output file to be saved:
output_file = 'results/1_1.mp4'  #@param {type:"string"}

assert os.path.exists(face_video), f"Face video file not found: {face_video}"
assert os.path.exists(audio_input), f"Audio input file not found: {audio_input}"

#@markdown Once your files are set, you can run the inference:
display(HTML('<font color="red">Running the inference...</font>'))
!python3 inference.py --face $face_video --audio $audio_input --outfile $output_file


[Info] Using cuda for inference.
[Step 0] Number of frames available for inference: 361
[Step 1] Landmarks Extraction in Video.
Downloading: "https://www.adrianbulat.com/downloads/python-fan/s3fd-619a316812.pth" to /root/.cache/torch/hub/checkpoints/s3fd-619a316812.pth
100% 85.7M/85.7M [00:04<00:00, 21.0MB/s]
Downloading: "https://www.adrianbulat.com/downloads/python-fan/2DFAN4-cd938726ad.zip" to /root/.cache/torch/hub/checkpoints/2DFAN4-cd938726ad.zip
100% 91.9M/91.9M [00:05<00:00, 17.7MB/s]
landmark Det:: 100% 361/361 [00:36<00:00,  9.90it/s]
[Step 2] 3DMM Extraction In Video:: 100% 361/361 [00:04<00:00, 77.46it/s]
using expression center
Load checkpoint from: checkpoints/DNet.pt
Load checkpoint from: checkpoints/LNet.pth
Load checkpoint from: checkpoints/ENet.pth
[Step 3] Stablize the expression In Video:: 100% 361/361 [00:36<00:00,  9.92it/s]


In [None]:

#@markdown ### **🎞️ View the Result Video**
#@markdown After running the inference, you can view the output video directly in your browser:

from IPython.display import HTML
from base64 import b64encode

mp4 = open(output_file,'rb').read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()

HTML("""
<video width=400 controls>
      <source src="%s" type="video/mp4">
</video>
""" % data_url)
