# Video Quality Enhancement with GFPGAN

## In this notebook, video enhancement is carried out on the video of the Wav2Lip model using the GFPGAN generative model V1.4.
- The videos generated by the Wav2lip.pth model undergo scaling processes that significantly impact facial quality, negatively affecting lip synchronization. Therefore, these videos require further enhancement through quality improvement methods.


# GFPGAN
GFPGAN (Generative Facial Prior GAN) is a powerful machine learning model designed to restore low-quality facial images to high-definition. It's a significant advancement in the field of image restoration, capable of handling various image degradations like low resolution, noise, blur, and compression artifacts.

### How GFPGAN Works
GFPGAN leverages the power of a pre-trained face GAN (like StyleGAN2) to provide rich and diverse facial priors. This allows it to restore faces effectively even with minimal input information. The model works by:

- Detecting the face: Using RetinaFace, it locates the face in the image.
- Facial landmark detection: It identifies key points on the face for precise restoration.
- Image restoration: Applying the generative facial prior, GFPGAN reconstructs the facial details, removing imperfections and enhancing clarity.


### GFPGAN v1.4: Improved Performance
#### The latest version, GFPGAN v1.4, builds upon the strengths of its predecessors while addressing certain limitations. It offers:

- Enhanced detail: Produces images with even finer details compared to previous versions.
- Better identity preservation: Maintains facial features and identity more accurately.
These improvements make GFPGAN v1.4 an even more effective tool for restoring old photos, enhancing low-resolution images, and improving overall image quality.

### Applications of GFPGAN
##### GFPGAN has a wide range of applications, including:

- Photo restoration: Reviving old and damaged photos.
- Video enhancement: Improving the quality of low-resolution videos.
- Social media: Enhancing profile pictures and other online images.
- Film restoration: Restoring old movies and TV shows.


In [1]:
# clone the GFPGAN model repo
!git clone https://github.com/TencentARC/GFPGAN.git

%cd GFPGAN

# install GFPGAN dependencies
!pip install -r requirements.txt

%cd /content/

# Install basicsr - https://github.com/xinntao/BasicSR
# We use BasicSR for both training and inference
!pip install basicsr

# Install facexlib - https://github.com/xinntao/facexlib
# We use face detection and face restoration helper in the facexlib package
!pip install facexlib

!pip install -r requirements.txt
!python setup.py develop

# If you want to enhance the background (non-face) regions with Real-ESRGAN,
# you also need to install the realesrgan package
!pip install realesrgan

# unstall torch to download the compatible versions
!pip uninstall -y torch torchvision

# install the compatible versions
# Install torch 2.0.1+cu117
!pip install torch==2.0.1+cu117 -f https://download.pytorch.org/whl/cu117/torch_stable.html

# Install torchvision 0.15.2+cu117
!pip install torchvision==0.15.2+cu117 -f https://download.pytorch.org/whl/cu117/torch_stable.html


%cd GFPGAN

!pip install ffmpeg
!pip install gdown



Cloning into 'GFPGAN'...
remote: Enumerating objects: 527, done.[K
remote: Counting objects: 100% (213/213), done.[K
remote: Compressing objects: 100% (59/59), done.[K
remote: Total 527 (delta 170), reused 155 (delta 154), pack-reused 314[K
Receiving objects: 100% (527/527), 5.38 MiB | 20.62 MiB/s, done.
Resolving deltas: 100% (281/281), done.
[31mERROR: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'[0m[31m
[0mCollecting basicsr
  Downloading basicsr-1.4.2.tar.gz (172 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m172.5/172.5 kB[0m [31m13.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting addict (from basicsr)
  Downloading addict-2.4.0-py3-none-any.whl.metadata (1.0 kB)
Collecting lmdb (from basicsr)
  Downloading lmdb-1.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.1 kB)
Collecting tb-nightly (from basicsr)
  Downloading tb_nightly-

In [None]:
# import the necessary libraries
import gdown
import os
import ffmpeg
import subprocess
import shutil

In [None]:
# download the videos to be edited
# run this cell to download the required video to enhance

# Download the modified inference_gfpgan.py

gdown.download(f"https://drive.google.com/uc?id=***********************", "/content/GFPGAN/inference_gfpgan.py", quiet=False)


In [None]:
# the videos links

# English
# file_id = "*************************"

# #korean
file_id = "********************************"

#spanish
# file_id = "********************************"

# #Arabic
# file_id = "********************************"

# Download the file
gdown.download(f"https://drive.google.com/uc?id={file_id}", "/content/input_video.mp4", quiet=False)

In [3]:

# create directories for frames and audio

def clear_directory(directory):
    if os.path.exists(directory) and os.listdir(directory):
        shutil.rmtree(directory)
    os.makedirs(directory, exist_ok=True)

# Define the directories
directories = [
    "/content/frames",
    "/content/sound",
    "/content/enhanced_frames"
]

# Clear and recreate the directories
for directory in directories:
    clear_directory(directory)


# Extract frames with highest quality
subprocess.run(['ffmpeg', '-i', '/content/input_video.mp4', '-vf', 'fps=25', '-q:v', '1', '/content/frames/frame_%06d.png'])

# Extract audio with highest quality
subprocess.run(['ffmpeg', '-i', '/content/input_video.mp4', '-vn', '-acodec', 'pcm_s16le', '-ar', '44100', '-ac', '2', '/content/sound/audio.wav'])

CompletedProcess(args=['ffmpeg', '-i', '/content/input_video.mp4', '-vn', '-acodec', 'pcm_s16le', '-ar', '44100', '-ac', '2', '/content/sound/audio.wav'], returncode=0)

In [4]:
# count the frames of the original video before the enhancement

num_frames = len([f for f in os.listdir("/content/frames") if os.path.isfile(os.path.join("/content/frames", f))])
print(f"Number of frames: {num_frames}")

Number of frames: 1049


In [5]:
# run GFPGAN to enhance the frames
!python inference_gfpgan.py -i /content/frames -o /content/enhanced_frames -v 1.4 -s 1



[1;30;43mStreaming output truncated to the last 5000 lines.[0m
	Tile 9/15
	Tile 10/15
	Tile 11/15
	Tile 12/15
	Tile 13/15
	Tile 14/15
	Tile 15/15
Processing frame_000738.png ...
	Tile 1/15
	Tile 2/15
	Tile 3/15
	Tile 4/15
	Tile 5/15
	Tile 6/15
	Tile 7/15
	Tile 8/15
	Tile 9/15
	Tile 10/15
	Tile 11/15
	Tile 12/15
	Tile 13/15
	Tile 14/15
	Tile 15/15
Processing frame_000739.png ...
	Tile 1/15
	Tile 2/15
	Tile 3/15
	Tile 4/15
	Tile 5/15
	Tile 6/15
	Tile 7/15
	Tile 8/15
	Tile 9/15
	Tile 10/15
	Tile 11/15
	Tile 12/15
	Tile 13/15
	Tile 14/15
	Tile 15/15
Processing frame_000740.png ...
	Tile 1/15
	Tile 2/15
	Tile 3/15
	Tile 4/15
	Tile 5/15
	Tile 6/15
	Tile 7/15
	Tile 8/15
	Tile 9/15
	Tile 10/15
	Tile 11/15
	Tile 12/15
	Tile 13/15
	Tile 14/15
	Tile 15/15
Processing frame_000741.png ...
	Tile 1/15
	Tile 2/15
	Tile 3/15
	Tile 4/15
	Tile 5/15
	Tile 6/15
	Tile 7/15
	Tile 8/15
	Tile 9/15
	Tile 10/15
	Tile 11/15
	Tile 12/15
	Tile 13/15
	Tile 14/15
	Tile 15/15
Processing frame_000742.png ...
	Tile 1/

In [6]:
# count the frames after enhancement
# Path to the directory containing the frames
frames_dir = "/content/enhanced_frames/restored_imgs"

# count the number of files in the directory
num_frames = len([f for f in os.listdir(frames_dir) if os.path.isfile(os.path.join(frames_dir, f))])

print(f"Number of frames: {num_frames}")


Number of frames: 1049


In [7]:

# combine enhanced frames and audio to create the output video with the highest quality
subprocess.run([
    'ffmpeg', '-framerate', '25', '-i', '/content/enhanced_frames/restored_imgs/frame_%06d.png',
    '-i', '/content/sound/audio.wav', '-c:v', 'libx264', '-crf', '18', '-pix_fmt', 'yuv420p', '/content/korean_output_video.mp4'
])


CompletedProcess(args=['ffmpeg', '-framerate', '25', '-i', '/content/enhanced_frames/restored_imgs/frame_%06d.png', '-i', '/content/sound/audio.wav', '-c:v', 'libx264', '-crf', '18', '-pix_fmt', 'yuv420p', '/content/korean_output_video.mp4'], returncode=0)