# Create 7777 video - 'everyday' alignment 2nd attempt

This notebook shows the full code base needed to align all of Noah's images from the 'everyday' project and create the video [7777](https://www.youtube.com/watch?v=DC1KHAxE7mo).

In [None]:
from IPython.lib.display import YouTubeVideo
YouTubeVideo('Tc2WPoR-zlw')

In short, we use [dlib](http://dlib.net/) toolbox to detect, extract and align faces from all the images. The process to do so is a 2-step process:

1. We use `hog_detector` detector to find the faces in all images. This detector is 'ok-ish' but runs very quickly.
2. For all images where it wasn't possible to detect a face, we use the `cnn_face_detection_model_v1` routine. This routine is slower but more accurate.

Dlib's face detection is usually used to extract small 'chips'/patches of pixels that only contain the face. In this case however we decided to keep the full image, but just profit from dlib's routine of aligning the faces according to the 5 landmarks (two eyes, nose and two corners of the mouth). During this procedure, images are also upscaled to 4k (3840, 2160) resolution.

## Step 1: Data Acquisition

First things first, let's collect all photos from the video with `pytube`. If the package is not yet installed on your machine you can do so with `pip install pytube`.

In [None]:
# Let's install pytube and a few other python packages
!pip install -qU pytube opencv-python tqdm ipywidgets scikit-learn scikit-image tensorflow

### A. Download the video

In [None]:
from pytube import YouTube

In [None]:
# Download video
video_url = 'https://www.youtube.com/watch?v=Tc2WPoR-zlw'
out_folder = 'video'
filename = 'boy'
yt_streams = YouTube(video_url).streams.filter(progressive=True, type='video', res='720p')
yt_streams.first().download(out_folder, filename)

### B. Extract all individual images

*First*, we will use the package OpenCV to load all images from the video. *Second*, we will go through all the images and only keep unique images.

In [None]:
import os
import cv2
import numpy as np
import matplotlib.pyplot as plt
import shutil
import pandas as pd
from skimage import io
from glob import glob
from matplotlib import patches

from tqdm.notebook import tqdm

In [None]:
# Load video with OpenCV and extract relevant parameters
video = cv2.VideoCapture(os.path.join(out_folder, filename+'.mp4'))
video_frame_count = int(video.get(cv2.CAP_PROP_FRAME_COUNT))
print('Video has %d number of frames.' % video_frame_count)

In [None]:
video_frame_count

In [None]:
# Stack to keep unique images
imgs = []

# Looping through all images, and only adding them to the stack if they are new
for idx in tqdm(np.arange(video_frame_count)):

    # Read frame
    video.set(cv2.CAP_PROP_POS_FRAMES, idx)
    frame_retrieved, frame = video.read()
    
    # Check if image is new by correlating it to previous frame
    if frame_retrieved:
        if len(imgs)>0 and np.corrcoef(frame.ravel(), imgs[-1].ravel())[0, 1]>0.99:
            continue
        imgs.append(frame)

# Transform stack into numpy array and switch colorcode from BGR to RGB
imgs = np.array(imgs)[..., ::-1]

print('Dataset has shape of', imgs.shape)

In [None]:
# Store all images on disk
for idx in tqdm(range(len(imgs))):
    plt.imsave(f'img_orig/plot_{idx:03d}.png', imgs[idx])

# Prepare everything

In [None]:
# Collect all file names
filenames = sorted(glob('img_orig/*'))
filenames[:5] + filenames[-5:]

In [None]:
len(filenames)

# Correct and align images with `skimage` and `dlib`

In [None]:
# Show last image in the dataset
last_img = io.imread(filenames[-1])
plt.title(last_img.shape)
plt.imshow(last_img);

In [None]:
# Create output folder for aligned images
out_dir = 'img_aligned'
if not os.path.exists(out_dir):
    os.makedirs(out_dir)

In [None]:
import dlib

# Additional dlib models for face recognition
shape_predictor = dlib.shape_predictor('dlib/shape_predictor_5_face_landmarks.dat') # Faces landmarks (points)

# Which face detector to use
hog_detector = dlib.get_frontal_face_detector()
cnn_detector = dlib.cnn_face_detection_model_v1('dlib/mmod_human_face_detector.dat')

In [None]:
face_chip_size = (1920, 1080) # full hd

In [None]:
def crop_img(img, dim=(1920, 1080), ratio=7.):
    offset = int(dim[0]/ratio)
    return img[offset:offset+dim[1], ...]

### Go through with hog_detector

In [None]:
# CNN is more advanced but takes longer; hog misses ~100 faces in total
face_detector = hog_detector

In [None]:
align_files = True

In [None]:
from skimage.exposure import rescale_intensity

padding = np.divide(*face_chip_size)

issues = []

if align_files:

    for f in tqdm(filenames):

        # Specify new filename
        new_filename = os.path.join('img_aligned', os.path.basename(f))
        if os.path.exists(new_filename):
            continue

        # Load image
        im = io.imread(f)[..., :3]

        # Get information about image size
        w, h = im.shape[:2]
        offset = (h - w)//2

        """
        # Correct image intensity
        plow, phigh = np.percentile(im, (0, 99))
        im_corrected = rescale_intensity(im, in_range=(plow, phigh))
        """

        # Center image in a canvas
        canvas = np.zeros((h, h, 3)).astype('uint8')
        canvas[...] = im

        # Detect faces and align image
        rectangles = [x if isinstance(x, dlib.rectangle) else x.rect for x in face_detector(canvas, 1)]
        if len(rectangles):
            landmarks = [shape_predictor(canvas, r) for r in rectangles]
            face_chips = [dlib.get_face_chip(canvas, l, size=face_chip_size[0],
                                             padding=padding) for l in landmarks]

            # Crop image to write ratio
            img_final = crop_img(face_chips[0], dim=face_chip_size, ratio=7)
            img_final = img_final[80:-80, 500:-500]

            # Save aligned image
            io.imsave(new_filename, img_final)

        else:
            print('new issue found:', f)
            issues.append(f)

In [None]:
print(len(issues))
issues

### Go through issue images with cnn_detector

In [None]:
# CNN is more advanced but takes longer; hog misses ~100 faces in total
face_detector = cnn_detector

In [None]:
from skimage.exposure import rescale_intensity

padding = np.divide(*face_chip_size)

issues_still = []

for f in tqdm(issues):
    
    # Load image
    im = io.imread(f)[..., :3]
    
    # Get information about image size
    w, h = im.shape[:2]
    offset = (h - w)//2
    
    """
    # Correct image intensity
    plow, phigh = np.percentile(im, (0, 99))
    im_corrected = rescale_intensity(im, in_range=(plow, phigh))
    """

    # Center image in a canvas
    canvas = np.zeros((h, h, 3)).astype('uint8')
    canvas[...] = im

    # Detect faces and align image
    rectangles = [x if isinstance(x, dlib.rectangle) else x.rect for x in face_detector(canvas, 1)]
    if len(rectangles):
        landmarks = [shape_predictor(canvas, r) for r in rectangles]
        face_chips = [dlib.get_face_chip(canvas, l, size=face_chip_size[0],
                                         padding=padding) for l in landmarks]
        
        # Crop image to write ratio
        img_final = crop_img(face_chips[0], dim=face_chip_size, ratio=7)
        img_final = img_final[80:-80, 500:-500]

        # Save aligned image
        io.imsave(os.path.join('img_aligned', os.path.basename(f)), img_final)
        
    else:
        print('new issue found:', f)
        issues_still.append(f)

In [None]:
print(len(issues_still))
issues_still

In [None]:
# Number of images
print(len(filenames), len(glob('img_aligned/plot*')))

# Setup video parameters

In [None]:
# Get all filenames
imgs = sorted(glob('img_aligned/plot*'))

# Extract number of images
N_total = len(imgs)
N_total

In [None]:
# Specify frames per second
fps = 24

print('Video length: %.2f seconds.' % (N_total/fps))

# Create aligned video

In [None]:
# Save images to disk
out_dir = 'img_video_aligned'
if not os.path.exists(out_dir):
    os.makedirs(out_dir)

# To keep track what was already loaded
already_loaded = []

for i in tqdm(np.arange(len(imgs))):
    
    im = io.imread(imgs[i])
    
    # Create out_filename
    out_filename = os.path.join(out_dir, '%04d.jpg' % (i + 1))
    
    # Save composition image
    io.imsave(out_filename, im.astype('uint8'))

In [None]:
# Use either code (the one that works) to create the video
!cat img_video_aligned/*jpg | ffmpeg -f image2pipe -r $fps -vcodec mjpeg -i - -vcodec libx264 video_aligned.mp4

# Create averaged images

In [None]:
# How many images to smooth at once
smooth = 30

In [None]:
# How many days to jump at every image
step_size = 1

In [None]:
# Get start indeces for images
ids = [i*step_size for i in range((N_total+smooth)//step_size+1)]
len(ids)

In [None]:
# Save images to disk
out_dir = 'img_video_%ddays_mean' % (smooth)
if not os.path.exists(out_dir):
    os.makedirs(out_dir)

# To keep track what was already loaded
already_loaded = []

for i in tqdm(ids):
    
    # Collect indeces of images
    imgs_idx = np.arange(np.clip(i-smooth, 0, N_total-1), np.clip(i, 0, N_total-1)+1)

    # Collect images relevant for the group
    group_names = np.array(imgs)[imgs_idx]
    
    # Detect which one is new to load
    new_to_load = np.setdiff1d(group_names, already_loaded)
    
    if len(new_to_load)==0:
        pass
    elif i==0:
        imgs_group = np.array([io.imread(f) for f in new_to_load])
    else:
        img_new = np.array([io.imread(f) for f in new_to_load])
        imgs_group = np.vstack((imgs_group, img_new))
        
    # Cut imgs_group to write size
    n_offset = (i - N_total)
    if n_offset <= 0:
        n_offset = 0
    elif n_offset%2==0:
        n_offset -= 1
    imgs_group = imgs_group[-smooth+n_offset:]
    
    # Create composition image
    img_comp = np.mean(imgs_group, axis=0).astype('int')
    
    # Create out_filename
    out_filename = os.path.join(out_dir, '%04d.jpg' % (i + 1))
    
    # Save composition image
    io.imsave(out_filename, img_comp.astype('uint8'))

    # Keep track of what has already been loaded
    already_loaded = group_names

In [None]:
# Use either code (the one that works) to create the video
!cat img_video_30days_mean/*jpg | ffmpeg -f image2pipe -r $fps -vcodec mjpeg -i - -vcodec libx264 video_30days_mean.mp4
#!ffmpeg -r 30 -f image2 -pattern_type glob -i 'img_video_30days_mean/*.jpg' -c:v libx264 -profile:v high video_30days_mean.mp4