# Optical Flow

1) Create 3 versions of a video with slow motion effects using interpolation of frames. The three versions are:

Repeat: frames interpolated by repetition of the original frame\
$fr_{new} = fr_{prev}$

Linear: frames interpolated linearly\
$fr_{new} = t \cdot fr_{prev} + (1 - t) \cdot fr_{next}$

Optical Flow: frames interpolated using optical flow\
$fr_{new} = t \cdot fr_{flow} + (1 - t) \cdot fr_{next}$

The optical flow can be calculated using the Horn-Schunck method or the Farneback method.

In [1]:
import cv2
import numpy as np

In [2]:
video_path = 'assets/onca.mp4'

# 8 frames to be inserted between each pair of consecutive frames from the original video
fator = 8 

cap = cv2.VideoCapture(video_path)
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

In [3]:
# create output video

# interpolation by repetition
outrep_width = width
outrep_height = height

# linear interpolation
outlin_width = width
outlin_height = height

# opt flow interpolation
outopt_width = width
outopt_height = height

# video with all three for comparison
outcomb_width = 3*width 
outcomb_height = height

# Create object VideoWriter to save the videos
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
outrep_path = 'output/out_rep.mp4'
outlin_path = 'output/out_lin.mp4'
outopt_path = 'output/out_opt.mp4'
outcomb_path = 'output/out_comb.mp4'

outrep = cv2.VideoWriter(outrep_path, fourcc, fps, (outrep_width, outrep_height))
outlin = cv2.VideoWriter(outlin_path, fourcc, fps, (outlin_width, outlin_height))
outopt = cv2.VideoWriter(outopt_path, fourcc, fps, (outopt_width, outopt_height))
outcomb = cv2.VideoWriter(outcomb_path, fourcc, fps, (outcomb_width, outcomb_height))

In [4]:
# Auxiliary function to insert interpolated frames
def combine_frames(frames):
    height = frames[0][0].shape[0]
    width = frames[0][0].shape[1]
    channels = frames[0][0].shape[2]
    combined_frame = np.zeros((height, width * len(frames), channels), dtype=np.uint8)

    for i, frame in enumerate(frames):
        combined_frame[:, i * width : (i + 1) * width, :] = frame[0]

    return combined_frame

In [5]:
# coordinate map for optical flow
coord_x, coord_y = np.meshgrid(np.arange(width), np.arange(height))

frames_count = 0; total_frames = cap.get(cv2.CAP_PROP_FRAME_COUNT); block = int(total_frames/10)

# bring the video back to the beginning
cap.set(cv2.CAP_PROP_POS_FRAMES, 0)
ret, prev_frame = cap.read()
while cap.isOpened():
    frames_count += 1

    # print progress
    if frames_count % block == 0:
        print('Processing: ', int(frames_count/block)*10, '%')

    ret, frame = cap.read()

    if not ret:
        break

    # start with the previous frame
    frame_repeat = cv2.resize(prev_frame, (outrep_width, outrep_height))
    frame_linear = cv2.resize(prev_frame, (outlin_width, outlin_height))
    frame_optflow = cv2.resize(prev_frame, (outopt_width, outopt_height))

    frame_combinado = combine_frames([[frame_repeat], [frame_linear], [frame_optflow]])
    frame_combinado = cv2.resize(frame_combinado, (outcomb_width, outcomb_height))

    # write each frame to the corresponding output video
    outrep.write(frame_repeat)
    outlin.write(frame_linear)
    outopt.write(frame_optflow)
    outcomb.write(frame_combinado)
    
    # optical flow
    prev_gray = cv2.cvtColor(prev_frame, cv2.COLOR_BGR2GRAY)
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    # calculate optical flow using the Farneback method already implemented by OpenCV
    # the method parameters are:
    # prev_gray: previous frame in grayscale
    # gray: current frame in grayscale
    # None: no mask
    # 0.5: scale pyramid, scale factor
    # 3: number of pyramid levels
    # 15: size of the window neighborhood
    # 3: number of iterations of the algorithm
    # 5: size of the averaging window for smoothing
    # 1.2: standard deviation of the Gaussian filter
    # 0: flags
    flow = cv2.calcOpticalFlowFarneback(prev_gray, gray, None, 0.5, 3, 15, 3, 5, 1.2, 0)

    # insert intermediate frames
    for i in range(1, fator):
        # interpolation by repetition
        frame_repeat = prev_frame

        # linear interpolation
        frame_linear = cv2.addWeighted(prev_frame, (fator - i) / fator, frame, i / fator, 0)

        # optical flow interpolation
        # we modify the x and y coordinates according to the optical flow
        # the map map_x contains the x coordinates of each pixel of the current frame
        # it must be calculated from the original x coordinate map (coord_x) and the optical flow
        map_x = coord_x - flow[:, :, 0] * i / fator
        map_y = coord_y - flow[:, :, 1] * i / fator
        frame_optflow = cv2.remap(prev_frame, map_x.astype(np.float32), map_y.astype(np.float32), interpolation=cv2.INTER_LINEAR)
        
        # finally, interpolate linearly between the next frame and the transformed one
        peso_fluxo_otico = 1 - i / fator
        peso_proximo_frame = i / fator
        frame_optflow = cv2.addWeighted(frame_optflow, peso_fluxo_otico, frame, peso_proximo_frame, 0)

        frame_combinado = combine_frames([[frame_repeat], [frame_linear], [frame_optflow]])
        
        outrep.write(frame_repeat)
        outlin.write(frame_linear)
        outopt.write(frame_optflow)
        outcomb.write(frame_combinado)

    prev_frame = frame

cap.release()
outrep.release()
outlin.release()
outopt.release()
outcomb.release()

Processing:  10 %
Processing:  20 %
Processing:  30 %
Processing:  40 %
Processing:  50 %
Processing:  60 %
Processing:  70 %
Processing:  80 %
Processing:  90 %
Processing:  100 %



2) Describe briefly the results you found in item (1). Which method is better? Why?


While in linear interpolation we apply a simple weighted average between the frames of the video, the strategy in optical flow is to calculate the instantaneous speed of the objects in a frame and produce intermediate frames faithful to that speed. This makes the result smoother and closer to what the original video would be in slow motion.\
\
The sensorial effect of linear interpolation is that of a slower and more dragged video, with a "ghost" effect on moving objects, almost as if the spectator were drunk.\
On the other hand, the sensorial effect of optical flow is that of a smoother video, with more natural movements and that minimizes the "ghost" effect on moving objects.\
Finally, repeating the frames only makes the video slower, without any interesting sensorial effect.