# Introduction to Video Processing 

## Learning Objectives:
- Create a sequence of video frames by applying signal processing techniques learned in the course.
- Solving problems of video streaming optimization by applying different tools and filters.

## Overview
So far, we have dealt with the idea of image processing, where images can be thought of as 2D signals and we learnt how to manipulate these signals using different filters.  Now, as you know, a video is essentially a sequence of images displayed at a certain frame rate. This makes image processing the cornerstone of video processing. Thus, initially in this lab, we explore how image processing techniques can be extended to videos.

In this lab, we will delve deep into the realm of video signals. A key concept we'll explore is how videos containing predominantly low-frequency components – where the difference between consecutive frames is minimal – can be effectively compressed without substantial loss in quality. 

We'll utilize the Fourier Transform, a powerful mathematical tool, to analyze the frequency content of video signals. This invaluable information will help inform your decisions on video compression strategies.

In [19]:
import matplotlib.pyplot as plt
import cv2
import numpy as np
import skimage
from skimage import io, color

### Basics of Video Processing
The first part of our journey into video processing begins with extraction of frames from a video. Videos, at their core, are a sequence of images, or frames, that are displayed at a rapid rate. When we watch a video, our brains interpret this rapid sequence of slightly differing images as motion. The process of extracting frames from a video is simply the act of capturing these individual images at each step in the sequence. 

When we're done processing, we'll need to convert these frames back into a video format. This is essentially the reverse of the extraction process, where we stitch the sequence of images back together into a continuous stream. Let us write functions for both the extraction of the frames and the reverse.

In [20]:
def extract_frames(video_path):
    """
    Extract frames from a video file and return them as a list of images.

    This function uses OpenCV's VideoCapture class to open the video file at 
    the provided path. It then enters a while loop that continues for as long 
    as the video file is open.

    Within the loop, it reads each frame in the video one at a time. If a frame 
    is successfully read (indicated by `ret` being True), it is added to the 
    list of frames.

    If a frame cannot be read (which usually means that the end of the video has 
    been reached), `ret` will be False, and the loop will exit.

    Finally, the VideoCapture object is released to free up system resources, 
    and the list of frames is returned.

    Args:
        video_path (str): Path to the video file.

    Returns:
        frames (list of ndarray): A list where each element is a numpy array 
            representing an image.
    """
    cap = cv2.VideoCapture(video_path)
    frames = []

    while(cap.isOpened()):
        ret, frame = cap.read()
        if ret:
            frames.append(frame)
        else:
            break

    cap.release()
    return frames


In [21]:
def frames_to_video(frames, output_video_file, fps=30):
    '''
    Convert a list of frames into a video

    Parameters:
    frames (list): List of frames. Each frame should be a numpy array
    output_video_file (str): Output video file name
    fps (int): Frames per second of the output video

    Returns:
    None
    '''
    # Get size (width, height) from the first image
    height, width, layers = frames[0].shape
    size = (width, height)

    # Initialize the video writer
    out = cv2.VideoWriter(output_video_file, cv2.VideoWriter_fourcc(*'DIVX'), fps, size)

    # Write frames to the video writer
    for frame in frames:
        out.write(frame)

    # Release the video writer
    out.release()

Next, we'll delve into an important concept in video processing - frame difference. **Frame difference** is a technique that allows us to quantify the changes between two consecutive frames in a video. Essentially, it helps us detect motion in the video by highlighting the parts of the frame that have changed. In a static image, the difference between it and the next image (if they are identical) would be zero. But in a video, especially in high action segments, the frame difference could reveal interesting patterns.

Let's try to write a function that calculates the frame difference. We already have a function that extracts all frames from a video, so we can utilize that. What we need to do next is to iterate over these frames, and for each frame, calculate the difference with the next one. There are multiple ways to calculate this difference, but a simple way is to subtract the pixel values of one frame from the corresponding pixel values of the next frame. This will give us a new image that represents the difference between the two frames.

Here's a skeleton for the function. Your task will be to complete the function using what you've learned so far. You can take help of the `absdiff`function that is available in the cv2. You can refer to the [documentation here](https://docs.opencv.org/3.4/d2/de8/group__core__array.html#ga6fef31bc8c4071cbc114a758a2b79c14) (link opens in new tab):

In [22]:
def compute_frame_differences(frames):
    """
    Calculate the difference between consecutive frames in a list.

    This function takes a list of frames and computes the absolute difference 
    between each pair of consecutive frames, creating a new "difference frame" 
    that highlights the regions of the frame that have changed.

    Args:
        frames (list of ndarray): A list of numpy arrays where each array 
            represents a frame in the video.

    Returns:
        frame_diffs (list of ndarray): A list of numpy arrays where each array 
            represents the absolute difference between two consecutive frames.
    """
    frame_diffs = []
    for i in range(...):
        diff = ...
        frame_diffs.append(diff)
    return frame_diffs


### Coming back to Image Processing
Before doing anything significant in video processing, we would need to revisit some tools that you have worked with in Image Processing. Load the video `goal.mp4` and extract its frames and frame differences.

In [23]:
frames = ...
frame_diffs = ...

## Part 1: Frequency Analysis
Frequency analysis plays a crucial role in image and video processing, offering a different perspective from the spatial domain representation, which we're more intuitively familiar with.

In the frequency domain, an image is represented by the frequencies of the signals that make up the image. High frequencies correspond to rapid changes in pixel values, such as edges or fine details, while low frequencies correspond to slow changes, such as smooth gradients or larger homogeneous areas.

**Question 1.1:** Write a function that converts an image to the frequency domain. Keep in mind, we will write this functions with respect to videos and not images. Run the code below to see if you wrote the right code.

In [24]:
def analyze_frequency(frame_diffs):
    """
    Analyzes and visualizes the frequency components of the first difference frame.

    This function takes as input a list of difference frames (typically derived from a sequence 
    of video frames). It then selects the first difference frame, converts it to grayscale, and 
    computes its 2D Fourier Transform. The Fourier Transform is then shifted to center the 
    zero-frequency components. The magnitude spectrum, which represents the distribution of 
    frequencies in the image, is computed and both the original grayscale difference frame and 
    its magnitude spectrum are displayed side by side.

    Parameters:
    - frame_diffs (list of numpy.ndarray): A list of difference frames, where each frame is 
      an image represented as a 2D or 3D array.

    Note:
    - This function uses OpenCV for image processing and matplotlib for visualization.
    - Only the first frame in `frame_diffs` is processed and visualized in this function.
    - The magnitude spectrum is displayed in logarithmic scale for better visibility of details.

    """
    # Select appropriate frame
    example_diff = ...
    gray_diff = cv2.cvtColor(example_diff, cv2.COLOR_BGR2GRAY)
    # Compute Fourier Transform and then perform a shift on it
    f = np.fft.fft2(...)
    fshift = ...
    # Compute associated magnitude of shifted frequency in dB 
    magnitude_spectrum = ...

    plt.subplot(1,2,1), plt.imshow(gray_diff, cmap = 'gray')
    plt.title('Input Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(1,2,2), plt.imshow(magnitude_spectrum, cmap = 'gray')
    plt.title('Magnitude Spectrum'), plt.xticks([]), plt.yticks([])
    plt.show()

In [None]:
analyze_frequency(...) 

After analyzing the frequency of the frame differences, you might start noticing patterns and correlations between the frequency spectrum and the pace of the video. In a fast-paced video, such as an action game, changes between frames are generally more dramatic, and in contrast, in a slower-paced video, like a walking simulator game, changes between frames are usually subtler, so we anticipate a concentration of lower frequencies.

**Question 1.2:** We have two games, Game A and Game B. Game A is a fast-paced game with rapid transitions and high-speed movements. Game B is a slow-paced game with slow transitions and minimal movements. Out of the two images below, which image resembles Game A and Game B respectively?

In [None]:
img1 = io.imread('img1.png')
plt.imshow(img1)
plt.title("Image 1")

In [None]:
img2 = io.imread('img2.png')
plt.imshow(img2)
plt.title("Image 2")

### Miscellaneous: Switching Phases

For fun, let us try to switch the phases between the fast-paced game and slow-paced game, and see the results. This would involve us to extract the frames and writing up a function to switch the phases between two frames and applying it throughout the video. You may need to use the following function that converts polar to cartesian coordinates.

In [28]:
def pol2cart(rho, phi):
    x = rho * np.cos(phi)
    y = rho * np.sin(phi)
    z = x + y*1j
    return z

**Question 1.3:** Write a function `swap_phase` that takes as argument two complex numbers $z_1$ and $z_2$ and returns complex numbers $z_3$ and $z_4$, such that 

$$
|z_3| = |z_1|,~~\rm{arg}(z_3) = \rm{arg}(z_2),
$$
and 
$$
|z_4| = |z_2|,~~\rm{arg}(z_4) = \rm{arg}(z_1).
$$

In [33]:
def swap_phase(z1, z2):
    """
    Swaps the phase angles of two complex numbers while preserving their magnitudes.

    Parameters:
    - z1 (complex): The first complex number.
    - z2 (complex): The second complex number.

    Returns:
    - tuple: A tuple containing two complex numbers:
        - The first complex number has the magnitude of z1 and the phase angle of z2.
        - The second complex number has the magnitude of z2 and the phase angle of z1.
    """
    z3 = ...
    z4 = ...
    return (z3, z4)


**Question 1.4:** Now, let us extend this function to images (or frames) where we swap the phase between two images by taking the fourier transfrom and applying the `swap_phase` function. Let us further apply this new function to the first frame from both the videos.

In [34]:
def swap_phase_img(img1,img2):
    """
    Swaps the phase spectrum of the Fourier Transforms of two images.

    This function performs a Fourier transform on both input images, swaps the phase
    spectra while keeping the magnitude spectra unchanged, and then inversely transforms
    back to the spatial domain.

    Parameters:
    - img1 (numpy.ndarray): The first 2D image array.
    - img2 (numpy.ndarray): The second 2D image array.

    Returns:
    - tuple: A tuple containing two 2D image arrays:
        - The first image has the magnitude spectrum of img1 and the phase spectrum of img2.
        - The second image has the magnitude spectrum of img2 and the phase spectrum of img1.

    Note:
    - The output images will be in real domain, with any imaginary components discarded.
    """    
    f1 = ...
    f2 = ...
    swapped1, swapped2 = ...
    img1_swapped = ...
    img2_swapped = ...

    return img1_swapped, img2_swapped

Complete the code below which takes the first frame from two video clips `game_A.mp4` and `game_B.mp4` and swaps the phase

In [None]:
#Extract the frames from game_A.mp4 and game_B.mp4 and select the first frame
video_path1 = 'game_A.mp4' 
frames1 = ...

video_path2 = 'game_B.mp4' 
frames2 = ...

example_frame1 = ...
example_frame2 = ... 

#Swap the phases
phase_switched_frame1, phase_switched_frame2 = swap_phase(...)

plt.figure()
plt.imshow(phase_switched_frame1.real, cmap='gray')
plt.title('Phase Switched Frame 1')
plt.show()

plt.figure()
plt.imshow(phase_switched_frame2.real, cmap='gray')
plt.title('Phase Switched Frame 2')
plt.show()

If you do not see a lot going on in your output, that should be fine.

**Question 1.5:** Let us now apply the swapped phase to the whole video and see what happens.

In [None]:
phase_switched_frames_1 = []
phase_switched_frames_2 = []

#Apply the swap_phase function and append it onto the frame arrays
#The number of iterations is dependent on the video containing the lesser number of frames (Why?)
for i in range(min(len(frames1), len(frames2))): 
    frame1, frame2 = ...
    #Write instructions to append new frames onto the two empty arrays
    ...
    ...

phase_switched_frames_1 = [np.real(frame) for frame in phase_switched_frames_1]
phase_switched_frames_2 = [np.real(frame) for frame in phase_switched_frames_2]

#Apply the frames to video functions for the two new arrays
...


## Part 2: Filters
Digital filters are a key concept in signal processing and will play a significant role in our lab. We've learned in our coursework how filters can modify signals in useful ways. In the context of images and video, which are just two-dimensional (or three-dimensional if we consider the color channels) signals, filters can help us enhance features, remove noise, and extract useful information from the raw signal. There are many types of digital filters, but we'll focus primarily on **Low-Pass Filter**, **High-Pass Filter**, **Edge Detection Filter**, **Image Cropping Filter.**

In [38]:
img = io.imread('fgp.jpg')

**Question 2.1:**  We have "black-boxed" the low-pass filter and the edge-detection filter. Can you implement the remaining functions? Futhermore, apply all the filters onto the image loaded below. (**Hint:** The Image-Cropping filter can simply be implemented in a single line)

In [1]:
def lowpass_filter(image, cutoff_frequency_ratio=0.01, high_pass=False, order=1):
    """Apply a Lowpass Butterworth filter to a colour image. 
    
    This function should wrap the skimage.filters.butterworth function 
    so that it is applied to each channel of the image individually, and
    then the reconstructed image returned.
    
    Args:
        image (array): Array representing a multi-channel image.
        cutoff_frequency_ratio (float): Indicates the position of the frequency
            cutoff as a percentage of the length of the spectrum.
        high_pass (bool): If True, will act as a highpass filter. Otherwise, it
            will be a lowpass filter.
        order (float): The order of the filter; how quickly the frequency 
            response drops to 0.
            
    Returns:
        array: A multi-channel image that has had the Butterworth filter
        applied to each channel indepedently.
    """
    # Set up a NumPy array of the same size as the image
    filtered_image = np.zeros_like(image)

    # Loop over the channels; apply the filter one channel at a time
    # and store the results in the filtered_image
    for idx in range(filtered_image.shape[2]):
        filtered_image[:,:,idx] = skimage.filters.butterworth(
            image[:,:,idx], 
            cutoff_frequency_ratio=cutoff_frequency_ratio, 
            high_pass=high_pass, 
            order=order
        )
        
    return filtered_image


In [None]:
lpf_img = ...
plt.imshow(lpf_img)

In [41]:
def edge_detector(image):
    """Apply an edge detection filter to a colour image. 
    
    This function should wrap the skimage.filters.butterworth function 
    so that it is applied to each channel of the image individually, and
    then the reconstructed image returned.
    
    Add keyword arguments to this function as needed.
    
    Args:
        image (array): Array representing a multi-channel image.
            
    Returns:
        array: A multi-channel image that has had the Butterworth filter
        applied to each channel indepedently.
    """
    filtered_image = np.zeros_like(image, dtype=float)
    
    for idx in range(filtered_image.shape[2]):
        filtered_image[:,:,idx] = skimage.filters.sobel(image[:,:,idx])
        
    return filtered_image

In [None]:
edge_detected_image = ...
io.imshow(edge_detected_image)

In [43]:
def highpass_filter(image, cutoff_frequency_ratio=0.35, high_pass=True, order=1):
    """Apply a High-Pass Butterworth filter to a colour image. 
    
    This function should wrap the skimage.filters.butterworth function 
    so that it is applied to each channel of the image individually, and
    then the reconstructed image returned.
    
    Args:
        image (array): Array representing a multi-channel image.
        cutoff_frequency_ratio (float): Indicates the position of the frequency
            cutoff as a percentage of the length of the spectrum.
        high_pass (bool): If True, will act as a highpass filter. Otherwise, it
            will be a lowpass filter.
        order (float): The order of the filter; how quickly the frequency 
            response drops to 0.
            
    Returns:
        array: A multi-channel image that has had the Butterworth filter
        applied to each channel indepedently.
    """
    ...


In [None]:
hpf_img = ...
plt.imshow(hpf_img, cmap='gray')

In [45]:
def crop_image(img, x, y, width, height):
    """
    This function crops an image to the desired size.
    
    Args:
    img (np.array): Input image.
    x (int): x-coordinate of the top left corner of the desired crop.
    y (int): y-coordinate of the top left corner of the desired crop.
    width (int): Desired width of the crop.
    height (int): Desired height of the crop.

    Returns:
    np.array: Cropped image.
    """
    return ...

In [None]:
cropped_image = ...
io.imshow(cropped_image)

## Part 3: Video Compression

Given a short video clip from a game of FIFA, you are required to experiment with different parameters of a low-pass Butterworth filter to reduce the high-frequency content (noise) in the video. You should aim for the highest level of compression (reduced data) without significantly compromising the quality of the video. As part of this exercise, you should also inspect the video frame-by-frame and examine the frequency spectrum of individual frames.

### Step 1: Extract Frames from Video

The first step will be to extract the frames from the video.

In [47]:
video_path = 'goal.mp4' 
frames = ...

### Step 2: Analyze a Single Frame

Before applying the filter to the entire video, it's a good idea to start with a single frame. This will allow you to get a sense of how the filter affects the image and what parameter values might be appropriate. You can display the frame before and after filtering and also look at the frequency spectrum.

In [None]:
example_frame = ... # take the first frame as an example
filtered_frame = ... # Apply the low-pass filter 

# Display the original and filtered frames for comparison
plt.figure(figsize=(10,5))
plt.subplot(1,2,1), plt.imshow(cv2.cvtColor(example_frame, cv2.COLOR_BGR2RGB)), plt.title('Original')
plt.subplot(1,2,2), plt.imshow(cv2.cvtColor(filtered_frame, cv2.COLOR_BGR2RGB)), plt.title('Filtered')
plt.show()

# Analyze the frequency spectrum of the original and filtered frames
analyze_frequency([example_frame])
analyze_frequency([filtered_frame])

### Step 3: Apply Filter to Entire Video

Once you're satisfied with the result on the example frame, you can apply the filter to the entire video. 

In [50]:
filtered_frames = []
... # Write your code to apply the filter to all the frames

### Step 4: Create Compressed Video

Now that you have the filtered frames, you can convert them back into a video. 

In [51]:
...

## Extra: Using Different Filters
After doing the last exercise, can you think of a way to apply all the other filtes you made to the same video? Write your code below. For the sake of simplicity, add the following line to your code before appending your frames when dealing with edge detector filter:

`normalized_edge_frame = cv2.normalize(edge_frame, None, alpha=0, beta=255, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_8U)`

In [55]:
...