## **VidPCA: PCA for Video Compression**
### Introduction

In this tutorial, we explore the application of Principal Component Analysis (PCA) using Singular Value Decomposition (SVD) for video compression, utilizing Octave as our tool. We begin our journey with a historical piece - 'Sallie Gardner at a Gallop', a video dating back to 1878.

In [43]:
%%html
<img src="original_movie.gif" style = 'width:100%;'>

We first create an Octave function that turns a directory of JPEG files into a GIF:

In [1]:
# Directory containing images
image_directory = "Frames/"; image_prefix = "ezgif-frame-"; num_frames = 14;

function createGIF(image_directory, image_prefix, output_name, num_frames)
    % Create a GIF from a series of images
    %
    % Arguments:
    %   image_directory - Directory containing the images
    %   image_prefix - Prefix for the image files
    %   num_frames - Number of frames in the animation

    % Loop through each frame and add it to the GIF
    for i = 1:num_frames
        image_path = [image_directory, "/", image_prefix, num2str(i), ".jpg"];
        img = imread(image_path);
        
        % For the first frame, create the GIF
        % For subsequent frames, append to the GIF
        if i == 1
            imwrite(img, output_name, 'gif', 'LoopCount', Inf, 'DelayTime', 0.1);
        else
            imwrite(img, output_name, 'gif', 'WriteMode', 'append', 'DelayTime', 0.1);
        end
    end
end

createGIF(image_directory, image_prefix, 'original_movie.gif', num_frames)

### The tensor of frames

Let's encode our images in a tensor that we can easily manipulate:

`grayscaleImages(:,:,i)` will contain a matrix of values between 0 and 1 corresponding to the pixels in the i-th frame

In [3]:
% Load Images and Initialize Arrays
first_image = imread([image_directory, "/", image_prefix, "1.jpg"]);
image_size = size(first_image);
grayscale_images = zeros(image_size(1), image_size(2), num_frames); % Initialize array for grayscale images

for i = 1:num_frames % Create a tensor containing the frames
    image = imread([image_directory, "/", image_prefix, num2str(i), ".jpg"]);
    if size(image, 3) == 3 % Check if the image is RGB
        grayscale_images(:, :, i) = rgb2gray(image); % Convert to grayscale
    else
        grayscale_images(:, :, i) = image; % Use grayscale image directly
    end
end


We want to apply the Singular Value Decomposition, and for that we have to flatten each matrix in the tensor and make it into a vector.


In [4]:
flattened_images = zeros(image_size(1) * image_size(2), num_frames); % Initialize array for reshaped grayscale images

for i = 1:num_frames % Flatten the tensor into a matrix
    flattened_images(:, i) = grayscale_images(:, :, i)(:);
end

Now we can apply the SVD to obtain the principal components (this should take a few seconds):

In [5]:
% Perform Singular Value Decomposition (SVD)
[U, S, V] = svd(flattened_images);

Recall that `U` and `V` are orthogonal, whereas `S` is diagonal.
The data of the movie is encoded in `S`. Let's have a look

In [6]:
diag(S)

ans =

   91283.60040
    8885.60764
    6836.43142
    5497.15298
    4494.28975
    4310.73219
    3834.10391
    3636.69352
    3458.88776
    3184.77091
    2998.70521
    2883.37220
    2843.93734
    2472.01146



As you can see, the latter elements are smaller in size than the first ones. This indicates that some compression is viable. Let's take the first 5 entries only and set the rest to zero:

In [70]:
% Number of principal components to retain
num_components = 5;

% Compress the video by retaining only a subset of the principal components
compressed_flattened_images = U(:, 1:num_components) * S(1:num_components, 1:num_components) * V(:, 1:num_components)';

The matrix `compressed_flattened_images` is now an approximation for the matrix `flattened_images`. Let's reconstruct our compressed movie:

In [71]:
% Reconstruct the compressed video tensor
compressed_images = zeros(image_size(1), image_size(2), num_frames);
for i = 1:num_frames
    current_matrix = reshape(compressed_flattened_images(:,i), image_size);
    % Normalize the values of the matrix to the range [0,1]
    min_val = min(current_matrix(:));
    max_val = max(current_matrix(:));
    compressed_images(:, :, i) = (current_matrix - min_val) / (max_val - min_val);
end

We will need an Octave function that turns our tensor back into a directory of JPEG files:

In [72]:
function saveImagesFromTensor(images_tensor, output_directory, file_prefix)
    % Check if the output directory exists, if not create it
    if ~exist(output_directory, 'dir')
        mkdir(output_directory);
    end

    % Get the number of frames/images in the tensor
    num_frames = size(images_tensor, 3);

    % Iterate over each frame/image in the tensor
    for i = 1:num_frames
        % Extract the ith grayscale image
        image = images_tensor(:, :, i);
        
        % Construct the filename for the image
        filename = fullfile(output_directory, [file_prefix, num2str(i), '.jpg']);

        % Save the image
        imwrite(image, filename);
    end
end

saveImagesFromTensor(compressed_images, 'CompressedFrames', 'frame_')

As before, let's turn our movie into a GIF:

In [73]:
createGIF('CompressedFrames', 'frame_', 'compressedmovie.gif', 14)

In [74]:
%%html
<img src="compressedmovie.gif" style = 'width:100%;'>

Notice: even though we are only using **35% of the original data**, we can still see the movement clearly. Let's keep analyzing the PCA

### Principal components

The components of the PCA correspond to the vectors in the matrix `U`. Let's see what the corresponding frames look like.

In [75]:
% Construct an empty tensor for the PCA components
PCA_images = zeros(image_size(1), image_size(2), num_frames);
for i = 1:num_frames
    current_matrix = reshape(U(:,i), image_size);
    % Normalize the values of the matrix to the range [0,1]
    min_val = min(current_matrix(:));
    max_val = max(current_matrix(:));
    compressed_images(:, :, i) = (current_matrix - min_val) / (max_val - min_val);
end

saveImagesFromTensor(compressed_images, 'PCAframes', 'PCA_')

In [76]:
%%html
<img src="PCAframes/PCA_1.jpg" style = "width:60%">

We can interpret this as the background, which roughly remains constant throughout the movie.

In [33]:
%%html
<img src="PCAframes/PCA_2.jpg" style = "width:60%">

This one is also easy to interpret. The motion has two "orthogonal" states: the horse is contracted (area shaded in black) or extended (in white)

In [34]:
%%html
<img src="PCAframes/PCA_3.jpg" style = "width:60%">

This tertiary component can be interpreted in two states as well:
- The legs of the horse and the head of the man are either in front (in black)
- Or they are at the back (in white)

Later components are harder to interpret.

In [36]:
%%html
<img src="PCAframes/PCA_4.jpg" style = "width:60%">

This fourth component seems to be encoding the relative height of the neck and tail of the horse, as well as some of the leg movement.

The least important contributions are the last components. Let's see the last one:

In [42]:
%%html
<img src="PCAframes/PCA_14.jpg" style = "width:60%">

Even when the rank of the compressed tensor is just 2, one can get a decent idea of the movement of the horse:

In [79]:
%%html
<img src="compressedrank2.gif" style = "width:100%">

### Conclusions

By now, you've seen how Principal Component Analysis (PCA) via Singular Value Decomposition (SVD) can effectively compress video data. This tutorial with the 'Sallie Gardner at a Gallop' video illustrates PCA's capability to retain crucial visual information while significantly reducing data size. This method is particularly beneficial for situations with limited storage or bandwidth.

### Further exploration

Feel free to apply these techniques to your own sets of frames. To get started, you can find a variety of frame files in the repository, in the folder AlternativeFrameSets, suitable for experimenting with PCA compression.