# Classification and Feature Selection Using Regression
<hr>

In this notebook, we are trying to use linear regression with L1 and L2 loss function as well as OMP to extract important data within our dataset and classify our data.

It is obvious that Classification using Regression is not an efficient method, however for teaching purposes it is a fun excerise to see how regression can be used to classify our data.

One method to classify is the output from Regression is continus whereas in Classification we require distinct discrete values such as 0, 1, 2, etc; Therefore we need to somehow convert these. The relationship between data point (AKA sample) can also be represented using regression models. Therefore  samples with more weight ($|| W ||$) tends to have more influence in the original dataset ,therefore considered to be more important. Using this method can also lead to classification and summarization. 
<hr> 
    
Everything required for this exercise is available at : 
   
 
    
   
***GitHub***  : <a href = "https://github.com/A-M-Kharazi/Machine-Learning-TMU.git" > Main (class) repo </a> 
    &nbsp;&nbsp;&nbsp;
    <a href = "https://github.com/A-M-Kharazi/Machine-Learning-TMU/tree/main/Questions%20and%20Homeworks/HW-Series2" > This Document page</a>
    
    
***GoogleDrive*** : <a href = "" > Not available ATM </a>

#  Import libraries
    
<hr>
 
    
It is essential that we first load these libraries in our code so that it can work.
    
-  numpy, pandas, matplotlib are all necessary libraries to perform simple tasks on our data such as reading, creating dataframes, visualizing, mathematical operations, etc.
   

- pillow is used to read gif data (for video summarization task)

    
-  sklearn LinearRegression is used to create OLS regression (linear regression of common knowledge that is with L2 loss function)

    
-  ....
    
 
- sklearn OrthogonalMatchingPursuit is the OMP package


- sklearn datasets is used to import ORL datasets.
    
If you don't have any/some of these installed, please <code> pip install </code> them


In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import cv2
import imageio

import torch # Linear Regression using L1, L2 Loss Function
from torch.autograd import Variable
from sklearn.linear_model import OrthogonalMatchingPursuit as OMP

import sklearn.datasets
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, f1_score, accuracy_score


#  Import data
    
<hr>
    
Data primary path is in a directory called  "**Data** " which is located in the main directory. In Data directory there is another directory calld H2 which contains the data that we'll be using for video part of our exercise.


- Video Data : These data usually come with a .gif format. To obtain a 4D video matrix we propose the video_matrix function
    This Function read the data using OpenCV (CV2) and extract its frames. Each frame contains an image of size X by Y, and each pixel contains an intensity of C (which can be gray scale, RGB , etc)
    
    Therefore video_matrix returns a 4D array such as [F, X, Y, C] where F is the frame, X and Y is the x and y location in the image (together they convey the location of the pixel within the image), and C is the color/ intensity of the pixel(x,y) where it can also be an array (like RGB).
    
    
- ORL Data : ORL complete Description will be mentioned in the next section. To obtain this data we use the fetch method from sklearn.datasets. 


## Video Operation Functions

These functions are used for data procesing and visualization of our Video data type. 

### Video_matrix Function

This is a function to obtain matrix format of a video . That is a 4D matrix containing frame, location of pixels and their intensity.

In [None]:
# This Function convert the vide to a 4D matrix 
# First Dimension is for frames
# Second is the image X 
# Third is the image Y
# Last Dimension is the Color

def video_matrix(video_path):
    
    # Read the Video 
    
    video = cv2.VideoCapture(video_path)
    
    matrix = []
    # Iterate through frames
    while True:
        # Capture a frame
        flag, frame = video.read()
        if not flag:
            break
        
        matrix.append(frame)
    video.release()
    return matrix   

### show_video Function

This function is used to present our video data. It requires a waitkey(default = 0) , closekey (default = q) , and video matrix (which can be obtained via previous function).

To propely play the video, you need to **HOLD** the waitkey button on your keyboar. Normally video closes after it iterates through all its frames, however if you wish to close it before that happens, you can use the closekey button on your keyboard.


Showing such videos can be tricky and if not careful, your notebook might crash.

In [None]:
# This Function show a video using CV2 library
# It can cause the notebook to crash if not careful
# Input :   
# Matrix # is the 4D matrix retrieved from video_matrix function 
# waitekey # is the waitkey button. To procced through different frames
# To propely observe the video, you need to HOLD this button (or any other key)
# closekey # is the key to close the video. Normally the video will close after
# one whole iteration through its frames, however if tou wish to close it before
# that happens, you can use this key. 

def show_video(matrix, waitkey = 0 , closekey = 'q', title = 'video'):
    N = np.shape(matrix)[0]
    print(f'Press {closekey} to exit and any other key to proceed through video')
    for i in range(N):
        cv2.imshow(title,matrix[i])
        close = cv2.waitKey(waitkey)
        if close == ord(closekey):
            break
    
    cv2.destroyAllWindows()    

### Video_matrix_to2D Function

This Function converts each frame to a 1D array, therfore the 4D video matrix will be converted to a 2D video matrix. Conversion to a 1D array is done via the flatten() option within np. Using this method each sample with be a vector which we can use to fit our regression models.

Considering original video matrix $\in \mathbb{R}^{F\times X \times Y \times C}$, the new video matrix 
$\in \mathbb{R}^{F\times XYC}$.

In [None]:
# This function flatten each frame so that each sample is a vector

def video_matrix_to2D(video_matrix):
    new_matrix = []
    for frame in video_matrix:
        # use flatten() to 1D the 3D data
        sample = frame.flatten()
        new_matrix.append(sample)
    return new_matrix   

### Video_save Function

This function takes the video_matrix and save it in the appropriate folder. Since images are RGB, be better save them as RGB.

OpenCV uses the BGR format instead imageio uses the RGB format.

In [None]:
# This function saves the video matrix as a .gif image

def video_save(video_matrix, name, path = '../Data/Summary/'):
    
    # convert frames to RGB
    
    new_frames = []
    for frame in video_matrix:
        rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        new_frames.append(rgb_frame)
        
    imageio.mimsave(path + name, new_frames)
    
    
    

## Obtain Data

Methods to obtain the data is already discussed in the previous sections. To obtain the ORL dataset, we use sklearn.dataset, as for video data types, the data will be the video matrix obtained from the video_matrix Function.


**Download the videos from the link mentioned in the 'Data' folder.**

In [None]:
# Videos

path = '../Data/H2'

# Video 1 

video1 = video_matrix(path + '/Data1.gif')

# Video 2 

video2 = video_matrix(path + '/Data2.gif')

# Video 3 

video3 = video_matrix(path + '/Data3.gif')

# Video 4 

video4 = video_matrix(path + '/Data4.webm')

# Videos

Video = {1: video1, 2: video2, 3: video3, 4:video4}

# ORL

ORL = sklearn.datasets.fetch_olivetti_faces()
ORL_images = ORL['images']
ORL_target = ORL['target']
ORL_data = ORL['data'] # Is the flatten() version of each image
ORL_desc = ORL['DESCR']


# Video Dataset

<hr>

In this section we showcase and describe our dataset 

## Video Description

In [None]:
# Video 1

print('Information about Video ',1)
print('#################')
print('It has ( ', np.shape(Video[1])[0],' ) frames')
print('Each image has ( ', np.shape(Video[1])[1],' by', np.shape(Video[1])[2] ,' ) pixels')
print('Each pixel has ( ', np.shape(Video[1])[3] ,' ) Colors')
print('\n\n')


# Video 2

print('Information about Video ',2)
print('#################')
print('It has ( ', np.shape(Video[2])[0],' ) frames')
print('Each image has ( ', np.shape(Video[2])[1],' by', np.shape(Video[2])[2] ,' ) pixels')
print('Each pixel has ( ', np.shape(Video[2])[3] ,' ) Colors')
print('\n\n')


# Video 3

print('Information about Video ',3)
print('#################')
print('It has ( ', np.shape(Video[3])[0],' ) frames')
print('Each image has ( ', np.shape(Video[3])[1],' by', np.shape(Video[3])[2] ,' ) pixels')
print('Each pixel has ( ', np.shape(Video[3])[3] ,' ) Colors')
print('\n\n')


# Video 4

print('Information about Video ',4)
print('#################')
print('It has ( ', np.shape(Video[4])[0],' ) frames')
print('Each image has ( ', np.shape(Video[4])[1],' by', np.shape(Video[4])[2] ,' ) pixels')
print('Each pixel has ( ', np.shape(Video[4])[3] ,' ) Colors')
print('\n\n')


## Video showcase (Press 'q' to exit | Hold any other key to play)

In [None]:
# Video 1

show_video(Video[1], title = "Data1")

# Video 2

show_video(Video[2], title = "Data2")

# Video 3

show_video(Video[3], title = "Data3")


# Video 4

show_video(Video[4], title = "Data4")




# ORL Dataset

<hr>

In this section we showcase and describe our dataset 

## ORL Description

In [None]:
print(ORL_desc)

## ORL Visualization (Ground Truth)

In [None]:
# Ground Truth

fig = plt.figure(figsize=(20,20))
indx = 1
for img in ORL_images:
    ax = fig.add_subplot(20,20,indx)
    ax.imshow(img, cmap = 'gray')
    ax.grid(False)
    ax.axis('off')
    # add label aka target aka class 
    ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_target[indx-1]), fontsize = 20, color = 'red', fontweight='bold')
    indx+=1
fig.tight_layout()
plt.show()

# 0. Linear Regression (Using PyTorch)

<hr>

Since we are going to use OMP and Linear Regression (with L1 and L2 Loss Function) in this notebook, we need to define these methods. 

Sklearn contains OMP and Linear Regression (with L2 Loss Function). In case of Linear Regression (with L1 Loss Function), we need to build it from scratch. 

Therefore, we are going to use PyTorch to build our Linear Regression method (using L1 and L2 Loss Function). 


  $\bullet$ LinearRegressionModel is a Linear Regression model with flexibility to use L1 and L2 loss function.
  
First you need to define data dimensions, then you can change the criterion_mode to 'l1' to perform L1 loss function optimization. Max_iter is set to 500 by default and represent the number of epochs. Optimizer learning rate is also set to 0.01.

This model also contains a fit function where it fit the model using criterion (l1 or l2 loss function) and SGD optimizer. 

If log is set to true, it will print every epoch's result. 



In [None]:
class LinearRegressionModel(torch.nn.Module):

    def __init__(self, input_dim, output_dim, criterion_mode = 'l2', max_iter = 500, optimizer_lr = 0.01, _bias = False):
        self.optimizer_lr = optimizer_lr
        self.criterion_mode = criterion_mode
        self.max_iter = max_iter
        super(LinearRegressionModel, self).__init__()
        self.linear = torch.nn.Linear(input_dim, output_dim, bias = _bias)

    def forward(self, x):
        y_pred = self.linear(x)
        return y_pred
    
    def fit(self, X, Y, log = True):
        
        if  self.criterion_mode == 'l1':
            criterion = torch.nn.L1Loss()
        else:
            criterion = torch.nn.MSELoss()
            
        optimizer = torch.optim.SGD(self.parameters(), lr = self.optimizer_lr)
        
        
        for epoch in range(self.max_iter):
            
            # Forward pass: Compute predicted y by passing
            # x to the model
    
            Y_pred = self(X)
        
            # Compute and print loss
        
            loss = criterion(Y_pred, Y)
        
            # Zero gradients, perform a backward pass,
            # and update the weights.
            
        
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
        
            # print loss per epoch (if log is set to True)
            if log :
                print('epoch {}, loss {}'.format(epoch, loss.item()))
        
        return loss.item()
    def coef(self):
        
        return self.linear.weight.detach().numpy().tolist()
    

# 1. Video Summarization 

<hr>

Our Objective is to use Linear Regression (L1 and L2 norm Loss function) and OMP to find the 10 percent most important frames withing each video data. 


Assume for each frame, our objective is to reconstruct that frame using remaining frames. Consider Video1 $\in \mathbb{R}^{188 \times 128 \times 220 \times 3}$. By flattening each frame, each sample is $\in \mathbb{R}^{84480}$ ($84480 = 128\times220\times 3$), therefore video1 $\in \mathbb{R}^{188\times 84480}$.


Assume we have $X_1, X_2, \dots, X_{188}$ as our samples. Our objective is to build $X_1$ using $X_2,\dots, X_{188}$. To do that, we can create $X_1$ as a linear combination of $X_2, \dots , X_{188}$. 

$$
X_1 = W_0 + W_1(X_1 = 0 ) + W_2 X_2 + \dots + W_{188} X_{188}\\
X_1 = W^T  \Phi(X_1)
$$
whereas :
$$
W^T = \begin{bmatrix} W_0, W_1, \dots, W_{188}\end{bmatrix}\hspace{1cm}\text{and}\hspace{1cm}
\Phi(X_1) = \begin{bmatrix}
1\\0\\X_2\\\vdots\\X_{188}
\end{bmatrix}
$$

We can easily generalize this equation for $X_i$:

$$
X_i = W_0 + W_1X_1 + W_2X_2 + \dots + W_{i-1}X_{i-1} + W_i(X_i  = 0) + W_{i+1} X_{i+1}+\dots+ W_{188} X_{188}\\
X_i = W^T  \Phi(X_i)\\
W^T = \begin{bmatrix} W_0, W_1, \dots, W_{188}\end{bmatrix}\hspace{1cm}\text{and}\hspace{1cm}
\Phi(X_i) = \begin{bmatrix}
1\\X_1\\
\vdots
\\
X_{i-1}\\0\\X_{i+1}
\\
\vdots
\\X_{188}
\end{bmatrix}
$$


The video1 itself is presented as $X$ that is $\begin{bmatrix}X_1\\X_2\\\vdots\\X_{188}\end{bmatrix}$; The reconstructed video1 is presented as $\hat X$ that is $\begin{bmatrix}W^T\Phi(X_1)\\W^T\Phi(X_2)\\\vdots\\W^T\Phi(X_{188})\end{bmatrix}$.


To find the best model, we need to optimze $W$ in a way that $Loss(X, \hat X )$ is minimized.


Therefore, we can present a regression model (L1, L2 , or OMP) to find the best $W$.


$$
\hat X = \Phi W
$$

$$
\Phi = \begin{bmatrix}
\Phi(X_1)^T\\
\Phi(X_2)^T\\
\vdots\\
\Phi(X_{188})^T
\end{bmatrix}
= 
\begin{bmatrix}
1,0,X_2,\dots, X_{188}\\
1, X_1, 0 , \dots, X_{188}\\
\vdots\\
1, X_1, X_2, \dots, 0
\end{bmatrix}
$$

$$
W = \begin{bmatrix}
W_0\\
W_1\\
\vdots\\
W_{188}
\end{bmatrix}
$$

$$
\hat X = \begin{bmatrix}
W_0 + W_1(0) + W_2 X_2 + \dots + W_{188}X_188\\
W_0 + W_1 X_1 + W_2(0) + \dots + W_{188}X_188\\
\vdots\\
W_0 + W_1 X_1 + W_2 X_2 + \dots + W_{188}(0)\\
\end{bmatrix}
=
\begin{bmatrix}
\hat X_1\\
\hat X_2\\
\vdots\\
\hat X_{188}
\end{bmatrix}
$$


Considering $W$, if we ignore $W_0$, each $W_i$ represent the contribution of sample $i$ in the reconstruction of video1.
Which means that if $W_i \geq W_j$ then Sample $i$ is more important than sample $j$. 


Using this technique, we can find our most important samples withing the data.  


## 1.0 Problems with implementation

Considering this problem, Creating and operating over $\Phi(X)$ is extremly slow and memory consuming , therefore you may encounter memory problems which will stop this notebook from working properly.   

### Memory Problem

As mentioned before, memory problem (allocation, etc) is bound to happen due to extremly large size of $\Phi(X)$ for our video data. $\Phi(X)$ for data1 (video 1) contains 188 (#frames) rows where each row contains 15,966,720 (almost 16 million) columns.

### Solution

A solution that came to my mind was to create the model separately for each frame. That is create model $f_1$ to reconstruct frame $X_1$ by frames $X_2, \dots , X_N$. This model will find the optimum $W$ corresponding to its data. Lets call the $W$ retrieved from $f_1$, $W^{(1)}$. Similarly we can produce other $W^{(i)}$s .


$$
\Phi(X_1)W^{(1)} = \hat X_1 \hspace{1.5cm} \dots \hspace{1.5cm} \Phi(X_N)W^{(N)} = \hat X_N
$$

This means we have $N$ models, and we wish to find a relationship between these models. 


Our goal is to find the $20$ percent most important frames, that is if we have a single model; $||W_i||_2 \in W$ shows the importance of sample $i$ in reconstruction, therefore if $||W_i||$ is bigger, then it is more important.


How can we use the same idea to find important frames while having multiple models ?

**Idea :**


1. Generate $W^{(i)}$ for $i : 1, 2, \dots, N$ (corresponding to model $f_i$ which reconstruct $X_i$ frame by other frames)

2. Sort indices of  $W^{(i)}$ by their corresponding value (Descending)

3. Generate a matrix $A$ such that it have $N$ rows ($W^{(i)}$ or sample $i$) and $N\times \frac{20}{100}$ coloumns ($20$ percent most important frames.)

    - To understand this, each $W^{(i)}$ has $N$ elements. Once sorted, we wish to choose only $20$ percent of them (location).
    
4. For each $a_{i,j} \in A$, count the number of sample $i$ being placed at location $j$.

    - For example, $a_{1,1}$ means how many times sample $1$ (or $W_1$) has been placed at location $1$ in the sorted indices of $W^{(i)}$s for every $i$


5. Iterate through $A$, find the biggest counter (for example $a_{i,j}$) and save $i$ as one of the important samples. Then remove row $i$ from $A$ so that sample $i$ cannot be choosen again. Loop this process until the number of important samples have been satisfied.


6. Sort the important frames and play them.



**In Summary** :

Generate $W$ for each model and sort their indices by values. Find the top $20$ percent of each $W$ and generate a matrix called $A$ so that each row of $A$ counts the number of sample $i$ has been observed at position $j$. For example $a_{3,4}$ means how many times sample $3$ has been at position $4$ in all $W$s. Once $A$ has been constructed, loop through $A$ and find the biggest counter; save its sample in another list (important_frames), and remove the row which had the biggest counter so it can't be choosen again. Once important_frames's list is generated (which means its length is satisfied), sort them by order of frames (not values) and play them as a video. 

Sorting them by frames at the end of this process, causes the summary video to start  with important frames which start at the begining of the original video and end with those that ends the original video. (the timeline of the video is saved)


## 1.1 Data 1 

Using the solution mentioned above to fix the memory problem.

### 1.1.1 Linear Regression (L2 Loss Function)

Linear Regression using PyTorch may fail due to inappropriate optimizer and learning rate, Therefore only for this case, we use sklearn LinearRegression model.

**This will take some time , please be patient ...**

In [None]:
# Linear Regression using Pytorch may fail due to inappropriate optimizer and learning rate
# Therefore just for this case, we use sklearn LinearRegression model

from sklearn.linear_model import LinearRegression as temporary_l2model

video2d_1 = np.array(video_matrix_to2D(Video[1]))

end = round(np.shape(video2d_1)[0] * (20/100))

A_l2_data1 = np.zeros((video2d_1.shape[0], end))


for (index1, sample1) in enumerate(video2d_1):
    
    phi_xi = []
    
    # generate X and Y
    
    for (index2,sample2) in enumerate(video2d_1):
        
        if index2!= index1:
            phi_xi.append(sample2)
    
    x = np.array(phi_xi).T
    y = sample1.reshape(-1,1)
    
    # create the model
    
    model = temporary_l2model()

    # fit the model
    
    model.fit(x,y)
    
    # calculate W
    
    Wi = model.coef_.flatten()
    # +1 for W_i of sample i which will be set to zero later.
    indices = np.arange(Wi.shape[0]+1)
    
    # adjust Wi to include W_i for sample i with 0 value
    Wi = Wi.tolist()
    Wi.insert(index1, 0)
    Wi = np.array(Wi)
    
    # Prepare to sort (absolute value of W_i for each Wi)
    
    W_dataframe = pd.DataFrame({'index':indices, 'W_i': np.abs(Wi)})
    W_dataframe = W_dataframe.sort_values(by='W_i', ascending=False)
    
    # Update a_{ij} based on the sorted W_dataframe
    
    for j in range(end):
        i = np.array(W_dataframe['index'])[j]
        # Increase the counter by 1
        A_l2_data1[i][j] += 1
    

#### Find the important frames

After building the $A$ matrix, we can loop through it and find the $20$ percent most important frames.

In [None]:
important_frames_list1_l2 = []

# Find most 20 percent most important frames.

for column in range(A_l2_data1.shape[1]):
    temp_max = 0
    max_index = 0
    for row in range(A_l2_data1.shape[0]):
        if row not in important_frames_list1_l2:
            if A_l2_data1[row][column] >= temp_max:
                temp_max = A_l2_data1[row][column]
                max_index = row
                
    important_frames_list1_l2.append(max_index)

sorted_important_frames_list1_l2 = np.sort(important_frames_list1_l2)


#### Generate the new video and play it

In [None]:
video1_summary_frames_l2 = []

for frame_index in sorted_important_frames_list1_l2:
    video1_summary_frames_l2.append(Video[1][frame_index])
    
show_video(video1_summary_frames_l2, title= 'Data1 Summary by L2 ')
video_save(video1_summary_frames_l2, name='Data1Summary_l2.gif')


### 1.1.2 Linear Regression (L1 Loss Function)

**This will take some time , please be patient ...**

In [None]:
video2d_1 = np.array(video_matrix_to2D(Video[1]))
end = round(np.shape(video2d_1)[0] * (20/100))

A_l1_data1 = np.zeros((video2d_1.shape[0], end))


for (index1, sample1) in enumerate(video2d_1):
    
    phi_xi = []
    
    # generate X and Y
    
    for (index2,sample2) in enumerate(video2d_1):
        
        if index2!= index1:
            phi_xi.append(sample2)
    
    x = np.array(phi_xi).T
    y = sample1.reshape(-1,1)
    
    # create the model and variables
    
    x = Variable(torch.Tensor(x))
    y = Variable(torch.Tensor(y))
    
    model = LinearRegressionModel(x.shape[1],y.shape[1], criterion_mode='l1', optimizer_lr= 0.000001, max_iter = 100)
    
    # Print a test log <start>
    
    print(f'Currently building sample {index1}')
    
    # Print a test log <end>
    
    trash = model.fit(x,y, log=False)
    
    # calculate W
    
    Wi = np.array(model.coef()).flatten()
    # +1 for W_i of sample i which will be set to zero later.
    indices = np.arange(Wi.shape[0]+1)
    
    # adjust Wi to include W_i for sample i with 0 value
    Wi = Wi.tolist()
    Wi.insert(index1, 0)
    Wi = np.array(Wi)
    
    # Prepare to sort (absolute value of W_i for each Wi)
    
    W_dataframe = pd.DataFrame({'index':indices, 'W_i': np.abs(Wi)})
    W_dataframe = W_dataframe.sort_values(by='W_i', ascending=False)
    
    # Update a_{ij} based on the sorted W_dataframe
    
    for j in range(end):
        i = np.array(W_dataframe['index'])[j]
        # Increase the counter by 1
        A_l1_data1[i][j] += 1
    

#### Find the important frames

After building the $A$ matrix, we can loop through it and find the $20$ percent most important frames.

In [None]:
important_frames_list1_l1 = []

# Find most 20 percent most important frames.

for column in range(A_l1_data1.shape[1]):
    temp_max = 0
    max_index = 0
    for row in range(A_l1_data1.shape[0]):
        if row not in important_frames_list1_l1:
            if A_l1_data1[row][column] >= temp_max:
                temp_max = A_l1_data1[row][column]
                max_index = row
                
    important_frames_list1_l1.append(max_index)

sorted_important_frames_list1_l1 = np.sort(important_frames_list1_l1)


#### Generate the new video and play it

In [None]:
video1_summary_frames_l1 = []

for frame_index in sorted_important_frames_list1_l1:
    video1_summary_frames_l1.append(Video[1][frame_index])
    
show_video(video1_summary_frames_l1, title='Data1 Summary by L1 ')
video_save(video1_summary_frames_l1, name='Data1Summary_l1.gif')


### 1.1.3 OMP

**This will take some time , please be patient ...**

In [None]:
video2d_1 = np.array(video_matrix_to2D(Video[1]))
end = round(np.shape(video2d_1)[0] * (20/100))

A_omp_data1 = np.zeros((video2d_1.shape[0], end))

for (index1, sample1) in enumerate(video2d_1):
    
    phi_xi = []
    
    # generate X and Y
    
    for (index2,sample2) in enumerate(video2d_1):
        
        if index2!= index1:
            phi_xi.append(sample2)
    
    x = np.array(phi_xi).T
    y = sample1.reshape(-1,1)
    
    # create the model
    
    model = OMP(fit_intercept=False, normalize=False, n_nonzero_coefs= 5)

    # fit the model
    
    model.fit(x,y)
    
    # calculate W
    
    Wi = model.coef_.flatten()
    # +1 for W_i of sample i which will be set to zero later.
    indices = np.arange(Wi.shape[0]+1)
    
    # adjust Wi to include W_i for sample i with 0 value
    Wi = Wi.tolist()
    Wi.insert(index1, 0)
    Wi = np.array(Wi)
    
    # Prepare to sort (absolute value of W_i for each Wi)
    
    W_dataframe = pd.DataFrame({'index':indices, 'W_i': np.abs(Wi)})
    W_dataframe = W_dataframe.sort_values(by='W_i', ascending=False)
    
    # Update a_{ij} based on the sorted W_dataframe
    
    for j in range(end):
        i = np.array(W_dataframe['index'])[j]
        # Increase the counter by 1
        A_omp_data1[i][j] += 1
    

#### Find the important frames

After building the $A$ matrix, we can loop through it and find the $20$ percent most important frames.

In [None]:
important_frames_list1_omp = []

# Find most 20 percent most important frames.

for column in range(A_omp_data1.shape[1]):
    temp_max = 0
    max_index = 0
    for row in range(A_omp_data1.shape[0]):
        if row not in important_frames_list1_omp:
            if A_omp_data1[row][column] >= temp_max:
                temp_max = A_omp_data1[row][column]
                max_index = row
                
    important_frames_list1_omp.append(max_index)

sorted_important_frames_list1_omp = np.sort(important_frames_list1_omp)


#### Generate the new video and play it

In [None]:
video1_summary_frames_omp = []

for frame_index in sorted_important_frames_list1_omp:
    video1_summary_frames_omp.append(Video[1][frame_index])
    
show_video(video1_summary_frames_omp, title= 'Data1 Summary by OMP ')
video_save(video1_summary_frames_omp, name='Data1Summary_omp.gif')


## 1.2 Data 2

Using the solution mentioned above to fix the memory problem.

### 1.2.1 Linear Regression (L2 Loss Function)

Linear Regression using PyTorch may fail due to inappropriate optimizer and learning rate, Therefore only for this case, we use sklearn LinearRegression model.

**This will take some time , please be patient ...**

In [None]:
# Linear Regression using Pytorch may fail due to inappropriate optimizer and learning rate
# Therefore just for this case, we use sklearn LinearRegression model

from sklearn.linear_model import LinearRegression as temporary_l2model

video2d_2 = np.array(video_matrix_to2D(Video[2]))

end = round(np.shape(video2d_2)[0] * (20/100))

A_l2_data2 = np.zeros((video2d_2.shape[0], end))

for (index1, sample1) in enumerate(video2d_2):
    
    phi_xi = []
    
    # generate X and Y
    
    for (index2,sample2) in enumerate(video2d_2):
        
        if index2!= index1:
            phi_xi.append(sample2)
    
    x = np.array(phi_xi).T
    y = sample1.reshape(-1,1)
    
    # create the model
    
    model = temporary_l2model()

    # fit the model
    
    model.fit(x,y)
    
    # calculate W
    
    Wi = model.coef_.flatten()
    # +1 for W_i of sample i which will be set to zero later.
    indices = np.arange(Wi.shape[0]+1)
    
    # adjust Wi to include W_i for sample i with 0 value
    Wi = Wi.tolist()
    Wi.insert(index1, 0)
    Wi = np.array(Wi)
    
    # Prepare to sort (absolute value of W_i for each Wi)
    
    W_dataframe = pd.DataFrame({'index':indices, 'W_i': np.abs(Wi)})
    W_dataframe = W_dataframe.sort_values(by='W_i', ascending=False)
    
    # Update a_{ij} based on the sorted W_dataframe
    
    for j in range(end):
        i = np.array(W_dataframe['index'])[j]
        # Increase the counter by 1
        A_l2_data2[i][j] += 1
        
    break
    

#### Find the important frames

After building the $A$ matrix, we can loop through it and find the $20$ percent most important frames.

In [None]:
important_frames_list2_l2 = []

# Find most 20 percent most important frames.

for column in range(A_l2_data2.shape[1]):
    temp_max = 0
    max_index = 0
    for row in range(A_l2_data2.shape[0]):
        if row not in important_frames_list2_l2:
            if A_l2_data2[row][column] >= temp_max:
                temp_max = A_l2_data2[row][column]
                max_index = row
                
    important_frames_list2_l2.append(max_index)

sorted_important_frames_list2_l2 = np.sort(important_frames_list2_l2)


#### Generate the new video and play it

In [None]:
video2_summary_frames_l2 = []

for frame_index in sorted_important_frames_list2_l2:
    video2_summary_frames_l2.append(Video[2][frame_index])
    
show_video(video2_summary_frames_l2, title= 'Data2 Summary by L2 ')
video_save(video2_summary_frames_l2, name='Data2Summary_l2.gif')


### 1.2.2 Linear Regression (L1 Loss Function)


**This will take some time , please be patient ...**

In [None]:
video2d_2 = np.array(video_matrix_to2D(Video[2]))

end = round(np.shape(video2d_2)[0] * (20/100))

A_l1_data2 = np.zeros((video2d_2.shape[0], end))

for (index1, sample1) in enumerate(video2d_2):
    
    phi_xi = []
    
    # generate X and Y
    
    for (index2,sample2) in enumerate(video2d_2):
        
        if index2!= index1:
            phi_xi.append(sample2)
    
    x = np.array(phi_xi).T
    y = sample1.reshape(-1,1)
    
    # create the model and variables
    
    x = Variable(torch.Tensor(x))
    y = Variable(torch.Tensor(y))
    
    model = LinearRegressionModel(x.shape[1],y.shape[1], criterion_mode='l1', optimizer_lr= 0.000001, max_iter = 100)

    # fit the model
    
    # Print a test log <start>
    
    print(f'Currently building sample {index1}')
    
    # Print a test log <end>
    
    trash = model.fit(x,y, log = True)
    
    # calculate W
    
    Wi = np.array(model.coef()).flatten()
    # +1 for W_i of sample i which will be set to zero later.
    indices = np.arange(Wi.shape[0]+1)
    
    # adjust Wi to include W_i for sample i with 0 value
    Wi = Wi.tolist()
    Wi.insert(index1, 0)
    Wi = np.array(Wi)
    
    # Prepare to sort (absolute value of W_i for each Wi)
    
    W_dataframe = pd.DataFrame({'index':indices, 'W_i': np.abs(Wi)})
    W_dataframe = W_dataframe.sort_values(by='W_i', ascending=False)
    
    # Update a_{ij} based on the sorted W_dataframe
    
    for j in range(end):
        i = np.array(W_dataframe['index'])[j]
        # Increase the counter by 1
        A_l1_data2[i][j] += 1
    

#### Find the important frames

After building the $A$ matrix, we can loop through it and find the $20$ percent most important frames.

In [None]:
important_frames_list2_l1 = []

# Find most 20 percent most important frames.

for column in range(A_l1_data2.shape[1]):
    temp_max = 0
    max_index = 0
    for row in range(A_l1_data2.shape[0]):
        if row not in important_frames_list2_l1:
            if A_l1_data2[row][column] >= temp_max:
                temp_max = A_l1_data2[row][column]
                max_index = row
                
    important_frames_list2_l1.append(max_index)

sorted_important_frames_list2_l1 = np.sort(important_frames_list2_l1)


#### Generate the new video and play it

In [None]:
video2_summary_frames_l1 = []

for frame_index in sorted_important_frames_list2_l1:
    video2_summary_frames_l1.append(Video[2][frame_index])
    
show_video(video2_summary_frames_l1, title= 'Data2 Summary by L1 ')
video_save(video2_summary_frames_l1, name='Data2Summary_l1.gif')


### 1.2.3 OMP


**This will take some time , please be patient ...**

In [None]:
video2d_2 = np.array(video_matrix_to2D(Video[2]))

end = round(np.shape(video2d_2)[0] * (20/100))

A_omp_data2 = np.zeros((video2d_2.shape[0], end))

for (index1, sample1) in enumerate(video2d_2):
    
    phi_xi = []
    
    # generate X and Y
    
    for (index2,sample2) in enumerate(video2d_2):
        
        if index2!= index1:
            phi_xi.append(sample2)
    
    x = np.array(phi_xi).T
    y = sample1.reshape(-1,1)
    
    # create the model
    
    model = OMP(fit_intercept=False, normalize=False, n_nonzero_coefs= 5)

    # fit the model
    
    model.fit(x,y)
    
    # calculate W
    
    Wi = model.coef_.flatten()
    # +1 for W_i of sample i which will be set to zero later.
    indices = np.arange(Wi.shape[0]+1)
    
    # adjust Wi to include W_i for sample i with 0 value
    Wi = Wi.tolist()
    Wi.insert(index1, 0)
    Wi = np.array(Wi)
    
    # Prepare to sort (absolute value of W_i for each Wi)
    
    W_dataframe = pd.DataFrame({'index':indices, 'W_i': np.abs(Wi)})
    W_dataframe = W_dataframe.sort_values(by='W_i', ascending=False)
    
    # Update a_{ij} based on the sorted W_dataframe
    
    for j in range(end):
        i = np.array(W_dataframe['index'])[j]
        # Increase the counter by 1
        A_omp_data2[i][j] += 1
    

#### Find the important frames

After building the $A$ matrix, we can loop through it and find the $20$ percent most important frames.

In [None]:
important_frames_list2_omp = []

# Find most 20 percent most important frames.

for column in range(A_omp_data2.shape[1]):
    temp_max = 0
    max_index = 0
    for row in range(A_omp_data2.shape[0]):
        if row not in important_frames_list2_omp:
            if A_omp_data2[row][column] >= temp_max:
                temp_max = A_omp_data2[row][column]
                max_index = row
                
    important_frames_list2_omp.append(max_index)

sorted_important_frames_list2_omp = np.sort(important_frames_list2_omp)


#### Generate the new video and play it

In [None]:
video2_summary_frames_omp = []

for frame_index in sorted_important_frames_list2_omp:
    video2_summary_frames_omp.append(Video[2][frame_index])
    
show_video(video2_summary_frames_omp, title = 'Data2 Summary by OMP ')
video_save(video2_summary_frames_omp, name='Data2Summary_omp.gif')


## 1.3 Data 3

Using the solution mentioned above to fix the memory problem.

### 1.3.1 Linear Regression (L2 Loss Function)

Linear Regression using PyTorch may fail due to inappropriate optimizer and learning rate, Therefore only for this case, we use sklearn LinearRegression model.

**This will take some time , please be patient ...**

In [None]:
# Linear Regression using Pytorch may fail due to inappropriate optimizer and learning rate
# Therefore just for this case, we use sklearn LinearRegression model

from sklearn.linear_model import LinearRegression as temporary_l2model

video2d_3 = np.array(video_matrix_to2D(Video[3]))

end = round(np.shape(video2d_3)[0] * (20/100))

A_l2_data3 = np.zeros((video2d_3.shape[0], end))

for (index1, sample1) in enumerate(video2d_3):
    
    phi_xi = []
    
    # generate X and Y
    
    for (index2,sample2) in enumerate(video2d_3):
        
        if index2!= index1:
            phi_xi.append(sample2)
    
    x = np.array(phi_xi).T
    y = sample1.reshape(-1,1)
    
    # create the model
    
    model = temporary_l2model()

    # fit the model
    
    model.fit(x,y)
    
    # calculate W
    
    Wi = model.coef_.flatten()
    # +1 for W_i of sample i which will be set to zero later.
    indices = np.arange(Wi.shape[0]+1)
    
    # adjust Wi to include W_i for sample i with 0 value
    Wi = Wi.tolist()
    Wi.insert(index1, 0)
    Wi = np.array(Wi)
    
    # Prepare to sort (absolute value of W_i for each Wi)
    
    W_dataframe = pd.DataFrame({'index':indices, 'W_i': np.abs(Wi)})
    W_dataframe = W_dataframe.sort_values(by='W_i', ascending=False)
    
    # Update a_{ij} based on the sorted W_dataframe
    
    for j in range(end):
        i = np.array(W_dataframe['index'])[j]
        # Increase the counter by 1
        A_l2_data3[i][j] += 1
    

#### Find the important frames

After building the $A$ matrix, we can loop through it and find the $20$ percent most important frames.

In [None]:
important_frames_list3_l2 = []

# Find most 20 percent most important frames.

for column in range(A_l2_data3.shape[1]):
    temp_max = 0
    max_index = 0
    for row in range(A_l2_data3.shape[0]):
        if row not in important_frames_list3_l2:
            if A_l2_data3[row][column] >= temp_max:
                temp_max = A_l2_data3[row][column]
                max_index = row
                
    important_frames_list3_l2.append(max_index)

sorted_important_frames_list3_l2 = np.sort(important_frames_list3_l2)


#### Generate the new video and play it

In [None]:
video3_summary_frames_l2 = []

for frame_index in sorted_important_frames_list3_l2:
    video3_summary_frames_l2.append(Video[3][frame_index])
    
show_video(video3_summary_frames_l2, title = 'Data3 Summary by L2 ')
video_save(video3_summary_frames_l2, name='Data3Summary_l2.gif')


### 1.3.2 Linear Regression (L1 Loss Function)


**This will take some time , please be patient ...**

In [None]:
video2d_3 = np.array(video_matrix_to2D(Video[3]))

end = round(np.shape(video2d_3)[0] * (20/100))

A_l1_data3 = np.zeros((video2d_3.shape[0], end))

for (index1, sample1) in enumerate(video2d_3):
    
    phi_xi = []
    
    # generate X and Y
    
    for (index2,sample2) in enumerate(video2d_3):
        
        if index2!= index1:
            phi_xi.append(sample2)
    
    x = np.array(phi_xi).T
    y = sample1.reshape(-1,1)
    
    # create the model and variables
    
    x = Variable(torch.Tensor(x))
    y = Variable(torch.Tensor(y))
    
    model = LinearRegressionModel(x.shape[1],y.shape[1], criterion_mode='l1', optimizer_lr= 0.000001, max_iter = 100)

    # fit the model
    
    # Print a test log <start>
    
    print(f'Currently building sample {index1}')
    
    # Print a test log <end>
    
    trash = model.fit(x,y)
    
    # calculate W
    
    Wi = np.array(model.coef()).flatten()
    # +1 for W_i of sample i which will be set to zero later.
    indices = np.arange(Wi.shape[0]+1)
    
    # adjust Wi to include W_i for sample i with 0 value
    Wi = Wi.tolist()
    Wi.insert(index1, 0)
    Wi = np.array(Wi)
    
    # Prepare to sort (absolute value of W_i for each Wi)
    
    W_dataframe = pd.DataFrame({'index':indices, 'W_i': np.abs(Wi)})
    W_dataframe = W_dataframe.sort_values(by='W_i', ascending=False)
    
    # Update a_{ij} based on the sorted W_dataframe
    
    for j in range(end):
        i = np.array(W_dataframe['index'])[j]
        # Increase the counter by 1
        A_l1_data3[i][j] += 1
    

#### Find the important frames

After building the $A$ matrix, we can loop through it and find the $20$ percent most important frames.

In [None]:
important_frames_list3_l1 = []

# Find most 20 percent most important frames.

for column in range(A_l1_data3.shape[1]):
    temp_max = 0
    max_index = 0
    for row in range(A_l1_data3.shape[0]):
        if row not in important_frames_list3_l1:
            if A_l1_data3[row][column] >= temp_max:
                temp_max = A_l1_data3[row][column]
                max_index = row
                
    important_frames_list3_l1.append(max_index)

sorted_important_frames_list3_l1 = np.sort(important_frames_list3_l1)


#### Generate the new video and play it

In [None]:
video3_summary_frames_l1 = []

for frame_index in sorted_important_frames_list3_l1:
    video3_summary_frames_l1.append(Video[3][frame_index])
    
show_video(video3_summary_frames_l1, title = 'Data3 Summary by L1 ')
video_save(video3_summary_frames_l1, name='Data3Summary_l1.gif')


### 1.3.3 OMP


**This will take some time , please be patient ...**

In [None]:
video2d_3 = np.array(video_matrix_to2D(Video[3]))

end = round(np.shape(video2d_3)[0] * (20/100))

A_omp_data3 = np.zeros((video2d_3.shape[0], end))

for (index1, sample1) in enumerate(video2d_3):
    
    phi_xi = []
    
    # generate X and Y
    
    for (index2,sample2) in enumerate(video2d_3):
        
        if index2!= index1:
            phi_xi.append(sample2)
    
    x = np.array(phi_xi).T
    y = sample1.reshape(-1,1)
    
    # create the model
    
    model = OMP(fit_intercept=False, normalize=False, n_nonzero_coefs= 5)

    # fit the model
    
    model.fit(x,y)
    
    # calculate W
    
    Wi = model.coef_.flatten()
    # +1 for W_i of sample i which will be set to zero later.
    indices = np.arange(Wi.shape[0]+1)
    
    # adjust Wi to include W_i for sample i with 0 value
    Wi = Wi.tolist()
    Wi.insert(index1, 0)
    Wi = np.array(Wi)
    
    # Prepare to sort (absolute value of W_i for each Wi)
    
    W_dataframe = pd.DataFrame({'index':indices, 'W_i': np.abs(Wi)})
    W_dataframe = W_dataframe.sort_values(by='W_i', ascending=False)
    
    # Update a_{ij} based on the sorted W_dataframe
    
    for j in range(end):
        i = np.array(W_dataframe['index'])[j]
        # Increase the counter by 1
        A_omp_data3[i][j] += 1
    

#### Find the important frames

After building the $A$ matrix, we can loop through it and find the $20$ percent most important frames.

In [None]:
important_frames_list3_omp = []

# Find most 20 percent most important frames.

for column in range(A_omp_data3.shape[1]):
    temp_max = 0
    max_index = 0
    for row in range(A_omp_data3.shape[0]):
        if row not in important_frames_list3_omp:
            if A_omp_data3[row][column] >= temp_max:
                temp_max = A_omp_data3[row][column]
                max_index = row
                
    important_frames_list3_omp.append(max_index)

sorted_important_frames_list3_omp = np.sort(important_frames_list3_omp)


#### Generate the new video and play it

In [None]:
video3_summary_frames_omp = []

for frame_index in sorted_important_frames_list3_omp:
    video3_summary_frames_omp.append(Video[3][frame_index])
    
show_video(video3_summary_frames_omp, title = 'Data3 Summary by OMP ')
video_save(video3_summary_frames_omp, name='Data3Summary_omp.gif')


## 1.4 Data 4

Using the solution mentioned above to fix the memory problem.


### 1.4.1 Linear Regression (L2 Loss Function)

Linear Regression using PyTorch may fail due to inappropriate optimizer and learning rate, Therefore only for this case, we use sklearn LinearRegression model.

**This will take some time , please be patient ...**

In [None]:
# Linear Regression using Pytorch may fail due to inappropriate optimizer and learning rate
# Therefore just for this case, we use sklearn LinearRegression model

from sklearn.linear_model import LinearRegression as temporary_l2model

video2d_4 = np.array(video_matrix_to2D(Video[4]))

end = round(np.shape(video2d_4)[0] * (20/100))

A_l2_data4 = np.zeros((video2d_4.shape[0], end))

for (index1, sample1) in enumerate(video2d_4):
    
    
    phi_xi = []
    
    # generate X and Y
    
    for (index2,sample2) in enumerate(video2d_4):
        
        if index2!= index1:
            phi_xi.append(sample2)
    
    x = np.array(phi_xi).T
    y = sample1.reshape(-1,1)
    
    # create the model
    
    model = temporary_l2model()

    # fit the model
    
    model.fit(x,y)
    
    # calculate W
    
    Wi = model.coef_.flatten()
    # +1 for W_i of sample i which will be set to zero later.
    indices = np.arange(Wi.shape[0]+1)
    
    # adjust Wi to include W_i for sample i with 0 value
    Wi = Wi.tolist()
    Wi.insert(index1, 0)
    Wi = np.array(Wi)
    
    # Prepare to sort (absolute value of W_i for each Wi)
    
    W_dataframe = pd.DataFrame({'index':indices, 'W_i': np.abs(Wi)})
    W_dataframe = W_dataframe.sort_values(by='W_i', ascending=False)
    
    # Update a_{ij} based on the sorted W_dataframe
    
    for j in range(end):
        i = np.array(W_dataframe['index'])[j]
        # Increase the counter by 1
        A_l2_data4[i][j] += 1
    

#### Find the important frames

After building the $A$ matrix, we can loop through it and find the $20$ percent most important frames.

In [None]:
important_frames_list4_l2 = []

# Find most 20 percent most important frames.

for column in range(A_l2_data4.shape[1]):
    temp_max = 0
    max_index = 0
    for row in range(A_l2_data4.shape[0]):
        if row not in important_frames_list4_l2:
            if A_l2_data4[row][column] >= temp_max:
                temp_max = A_l2_data4[row][column]
                max_index = row
                
    important_frames_list4_l2.append(max_index)

sorted_important_frames_list4_l2 = np.sort(important_frames_list4_l2)


#### Generate the new video and play it

In [None]:
video4_summary_frames_l2 = []

for frame_index in sorted_important_frames_list4_l2:
    video4_summary_frames_l2.append(Video[4][frame_index])
    
show_video(video4_summary_frames_l2, title = 'Data4 Summary by L2 ')
video_save(video4_summary_frames_l2, name='Data4Summary_l2.gif')


### 1.4.2 Linear Regression (L1 Loss Function)


**This will take some time , please be patient ...**

In [None]:
video2d_4 = np.array(video_matrix_to2D(Video[4]))

end = round(np.shape(video2d_4)[0] * (20/100))

A_l1_data4 = np.zeros((video2d_4.shape[0], end))

for (index1, sample1) in enumerate(video2d_4):
    
    phi_xi = []
    
    # generate X and Y
    
    for (index2,sample2) in enumerate(video2d_4):
        
        if index2!= index1:
            phi_xi.append(sample2)
    
    x = np.array(phi_xi).T
    y = sample1.reshape(-1,1)
    
    # create the model and variables
    
    x = Variable(torch.Tensor(x))
    y = Variable(torch.Tensor(y))
    
    model = LinearRegressionModel(x.shape[1],y.shape[1], criterion_mode='l1', optimizer_lr= 0.000001, max_iter = 100)

    # fit the model
    
    # Print a test log <start>
    
    print(f'Currently building sample {index1}')
    
    # Print a test log <end>
    
    trash = model.fit(x,y)
    
    # calculate W
    
    Wi = np.array(model.coef()).flatten()
    # +1 for W_i of sample i which will be set to zero later.
    indices = np.arange(Wi.shape[0]+1)
    
    # adjust Wi to include W_i for sample i with 0 value
    Wi = Wi.tolist()
    Wi.insert(index1, 0)
    Wi = np.array(Wi)
    
    # Prepare to sort (absolute value of W_i for each Wi)
    
    W_dataframe = pd.DataFrame({'index':indices, 'W_i': np.abs(Wi)})
    W_dataframe = W_dataframe.sort_values(by='W_i', ascending=False)
    
    # Update a_{ij} based on the sorted W_dataframe
    
    for j in range(end):
        i = np.array(W_dataframe['index'])[j]
        # Increase the counter by 1
        A_l1_data4[i][j] += 1
    

#### Find the important frames

After building the $A$ matrix, we can loop through it and find the $20$ percent most important frames.

In [None]:
important_frames_list4_l1 = []

# Find most 20 percent most important frames.

for column in range(A_l1_data4.shape[1]):
    temp_max = 0
    max_index = 0
    for row in range(A_l1_data4.shape[0]):
        if row not in important_frames_list4_l1:
            if A_l1_data4[row][column] >= temp_max:
                temp_max = A_l1_data4[row][column]
                max_index = row
                
    important_frames_list4_l1.append(max_index)

sorted_important_frames_list4_l1 = np.sort(important_frames_list4_l1)


#### Generate the new video and play it

In [None]:
video4_summary_frames_l1 = []

for frame_index in sorted_important_frames_list4_l1:
    video4_summary_frames_l1.append(Video[4][frame_index])
    
show_video(video4_summary_frames_l1, title = 'Data4 Summary by L1 ')
video_save(video4_summary_frames_l1, name='Data4Summary_l1.gif')


### 1.4.3 OMP


**This will take some time , please be patient ...**

In [None]:
video2d_4 = np.array(video_matrix_to2D(Video[4]))

end = round(np.shape(video2d_4)[0] * (20/100))

A_omp_data4 = np.zeros((video2d_4.shape[0], end))

for (index1, sample1) in enumerate(video2d_4):
    
    phi_xi = []
    
    # generate X and Y
    
    for (index2,sample2) in enumerate(video2d_4):
        
        if index2!= index1:
            phi_xi.append(sample2)
    
    x = np.array(phi_xi).T
    y = sample1.reshape(-1,1)
    
    # create the model
    
    model = OMP(fit_intercept=False, normalize=False, n_nonzero_coefs= 2)

    # fit the model
    
    model.fit(x,y)
    
    # calculate W
    
    Wi = model.coef_.flatten()
    # +1 for W_i of sample i which will be set to zero later.
    indices = np.arange(Wi.shape[0]+1)
    
    # adjust Wi to include W_i for sample i with 0 value
    Wi = Wi.tolist()
    Wi.insert(index1, 0)
    Wi = np.array(Wi)
    
    # Prepare to sort (absolute value of W_i for each Wi)
    
    W_dataframe = pd.DataFrame({'index':indices, 'W_i': np.abs(Wi)})
    W_dataframe = W_dataframe.sort_values(by='W_i', ascending=False)
    
    # Update a_{ij} based on the sorted W_dataframe
    
    for j in range(end):
        i = np.array(W_dataframe['index'])[j]
        # Increase the counter by 1
        A_omp_data4[i][j] += 1
    

#### Find the important frames

After building the $A$ matrix, we can loop through it and find the $20$ percent most important frames.

In [None]:
important_frames_list4_omp = []

# Find most 20 percent most important frames.

for column in range(A_omp_data4.shape[1]):
    temp_max = 0
    max_index = 0
    for row in range(A_omp_data4.shape[0]):
        if row not in important_frames_list4_omp:
            if A_omp_data4[row][column] >= temp_max:
                temp_max = A_omp_data4[row][column]
                max_index = row
                
    important_frames_list4_omp.append(max_index)

sorted_important_frames_list4_omp = np.sort(important_frames_list4_omp)


#### Generate the new video and play it

In [None]:
video4_summary_frames_omp = []

for frame_index in sorted_important_frames_list4_omp:
    video4_summary_frames_omp.append(Video[4][frame_index])
    
show_video(video4_summary_frames_omp, title = 'Data4 Summary by OMP ')
video_save(video4_summary_frames_omp, name='Data4Summary_omp.gif')


# 2. ORL Classification

<hr>

Our Objective is to use Linear Regression  (L1 and L2 norm Loss function) and OMP to Classifiy ORL dataset.

## 2.1 Method 1

In this method, we use onehotencoder approach towards classification. Three simple Steps:

1. Convert categorical output to one hot vector


2. Create your model by X ($\in \mathbb{R}^{N\times d}$) and Y ($\in \mathbb{R}^{N\times D}$)


3. Estimate categorical $\hat Y$ using vector $\hat Y$ 
    - that is to find the element which contains the maximum value and call its index as the categorical value

### 2.1.1 Train Test Split

In [None]:
ORL_train_x, ORL_test_x, ORL_train_y, ORL_test_y= train_test_split(ORL_data, ORL_target, test_size= 0.2)

### 2.1.2 Vectorize labels  (one hot encoding)

Using this approach, we are going to vectorize each labels.

Apprach :

1. $Y$ currently belongs to $\mathbb{R}^1$ and can contain a distinct value from set $\left\{0,1,2,\dots,39\right\}$
    for example $Y$ can be 0 or 10 for sample $X_i$
    
    
2. Vectorize $Y$ so that it belongs to $\mathbb{R}^K$ (In our case $K$, number of classes,  is 40 )


3. For each $Y$, set the $i$th element of $Y$ to 1 and the rest to 0 ( $i$ is the previous distinct value of $Y$ for sample $X_j$)


For example :

Conside $Y$ to be 3 for $X_i$, therefore vectorized $Y$ is $[0,0,0,1,0,\dots, 0]$  where the $Y[3]$ element of $Y$
 is set to 1. Since $Y$ is an array, the counter for its elements starts with 0, thus the fourth element is set to 1. 
 
 
Consider $Y$ to be $\left\{1,2,3\right\}$ and $Y_i$ for $X_i$ is 2. Vectorized vestion of $Y_i$ is $[0,1,0]$ where the second element is set to 1. 

In [None]:
# Vectorize each target (one hot encoder)

K = np.shape(np.unique(ORL_target))[0]

def encoder(target, _K = K):
    target_vec = []
    for _target in target:
        vector = np.zeros(_K)
        vector[_target] = 1
        target_vec.append(vector)
    return target_vec

        
# Vectorize Train Y 

ORL_train_y_vec = encoder(ORL_train_y)

# Vectorize Test Y

ORL_test_y_vec = encoder(ORL_test_y)

### 2.1.3 Regression

Using Linear Regression with (L1 and L2 loss function) and OMP, we will create 3 models to estimate $Y$.


In [None]:
dimX = np.shape(ORL_train_x)[1]
dimY = np.shape(ORL_train_y_vec)[1]

trainX = Variable(torch.Tensor(ORL_train_x))
trainY = Variable(torch.Tensor(ORL_train_y_vec))
testX  = Variable(torch.Tensor(ORL_test_x))

#### 2.1.3.1 Linear Regression (L2 loss function)

In [None]:
# Create Model

ORL_model_l2 = LinearRegressionModel(dimX, dimY, criterion_mode= 'l2', max_iter= 1000, optimizer_lr= 0.01)
ORL_model_l2.fit(trainX,trainY, True)

# Generate Results

ORL_model_l2_estimate_train_vec = ORL_model_l2(trainX)
ORL_model_l2_estimate_test_vec  = ORL_model_l2(testX)

# Convert Tensor to array

ORL_model_l2_estimate_train_vec = ORL_model_l2_estimate_train_vec.detach().numpy().tolist()
ORL_model_l2_estimate_test_vec  = ORL_model_l2_estimate_test_vec.detach().numpy().tolist() 

#### 2.1.3.2 Linear Regression (L1 loss function)

In [None]:
# Create Model

ORL_model_l1 = LinearRegressionModel(dimX, dimY, criterion_mode= 'l1', max_iter= 1000, optimizer_lr= 0.01)
ORL_model_l1.fit(trainX,trainY, True)

# Generate Results

ORL_model_l1_estimate_train_vec = ORL_model_l1(trainX)
ORL_model_l1_estimate_test_vec  = ORL_model_l1(testX)

# Convert Tensor to array

ORL_model_l1_estimate_train_vec = ORL_model_l1_estimate_train_vec.detach().numpy().tolist()
ORL_model_l1_estimate_test_vec  = ORL_model_l1_estimate_test_vec.detach().numpy().tolist() 

#### 2.1.3.3 OMP

In [None]:
# Create Model

ORL_model_omp = OMP(n_nonzero_coefs=5,fit_intercept=False, normalize= False)
ORL_model_omp.fit(ORL_train_x,ORL_train_y_vec)

# Generate Results

ORL_model_omp_estimate_train_vec = ORL_model_omp.predict(ORL_train_x)
ORL_model_omp_estimate_test_vec  = ORL_model_omp.predict(ORL_test_x)


### 2.1.4 Classify by Regression results

Using The results of Regression models. Each model will predict a $\mathbb{R}^{K=40}$ vector. Previously each vector contained only 0 or 1(at the $i$th element), however by regression we don't have such luxury.

Previously we defined that $i$th element of each vector will decide its class. 1 is bigger than 0, therefore it might be a good first step towards classification to choose the biggest element as classification measure.  


Approach :

1. Obtain Predicted Values of target using Regression models


2. For each target ($Y_i \in \mathbb{R}^40$): Find The biggest Element $k$ such as $\forall j\in \left\{0,1,\dots,39\right\} :  Y_i[k] \geq Y_i[j]$


3. choose $k$ as the class of $Y_i$



#### 2.1.4.1 Linear Regression (L2 loss function)

In [None]:
# Classify using predicted vectors

# Train 

ORL_model_l2_estimate_train = []
for target in np.abs(ORL_model_l2_estimate_train_vec):
    k = np.argmax(target)
    ORL_model_l2_estimate_train.append(k)
    
# Test

ORL_model_l2_estimate_test = []
for target in np.abs(ORL_model_l2_estimate_test_vec):
    k = np.argmax(target)
    ORL_model_l2_estimate_test.append(k)


#### 2.1.4.2 Linear Regression (L1 loss function)

In [None]:
# Classify using predicted vectors

# Train 

ORL_model_l1_estimate_train = []
for target in np.abs(ORL_model_l1_estimate_train_vec):
    k = np.argmax(target)
    ORL_model_l1_estimate_train.append(k)
    
# Test

ORL_model_l1_estimate_test = []
for target in np.abs(ORL_model_l1_estimate_test_vec):
    k = np.argmax(target)
    ORL_model_l1_estimate_test.append(k)


#### 2.1.4.3 OMP

In [None]:
# Classify using predicted vectors

# Train 

ORL_model_omp_estimate_train = []
for target in np.abs(ORL_model_omp_estimate_train_vec):
    k = np.argmax(target)
    ORL_model_omp_estimate_train.append(k)
    
# Test

ORL_model_omp_estimate_test = []
for target in np.abs(ORL_model_omp_estimate_test_vec):
    k = np.argmax(target)
    ORL_model_omp_estimate_test.append(k)


### 2.1.5 Visualize Results


Since it is a classification task, RMSE doesn't work. New metric to evaluate our classification model is required. 

We use F1 and accuracy score to evaluate our models. The F1 and accuracy score is a number between 0(worst) and 1(best).

#### 2.1.5.1 Linear Regression (L2 loss function)

In [None]:
# F1 score on train and test 

ORL_model_l2_f1_train = f1_score(ORL_train_y, ORL_model_l2_estimate_train, average='macro')
ORL_model_l2_f1_test  = f1_score(ORL_test_y, ORL_model_l2_estimate_test, average='macro')

# Accuracy score on test

ORL_model_l2_accuracy_train = accuracy_score(ORL_train_y, ORL_model_l2_estimate_train)
ORL_model_l2_accuracy_test  = accuracy_score(ORL_test_y, ORL_model_l2_estimate_test)

# Print Result

print(f'f1 score for training dataset using Linear Regression and L2 loss function : {ORL_model_l2_f1_train}')
print(f'f1 score for testing dataset using Linear Regression and L2 loss function : {ORL_model_l2_f1_test}')
print(f'accuracy score for training dataset using Linear Regression and L2 loss function : {ORL_model_l2_accuracy_train}')
print(f'accuracy score for testing dataset using Linear Regression and L2 loss function : {ORL_model_l2_accuracy_test}')


<font color = 'blue'> Blue </font> is the real target and 
<font color = 'lime'> Lime </font> is the correct predicted target  and 
<font color = 'red'> Red </font> is the Misclassification.

In [None]:
# Train Dataset

fig = plt.figure(figsize=(20,20))
indx = 1
for img in ORL_train_x:
    img = img.reshape(64,64)
    ax = fig.add_subplot(20,20,indx)
    ax.imshow(img, cmap = 'gray')
    ax.grid(False)
    ax.axis('off')
    # add label aka target aka class  [orange]
    ax.text(img.shape[0]/4, img.shape[1]/4, str(ORL_train_y[indx-1]), fontsize = 20, color = 'blue', fontweight='bold')
    # add predicted label (lime is correct and red is misclassified)
    if ORL_train_y[indx-1] != ORL_model_l2_estimate_train[indx-1]:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_model_l2_estimate_train[indx-1]), fontsize = 20, color = 'red', fontweight='bold', bbox=dict(fill=False, edgecolor='red', linewidth=2))
    else:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_model_l2_estimate_train[indx-1]), fontsize = 20, color = 'lime', fontweight='bold')
        
    indx+=1
fig.tight_layout()
plt.show()

In [None]:
# Test Dataset

fig = plt.figure(figsize=(20,20))
indx = 1
for img in ORL_test_x:
    img = img.reshape(64,64)
    ax = fig.add_subplot(20,20,indx)
    ax.imshow(img, cmap = 'gray')
    ax.grid(False)
    ax.axis('off')
    # add label aka target aka class  [orange]
    ax.text(img.shape[0]/5, img.shape[1]/5, str(ORL_test_y[indx-1]), fontsize = 20, color = 'blue', fontweight='bold')
    # add predicted label (lime is correct and red is misclassified)
    if ORL_test_y[indx-1] != ORL_model_l2_estimate_test[indx-1]:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_model_l2_estimate_test[indx-1]), fontsize = 20, color = 'red', fontweight='bold', bbox=dict(fill=False, edgecolor='red', linewidth=2))
    else:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_model_l2_estimate_test[indx-1]), fontsize = 20, color = 'lime', fontweight='bold')
        
    indx+=1
fig.tight_layout()
plt.show()

#### 2.1.5.2 Linear Regression (L1 loss function)

In [None]:
# F1 score on train and test 

ORL_model_l1_f1_train = f1_score(ORL_train_y, ORL_model_l1_estimate_train, average='macro')
ORL_model_l1_f1_test  = f1_score(ORL_test_y, ORL_model_l1_estimate_test, average='macro')

# Accuracy score on test

ORL_model_l1_accuracy_train = accuracy_score(ORL_train_y, ORL_model_l1_estimate_train)
ORL_model_l1_accuracy_test  = accuracy_score(ORL_test_y, ORL_model_l1_estimate_test)

# Print Result

print(f'f1 score for training dataset using Linear Regression and L2 loss function : {ORL_model_l1_f1_train}')
print(f'f1 score for testing dataset using Linear Regression and L2 loss function : {ORL_model_l1_f1_test}')
print(f'accuracy score for training dataset using Linear Regression and L2 loss function : {ORL_model_l1_accuracy_train}')
print(f'accuracy score for testing dataset using Linear Regression and L2 loss function : {ORL_model_l1_accuracy_test}')


<font color = 'blue'> Blue </font> is the real target and 
<font color = 'lime'> Lime </font> is the correct predicted target  and 
<font color = 'red'> Red </font> is the Misclassification.

In [None]:
# Train Dataset

fig = plt.figure(figsize=(20,20))
indx = 1
for img in ORL_train_x:
    img = img.reshape(64,64)
    ax = fig.add_subplot(20,20,indx)
    ax.imshow(img, cmap = 'gray')
    ax.grid(False)
    ax.axis('off')
    # add label aka target aka class  [orange]
    ax.text(img.shape[0]/4, img.shape[1]/4, str(ORL_train_y[indx-1]), fontsize = 20, color = 'blue', fontweight='bold')
    # add predicted label (lime is correct and red is misclassified)
    if ORL_train_y[indx-1] != ORL_model_l1_estimate_train[indx-1]:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_model_l1_estimate_train[indx-1]), fontsize = 20, color = 'red', fontweight='bold', bbox=dict(fill=False, edgecolor='red', linewidth=2))
    else:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_model_l1_estimate_train[indx-1]), fontsize = 20, color = 'lime', fontweight='bold')
        
    indx+=1
fig.tight_layout()
plt.show()

In [None]:
# Test Dataset

fig = plt.figure(figsize=(20,20))
indx = 1
for img in ORL_test_x:
    img = img.reshape(64,64)
    ax = fig.add_subplot(20,20,indx)
    ax.imshow(img, cmap = 'gray')
    ax.grid(False)
    ax.axis('off')
    # add label aka target aka class  [orange]
    ax.text(img.shape[0]/5, img.shape[1]/5, str(ORL_test_y[indx-1]), fontsize = 20, color = 'blue', fontweight='bold')
    # add predicted label (lime is correct and red is misclassified)
    if ORL_test_y[indx-1] != ORL_model_l1_estimate_test[indx-1]:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_model_l1_estimate_test[indx-1]), fontsize = 20, color = 'red', fontweight='bold', bbox=dict(fill=False, edgecolor='red', linewidth=2))
    else:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_model_l1_estimate_test[indx-1]), fontsize = 20, color = 'lime', fontweight='bold')
        
    indx+=1
fig.tight_layout()
plt.show()

#### 2.1.5.3 OMP

In [None]:
# F1 score on train and test 

ORL_model_omp_f1_train = f1_score(ORL_train_y, ORL_model_omp_estimate_train, average='macro')
ORL_model_omp_f1_test  = f1_score(ORL_test_y, ORL_model_omp_estimate_test, average='macro')

# Accuracy score on test

ORL_model_omp_accuracy_train = accuracy_score(ORL_train_y, ORL_model_omp_estimate_train)
ORL_model_omp_accuracy_test  = accuracy_score(ORL_test_y, ORL_model_omp_estimate_test)

# Print Result

print(f'f1 score for training dataset using Linear Regression and L2 loss function : {ORL_model_omp_f1_train}')
print(f'f1 score for testing dataset using Linear Regression and L2 loss function : {ORL_model_omp_f1_test}')
print(f'accuracy score for training dataset using Linear Regression and L2 loss function : {ORL_model_omp_accuracy_train}')
print(f'accuracy score for testing dataset using Linear Regression and L2 loss function : {ORL_model_omp_accuracy_test}')


<font color = 'blue'> Blue </font> is the real target and 
<font color = 'lime'> Lime </font> is the correct predicted target  and 
<font color = 'red'> Red </font> is the Misclassification.

In [None]:
# Train Dataset

fig = plt.figure(figsize=(20,20))
indx = 1
for img in ORL_train_x:
    img = img.reshape(64,64)
    ax = fig.add_subplot(20,20,indx)
    ax.imshow(img, cmap = 'gray')
    ax.grid(False)
    ax.axis('off')
    # add label aka target aka class  [orange]
    ax.text(img.shape[0]/4, img.shape[1]/4, str(ORL_train_y[indx-1]), fontsize = 20, color = 'blue', fontweight='bold')
    # add predicted label (lime is correct and red is misclassified)
    if ORL_train_y[indx-1] != ORL_model_omp_estimate_train[indx-1]:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_model_omp_estimate_train[indx-1]), fontsize = 20, color = 'red', fontweight='bold', bbox=dict(fill=False, edgecolor='red', linewidth=2))
    else:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_model_omp_estimate_train[indx-1]), fontsize = 20, color = 'lime', fontweight='bold')
        
    indx+=1
fig.tight_layout()
plt.show()

In [None]:
# Test Dataset

fig = plt.figure(figsize=(20,20))
indx = 1
for img in ORL_test_x:
    img = img.reshape(64,64)
    ax = fig.add_subplot(20,20,indx)
    ax.imshow(img, cmap = 'gray')
    ax.grid(False)
    ax.axis('off')
    # add label aka target aka class  [orange]
    ax.text(img.shape[0]/5, img.shape[1]/5, str(ORL_test_y[indx-1]), fontsize = 20, color = 'blue', fontweight='bold')
    # add predicted label (lime is correct and red is misclassified)
    if ORL_test_y[indx-1] != ORL_model_omp_estimate_test[indx-1]:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_model_omp_estimate_test[indx-1]), fontsize = 20, color = 'red', fontweight='bold', bbox=dict(fill=False, edgecolor='red', linewidth=2))
    else:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_model_omp_estimate_test[indx-1]), fontsize = 20, color = 'lime', fontweight='bold')
        
    indx+=1
fig.tight_layout()
plt.show()

## 2.2 Method 2

In this method, we use train data as a base to reconstruct test data. Simple Steps :

1. For each $x_i \in C_j$, reconstruct test sample $z_k$ :

    - that is write $z_k$ as a linear combination of $x_i$s : $\hat z_k = \sum_{x_i \in C_j}\alpha_i\times x_i$
    
2. Calculate  $Loss_j(z_k, \hat z_k)$.

3. Find the minimum $Loss_j$ and its respected $C_j$

4. test sample $z_k$ belongs to Class $j$


How to build a linear combination ?

Assume $X_1, \dots X_{10}$ are in Class 1.  Our objective is to create the model $f = W_1 X_1 + W_2 X_2 +\dots + W_{10} X_{10}$ such that $f = \hat z$, where $\hat z$ is the estimation of sample test $z$.


$$
\hat z^T = W^TX\\
W^T = \begin{bmatrix} W_1,W_2,\dots, W_{10} \end{bmatrix} \hspace{2.2cm} X =\begin{bmatrix} X_1\\X_2\\\vdots\\X_{10}\end{bmatrix}
$$


Cosidering $Loss(z , \hat z)$, we can choose the best class to predict our sample $z$ of test dataset.


$\bullet$ First we need to build $X$ for each Class.

$\bullet$ then we need to build our model one each $X$ for each sample and calculate its $Loss$.

$\bullet$ Predict the class of each sample


In General :


$$
\hat z = X^T W 
$$

In [None]:
# Convert to dataframe to split in differenct classes

train_dataframe = pd.DataFrame({'x':ORL_train_x.tolist(), 't': ORL_train_y})

# Sort by classes

train_dataframe = train_dataframe.sort_values(by='t')

# Split to different classes

ORL_split_train = {}
ORL_split_train_keys   = []

for c in np.unique(train_dataframe['t']):
    ORL_split_train[c] = np.array(train_dataframe[train_dataframe['t'] == c]['x'])
    ORL_split_train_keys.append(c)


### 2.2.1 Linear Regression (L2 Loss Function)

In [None]:
ORL_l2_prediction = []

for test_sample in ORL_test_x:
    
    loss_list = []
    loss_keys = []
    # key is the class
    for key in ORL_split_train_keys:
        
        # Variables (input and output)
        x = np.array([np.array(xi) for xi in ORL_split_train[key]]).T
        y = test_sample.reshape(-1,1)
        x = Variable(torch.Tensor(x))
        y = Variable(torch.Tensor(y))
        
        # create model
        model = LinearRegressionModel(x.shape[1], y.shape[1],criterion_mode='l2',max_iter=100, optimizer_lr=0.01)
        
        # calculate loss for reconstruction using class 'key'
        loss = model.fit(x,y, False)
        
        loss_list.append(loss)
        loss_keys.append(key)
  
    # Predict the label of test sample
    
    min_loss = np.argmin(loss_list)
    label_pred  = loss_keys[min_loss]
    
    ORL_l2_prediction.append(label_pred)
    
    

#### Results Visualization 

In [None]:
# F1 score on test 

ORL_model2_l2_f1_test  = f1_score(ORL_test_y, ORL_l2_prediction, average='macro')

# Accuracy score on test

ORL_model2_l2_accuracy_test  = accuracy_score(ORL_test_y, ORL_l2_prediction)

# Print Result

print(f'f1 score for testing dataset using Linear Regression and L2 loss function : {ORL_model2_l2_f1_test}')
print(f'accuracy score for testing dataset using Linear Regression and L2 loss function : {ORL_model2_l2_accuracy_test}')


<font color = 'blue'> Blue </font> is the real target and 
<font color = 'lime'> Lime </font> is the correct predicted target  and 
<font color = 'red'> Red </font> is the Misclassification.

In [None]:
# Test Dataset

fig = plt.figure(figsize=(20,20))
indx = 1
for img in ORL_test_x:
    img = img.reshape(64,64)
    ax = fig.add_subplot(20,20,indx)
    ax.imshow(img, cmap = 'gray')
    ax.grid(False)
    ax.axis('off')
    # add label aka target aka class  [orange]
    ax.text(img.shape[0]/5, img.shape[1]/5, str(ORL_test_y[indx-1]), fontsize = 20, color = 'blue', fontweight='bold')
    # add predicted label (lime is correct and red is misclassified)
    if ORL_test_y[indx-1] != ORL_l2_prediction[indx-1]:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_l2_prediction[indx-1]), fontsize = 20, color = 'red', fontweight='bold', bbox=dict(fill=False, edgecolor='red', linewidth=2))
    else:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_l2_prediction[indx-1]), fontsize = 20, color = 'lime', fontweight='bold')
        
    indx+=1
fig.tight_layout()
plt.show()

### 2.2.2 Linear Regression (L1 Loss Function)

In [None]:
ORL_l1_prediction = []

for test_sample in ORL_test_x:
    
    loss_list = []
    loss_keys = []
    # key is the class
    for key in ORL_split_train_keys:
        
        # Variables (input and output)
        x = np.array([np.array(xi) for xi in ORL_split_train[key]]).T
        y = test_sample.reshape(-1,1)
        x = Variable(torch.Tensor(x))
        y = Variable(torch.Tensor(y))
        
        # create model
        model = LinearRegressionModel(x.shape[1], y.shape[1],criterion_mode='l1',max_iter=100, optimizer_lr=0.01)
        
        # calculate loss for reconstruction using class 'key'
        loss = model.fit(x,y, False)
        
        loss_list.append(loss)
        loss_keys.append(key)
  
    # Predict the label of test sample
    
    min_loss = np.argmin(loss_list)
    label_pred  = loss_keys[min_loss]
    
    ORL_l1_prediction.append(label_pred)
    
    

#### Results Visualization 

In [None]:
# F1 score on test 

ORL_model2_l1_f1_test  = f1_score(ORL_test_y, ORL_l1_prediction, average='macro')

# Accuracy score on test

ORL_model2_l1_accuracy_test  = accuracy_score(ORL_test_y, ORL_l1_prediction)

# Print Result

print(f'f1 score for testing dataset using Linear Regression and L2 loss function : {ORL_model2_l1_f1_test}')
print(f'accuracy score for testing dataset using Linear Regression and L2 loss function : {ORL_model2_l1_accuracy_test}')


<font color = 'blue'> Blue </font> is the real target and 
<font color = 'lime'> Lime </font> is the correct predicted target  and 
<font color = 'red'> Red </font> is the Misclassification.

In [None]:
# Test Dataset

fig = plt.figure(figsize=(20,20))
indx = 1
for img in ORL_test_x:
    img = img.reshape(64,64)
    ax = fig.add_subplot(20,20,indx)
    ax.imshow(img, cmap = 'gray')
    ax.grid(False)
    ax.axis('off')
    # add label aka target aka class  [orange]
    ax.text(img.shape[0]/5, img.shape[1]/5, str(ORL_test_y[indx-1]), fontsize = 20, color = 'blue', fontweight='bold')
    # add predicted label (lime is correct and red is misclassified)
    if ORL_test_y[indx-1] != ORL_l1_prediction[indx-1]:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_l1_prediction[indx-1]), fontsize = 20, color = 'red', fontweight='bold', bbox=dict(fill=False, edgecolor='red', linewidth=2))
    else:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_l1_prediction[indx-1]), fontsize = 20, color = 'lime', fontweight='bold')
        
    indx+=1
fig.tight_layout()
plt.show()

### 2.2.3 OMP

In [None]:
ORL_omp_prediction = []

for test_sample in ORL_test_x:
    
    loss_list = []
    loss_keys = []
    # key is the class
    for key in ORL_split_train_keys:
        
        # Variables (input and output)
        x = np.array([np.array(xi) for xi in ORL_split_train[key]]).T
        y = test_sample.reshape(-1,1)
        
        # create model
        model = OMP(n_nonzero_coefs=2,fit_intercept=False, normalize= False)
        
        # calculate loss for reconstruction using class 'key'
        loss = model.fit(x,y).score(x,y)
        
        loss_list.append(loss)
        loss_keys.append(key)
  
    # Predict the label of test sample
    # since Loss is the score, the bigger one is the better one.
    
    min_loss = np.argmax(loss_list)
    label_pred  = loss_keys[min_loss]
    
    ORL_omp_prediction.append(label_pred)
    
    

#### Results Visualization 

In [None]:
# F1 score on test 

ORL_model2_omp_f1_test  = f1_score(ORL_test_y, ORL_omp_prediction, average='macro')

# Accuracy score on test

ORL_model2_omp_accuracy_test  = accuracy_score(ORL_test_y, ORL_omp_prediction)

# Print Result

print(f'f1 score for testing dataset using Linear Regression and L2 loss function : {ORL_model2_omp_f1_test}')
print(f'accuracy score for testing dataset using Linear Regression and L2 loss function : {ORL_model2_omp_accuracy_test}')


<font color = 'blue'> Blue </font> is the real target and 
<font color = 'lime'> Lime </font> is the correct predicted target  and 
<font color = 'red'> Red </font> is the Misclassification.

In [None]:
# Test Dataset

fig = plt.figure(figsize=(20,20))
indx = 1
for img in ORL_test_x:
    img = img.reshape(64,64)
    ax = fig.add_subplot(20,20,indx)
    ax.imshow(img, cmap = 'gray')
    ax.grid(False)
    ax.axis('off')
    # add label aka target aka class  [orange]
    ax.text(img.shape[0]/5, img.shape[1]/5, str(ORL_test_y[indx-1]), fontsize = 20, color = 'blue', fontweight='bold')
    # add predicted label (lime is correct and red is misclassified)
    if ORL_test_y[indx-1] != ORL_omp_prediction[indx-1]:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_omp_prediction[indx-1]), fontsize = 20, color = 'red', fontweight='bold', bbox=dict(fill=False, edgecolor='red', linewidth=2))
    else:
        ax.text(img.shape[0]/2, img.shape[1]/2, str(ORL_omp_prediction[indx-1]), fontsize = 20, color = 'lime', fontweight='bold')
        
    indx+=1
fig.tight_layout()
plt.show()