# Face recognition using k-Nearest Neighbor

---

### Contents
<ol>
    <li><a href="#image-loading" style="color: currentColor">Image loading</a></li>
    <li><a href="#data-preprocessing" style="color: currentColor">Data preprocessing</a></li>
    <li><a href="#pca" style="color: currentColor">Principal Component Analysis</a></li>
    <li><a href="#knn" style="color: currentColor">kNN-algorithm</a></li>
    <li><a href="#testing" style="color: currentColor">Model testing</a></li>
    <li><a href="#accuracy" style="color: currentColor">Accuracy evaluation</a></li>
    <li><a href="#further-analysis" style="color: currentColor">Further Analysis</a></li>
</ol>
<br>

---

### Libraries

In [18]:
#! C:\Users\fedbe\OneDrive\Dokumente\GitHub\topic01_team01\venv\Scripts\python.exe

import os
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
 

---

## <a id="image-loading"></a> 1. Image loading



In [19]:
#path to dataset
img_folder = os.path.join(os.getcwd(), "../datasets")

#store total amount of images 
N = 165 

#two empty list to hold the image data
image_arrays = []
image_names = []
 

for img_file in os.listdir(img_folder):
    #only load .gif images 
    if img_file.endswith(".gif"): 
        #get full file path 
        file_path = os.path.join(img_folder, img_file)

        img = Image.open(file_path)
        #convert to Numpy array
        img_array = np.array(img)

        #add arrays to empty lists
        image_arrays.append(img_array)
        image_names.append(img_file)





FileNotFoundError: [Errno 2] No such file or directory

---

## <a id="data-preprocessing"></a> 2. Data preprocessing 

I would add splitting dataset here, as it is part of preprocessing **(discuss if we do some stuff with a lot of code in functions folder in order to keep markdown readable => applicaple for each part of project)**

In [None]:
#Extract all indices of the pictures that correspond to the last 3 images of each subject
test_image_indices = [ i for i in range(N) if (i % 11) in {8, 9, 10} ] 

#split training dataset from test dataset
training_raw_images = [img for i, img in enumerate(image_arrays)if i not in test_image_indices]
test_raw_images = [img for i, img in enumerate(image_arrays)if i in test_image_indices]




### now the raw images have to be flattened, centered and normalized

In [None]:
#flattening the images 
#2D --> 1D
X = [ img.flatten() for img in training_raw_images ]
X_testset = [ img.flatten() for img in test_raw_images ]

#creating the meanface for centering
#axis 0 --> mean for each pixel
mean_face = np.mean(X, axis= 0)

#centering 
X_centered = X - mean_face
X_testset_centered = X_testset - mean_face

#normalising
X_normalised = X_centered / np.linalg.norm(X_centered)
X_testset_normalised = X_testset_centered / np.linalg.norm(X_testset_centered)


---

## <a id="pca"></a> 3. Pricinpal component analysis

#### using svd the dimensions are reduced via PCA 


In [None]:
#reducing dimension with PCA
def  reduce_dim(X, n_components):
    #performing SVD
    #we use full_matrices=False to get reduced matrices
    U, S, Vt = np.linalg.svd(X, full_matrices=False)
    #selecting the first n_components
    eigenfaces = U[:, :n_components]
    S_reduced = S[:n_components]
    V_reduced  = Vt[:n_components, :]

    
    #projecting the data onto the reduced space 
    X_reduced = X @ V_reduced.T

    #defining the eigenvalues 
    eigenvalues = S_reduced ** 2
    #returning reduced data
    return eigenfaces, S_reduced, V_reduced, X_reduced, eigenvalues




reduce_dim(X_normalised, n_components=100)




    

(array([[-0.00855621,  0.07375859, -0.06631712, ...,  0.00315763,
          0.00188348,  0.01101   ],
        [-0.08725211,  0.03436491,  0.00134766, ...,  0.02774429,
          0.01895208, -0.01607961],
        [ 0.07890903,  0.03437058,  0.06142554, ...,  0.08216572,
          0.06308874,  0.10792335],
        ...,
        [-0.04916946,  0.08094005, -0.13116492, ...,  0.01863853,
         -0.01317564,  0.00837996],
        [ 0.03348046,  0.06064459,  0.12426734, ..., -0.07941685,
          0.25579863, -0.03105167],
        [ 0.03514964,  0.01754592, -0.02431475, ...,  0.08151375,
          0.03496107, -0.04207439]], shape=(120, 100)),
 array([0.55310218, 0.37001763, 0.31073541, 0.25808422, 0.21960009,
        0.21852628, 0.17173187, 0.14776376, 0.14237273, 0.12899068,
        0.12091529, 0.10816533, 0.1010261 , 0.10012002, 0.09609794,
        0.09344412, 0.08728925, 0.08253316, 0.07993395, 0.07513275,
        0.07204441, 0.06988282, 0.06675385, 0.06537095, 0.06192083,
        0.06124

---

## <a id="knn"></a> 4. kNN-algorithm

---

## <a id="testing"></a> 5. Model testing

---

## <a id="accuracy"></a> 6. Accuracy evaluation

---

## <a id="further-analysis"></a> 7. Further analysis