# KNN for Facial Emotion Recognition
====================================================================================================
### Introduction
k-Nearest Neighbors (kNN) is a fundamental algorithm in machine learning used for classification and regression. It's especially useful in facial emotion recognition due to its simplicity and effectiveness. Here's how it works:

### 1. Feature Extraction
- **Process**: Extract features from facial images, focusing on key expressive points like mouth corners, eyebrows, etc.

### 2. Training the Model
- **Data**: Use a dataset with facial images labeled with emotions (happy, sad, angry, etc.).
- **Method**: kNN stores the dataset rather than learning a model.

### 3. Choosing 'k' Value
- **Importance**: The number of nearest neighbors ('k') to consider is critical.
- **Method**: Determine 'k' through cross-validation based on the dataset.

### 4. Classification of New Data
- **Operation**: For a new facial image, kNN identifies 'k' nearest neighbors using a distance metric (like Euclidean distance).
- **Outcome**: Assigns the most common emotion label among these neighbors to the new image.

### 5. Real-Time Application
- **Use Cases**: Useful in emotion-based user interfaces, security systems, and psychological studies.
- **Requirement**: Needs continuous input data for real-time analysis.

**Note**: The performance of kNN in facial emotion recognition depends on the training dataset quality, the choice of 'k', and the extracted features. It can be computationally intensive in real-time applications due to the need for distance computations with many dataset points for each new image.



In [8]:
import os
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from PIL import Image
from sklearn.model_selection import train_test_split

We will start by implementing the k Nearest Neighbors class and then use it to classify facial emotions.

In [9]:
class KNearestNeighbor(object):
    def __init__(self, k=3):
        self.X_train = None
        self.y_train = None
        self.k = k

    def fit(self, X, y):
        """ 
        Train the classifier. For k-nearest neighbors this means
        memorizing the training data

        X: training data
        y: labels
        
        """
        self.X_train = [x.flatten() for x in X]
        self.y_train = y

    def predict(self, X):
        """
        Predict the labels for the data points in X.
        """
        distances = [np.linalg.norm(x.flatten() - x_train) for x in X for x_train in self.X_train]
        k_idx = np.argsort(distances)[:self.k]
        k_labels = [self.y_train[i] for i in k_idx]
        return max(set(k_labels), key=k_labels.count)

    def accuracy(self, X, y):
        """
        Return the accuracy of the classifier on the data X and the labels y.
        """
        predictions = [self.predict(x) for x in X]
        return np.mean(predictions == y)

We will now prepare the dataset and set the values for the hyperparameters. Firstly, we need functions to l

In [16]:
# Assuming you have a dataset named 'dataset' with features and labels

def load_data(dataset_path, csv_file, image_size=(48, 48)):
    # Read the CSV file
    df = pd.read_csv(os.path.join(dataset_path, csv_file))
    images = []
    labels = []

    for index, row in df.iterrows():
        image_path = os.path.join(dataset_path, row['path'])
        image = Image.open(image_path)
        image = image.resize(image_size)
        images.append(np.array(image).flatten())
        labels.append(row['label'])

    return np.array(images), np.array(labels)

In [17]:
# Assuming you have the dataset path and CSV file name
dataset_path = "./dataset"
csv_file = "data.csv"

# Load the dataset
X, y = load_data(dataset_path, csv_file)

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [18]:
knn = KNearestNeighbor(k=3)
knn.fit(X_train, y_train)
predictions = knn.predict(X_test)
accuracy = knn.accuracy(X_test, y_test)

print(f"Accuracy: {accuracy}")


IndexError: index 17075477 is out of bounds for axis 0 with size 12362