
To create an image classifier that differentiates between normal faces and those of individuals with stroke or neurological disorders using Mediapipe for landmark extraction, follow this step-by-step guide. This approach will focus on using landmarks extracted by Mediapipe to classify images, instead of employing traditional machine learning models like SVMs (Support Vector Machines) or CNNs (Convolutional Neural Networks).

# **Step 1: Setup Your Environment**
First, ensure you have all the required libraries installed. You'll need OpenCV, Mediapipe, and any additional libraries for data handling and processing (like NumPy and Pandas). The code snippet you provided is a good starting point for setting up Mediapipe and capturing video frames. For image classification, however, we'll adapt it to process a dataset of images.

# **Step 2: Load Your Dataset**
Load the dataset of 600 images, ensuring that you have labels for each image indicating whether it's a normal face or a stroke/neurologically disordered face. Organize your images and labels in a manner that's easy to access and iterate over, possibly using a structured directory format or a CSV file to map images to their labels.

# **Step A: Organize Your Dataset**

First, ensure your dataset is well-organized. A common approach is to have separate folders for each category. For instance:

dataset/

normal/ (contains 300 images of normal faces)

stroke/ (contains 300 images of faces with stroke or neurological disorders)

# **Step B: Install Required Libraries**

Make sure you have the necessary Python libraries installed. You'll need opencv-python for image processing and numpy for handling arrays. You can install them using pip:

In [None]:
pip install opencv-python numpy
pip install mediapipe

# **Step 3: Load and Label the Images**

You will write a function to load the images from each directory, convert them into a suitable format (e.g., a NumPy array), and label them appropriately. Here's an example script that does this:

In [None]:
import os
import cv2
import numpy as np

# Function to load images from a directory
def load_images_from_folder(folder, label):
    images = []
    labels = []
    for filename in os.listdir(folder):
        img = cv2.imread(os.path.join(folder, filename))
        if img is not None:
            img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)  # Convert to RGB
            images.append(img)
            labels.append(label)
    return images, labels

# Load dataset
normal_images, normal_labels = load_images_from_folder('dataset/normal', 0)  # 0 for 'normal'
stroke_images, stroke_labels = load_images_from_folder('dataset/stroke', 1)  # 1 for 'stroke'

# Combine datasets
all_images = normal_images + stroke_images
all_labels = normal_labels + stroke_labels

# Convert to numpy arrays for processing
all_images = np.array(all_images)
all_labels = np.array(all_labels)


# **Step 4: Preprocess the Images**

Depending on your specific requirements (e.g., the input size of your classification model), you may need to resize the images to ensure they are of uniform size:

In [None]:
# Function to resize images
def resize_images(images, size=(224, 224)):
    resized_images = []
    for img in images:
        img = cv2.resize(img, size)
        resized_images.append(img)
    return np.array(resized_images)

# Resize all images
all_images_resized = resize_images(all_images)


# **Step 5: Split the Dataset**

It's essential to split your dataset into training and testing (and possibly validation) sets. This helps evaluate the performance of your model on unseen data. You can use train_test_split from sklearn.model_selection for this purpose:

In [None]:
from sklearn.model_selection import train_test_split

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(all_images_resized, all_labels, test_size=0.2, random_state=42)



Preprocessing the landmarks extracted from images and performing feature engineering are critical steps in preparing your data for training a classifier. This process involves transforming raw facial landmarks into meaningful features that a machine learning model can use to differentiate between normal faces and those affected by stroke or neurological disorders. Below, I'll guide you through the preprocessing and feature engineering stages using the landmarks extracted via Mediapipe.

# **Preprocessing Landmarks**
Preprocessing involves standardizing the format and scale of your landmark data to make it suitable for feature engineering and model training.

**Normalization**:
Normalize landmark coordinates to ensure they are scale-invariant. This can be particularly important if the images vary in size or if the faces are at different distances from the camera.

Alignment: Optionally, you might align the faces based on specific landmarks (e.g., the eyes) to reduce variability due to head pose.

In [None]:
import numpy as np

def normalize_landmarks(landmarks):
    # Assuming landmarks is a NumPy array of shape (num_landmarks, 3) for (x, y, z) coordinates
    mean = np.mean(landmarks, axis=0)
    std = np.std(landmarks, axis=0)
    normalized_landmarks = (landmarks - mean) / std
    return normalized_landmarks


# **Feature Engineering**
Feature engineering involves creating meaningful features from the normalized landmarks that can help distinguish between normal and stroke faces. This might include metrics of asymmetry, distances between specific points, angles, or other statistical features derived from the landmark positions.

Asymmetry Features: Stroke often causes facial asymmetry. Calculate asymmetry by comparing the distances between corresponding landmarks on the left and right sides of the face.

Key Distances: Measure distances between key points, such as the width of the mouth, the height of the eyes, and the distance between eyebrows. Changes in these distances might indicate neurological disorders.

Statistical Features: Compute statistical measures like the mean, median, standard deviation, and variance of the landmark coordinates or the distances/angles between landmarks.

Here's an example of calculating a simple feature - the distance between two points (which could represent the eye corners, for example):

In [None]:
def calculate_distance(point1, point2):
    # Assuming point1 and point2 are (x, y, z) coordinates
    return np.sqrt(np.sum((point1 - point2) ** 2))

# Example usage
# Calculate the distance between two landmarks
distance_example = calculate_distance(landmarks[0], landmarks[1])


# **Combining Features into a Dataset**
Once you've engineered your features, combine them into a structured format (like a NumPy array or Pandas DataFrame) along with the corresponding labels for each image. This dataset is what you'll use to train your model.

In [None]:
import pandas as pd

# Assuming you have a list of feature vectors and their corresponding labels
features = [feature_vector1, feature_vector2, ...]  # Your engineered features for each image
labels = [0, 1, ...]  # 0 for normal, 1 for stroke

# Convert to DataFrame for easy manipulation
df = pd.DataFrame(features)
df['label'] = labels


# **Tips for Effective Feature Engineering**
Exploration is Key: Experiment with different features and combinations thereof. The effectiveness of features can vary depending on the specifics of the dataset and the task.
Dimensionality Reduction: If you end up with a high number of features, consider using techniques like PCA (Principal Component Analysis) to reduce dimensionality while retaining the most informative aspects of your data.
Iterate and Validate: Continuously validate the effectiveness of your features by training models and evaluating their performance. Use this feedback to refine or develop new features.
The goal of these steps is to transform the raw facial landmarks into a set of features that effectively capture the differences between normal faces and those affected by stroke or neurological conditions, thereby enabling accurate classification.

# **Step : Train Your Model**
Here's how you can train a Random Forest model using scikit-learn:

In [None]:
from sklearn.ensemble import RandomForestClassifier

# Initialize the model
clf = RandomForestClassifier(n_estimators=100, random_state=42)

# Train the model
clf.fit(X_train, y_train)


# **Step 5: Evaluate the Model**
After training, you evaluate the model using the test set:



In [None]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

# Predict on the test set
y_pred = clf.predict(X_test)

# Calculate metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

print(f"Accuracy: {accuracy:.2f}")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"F1 Score: {f1:.2f}")


Implementing Image Classification with MediaPipe and Training the Model
To tie everything together for your specific use case:

Extract Landmarks: Use MediaPipe to extract facial landmarks from your images, as discussed earlier.

Preprocess and Feature Engineering: Preprocess these landmarks to normalize them, then engineer features that can discriminate between normal and stroke-affected faces.

Prepare Your Dataset: Combine your engineered features into a structured dataset with labels.

Follow Steps 3-5: Split your dataset, choose a machine learning model, train it, and evaluate its performance.

In [None]:
# Assuming features and labels are prepared
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

# Split the dataset
X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.2, random_state=42)

# Initialize and train the classifier
clf = RandomForestClassifier(n_estimators=100, random_state=42)
clf.fit(X_train, y_train)

# Predict and evaluate
y_pred = clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print(f"Model Accuracy: {accuracy:.2f}")
