# **Face Recognition (CelebA dataset)**


> Performed Facial Recognition using CNN as well as VGG16









### **Using CNN:**


> Code Summary:

*   the code loads facial attribute data, prepares image and label data, builds a CNN model, trains it using the generated data, and evaluates its performance.

*   The model aims to classify whether a person is smiling or not based on the provided facial images.

*   The accuracy score and classification report provide insights into the model's performance in predicting the smiling attribute.



In [44]:
# Import libraries
import os
import numpy as np
import pandas as pd
import cv2
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from keras.preprocessing.image import ImageDataGenerator
from keras.utils import to_categorical
from sklearn.metrics import classification_report, accuracy_score

Here due to memory issue in my local system I have used a subset of the actual dataset with 100 records - which may affect accuracy;

> Described about it in detail in challenges and limitations at the end





In [66]:
# Path to the CelebA dataset
dataset_dir = '/content/sample_data/celeba/'

In [67]:
# Load the CelebA dataset annotations
annotations_df = pd.read_csv('/content/sample_data/list_attr_celeba.csv')
annotations_df = annotations_df.head(100)

In [47]:
annotations_df.columns

Index(['image_id', '5_o_Clock_Shadow', 'Arched_Eyebrows', 'Attractive',
       'Bags_Under_Eyes', 'Bald', 'Bangs', 'Big_Lips', 'Big_Nose',
       'Black_Hair', 'Blond_Hair', 'Blurry', 'Brown_Hair', 'Bushy_Eyebrows',
       'Chubby', 'Double_Chin', 'Eyeglasses', 'Goatee', 'Gray_Hair',
       'Heavy_Makeup', 'High_Cheekbones', 'Male', 'Mouth_Slightly_Open',
       'Mustache', 'Narrow_Eyes', 'No_Beard', 'Oval_Face', 'Pale_Skin',
       'Pointy_Nose', 'Receding_Hairline', 'Rosy_Cheeks', 'Sideburns',
       'Smiling', 'Straight_Hair', 'Wavy_Hair', 'Wearing_Earrings',
       'Wearing_Hat', 'Wearing_Lipstick', 'Wearing_Necklace',
       'Wearing_Necktie', 'Young'],
      dtype='object')

In [48]:
# Select relevant columns for facial recognition
selected_attributes = ['image_id', 'Smiling', 'Male', 'Young']
data = annotations_df[selected_attributes]

In [49]:
# Remove images with missing attributes
data = data.dropna()

In [50]:
# Define the target attribute
target_attribute = 'Smiling'

In [52]:
# Prepare the image data
image_dir = os.path.join(dataset_dir)
image_files = data['image_id'].values
images = []
for image_file in image_files:
    image_path = os.path.join(image_dir, image_file)
    image = cv2.imread(image_path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)  # Convert to RGB
    images.append(image)
images = np.array(images)

In [53]:
# Prepare the target labels
labels = data[target_attribute].values
le = LabelEncoder()
labels = le.fit_transform(labels)
labels = to_categorical(labels)

In [54]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(images, labels, test_size=0.2, random_state=42)

In [55]:
# Preprocess the image data
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow(X_train, y_train, batch_size=32)
test_generator = test_datagen.flow(X_test, y_test, batch_size=32)

In [56]:
# Build the facial recognition model
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(218, 178, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

In [57]:
# Train the model
model.fit(train_generator, epochs=10, validation_data=test_generator)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7fbfc3b2d1e0>

In [58]:
# Evaluate the model
test_predictions = model.predict(test_generator)
test_predictions = np.argmax(test_predictions, axis=1)
test_labels = np.argmax(y_test, axis=1)
accuracy = accuracy_score(test_labels, test_predictions)
classification_report = classification_report(test_labels, test_predictions)



In [63]:
# Create a DataFrame with actual and predicted values
df = pd.DataFrame({'Actual': le.inverse_transform(test_labels), 'Predicted': le.inverse_transform(test_predictions)})

# Accuracy:
print("Accuracy: ", accuracy*100,"%")

Accuracy:  45.0 %


In [65]:
print("Classification Report:")
print(classification_report)

Classification Report:
              precision    recall  f1-score   support

           0       0.44      0.89      0.59         9
           1       0.50      0.09      0.15        11

    accuracy                           0.45        20
   macro avg       0.47      0.49      0.37        20
weighted avg       0.47      0.45      0.35        20



Accuracy and Classification Report:


> The accuracy for this model is 45.00 %


*   The model has higher precision and recall for class 0 (non-smiling)
compared to class 1 (smiling).
*   The F1-scores indicate that the model's performance is relatively better for class 0.
*   The low recall for class 1 suggests that the model struggles to identify actual instances of smiling correctly.
*   The accuracy of the model is 45%, indicating that it is not performing well in accurately classifying smiling and non-smiling attributes.



**Using VGG16**

> Code Summary:


*   The methodology involves training a deep learning model with pre-trained **VGG16** as the base model.

*  The model is trained using image data generators and the binary
cross-entropy loss function.

*   The results obtained include accuracy scores and confusion matrices showing for each attribute.

In [41]:
# Required Libraries:

import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix
from keras.preprocessing.image import ImageDataGenerator
from keras.applications import VGG16
from keras.models import Sequential
from keras.layers import Dense, Flatten
from keras.optimizers import Adam

In [2]:
# Set the path to the CelebA dataset
dataset_dir = '/content/sample_data/celeba'

In [3]:
# Load the CelebA annotations file
df_attributes = pd.read_csv('/content/sample_data/list_attr_celeba.csv',low_memory = False)
df_attributes = df_attributes.head(100)

In [21]:
df_attributes.shape

(100, 41)

In [4]:
df_attributes.columns

Index(['image_id', '5_o_Clock_Shadow', 'Arched_Eyebrows', 'Attractive',
       'Bags_Under_Eyes', 'Bald', 'Bangs', 'Big_Lips', 'Big_Nose',
       'Black_Hair', 'Blond_Hair', 'Blurry', 'Brown_Hair', 'Bushy_Eyebrows',
       'Chubby', 'Double_Chin', 'Eyeglasses', 'Goatee', 'Gray_Hair',
       'Heavy_Makeup', 'High_Cheekbones', 'Male', 'Mouth_Slightly_Open',
       'Mustache', 'Narrow_Eyes', 'No_Beard', 'Oval_Face', 'Pale_Skin',
       'Pointy_Nose', 'Receding_Hairline', 'Rosy_Cheeks', 'Sideburns',
       'Smiling', 'Straight_Hair', 'Wavy_Hair', 'Wearing_Earrings',
       'Wearing_Hat', 'Wearing_Lipstick', 'Wearing_Necklace',
       'Wearing_Necktie', 'Young'],
      dtype='object')

In [22]:
# Select relevant columns for facial recognition
selected_columns = ['image_id', 'Eyeglasses', 'Smiling', 'Male', 'Young']
df = df_attributes[selected_columns]

In [23]:
# Preprocess the dataset
train_df, test_df = train_test_split(df, test_size=0.2, random_state=42)

In [33]:
# Create data generators for training and testing
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_dataframe(
    dataframe=train_df,
    directory=dataset_dir,
    x_col='image_id',
    y_col=selected_columns[1:],
    class_mode='raw',
    target_size=(224, 224),
    batch_size=32
)

test_generator = test_datagen.flow_from_dataframe(
    dataframe=test_df,
    directory=dataset_dir,
    x_col='image_id',
    y_col=selected_columns[1:],
    class_mode='raw',
    target_size=(224, 224),
    batch_size=32
)

Found 80 validated image filenames.
Found 20 validated image filenames.


In [34]:
# Load the pre-trained VGG16 model
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

In [35]:
# Create the facial recognition model
model = Sequential()
model.add(base_model)
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dense(len(selected_columns[1:]), activation='sigmoid'))

In [36]:
# Compile the model
model.compile(optimizer=Adam(lr=0.001), loss='binary_crossentropy', metrics=['accuracy'])

  super().__init__(name, **kwargs)


In [37]:
# Train the model
history = model.fit(train_generator, epochs=10, validation_data=test_generator)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [38]:
# Evaluate the model on test data
test_generator.reset()
pred_prob = model.predict(test_generator)
pred_labels = (pred_prob > 0.5).astype(int)



In [39]:
# Convert predictions and actual values to a DataFrame
columns = selected_columns[1:]
actual_values = pd.DataFrame(test_generator.labels, columns=columns)
predicted_values = pd.DataFrame(pred_labels, columns=columns)

In [42]:
# Calculate accuracy for each attribute
accuracy = pd.DataFrame([accuracy_score(actual_values[attr], predicted_values[attr]) for attr in columns])

# Combine actual and predicted values with accuracy
results = pd.concat([actual_values, predicted_values, accuracy], axis=1, keys=['Actual', 'Predicted', 'Accuracy'])
accuracy.mean()*100

0    22.5
dtype: float64

In [43]:
# Calculate confusion matrix for each attribute
confusion_matrices = {}
for attr in columns:
    cm = confusion_matrix(actual_values[attr], predicted_values[attr])
    confusion_matrices[attr] = cm
print('Confusion Matrice:')
for attr in columns:
    print(f'\nAttribute: {attr}')
    print(confusion_matrices[attr])

Confusion Matrice:

Attribute: Eyeglasses
[[ 0 20]
 [ 0  0]]

Attribute: Smiling
[[ 0  9  0]
 [ 0  0  0]
 [ 0 11  0]]

Attribute: Male
[[ 0 15  0]
 [ 0  0  0]
 [ 0  5  0]]

Attribute: Young
[[ 0  2]
 [ 0 18]]


### Accuracy and Confusion Matrices:



>  The accuracy of this model is 22.50 %


> These confusion matrices provide insights into the model's performance for each attribute and highlight the misclassifications made by the model.




*   For "Eyeglasses," all samples are predicted as the negative class (no eyeglasses), resulting in a false negative count of 20.

*   For "Smiling," there are no positive predictions for any class, resulting in all zeros.

*   For "Male," all samples are predicted as the negative class (not male), resulting in a false negative count of 15.

*   For "Young," all samples are predicted as the positive class (young), except for 2 false negatives.





### **Limitations and Challenges:**


Due to memory issue in my local system I have used a subset of the actual dataset with 100 records;
*  Due to it the model accuracy may get affected as the training data is too small or not diverse enough,
*  there may be imbalence of classes where certain attributes may have significantly more samples than others so the model may be biased towards the majority class and struggle to accurately predict the minority class.


> To improve accuracy


*   the actual dataset  or a larger subset can be used to increase the model training set
*   Furthermore, oversampling or undersampling can be employed to balance the classes so that model learns equally from all attribute categories.





*From Above we can infer that CNN has higher accuracy than VGG16*

*  As here I've took subset of actual dataset - CNN performed well than VGG16; whereas CNN is prone to overfitting on large datasets.
*  Also CelebA dataset contains images with variations in lighting, occlusion, and face angles that are better captured by the CNN architecture used, it can result in higher accuracy compared to VGG16.

Both models may perform differently based on the input dataset 
