<h1 style="text-align:center">Face Detection</h1>
<h3 style="text-align:center">By: <a href="https://github.com/Nancy-07">Nancy Galicia</a>, <a href="https://github.com/AlvaroVasquezAI">Álvaro García</a> and <a href="https://github.com/ConnorKenwayAC3">Omar Sanchez</a></h3>

<h3 >Table of contents</h3>

<div style="margin-top: 20px">
    <ol>
        <li><a href="#introduction">Introduction</a></li>
        <li><a href="#libraries">Libraries</a></li>
        <li><a href="#data">Data</a></li>
        <li><a href="#model">Model</a></li>
        <li><a href="#trainingModel">Training Model</a></li>
        <li><a href="#evaluatingModel">Evaluating Model</a></li>
        <li><a href="#savingModel">Saving Model</a></li>
        <li><a href="#loadingModel">Loading Model</a></li>
        <li><a href="#testingModel">Testing Model</a></li>
        <li><a href="#conclusion">Conclusion</a></li>
    </ol>
</div>

<h2 id="introduction" style="text-align:center">Introduction</h2>
This notebook demonstrates the creation of a face detection model using transfer learning. The process begins with labeling images to identify those with and without faces, followed by creating bounding boxes around the faces. A pre-trained model, MobileNetV2, serves as the base model. On top of MobileNetV2, a custom model is built specifically for face detection, which includes additional layers for classification and regression tasks. The model is trained with the labeled dataset, leveraging the power of transfer learning to improve accuracy and reduce training time. Finally, the performance of the model is evaluated using a test set.

<h2 id="libraries" style="text-align:center">Libraries</h2>

In [None]:
import os
import cv2
import matplotlib.pyplot as plt
import pandas as pd
import json
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense, GlobalMaxPooling2D
from tensorflow.keras.models import Model
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.utils import Sequence
import numpy as np

<h2 id="data" style="text-align:center">Data</h2>

<h3 style="text-align:center">Datasets</h3>

<h4 style="text-align:center">LFW | Jack</h4>

<div style="justify-content:center; display:flex">
    <img src="Data/Test/Images/1_1.jpg" style="width: 250px; height: 250px"/>
    <img src="Data/Test/Images/104_0.jpg" style="width: 250px; height: 250px"/>
</div>

<h3 style="text-align:center">Labeled Faces in the Wild</h3>

The dataset "Labeled Faces in the Wild" (LFW) contains images of faces of famous people. This dataset is used for images that contain faces. The dataset contains 13,233 images of 5,749 people. 

We have reduce the size of the dataset to 11,917 images. This was done to perform the quality of the dataset, where we removed the images in which there were more than one face.

The size of each image is 250x250 pixels.

<div style="justify-content:center; display:flex">
    <img src="Data/Test/Images/1_1.jpg" style="width: 250px; height: 250px"/>
    <img src="Data/Test/Images/64_1.jpg" style="width: 250px; height: 250px"/>
    <img src="Data/Test/Images/71_1.jpg" style="width: 250px; height: 250px"/>
    <img src="Data/Test/Images/102_1.jpg" style="width: 250px; height: 250px"/>
</div>


<h3 style="text-align:center">Jack</h3>

The dataset "Jack" contains random images. The dataset contains 3,795 images. This dataset is used for images that do not contain faces, but some images contain faces, but they are not too clear or in a bad position.

We resize the images to 250x250 pixels.

<div style="justify-content:center; display:flex">
    <img src="Data/Test/Images/140_0.jpg" style="width: 250px; height: 250px"/>
    <img src="Data/Test/Images/169_0.jpg" style="width: 250px; height: 250px"/>
    <img src="Data/Test/Images/152_0.jpg" style="width: 250px; height: 250px"/>
    <img src="Data/Test/Images/203_0.jpg" style="width: 250px; height: 250px"/>
</div>


<h3 style="text-align:center">Labeling data</h3>
<h4 style="text-align:center">Face | No face</h4>

<div style="justify-content:center; display:flex">
    <img src="Data/Test/Images/64_1.jpg" style="width: 250px; height: 250px"/>
    <img src="Data/Test/Images/203_0.jpg" style="width: 250px; height: 250px"/>
</div>

We have labeled the data getting 2 coordinates, the top left and the bottom right of the bounding box of the face. The coordinates are normalized, so they are between 0 and 1. We have created a JSON file with the data of the images. 

The JSON file has the following structure:

```json
{
    "image": "file_name.jpg",
    "bbox": [
        x1, 
        y1,
        x2,
        y2
    ]
    "class": 1|0
}
```
where:
- image: name of the image
- bbox: coordinates of the bounding box (x1, y1, x2, y2), where (x1, y1) is the top left coordinate and (x2, y2) is the bottom right coordinate
- class: 1 if the image contains a face, 0 if the image does not contain a face




<h3 style="text-align:center">Split data</h3>

We have split the data into training, validation and test sets. Where the training set contains 80% of the data, the validation set contains 10% of the data and the test set contains 10% of the data.

<lo>
    <li>Training set: 80% of the data</li>
    <li>Validation set: 10% of the data</li>
    <li>Test set: 10% of the data</li>
</lo>

Data distribution:

- Training set: 9,588 images
- Validation set: 1,200 images
- Test set: 1,195 images

<h4>Dataset structure</h4>

The structure of the dataset is the following:

```
Data
│
└───Train
│   │
│   └───Images
│   │   │   1_1.jpg
│   │   │   2_1.jpg
│   │   │   ...
│   │
│   └───Labels
│       │   1_1.json
│       │   2_1.json
│       │   ...
│
└───Validation
│   │
│   └───Images
│   │   │   1_1.jpg
│   │   │   2_1.jpg
│   │   │   ...
│   │
│   └───Labels
│       │   1_1.json
│       │   2_1.json
│       │   ...
│
└───Test
    │
    └───Labels
    │   │   1_1.jpg
    │   │   2_1.jpg
    │   │   ...
    │
    └───Labels
        │   1_1.json
        │   2_1.json
        │   ...
```


<h3 style="text-align:center">Download data</h3>

To download the data, you can use the following link: <a href="#">Data.zip</a>

<h3 style="text-align:center">Load data</h3>

<h4 style="text-align:center">Are you using Google Colab?</h4>
<p style="text-align:center"> If you are using Google Colab, run the following code to load the data from Google Drive. </p>
    
```python  
from google.colab import drive
drive.mount('/content/drive')

import zipfile

zip_path = '/content/drive/My Drive/FaceDetection/Data.zip'
!cp "{zip_path}" .
!ls
with zipfile.ZipFile('Data.zip', 'r') as zip_ref:
    zip_ref.extractall('Dataset')
!ls Dataset

Data = 'Dataset/Data'
```

In [None]:
"""
from google.colab import drive

drive.mount('/content/drive')
"""

In [None]:
"""
import zipfile

zip_path = '/content/drive/My Drive/FaceDetection/Data.zip'
!cp "{zip_path}" .
!ls
with zipfile.ZipFile('Data.zip', 'r') as zip_ref:
    zip_ref.extractall('Dataset')
!ls Dataset

Data = 'Dataset/Data'
"""

<h4 style="text-align:center">Are you using a local environment?</h4>
<p style="text-align:center"> If you are using a local environment, just put the path where the data is located. </p>

```python 
Data = 'path_folder_data'
```


In [None]:
Data = "Data"

<h3 style="text-align:center">How does the data look like?</h3>

In [None]:
def plotDataset(path_image_with_face, path_image_without_face):
    image_with_face = cv2.imread(path_image_with_face)
    image_without_face = cv2.imread(path_image_without_face)
    fig, ax = plt.subplots(1, 2, figsize=(10, 5))

    path_label_with_face = path_image_with_face.replace("Images", "Labels").replace(".jpg", ".json")
    path_label_without_face = path_image_without_face.replace("Images", "Labels").replace(".jpg", ".json")

    with open(path_label_with_face) as json_file:
        label_with_face = json.load(json_file)
    with open(path_label_without_face) as json_file:
        label_without_face = json.load(json_file)

    name_with_face = label_with_face["image"]
    class_with_face = label_with_face["class"]
    coordinates_with_face = label_with_face["bbox"]
    x1_1 = coordinates_with_face[0] * image_with_face.shape[1]
    y1_1 = coordinates_with_face[1] * image_with_face.shape[0]
    x2_1 = coordinates_with_face[2] * image_with_face.shape[1]
    y2_1 = coordinates_with_face[3] * image_with_face.shape[0]

    name_without_face = label_without_face["image"]
    class_without_face = label_without_face["class"]
    coordinates_without_face = label_without_face["bbox"]
    x1_0 = coordinates_without_face[0] * image_without_face.shape[1]
    y1_0 = coordinates_without_face[1] * image_without_face.shape[0]
    x2_0 = coordinates_without_face[2] * image_without_face.shape[1]
    y2_0 = coordinates_without_face[3] * image_without_face.shape[0]

    cv2.rectangle(image_with_face, (int(x1_1), int(y1_1)), (int(x2_1), int(y2_1)), (0, 255, 0), 2)
    ax[0].imshow(cv2.cvtColor(image_with_face, cv2.COLOR_BGR2RGB))
    ax[0].set_title(f"Image: {name_with_face}\nClass: {class_with_face}\nCoordinates: {coordinates_with_face}")
    ax[0].axis("off")

    cv2.rectangle(image_without_face, (int(x1_0), int(y1_0)), (int(x2_0), int(y2_0)), (0, 255, 0), 2)
    ax[1].imshow(cv2.cvtColor(image_without_face, cv2.COLOR_BGR2RGB))
    ax[1].set_title(f"Image: {name_without_face}\nClass: {class_without_face}\nCoordinates: {coordinates_without_face}")
    ax[1].axis("off")

    plt.show()

plotDataset(f"{Data}/Test/Images/1_1.jpg", f"{Data}/Test/Images/8_0.jpg")


<h2 id="model" style="text-align:center">Model</h2>

<h3 style="text-align:center">MobileNetV2</h3>

We will use the pre-trained model "MobileNetV2" to detect the faces in the images. The model is trained on the "ImageNet" dataset. The model has 155 layers and 3,504,872 parameters.

<div style="justify-content:center; display:flex">
    <img src="https://www.researchgate.net/publication/361260658/figure/fig1/AS:1179073011290112@1658124320675/The-architecture-of-MobileNetV2-DNN.png" style="width: 500px; height: 500px"/>
</div>

<br>

<h3 style="text-align:center">Generating data</h3>

Tensors for the training, validation and test sets. The tensors will contain the images, the bounding box coordinates and the class of the images.

Images

In [None]:
def load_image(x): 
    byte_img = tf.io.read_file(x)
    img = tf.io.decode_jpeg(byte_img)
    return img

train_images = tf.data.Dataset.list_files(f"{Data}/Train/Images/*.jpg", shuffle=False)
train_images = train_images.map(load_image)
train_images = train_images.map(lambda x: tf.image.resize(x, (224,224)))
train_images = train_images.map(lambda x: x/255)

validation_images = tf.data.Dataset.list_files(f"{Data}/Validation/Images/*.jpg", shuffle=False)
validation_images = validation_images.map(load_image)
validation_images = validation_images.map(lambda x: tf.image.resize(x, (224,224)))
validation_images = validation_images.map(lambda x: x/255)

test_images = tf.data.Dataset.list_files(f"{Data}/Test/Images/*.jpg", shuffle=False)
test_images = test_images.map(load_image)
test_images = test_images.map(lambda x: tf.image.resize(x, (224,224)))
test_images = test_images.map(lambda x: x/255)

train_images = train_images.map(lambda x: tf.ensure_shape(x, [224, 224, 3]), num_parallel_calls=tf.data.AUTOTUNE)
validation_images = validation_images.map(lambda x: tf.ensure_shape(x, [224, 224, 3]), num_parallel_calls=tf.data.AUTOTUNE)
test_images = test_images.map(lambda x: tf.ensure_shape(x, [224, 224, 3]), num_parallel_calls=tf.data.AUTOTUNE)

Labels

In [None]:
def load_labels(json_file_name):
    with open(json_file_name.numpy(), 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)

    return [data['class'], data['bbox']]

def map_labels(x):
    return tf.py_function(func=load_labels, inp=[x], Tout=[tf.uint8, tf.float32], name='map_labels')

train_labels = tf.data.Dataset.list_files(f"{Data}/Train/Labels/*.json", shuffle=False)
train_labels = train_labels.map(map_labels, num_parallel_calls=tf.data.AUTOTUNE)
train_labels = train_labels.map(lambda class_, bbox: (tf.ensure_shape(class_, []), tf.ensure_shape(bbox, [4])), num_parallel_calls=tf.data.AUTOTUNE)

validation_labels = tf.data.Dataset.list_files(f"{Data}/Validation/Labels/*.json", shuffle=False)
validation_labels = validation_labels.map(map_labels, num_parallel_calls=tf.data.AUTOTUNE)
validation_labels = validation_labels.map(lambda class_, bbox: (tf.ensure_shape(class_, []), tf.ensure_shape(bbox, [4])), num_parallel_calls=tf.data.AUTOTUNE)

test_labels = tf.data.Dataset.list_files(f"{Data}/Test/Labels/*.json", shuffle=False)
test_labels = test_labels.map(map_labels, num_parallel_calls=tf.data.AUTOTUNE)
test_labels = test_labels.map(lambda class_, bbox: (tf.ensure_shape(class_, []), tf.ensure_shape(bbox, [4])), num_parallel_calls=tf.data.AUTOTUNE)

Data (Images and labels together | Training, Validation and Test)

In [None]:
train = tf.data.Dataset.zip((train_images, train_labels))
train = train.shuffle(9600)
train = train.batch(8)
train = train.prefetch(tf.data.AUTOTUNE)

validation = tf.data.Dataset.zip((validation_images, validation_labels))
validation = validation.shuffle(1200)
validation = validation.batch(8)
validation = validation.prefetch(tf.data.AUTOTUNE)

test = tf.data.Dataset.zip((test_images, test_labels))
test = test.shuffle(1200)
test = test.batch(8)
test = test.prefetch(tf.data.AUTOTUNE)

Testing

In [None]:
data_sample = train.as_numpy_iterator()
res = data_sample.next()

for idx in range(4): 
    Image = res[0][idx]
    class_ = res[1][0][idx]
    coords = res[1][1][idx]

    x1 = int(coords[0] * 224)
    y1 = int(coords[1] * 224)
    x2 = int(coords[2] * 224)
    y2 = int(coords[3] * 224)

    Image = Image * 255
    Image = Image.astype(np.uint8)

    cv2.rectangle(Image, (x1, y1), (x2, y2), (0, 255, 0), 2)
    plt.title(class_)
    plt.imshow(Image)
    plt.show()

<h3 style="text-align:center">Building model</h3>

To create a custom face detection model, the pre-trained "MobileNetV2" is utilized as the base model. This approach leverages the features learned from the ImageNet dataset while adding custom layers to suit the specific task of face detection. The model consists of one input layer and two output layers, tailored for classification and regression.

Input layers:
1. Input: The input layer accepts images of the shape (224, 224, 3), which is the required input size for MobileNetV2. This layer handles the input images that are fed into the model.

Base model:
1. MobileNetV2: MobileNetV2 serves as the base model, leveraging pre-trained weights from ImageNet. By setting include_top=False, the fully connected layers at the top of MobileNetV2 are excluded, allowing the addition of custom layers that cater to the specific detection task.

Output layers:
1. Classification Output:
- Neurons: 1 neuron
- Function: This neuron predicts whether the input image contains a face or not.
- Activation: Sigmoid activation function, which outputs a probability between 0 and 1 indicating the presence of a face.

2. Regression: This output will have 4 neurons, which will predict the bounding box coordinates. The activation function will be "sigmoid".
- Neurons: 4 neurons
- Function: These neurons predict the coordinates of the bounding box surrounding the face.
- Activation: Sigmoid activation function, which normalizes the output to a range between 0 and 1, representing the bounding box coordinates relative to the input image size.


Function

In [None]:
def build_model():
    input_layer = Input(shape=(224, 224, 3))

    base_model = MobileNetV2(include_top=False, weights='imagenet', input_shape=(224, 224, 3))(input_layer)

    f1 = GlobalMaxPooling2D()(base_model)
    class1 = Dense(2048, activation='relu')(f1)
    class2 = Dense(1, activation='sigmoid', name='classification')(class1)

    f2 = GlobalMaxPooling2D()(base_model)
    regress1 = Dense(2048, activation='relu')(f2)
    regress2 = Dense(4, activation='sigmoid', name='bounding_box')(regress1)

    model = Model(inputs=input_layer, outputs=[class2, regress2])
    return model

Model

In [None]:
modelFaceDetection = build_model()
modelFaceDetection.summary()

<h3 style="text-align:center">Face Detection</h3>

<p>The face detection model uses MobileNetV2 as the base model, adding custom layers for classification and regression. This allows the model to detect faces and predict bounding box coordinates.</p>

#### Loss Functions
- Regression Loss: Measures the error between the true and predicted bounding box coordinates.
- Classification Loss: Uses binary cross-entropy to measure the error in face detection.
#### Optimizer
- An Adam optimizer with an exponential decay learning rate schedule.
#### Face Detection Class
- Defines a custom model for face detection with specific training and evaluation steps.
#### Model Initialization
- Initialize and compile the <code>FaceDetection</code> model with the custom optimizer and loss functions.

Functions & Class

In [None]:
@tf.keras.utils.register_keras_serializable(package="face_detection")
def regression_loss(y_true, yhat):
    delta_coord = tf.reduce_sum(tf.square(y_true[:,:2] - yhat[:,:2]))
    h_true = y_true[:,3] - y_true[:,1]
    w_true = y_true[:,2] - y_true[:,0]
    h_pred = yhat[:,3] - yhat[:,1]
    w_pred = yhat[:,2] - yhat[:,0]
    delta_size = tf.reduce_sum(tf.square(w_true - w_pred) + tf.square(h_true - h_pred))
    return delta_coord + delta_size

@tf.keras.utils.register_keras_serializable(package="face_detection")
def classification_loss():
    return tf.keras.losses.BinaryCrossentropy()

@tf.keras.utils.register_keras_serializable(package="face_detection")
def optimizerpro():
    initial_learning_rate = 0.001
    lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
        initial_learning_rate,
        decay_steps=100000,
        decay_rate=0.96,
        staircase=True)
    opt = tf.keras.optimizers.Adam(learning_rate=lr_schedule)
    return opt

@tf.keras.utils.register_keras_serializable(package="face_detection")
class FaceDetection(Model):
    def __init__(self, model):
        super().__init__()
        self.model = model

    def compile(self, optimizer, classloss, regressloss):
        super().compile()
        self.opt = optimizer
        self.closs = classloss
        self.rloss = regressloss

    def train_step(self, batch):
        X, y = batch
        
        y_class = tf.reshape(y[0], (-1, 1))
        y_bbox = tf.cast(y[1], tf.float32)
        
        with tf.GradientTape() as tape:
            classes, coords = self.model(X, training=True)

            batch_classloss = self.closs(y_class, classes)
            batch_regressloss = self.rloss(y_bbox, coords)
            total_loss = 1.5 * batch_regressloss + 0.5 * batch_classloss
            grad = tape.gradient(total_loss, self.model.trainable_variables)
        self.opt.apply_gradients(zip(grad, self.model.trainable_variables))
        return {"total_loss": total_loss, "class_loss": batch_classloss, "regress_loss": batch_regressloss}

    def test_step(self, batch):
        X, y = batch

        y_class = tf.reshape(y[0], (-1, 1))
        y_bbox = tf.cast(y[1], tf.float32)

        classes, coords = self.model(X, training=False)

        batch_classloss = self.closs(y_class, classes)
        batch_regressloss = self.rloss(y_bbox, coords)
        total_loss = 1.5 * batch_regressloss + 0.5 * batch_classloss
        return {"total_loss": total_loss, "class_loss": batch_classloss, "regress_loss": batch_regressloss}


    def call(self, X):
        return self.model(X)

    def get_config(self):
        return {
            "model": self.model.get_config(),
            "optimizer": tf.keras.utils.serialize_keras_object(self.opt),
            "classloss": tf.keras.utils.serialize_keras_object(self.closs),
            "regressloss": tf.keras.utils.serialize_keras_object(self.rloss),
        }

    @classmethod
    def from_config(cls, config, custom_objects=None):
        model = Model.from_config(config["model"], custom_objects=custom_objects)
        optimizer = tf.keras.utils.deserialize_keras_object(config["optimizer"], custom_objects=custom_objects)
        classloss = tf.keras.utils.deserialize_keras_object(config["classloss"], custom_objects=custom_objects)
        regressloss = tf.keras.utils.deserialize_keras_object(config["regressloss"], custom_objects=custom_objects)
        instance = cls(model)
        instance.compile(optimizer=optimizer, classloss=classloss, regressloss=regressloss)
        return instance

    def get_compile_config(self):
        return {
            "optimizer": self.opt,
            "classloss": self.closs,
            "regressloss": self.rloss,
        }

    @classmethod
    def compile_from_config(cls, config):
        optimizer = tf.keras.utils.deserialize_keras_object(config["optimizer"])
        classloss = tf.keras.utils.deserialize_keras_object(config["classloss"])
        regressloss = tf.keras.utils.deserialize_keras_object(config["regressloss"])
        return optimizer, classloss, regressloss

Face Detection Model 

In [None]:
Face_Detection = FaceDetection(modelFaceDetection)
Face_Detection.compile(optimizer=optimizerpro(), classloss=classification_loss(), regressloss=regression_loss)

<h2 id="trainingModel" style="text-align:center">Training Model</h2>

The <code>Face_Detection.fit</code> method trains the model using the training data (<code>train</code>) and validation data (<code>validation</code>) for n epochs. The <code>tensorboard_callback</code> logs the metrics during training for later visualization in TensorBoard.


In [None]:
logdir = 'logs'
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=logdir)

In [None]:
history = Face_Detection.fit(train, validation_data=validation, epochs=10, callbacks=[tensorboard_callback])

<h2 id="evaluatingModel" style="text-align:center">Evaluating Model</h2>

In [None]:
fig, ax = plt.subplots(ncols=3, figsize=(20,5))

ax[0].plot(history.history['total_loss'], color='teal', label='loss')
ax[0].plot(history.history['val_total_loss'], color='orange', label='val loss')
ax[0].title.set_text('Loss')
ax[0].legend()

ax[1].plot(history.history['class_loss'], color='teal', label='class loss')
ax[1].plot(history.history['val_class_loss'], color='orange', label='val class loss')
ax[1].title.set_text('Classification Loss')
ax[1].legend()

ax[2].plot(history.history['regress_loss'], color='teal', label='regress loss')
ax[2].plot(history.history['val_regress_loss'], color='orange', label='val regress loss')
ax[2].title.set_text('Regression Loss')
ax[2].legend()

plt.show()

In [None]:
results = Face_Detection.evaluate(test)
print(f"Results: {results}")

if int(len(results)) == 3:
    print(f"Loss: {results[0]}, Classification Accuracy: {results[1]}, Bounding Box MSE: {results[2]}")
else:
    print("Unexpected results format:", results)

<h2 id="savingModel" style="text-align:center">Saving Model</h2>

In [None]:
Face_Detection.save('FaceDetectionModel.keras')

<h2 id="loadingModel" style="text-align:center">Loading Model</h2>

In [None]:
model_loaded_FaceDetection = tf.keras.models.load_model('FaceDetectionModel.keras')

model_loaded_FaceDetection.summary()

<h2 id="testingModel" style="text-align:center">Testing Model</h2>
<h3 style="text-align:center">Using the Test Set</h3>

In [None]:
# Predictions
X, y = test.as_numpy_iterator().next()
predictions = model_loaded_FaceDetection.predict(X)

for idx in range(4):

    fig, ax = plt.subplots(ncols=2, figsize=(10,5))


    Image = X[idx]
    class_ = y[0][idx]
    coords = y[1][idx]

    x1 = int(coords[0] * 224)
    y1 = int(coords[1] * 224)
    x2 = int(coords[2] * 224)
    y2 = int(coords[3] * 224)

    print(f"Actual Class: {class_}, Actual Coordinates: {coords}")

    Image = Image * 255
    Image = Image.astype(np.uint8)
    ImagePred = Image.copy()

    cv2.rectangle(Image, (x1, y1), (x2, y2), (0, 255, 0), 2)
    ax[0].title.set_text(class_)
    ax[0].imshow(Image)

    class_pred = predictions[0][idx]
    coords_pred = predictions[1][idx]

    x1_pred = int(coords_pred[0] * 224)
    y1_pred = int(coords_pred[1] * 224)
    x2_pred = int(coords_pred[2] * 224)
    y2_pred = int(coords_pred[3] * 224)

    print(f"Predicted Class: {class_pred}, Predicted Coordinates: {coords_pred}")

    cv2.rectangle(ImagePred, (x1_pred, y1_pred), (x2_pred, y2_pred), (0, 255, 0), 2)
    ax[1].title.set_text(class_pred)
    ax[1].imshow(ImagePred)

    plt.show()

<h3 style="text-align:center">Using new images</h3>

In [None]:
import cv2
import numpy as np
import matplotlib.pyplot as plt

def load_and_preprocess_image(filepath):
    img = cv2.imread(filepath)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    
    img_resized = cv2.resize(img, (224, 224))

    img_normalized = img_resized / 255.0
    
    img_expanded = np.expand_dims(img_normalized, axis=0)
    
    return img_expanded, img_resized

image_paths = ["ImagesTest/face1.jpg","ImagesTest/Omar.png", "ImagesTest/Sarita.png"]
external_images = [load_and_preprocess_image(path) for path in image_paths]

for img_expanded, img_original in external_images:
    predictions = model_loaded_FaceDetection.predict(img_expanded)
    
    fig, ax = plt.subplots(ncols=2, figsize=(10, 5))
    
    Image = img_original.copy()
    ImagePred = img_original.copy()

    class_pred = predictions[0][0]
    coords_pred = predictions[1][0]

    x1_pred = int(coords_pred[0] * 224)
    y1_pred = int(coords_pred[1] * 224)
    x2_pred = int(coords_pred[2] * 224)
    y2_pred = int(coords_pred[3] * 224)

    print(f"Predicted Class: {class_pred}, Predicted Coordinates: {coords_pred}")

    cv2.rectangle(ImagePred, (x1_pred, y1_pred), (x2_pred, y2_pred), (0, 255, 0), 2)
    ax[0].title.set_text("Original Image")
    ax[0].imshow(Image)
    ax[1].title.set_text(f"Predicted Class: {class_pred[0]}")
    ax[1].imshow(ImagePred)

    plt.show()

<h3 style="text-align:center">Using the webcam in real time</h3>

In [None]:
def load_and_preprocess_image_camera(img):
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    
    img_resized = cv2.resize(img, (224, 224))

    img_normalized = img_resized / 255.0
    
    img_expanded = np.expand_dims(img_normalized, axis=0)
    
    return img_expanded, img_resized

cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()
    if not ret:
        break

    img_expanded, img_original = load_and_preprocess_image_camera(frame)
    predictions = model_loaded_FaceDetection.predict(img_expanded)
    
    class_pred = predictions[0][0]
    coords_pred = predictions[1][0]

    x1_pred = int(coords_pred[0] * 224)
    y1_pred = int(coords_pred[1] * 224)
    x2_pred = int(coords_pred[2] * 224)
    y2_pred = int(coords_pred[3] * 224)

    cv2.rectangle(img_original, (x1_pred, y1_pred), (x2_pred, y2_pred), (0, 255, 0), 2)
    cv2.imshow("Face Detection", cv2.cvtColor(img_original, cv2.COLOR_RGB2BGR))

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()

cv2.destroyAllWindows()


<h2 id="conclusion" style="text-align:center">Conclusion</h2>

Face detection is a crucial task in computer vision, with applications in security, surveillance, and image processing. This notebook demonstrates the creation of a face detection model using transfer learning, leveraging the pre-trained MobileNetV2 model. By adding custom layers for classification and regression, the model can accurately detect faces and predict bounding box coordinates. The model is trained on a labeled dataset, achieving high accuracy and performance. The face detection model can be used for various applications, such as facial recognition, emotion detection, and face trackin.