# Train a Face Detection model with YOLOv7: A Comprehensive Guide


This notebook will guide you through the process of building a face detection model using YOLOv7. Each code block will be explained to ensure you understand the steps involved.

# Importing Libraries
We start by importing the necessary libraries.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import pandas as pd
from datasets import load_dataset
from sklearn import preprocessing
import shutil
import cv2
from PIL import Image
import os
from sklearn.model_selection import train_test_split
import glob
from IPython.display import Image, display

These libraries are essential for data manipulation (`pandas`, `numpy`), machine learning (`tensorflow`, `sklearn`), image processing (`cv2`, `PIL.Image`), and visualization (`matplotlib`, `IPython.display`).

# Setting Up YOLOv7
In this step, we will download the YOLOv7 repository and install the necessary requirements.

In [None]:
%%capture
!git clone https://github.com/WongKinYiu/yolov7 # Downloading YOLOv7 repository
%cd yolov7
!pip3 install -qr requirements.txt # Installing requirements

The `git clone` command is used to clone the YOLOv7 repository from GitHub. The `%cd` command changes the current directory to the cloned repository. Finally, the `pip3 install -qr requirements.txt` command installs the necessary requirements for YOLOv7 from the `requirements.txt` file.


## Downloading Pretrained Weights
In this step, we will download the pretrained weights for YOLOv7. These weights have been trained on a large dataset and can help improve the performance of our model.

In [None]:
!wget "https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7_training.pt"

The `wget` command is a free utility for non-interactive download of files from the web. Here, we are using it to download the pretrained weights for YOLOv7 from the given URL.

# Loading the Dataset
We start by loading the 'wider_face' dataset and converting it to a pandas dataframe.

In [None]:
ds = load_dataset('wider_face',split='train')
df=ds.to_pandas()

# Exploring the Dataset
Let's print the index of our dataframe and the bounding box coordinates of the third face in our dataset.

In [None]:
print(df.index)
print(df.faces.iloc[3]['bbox'])

## Loading and Visualizing an Image
Next, we load an image from our dataset, convert its color from BGR to RGB (as OpenCV loads images in BGR format), and print its shape.

In [None]:
path=df.image.iloc[0]['path']
img = cv2.imread(path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
print(img.shape)

## Drawing Bounding Boxes
In the following block, we draw bounding boxes on the faces detected in our image and display it to as part of our exploration.

In [None]:
for bbox in df.faces.iloc[0]['bbox']:
  x1, y1, x2, y2 = bbox
  cv2.rectangle(img, (int(x1), int(y1)), (int(x2), int(y2)), (255, 0, 0), 2)
plt.imshow(img)

This block of code draws rectangles around the detected faces in the image using the coordinates provided in the 'bbox' field of our dataframe. The rectangles are drawn in red color (255, 0, 0) with a thickness of 2.

# Label Encoding
In this step, we start by defining our labels. In this case, we only have one label: 'face'. We then use LabelEncoder to transform this label into a format that our model can understand.


In [None]:
labels=['face']
le = preprocessing.LabelEncoder()
le.fit(labels)
print(le.classes_)

The `fit` method is used to fit the LabelEncoder to our labels. This will assign an integer to each label, which can be useful when working with algorithms that require numerical input. The `classes_` attribute of the fitted LabelEncoder will give us the classes that the LabelEncoder has learned, which we print out.


# Converting COCO Bounding Boxes to YOLO Format
Next, we define a function to convert bounding boxes from COCO format to YOLO format. This is necessary because YOLOv7 requires bounding boxes to be in a specific format.

In [None]:
def coco_to_yolo(coco_boxes, img_width, img_height,le):
    yolo_boxes = []
    labelcode = le.transform(['face'])[0]
    for box in coco_boxes:
        x_min, y_min, width, height = box
        x_center = x_min + width / 2.0
        y_center = y_min + height / 2.0
        x_center_norm = x_center / img_width
        y_center_norm = y_center / img_height
        width_norm = width / img_width
        height_norm = height / img_height
        yolo_boxes.append([labelcode,x_center_norm, y_center_norm, width_norm, height_norm])
    return yolo_boxes

In this function, we first transform the label 'face' into a numerical code using the LabelEncoder we fitted earlier. Then, for each bounding box in COCO format, we calculate the center of the box and normalize the coordinates and dimensions by the image width and height. These normalized values are then appended to our list of YOLO formatted bounding boxes.


# Saving the Dataset
Next, we define a function to save our dataset in a format suitable for training our YOLOv7 model.

In [None]:
def savedatasetfhs(row):
  a=row.name
  if(a< 6500):
    path=row[0]['path']
    img=cv2.imread(path)
    image=tf.keras.preprocessing.image.array_to_img(load_image(row[0]['path']))
    image.save('/kaggle/working/images/image'+str(a)+'.jpeg')
    np.savetxt('/kaggle/working/images/image'+str(a)+'.txt', coco_to_yolo(row[1]['bbox'],img.shape[1],img.shape[0],le))

In this function, we first check if the row number is less than 6500. If it is, we load the image from the path provided in the row, convert it to an image object using TensorFlow's `array_to_img` function, and save it as a JPEG file in the '/kaggle/working/images/' directory. We also convert the bounding box coordinates from COCO format to YOLO format using our previously defined `coco_to_yolo` function, and save these coordinates as a text file in the same directory.

Finally, we apply this function to each row in our dataframe.

In [None]:
df.apply(savedatasetfhs,axis="columns")

This line of code applies the `savedatasetfhs` function to each row (or column) in our dataframe, effectively saving our entire dataset in a format suitable for training our YOLOv7 model.


## Splitting the Dataset
In this step, we define a function to split our dataset into training and validation sets. This is important for evaluating the performance of our model.

In [None]:
def split_data(folder_path, train_size=0.8):
    # Create the output directories
    os.makedirs(os.path.join('split', 'images', 'train'), exist_ok=True)
    os.makedirs(os.path.join('split', 'images', 'val'), exist_ok=True)
    os.makedirs(os.path.join('split', 'labels', 'train'), exist_ok=True)
    os.makedirs(os.path.join('split', 'labels', 'val'), exist_ok=True)

    # Get the list of image and label files in the folder
    image_files = [f for f in os.listdir(folder_path) if f.endswith('.jpeg')]
    label_files = [f for f in os.listdir(folder_path) if f.endswith('.txt')]

    # Sort the lists of files
    image_files.sort()
    label_files.sort()

    # Split the data into training and validation sets
    train_images, val_images, train_labels, val_labels = train_test_split(image_files, label_files, train_size=train_size)

    # Copy the training images and labels to the output directories
    for image_file, label_file in zip(train_images, train_labels):
        shutil.copy(os.path.join(folder_path, image_file), os.path.join('split', 'images', 'train', image_file))
        shutil.copy(os.path.join(folder_path, label_file), os.path.join('split', 'labels', 'train', label_file))

    # Copy the validation images and labels to the output directories
    for image_file, label_file in zip(val_images, val_labels):
        shutil.copy(os.path.join(folder_path, image_file), os.path.join('split', 'images', 'val', image_file))
        shutil.copy(os.path.join(folder_path, label_file), os.path.join('split', 'labels', 'val', label_file))

This function creates separate directories for training and validation images and labels. It then splits the data into training and validation sets and copies the corresponding images and labels into the appropriate directories.

Finally, we call this function to split our dataset.

In [None]:
folder_path = '/kaggle/working/images'
split_data(folder_path)

## Verifying the Data Split
In this step, we verify that our data has been correctly split into training and validation sets.

In [None]:
train_img_path = "split/images/train"
train_label_path = "split/labels/train"

val_img_path = "split/images/val"
val_label_path = "split/labels/val"

# Get the list of image and label files in the train and validation folders
image_files_train=[f for f in os.listdir(train_img_path) if f.endswith('.jpeg')]
label_files_train=[f for f in os.listdir(train_label_path) if f.endswith('.txt')]
image_files_val=[f for f in os.listdir(val_img_path) if f.endswith('.jpeg')]
label_files_val=[f for f in os.listdir(val_label_path) if f.endswith('.txt')]

# Sort the lists of files
image_files_train.sort()
label_files_train.sort()
image_files_val.sort()
label_files_val.sort()

# Remove the file extensions
for f in range(len(image_files_train)):
  image_files_train[f]=f'{image_files_train[f].split(".")[0]}'
for f in range(len(label_files_train)):
  label_files_train[f]=f'{label_files_train[f].split(".")[0]}'
for f in range(len(image_files_val)):
  image_files_val[f]=f'{image_files_val[f].split(".")[0]}'
for f in range(len(label_files_val)):
  label_files_val[f]=f'{label_files_val[f].split(".")[0]}'

# Check if the number of images matches the number of labels and if the filenames match
if(len(image_files_train)==len(label_files_train) and image_files_train== label_files_train):
  print('Successful train split ')
  print('Number of train images: '+str(len(image_files_train)))
  print('Number of train labels: '+str(len(label_files_train)))
else:
  print('Unsuccessful train split')
if(len(image_files_val)==len(label_files_val) and image_files_val== label_files_val):
  print('Successful validation split ')
  print('Number of validation images: '+str(len(image_files_val)))
  print('Number of validation labels: '+str(len(label_files_val)))
else:
  print('Unsuccessful validation split')

This block of code verifies that the number of images matches the number of labels in both the training and validation sets, and that the filenames match. If the split was successful, it prints the number of images and labels in each set.


## Creating a YAML Configuration File
In this step, we create a YAML configuration file that specifies the paths to the training and validation sets, the number of classes, and the class names.

In [None]:
echo -e "train: /kaggle/working/yolov7/split/images/train\nval: /kaggle/working/yolov7/split/images/val\n\nnc: 1\nnames: ['face']" >> rbcdet.yaml
! cat 'rbcdet.yaml'
shutil.copyfile('/kaggle/working/yolov7/rbcdet.yaml', '/kaggle/working/yolov7/data/rbcdet.yaml')

The echo command is used to create the YAML file with the necessary configurations. The -e option enables interpretation of backslash escapes. The >> operator appends the output to the rbcdet.yaml file. The cat command is used to display the contents of the rbcdet.yaml file. Finally, the shutil.copyfile function is used to copy the rbcdet.yaml file to the /kaggle/working/yolov7/data/ directory.

## Updating the Configuration File
In this step, we update the Intersection Over Union (IOU) threshold in the configuration file.

In [None]:
!sed -i 's/iou_t: 0.2/iou_t: 0.6/'  /kaggle/working/yolov7/data/hyp.scratch.p5.yaml

The `sed` command is used here to edit the `hyp.scratch.p5.yaml` file in-place (`-i`). The `s/iou_t: 0.2/iou_t: 0.6/` argument tells `sed` to replace (`s`) the first occurrence of `iou_t: 0.2` with `iou_t: 0.6` in each line of the file. This effectively updates the IOU threshold from 0.2 to 0.6.


## Training the Model with Updated Configuration
In this step, we train our YOLOv7 model using the updated configuration.

In [None]:
!python train.py --batch 16 --epochs 25 --data /kaggle/working/yolov7/data/rbcdet.yaml --weights '/kaggle/working/yolov7/yolov7_training.pt'   --img=640 --freeze 5 --hyp data/hyp.scratch.p5.yaml

The `train.py` script is used to train the YOLOv7 model. The `--batch 16` option sets the batch size to 16. The `--epochs 25` option sets the number of epochs to 25. The `--data /kaggle/working/yolov7/data/rbcdet.yaml` option specifies the path to the YAML configuration file. The `--weights '/kaggle/working/yolov7/yolov7_training.pt'` option specifies the path to the weights of the trained model. The `--img=640` option sets the size of the images to 640 pixels. The `--freeze 5` option freezes the first 5 layers of the model during training. The `--hyp data/hyp.scratch.p5.yaml` option specifies the path to the hyperparameters configuration file.


## Running the Model on Test Images
In this step, we use the `detect.py` script provided in the YOLOv7 repository to run our trained model on the test images.

In [None]:
h!python detect.py --weights /kaggle/working/yolov7/runs/train/exp4/weights/best.pt --img 256 --conf 0.40 --source /kaggle/input/testdetect/

The `detect.py` script is used to run the YOLOv7 model on images. The `--weights /kaggle/working/yolov7/runs/train/exp4/weights/best.pt` option specifies the path to the weights of the trained model. The `--img 256` option sets the size of the images to 256 pixels. The `--conf 0.40` option sets the confidence threshold for detections to 0.40. The `--source /kaggle/input/testdetect/` option specifies the directory containing the test images.


## Displaying Inference on Test Images
In this step, we display the inference results on all test images. We set a limit to the maximum number of images to print.


In [None]:
i = 0
limit = 10000 # max images to print
for imageName in glob.glob('/kaggle/working/yolov7/runs/detect/exp8/*.jpeg'): #assuming JPG
    if i < limit:
      display(Image(filename=imageName))
      print("\n")
    i = i + 1

This block of code loops through all the JPEG images in the specified directory. For each image, if the number of images displayed so far is less than the limit, it displays the image and prints a newline. The `glob.glob` function is used to get the list of all JPEG images in the directory. The `display` function from the `IPython.display` module is used to display the images, and the `Image` function is used to load the images.
