__________________________
# <center>Count the number of faces in an Image</center>
__________________________

<img src="https://i.pinimg.com/originals/b2/13/7c/b2137cd75449417bdcb2eb05305d1a1e.png" height=500 width=600/>

## Introduction

The method of face detection in pictures is complicated because of variability present across human faces such as pose, expression, position and orientation, skin colour, the presence of glasses or facial hair, differences in camera gain, lighting conditions, and image resolution.

- YOLOv5 is one of the object detection technique and using this technique we gonna classify the number of object in an image. And for this particular dataset we are treating human faces as an object.

### <font color='red'>Note:</font> 

- If you are a beginner or using YOLOv5 for the first time i suggest you check this [Beginners Notebook On YOLOv5](https://www.kaggle.com/vin1234/gettingstarted-with-yolov5-global-wheat-detection).

## How gonna we prceed further with this problem using YOLOv5?

YOLO “You Only Look Once” is one of the most popular and most favorite algorithms for AI engineers. It always has been the first preference for real-time object detection.

YOLO model are trained on [COCO dataset](https://cocodataset.org/), which has around 80 classes.

> So here we gonna use the concept of transfer leanring for object(face) detection. 

# Reading the data

In [None]:
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import numpy as np 


# import useful tools
from glob import glob
from PIL import Image
import cv2

# import data visualization
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import seaborn as sns

from bokeh.plotting import figure
from bokeh.io import output_notebook, show, output_file
from bokeh.models import ColumnDataSource, HoverTool, Panel
from bokeh.models.widgets import Tabs

from tqdm.auto import tqdm
import shutil as sh

# import data augmentation
import albumentations as albu

# Face Count EDA

### About the dataset

In [None]:
!ls ../input/count-the-number-of-faces-present-in-an-image/train

In [None]:
# Setup the paths to train and test images
train=pd.read_csv('../input/count-the-number-of-faces-present-in-an-image/train/train.csv')
test=pd.read_csv('../input/count-the-number-of-faces-present-in-an-image/test.csv')

Images='../input/count-the-number-of-faces-present-in-an-image/train/image_data/'
# Glob the directories and get the lists of train and test images
img = glob(Images + '*')


In [None]:
# Compute at the number of images:
print('Total Number of images is {}'.format(len(img)))

In [None]:
print('Number of image in train data are {}'.format(train.shape[0]))
train.head()

In [None]:
print('Number of image in test data are {}'.format(test.shape[0]))
test.head()

### What's in the bbox_train.csv

In [None]:
bbox=pd.read_csv('../input/count-the-number-of-faces-present-in-an-image/train/bbox_train.csv')
bbox.head()

- These are the box dimensions around the faces.
- Let's merge the data set and then see a sample of image with bounding boxes. 

In [None]:
# Merge all train images with the bounding boxes dataframe

train_images = train.merge(bbox, on='Name', how='left')

In [None]:
print(train_images.isnull().sum())
print(train_images.shape)
train_images

In [None]:
### Let's plot some image examples:

train_images.iloc[2].Name

In [None]:
# First we store all the box dimensions.
def get_all_bboxes(df, image_id):
    image_bboxes = df[df.Name == image_id]
    
    bboxes = []
    for _,row in image_bboxes.iterrows():
        bboxes.append((row.xmin, row.ymin, row.xmax, row.ymax))
        
    return bboxes

# function for box representation on the image.

def plot_image_with_box(df, rows=3, cols=4, title='Face count images'):
    fig, axs = plt.subplots(rows, cols, figsize=(20,15))
    for row in range(rows):
        for col in range(cols):
            idx = np.random.randint(len(df), size=1)[0]
            img_id = df.iloc[idx].Name
            
            img = Image.open(Images + img_id)
            axs[row, col].imshow(img)
            
            bboxes = get_all_bboxes(df, img_id)
            
            for bbox in bboxes:
                rect = patches.Rectangle((bbox[0],bbox[1]),bbox[2],bbox[3],linewidth=2,edgecolor='g',facecolor='none')
                axs[row, col].add_patch(rect)
            
            axs[row, col].axis('off')
            
    plt.suptitle(title)

In [None]:
plot_image_with_box(train_images)

### Important points

- Here we can see that the ```bounding boxes``` data is not only around the face but it also covers other body portion. But the __number of bounding boxes is equivalent__ to the number of __faces__ in the image. 

- Here we can see images are take into different lighting condition, and persons have different facial expression in the images. 

## Count the number of faces or bounding boxes 

        - That's what we need to predict for the test images.
        

- This data is already given but here we are creating a function for bounding box and counting the bounding box we will predict the number of faces in an Image.

In [None]:
train

In [None]:
# compute the number of bounding boxes per train image
# train_images['count'] = train_images.loc[:,train_images.columns !='HeadCount'].apply(lambda row: 1 if np.isfinite(row.width) else 0, axis=1)


# train_images_count = train_images.loc[:,train_images.columns !='HeadCount'].groupby('Name').sum().reset_index()

In [None]:
# train_images_count['HeadCount']=train['HeadCount']
# train_images_count.head()

In [None]:
# len(train_images_count.Name.unique())

Here we see the count is equivalent to the HeadCount or we can call it as ```FACECOUNT```

In [None]:
# See this article on how to plot bar charts with Bokeh:
# https://towardsdatascience.com/interactive-histograms-with-bokeh-202b522265f3

def hist_hover(dataframe, column, colors=["#94c8d8", "#ea5e51"], bins=30, title=''):
    hist, edges = np.histogram(dataframe[column], bins = bins)
    
    hist_df = pd.DataFrame({column: hist,
                             "left": edges[:-1],
                             "right": edges[1:]})
    hist_df["interval"] = ["%d to %d" % (left, right) for left, 
                           right in zip(hist_df["left"], hist_df["right"])]

    src = ColumnDataSource(hist_df)
    plot = figure(plot_height = 400, plot_width = 600,
          title = title,
          x_axis_label = 'Faces in image',
          y_axis_label = "Count")    
    plot.quad(bottom = 0, top = column,left = "left", 
        right = "right", source = src, fill_color = colors[0], 
        line_color = "#35838d", fill_alpha = 0.7,
        hover_fill_alpha = 0.7, hover_fill_color = colors[1])
        
    hover = HoverTool(tooltips = [('Interval', '@interval'),
                              ('Count', str("@" + column))])
    plot.add_tools(hover)
    
    output_notebook()
    show(plot)

In [None]:
hist_hover(train_images, 'HeadCount', title='Number of faces per image')

In [None]:
train_images.head()

In [None]:
df=train_images
df.head()

In [None]:
df['x_center'] = df['xmin'] + df['width']/2
df['y_center'] = df['ymin'] + df['height']/2
df['classes'] = 0


df['image_id']=df['Name'].str.replace('.jpg','')

df = df[['image_id','xmin', 'ymin', 'width', 'height','x_center','y_center','classes']]

In [None]:
df.head()

# Recreation of YOLOv5 model for Face Detection

## First and Farmost 

### Data 

> (Remember to choose GPU in Runtime if not already selected. Runtime --> Change Runtime Type --> Hardware accelerator --> GPU)

In [None]:
from IPython.display import Image, clear_output  # to display images

In [None]:
# import required dependencies

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import os
from tqdm.auto import tqdm
import shutil as sh

import matplotlib.pyplot as plt

%matplotlib inline

> Clone the github repo

1.👌 Settings > Internet (set on)

In [None]:
!git clone https://github.com/AIVenture0/yolov5.git

In [None]:
# check for the cloned repo
!ls -R

In [None]:
# move all the files of YOLOv5 to current working directory
!mv yolov5/* ./

In [None]:
# check for all the files in the current working directory
!ls

> Install Dependencies

In [None]:
!pip install -r requirements.txt

In [None]:
# # read the training data.


# df = pd.read_csv('../input/global-wheat-detection/train.csv')
# bboxs = np.stack(df['bbox'].apply(lambda x: np.fromstring(x[1:-1], sep=',')))
# for i, column in enumerate(['x', 'y', 'w', 'h']):
#     df[column] = bboxs[:,i]
# df.drop(columns=['bbox'], inplace=True)
# df['x_center'] = df['x'] + df['w']/2
# df['y_center'] = df['y'] + df['h']/2
# df['classes'] = 0
# from tqdm.auto import tqdm
# import shutil as sh
# df = df[['image_id','x', 'y', 'w', 'h','x_center','y_center','classes']]

In [None]:
# count
index = list(set(df.image_id))
len(index)


### Data Creation


- To work with the yolo you need to frame your data in to a particular formate.Because that's how yolo is designed.

> Formate

- converter(main directory)
    - val2017
        - labels (contains all the box dimensions)
        - images (contains images)
    - train2017
        - labels
        - images

In [None]:
# code to transform the dataset.

source = 'train'
if True:
    for fold in [0]:
        val_index = index[len(index)*fold//5:len(index)*(fold+1)//5]
        for name,mini in tqdm(df.groupby('image_id')):
            if name in val_index:
                path2save = 'val2017/'
            else:
                path2save = 'train2017/'
            if not os.path.exists('convertor/fold{}/labels/'.format(fold)+path2save):
                os.makedirs('convertor/fold{}/labels/'.format(fold)+path2save)
            with open('convertor/fold{}/labels/'.format(fold)+path2save+name+".txt", 'w+') as f:
                row = mini[['classes','x_center','y_center','width','height']].astype(float).values
                row = row/1024
                row = row.astype(str)
                for j in range(len(row)):
                    text = ' '.join(row[j])
                    f.write(text)
                    f.write("\n")
            if not os.path.exists('convertor/fold{}/images/{}'.format(fold,path2save)):
                os.makedirs('convertor/fold{}/images/{}'.format(fold,path2save))
            sh.copy("../input/count-the-number-of-faces-present-in-an-image/{}/image_data/{}.jpg".format(source,name),'convertor/fold{}/images/{}/{}.jpg'.format(fold,path2save,name))

In [None]:
print(os.listdir("../input/count-the-number-of-faces-present-in-an-image/train"))

In [None]:
# !ls ./convertor

!ls ./convertor/fold0/labels/train2017/12433.txt

> Training Custom YOLOv5 Detector for Wheat Head

Again i am saying if you actually want to understand all the concepts of YOLOv5 with deeper intution check [Beginners Notebook On YOLOv5](https://www.kaggle.com/vin1234/gettingstarted-with-yolov5-global-wheat-detection).

In [None]:
# As i am running it for just trial(To save training time and GPU ) 
# So i am considering all the training factors to a limited extent.

# Play with all featuers and see their performance.


# !python train.py --img 1024 --batch 20 --epochs 10 --data ../input/yaml-file-for-face-count-data-model/face_count.yaml --cfg ../input/yaml-file-for-face-count-data-model/yolov5x.yaml --name yolov5x_fold0_new


!python ./train.py --img 640 --batch 3 --epochs 20 --data ../input/yaml-file-for-face-count-data-model/face_count.yaml --cfg ../input/yaml-file-for-face-count-data-model/yolov5x.yaml --name yolov5x_fold0_new

### Run Inference With Trained Weights

In [None]:
# trained weights are saved by default in the weights folder
%ls weights/

In [None]:

!python ./detect.py --weights ./weights/last_yolov5x_fold0_new.pt --img 640 --conf 0.4 --source ./convertor/fold0/images/val2017

### Output will look something like this.

In [None]:
# This will work from your end when you edit this notebook and run it.
Image(filename='/kaggle/working/inference/output/16800.jpg', width=400)

In [None]:
Image(filename='/kaggle/working/inference/output/10185.jpg', width=400)

In [None]:
Image(filename='/kaggle/working/inference/output/10118.jpg', width=400)

Model prediction is not much appriciable.

- Till now i totally consumed my weekly gpu quota. 
- I leave all up to you guys to practice and and try out different parameters to achieve better result.

-------------------Let me know in the comment section about your results-----------------------------------