Skip to content

Implementation of Haar-Cascade combined with CNN to detect children in images.

License

Notifications You must be signed in to change notification settings

phsmoura/child-image-detection

Repository files navigation

Child Image Detection

This project was developed as an undergraduate thesis and the goal was an implementation of Haar-Cascade combined with CNN (Convolutional Neural Network) to detect children in images. CNN was used for facial classification of children in facial frames captured by Haar-Cascade.

See Deteccao_de_criancas_em_imagens.pdf to read the undergraduate thesis. All related work and concepts involving this project were explained in this document, just be aware that, aside the abstract, it was written in portuguese (PT_BR).

Getting Started

This project was developed on an Ubuntu 20.04 LTS, but it's recommended to read the documentation of each package required to install before running the python scripts.

Install git if not present and clone this repository.

$ apt install git
$ git clone https://github.com/phsmoura/child-image-detection

Prerequisites

This project requires these packages to be installed before running it:

  • Python 3:
$ sudo apt-get install python3
  • Pip3:
$ sudo apt install python3-pip
  • Virtualenv:

There are some ways to install it, see PYPI for official documentation. If python version is 3.5+, then it's safe to run the following command:

$ pipx install virtualenv

Configuring development environment

Set a development environment with virtualenv:

$ virtualenv venv
$ source venv/bin/activate

Install dependency packages running the command:

$ pip3 install -r requirements.txt

Running the code

This repository has 4 subdirectories and 3 steps to get it running on real cases. In fact, one of the steps is optional. The subdirectories are:

  • dataset: here are the images that will be used to train CNN
  • step-1-cnn: scripts to train CNN and generate models of each training
  • step-2-haarcascade: this is an optional step and has the script to detect human faces using haar-cascade algorithm
  • step-3-final: combined haar-cascade with CNN, this script is used in real cases

About the Dataset

Dataset in this repository has a total of 4.755 images to carry out the training of the convolutional network, of this total 2.587 are adult face images and 2.168 are children face images. These images were extracted from 2 different datasets that are avaliable on:

After downloading and manualy classifing children and nonchildren, all images were scaled to a scale of 150x150 pixels and converted to gray scale. This step is necessary because the convolutional network only accepts fixed-size entries, in this case, 22,500 (150 ∗ 150) entries. In addition, the conversion to grayscale was done to decrease the number of entries. In this way, a single luminance channel (gray scale) is provided for the network.

Execute CNN Training

After pre-processing images to build the dataset, the convolutional network can carry out the training stage. Training consists of supervised learning, using 90% of the images and the other 10% for cross-validation tests. Execute the following command to run the training, because it randonly creates the training and cross-validation images the training will be executed 100 times and to create 100 models, therefore the best model can be choosen by the success rate.

$ cd step-1-cnn
$ python main.py

The number of trainings can be changed at line 79 of "main.py" file. The success rate of each training is automatic calculated and registered in the file "tests-results.txt".

With the training finished, the models can be found in the "models" folder, the numbers folders found in this folder "models" correspond to the test column in the file "tests-results.txt" and inside them 2 files: "kids.json" and "weight_kids.h5".

Test Haar-Cascade facial detection

This is an optional step and it was created just to see how accurate haar-cascade is, because haarcascades ".xml" files comes with OpenCV instalation. Empirical tests were performed with 3 types of cascades "(haarcascade_frontalface_default.xml, haarcascade_frontalface_alt.xml, haarcascade_frontalface_alt2.xml)" to detect frontal face. For each cascade in the Haar-Cascade, a percentage of performance was achieved, with two of the cascades having 97% of right facial detection and one of them 98%.

For that study the following dataset were used:

To reproduce this study, download this dataset, move it inside "step-2-haarcascade" directory with the name "natural_images" and run "main.py" of this step.

$ cd step-2-haarcascade
$ python main.py

CNN and Haar-Cascade combined in a real dataset

In this step, the combined algorithms are used in images containing scenes and not just faces of people. That is why haar-cascade is used, it aims to identify the faces of people in images, whether or not they belong to a child. With the faces identified by the Haar-Cascade, it is possible to use the convolutional neural network (CNN) to classify the faces as children or not. Identifying just one child is enough to separate that scene image into a directory containing only scenes with children, otherwise the image that don't have children is discarded. This directory can be found in "step-3-final/kids".

Before running the script, copy the best model from "step-1-cnn/models//" to "step-3-final":

$ cp -a step-1-cnn/models/<number of the model> step-3-final/model

Also create a directory called "base-full" and put all desired imagens to find kids.

$ mkdir step-3-final/base-full

As done in the previous steps, run "main.py" script.

$ cd step-3-final
$ python main.py

Built With

  • OpenCV - Open source computer vision and machine learning software library.
  • TensorFlow - End-to-End open source platform for machine learning

Authors

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

This project was mentored by:

Releases

No releases published

Packages

No packages published

Languages