A simple face recognition engine
The following list contains the name of the celebrity and the number of photos used for training, ordered by the number of photos
Celebrites list
George_W_Bush 530 Colin_Powell 236 Tony_Blair 144 Donald_Rumsfeld 121 Gerhard_Schroeder 109 Ariel_Sharon 77 Hugo_Chavez 71 Junichiro_Koizumi 60 Jean_Chretien 55 John_Ashcroft 53 Serena_Williams 52 Jacques_Chirac 52 Vladimir_Putin 49 Luiz_Inacio_Lula_da_Silva 48 Gloria_Macapagal_Arroyo 44 Jennifer_Capriati 42 Arnold_Schwarzenegger 42 Lleyton_Hewitt 41 Laura_Bush 41 Hans_Blix 39 Alejandro_Toledo 39 Nestor_Kirchner 37 Andre_Agassi 36 Alvaro_Uribe 35 Tom_Ridge 33 Silvio_Berlusconi 33 Megawati_Sukarnoputri 33 Vicente_Fox 32 Roh_Moo-hyun 32 Kofi_Annan 32 John_Negroponte 31 David_Beckham 31 Recep_Tayyip_Erdogan 30 Guillermo_Coria 30 Mahmoud_Abbas 29 Bill_Clinton 29 Juan_Carlos_Ferrero 28 Jack_Straw 28 Ricardo_Lagos 27 Rudolph_Giuliani 26 Gray_Davis 26 Tom_Daschle 25 Winona_Ryder 24 Jeremy_Greenstock 24 Atal_Bihari_Vajpayee 24 Tiger_Woods 23 Saddam_Hussein 23 Jose_Maria_Aznar 23 Pete_Sampras 22 Naomi_Watts 22 Lindsay_Davenport 22 Hamid_Karzai 22 George_Robertson 22 Jennifer_Lopez 21 Jennifer_Aniston 21 Carlos_Menem 21 Amelie_Mauresmo 21 Paul_Bremer 20 Michael_Bloomberg 20 Jiang_Zemin 20 Igor_Ivanov 20 Angelina_Jolie 20 Tim_Henman 19 Nicole_Kidman 19 Julianne_Moore 19 Joschka_Fischer 19 John_Howard 19 Carlos_Moya 19 Abdullah_Gul 19 Richard_Myers 18 Pervez_Musharraf 18 Michael_Schumacher 18 Lance_Armstrong 18 Fidel_Castro 18 Venus_Williams 17 Spencer_Abraham 17 Renee_Zellweger 17 John_Snow 17 John_Kerry 17 John_Bolton 17 Jean_Charest 17 Bill_Gates 17 Trent_Lott 16 Tommy_Franks 16 Halle_Berry 16 Taha_Yassin_Ramadan 15 Pierce_Brosnan 15 Norah_Jones 15 Nancy_Pelosi 15 Mohammed_Al-Douri 15 Meryl_Streep 15 Julie_Gerberding 15 Hu_Jintao 15 Dominique_de_Villepin 15 Bill_Simon 15 Andy_Roddick 15 Yoriko_Kawaguchi 14 Roger_Federer 14 Mahathir_Mohamad 14 Kim_Clijsters 14 James_Blake 14 Hillary_Clinton 14 Eduardo_Duhalde 14 Dick_Cheney 14 David_Nalbandian 14 Britney_Spears 14 Wen_Jiabao 13 Salma_Hayek 13 Queen_Elizabeth_II 13 Lucio_Gutierrez 13 Joe_Lieberman 13 Jackie_Chan 13 Gordon_Brown 13 George_HW_Bush 13 Edmund_Stoiber 13 Charles_Moose 13 Ari_Fleischer 13 Rubens_Barrichello 12 Michael_Jackson 12 Keanu_Reeves 12 Jennifer_Garner 12 Jeb_Bush 12 Howard_Dean 12 Harrison_Ford 12 Gonzalo_Sanchez_de_Lozada 12 Anna_Kournikova 12 Adrien_Brody 12 Tang_Jiaxuan 11 Sergio_Vieira_De_Mello 11 Sergey_Lavrov 11 Richard_Gephardt 11 Paul_Burrell 11 Nicanor_Duarte_Frutos 11 Mike_Weir 11 Mark_Philippoussis 11 Kim_Ryong-sung 11 John_Paul_II 11 John_Allen_Muhammad 11 Jiri_Novak 11 James_Kelly 11 Condoleezza_Rice 11 Catherine_Zeta-Jones 11 Ann_Veneman 11 Walter_Mondale 10 Tommy_Thompson 10 Tom_Hanks 10 Tom_Cruise 10 Richard_Gere 10 Paul_Wolfowitz 10 Paradorn_Srichaphan 10 Muhammad_Ali 10 Mohammad_Khatami 10 Jean-David_Levitte 10 Javier_Solana 10 Jason_Kidd 10 Jacques_Rogge 10 Ian_Thorpe 10 Bill_McBride 10 Zhu_Rongji 9 Vaclav_Havel 9 Tung_Chee-hwa 9 Thomas_OBrien 9 Sylvester_Stallone 9 Richard_Armitage 9 Ray_Romano 9 Paul_ONeill 9 Li_Peng 9 Leonardo_DiCaprio 9 Kate_Hudson 9 Jose_Serra 9 John_Abizaid 9 Joan_Laporta 9 Jimmy_Carter 9 Jesse_Jackson 9 Jeong_Se-hyun 9 Hugh_Grant 9 Hosni_Mubarak 9 Heizo_Takenaka 9 George_Clooney 9 Fernando_Gonzalez 9 Colin_Farrell 9 Charles_Taylor 9 Bill_Graham 9 Bill_Frist 9 Yasser_Arafat 8 Yao_Ming 8 Shimon_Peres 8 Sheryl_Crow 8 Ron_Dittemore 8 Robert_Redford 8 Robert_Duvall 8 Robert_Blake 8 Richard_Virenque 8 Ralf_Schumacher 8 Paul_Martin 8 Naji_Sabri 8 Mohamed_ElBaradei 8 Michelle_Kwan 8 Michael_Chang 8 Maria_Shriver 8 Li_Zhaoxing 8 Kim_Dae-jung 8 Kevin_Costner 8 Justin_Timberlake 8 Juan_Pablo_Montoya 8 Jonathan_Edwards 8 John_Edwards 8 Jelena_Dokic 8 Gerry_Adams 8 Fernando_Henrique_Cardoso 8 Cesar_Gaviria 8 Celine_Dion 8 Bob_Hope 8 Antonio_Palocci 8 Ana_Palacio 8 Ali_Naimi 8 Al_Gore 8 Yashwant_Sinha 7 William_Ford_Jr 7 William_Donaldson 7 Vojislav_Kostunica 7 Vincent_Brooks 7 Steven_Spielberg 7 Sophia_Loren 7 Romano_Prodi 7 Robert_Zoellick 7 Pedro_Almodovar 7 Paul_McCartney 7 Oscar_De_La_Hoya 7 Norm_Coleman 7 Mike_Myers 7 Mike_Martz 7 Matthew_Perry 7 Martin_Scorsese 7 Mariah_Carey 7 Liza_Minnelli 7 Larry_Brown 7 Justine_Pasek 7 Jon_Gruden 7 John_Travolta 7 John_McCain 7 John_Manley 7 Jean-Pierre_Raffarin 7 Holly_Hunter 7 Gunter_Pleuger 7 Goldie_Hawn 7 Geoff_Hoon 7 Elton_John 7 Dennis_Kucinich 7 David_Wells 7 Bob_Stoops 7 Binyamin_Ben-Eliezer 7 Ben_Affleck 7 Ana_Guevara 7 Amelia_Vega 7 Al_Sharpton 7 Zinedine_Zidane 6 Yoko_Ono 6 Valery_Giscard_dEstaing 6 Valentino_Rossi 6 Tony_Stewart 6 Tommy_Haas 6 Thaksin_Shinawatra 6 Tariq_Aziz 6 Susan_Sarandon 6 Steve_Lavin 6 Silvan_Shalom 6 Sarah_Jessica_Parker 6 Sarah_Hughes 6 Roy_Moore 6 Roman_Polanski 6 Rob_Marshall 6 Robert_De_Niro 6 Rick_Perry 6 Ricardo_Sanchez 6 Paula_Radcliffe 6 Natalie_Coughlin 6 Monica_Seles 6 Mike_Krzyzewski 6 Michael_Douglas 6 Marco_Antonio_Barrera 6 Luis_Horna 6 Luis_Ernesto_Derbez_Bautista 6 Leonid_Kuchma 6 Kamal_Kharrazi 6 Jose_Manuel_Durao_Barroso 6 JK_Rowling 6 Jim_Furyk 6 Jay_Garner 6 Jan_Ullrich 6 Gwyneth_Paltrow 6 Fujio_Cho 6 Elsa_Zylberstein 6 Edward_Lu 6 Diana_Krall 6 Dennis_Hastert 6 Costas_Simitis 6 Clint_Eastwood 6 Clay_Aiken 6 Christine_Todd_Whitman 6 Charlton_Heston 6 Carmen_Electra 6 Cameron_Diaz 6 Calista_Flockhart 6 Bulent_Ecevit 6 Boris_Becker 6 Bob_Graham 6 Billy_Crystal 6 Arminio_Fraga 6 Angela_Bassett 6 Albert_Costa 6
This project is developed for have a plug-and-play facial recognition tool able to detect and recognize multiple faces from photos. It aim to be inter-operable with other tool. For this purpose, it expose REST api in order to interact with the internal face-recognition engine (train/tune/predict) and return the result of the prediction in a JSON format.
It's written for be a basecode/project-template for future project where a more complicated facial detect + neural network have to be engaged. But is a complete face recognition tool that can be deployed on Docker. Currently it use a Multi Layer Perceptron (MLP) as neural network in order to predict the given faces.
The tool is powered with Flask_MonitoringDashboard
that expose some useful utilization/performance graph at the /dashboard
endpoint
- face_recognition Extract face point from image
- Flask The Python micro framework for building web applications
- Flask_MonitoringDashboard Automatically monitor the evolving performance of Flask/Python web services
- numpy The fundamental package for scientific computing with Python.
- olefile Parse, read and write Microsoft OLE2 files (deal with image)
- Pillow The friendly PIL fork (Python Imaging Library)
- py-bcrypt Python wrapper of OpenBSD's Blowfish password hashing code
- redis-py The Python interface to the Redis key-value store.
- scikit-learn Machine learning in Python
- tqdm A Fast, Extensible Progress Bar
- werkzeug The comprehensive WSGI web application library
NOTE: If you encounter an error during pip install -r requirements.txt
, it's possible that you have not installed cmake
. dlib
need cmake
.
You can install cmake
using:
apt install cmake -y
(Debian/Ubuntu).yum install cmake -y
(CentOS/Fedora/RedHat).
The software is coded in Python
, into the requirements.txt
file are saved the necessary dependencies.
Create a virtual environment with you favorite python
package manager
# Create a new environment
conda create -n PyRecognizer python=3.7.4
# Activate the environment
conda activate PyRecognizer
# Install the necessary dependencies
pip install -r requirements.txt
At this point all the necessary library for run the tool are ready, and you can run the software.
You can view the following example video in order to understand how to interact with the tool for the following process:
- Create dataset from images
- Predict image
- Train/Tune the neural network
Before you can train the neural network with the photos, you need to create an archive that contains the image of the people's faces that you want to predict.
- Save a bunch of images of the people that you need to recognize.
- Copy the image in a folder. The name of that folder is important, cause it will be used as a label for the dataset (images) that contains during prediction.
- Compress the folders in a
zip
file.
Before train the neural network, you have to create a dataset with the people images that you want to recognize. If your dataset tree structure look likes the following tree dir, you can continue with training phase.
├── bfegan
└── ...
├── chris
└── ...
├── dhawley
└── ...
├── graeme
└──...
├── heather
└──...
In this case we have a dataset that contains the photos of 5 people (bfegan, dhawley, heather etc).
Each directory, contains the photos related to the "target".
You can find an example dataset at the following link: https://www.kaggle.com/jessicali9530/lfw-dataset
Some people in this dataset have only very few image.
We can create a new one dataset using the following bash
command, in order to extract only the people that contains more than 5 images:
# Extract only the people that have more than 5 photos (-gt 5)
for i in $(ls); do a=$(ls $i |wc -l); if [ "$a" -gt 5 ]; then echo $i ; fi ; done > people_ok
# Create a directory for store the images
mkdir -p /tmp/faces
# Copy the filtered directory in the new one
for i in $(cat people_ok | xargs echo -n) ; do cp -r $i /tmp/faces/ ; done
At this point the dataset is complete and you can continue with training/tuning.
Backup and remove the already present model (if present,inside the dataset/model
directory), the tool will understand that you want to train the model and will initialize a new MLP model. The model have the following name template: %Y%m%d_%H%M%S
, related to the time that was generated.
Open your browser at the endpoint:port/train
specified in the configuration file (conf/test.json
) and you will be redirect to the Administrator login page.
NOTE: you can switch on/off the SSL, be sure to add https
before the endpoint ip/hostname if it is enabled.
NOTE: In order to access to the training/tuning page, you have to run the script in utils/add_users.py for create an admin user, capable of manage the train/tune for the neural network.
NOTE: A instance of redis
have to be up and running if you want to train your custom neural network, cause the login will read the data from redis
.
At this point you can upload the dataset (the previous zip
file) and wait for the training of the neural network.
You can tail the log in log/pyrecognizer.log
in order to understand the status of the training (lnav
is your friends).
Once completed, the browser page will be refreshed automatically and you can:
- predict a new photos that the neural network haven't seen before, realated to the peoeple in the dataset.
- reduce the treeshold and see how you are similar to a celebrity!.
NOTE: The same procedure can be applied for tune
the neural network. By this way, you are going to execute an exhaustive search over specified parameter values for the KNN classifier. And, obviously, is more time consuming and the neural network produced will be more precise. The endpoint is /tune
instead of /train
After train/tune
phase, you have to modify the configuration file in order to use the new model. The model is saved in a new folder with the related timestamp (modify classifier -> timestamp in the configuration file)
tree
.
├── api
│ ├── Api.py # Code that contains the API endpoint logic
│ └── templates # Folder that contains the HTML template for tune/train/predict
│ ├── train.html
│ └── upload.html
├── conf # Configuration folder
│ ├── conf.json # Tool configuration file
│ ├── dashboard.ini # File related to the Dashboard configuration
│ ├── flask_monitoringdashboard.db # Dashboard database
│ ├── ssl # SSL Certificates folder
│ │ ├── localhost.crt
│ │ └── localhost.key
├── dataset # Model folder + test dataset
│ ├── face_training_dataset_little.zip # Model used for test train
│ ├── face_training_dataset.zip
│ └── model # Neural network model's folder
│ ├── 20191123_171821 # Folder for the NN model
│ │ ├── model.clf # Neural network dumped
│ │ ├── model.dat # Data used for train/tune
│ │ └── model.json # Hyperparmaters of the NN
│ └── README.md
├── datastructure # Datastructure/Class used
│ ├── Administrator.py # Class that handle the admin of the NN, for train/tune
│ ├── Classifier.py # Class delegated to predict the photos
│ ├── Person.py # Class delegated to handle the "stuff" related to loading people data
│ └── Response.py # Class delegated to wrap the response
├── docker-compose.yml # docker-compose file for raise up the PyRecognizer (predict + train/tune)
├── Dockerfile # Dockerfile related to the PyRecognizer only (only predict)
├── LICENSE # License file
├── log # Log folder
│ └── pyrecognizer.log
├── main.py # Main program to spawn the tool
├── README.md
├── requirements.txt # Dependencies file
├── uploads # Folder that contains the upload data
├── test # Test folder
│ ├── conf_test.json
│ ├── test_classifier.py # File with test cases
│ ├── test_images # Test data
│ │ ├── bush_test.jpg
│ │ ├── multi_face_test.jpg
│ │ └── unknown_face.jpg
│ ├── test_log # Log of the test
│ │ └── pyrecognizer.log
│ └── uploads
│ ├── predict
│ ├── training
│ ├── unknown
│ └── upload
│ ├── predict
│ ├── training
│ └── upload
├── utils
│ ├── add_users.py # Python file for add a new user for train/tune the network
│ └── util.py # Common methods
└── wsgi.py
- Missing the photo in request
{
"response": {
"data": null,
"date": "2020-01-12 15:12:14.762526",
"description": "You have sent a request without the photo to predict :/",
"error": "NO_FILE_IN_REQUEST",
"status": "KO"
}
}
- Missing threshold parameter in request
{
"response": {
"data": null,
"date": "2020-01-12 15:12:14.769286",
"description": "You have sent a request without the `threshold` parameter :/",
"error": "THRESHOLD_NOT_PROVIDED",
"status": "KO"
}
}
- Threshold provided is a number not in the properly range
{
"response": {
"data": null,
"date": "2020-01-12 15:12:14.776730",
"description": "Threshold have to be greater than 0 and lesser than 100!",
"error": "THRESHOLD_ERROR_VALUE",
"status": "KO"
}
}
- File in request is not a valid one
{
"response": {
"data": null,
"date": "2019-11-23 18:10:11.038329",
"description": "Seems that the file that you have tried to upload is not valid ...",
"error": "FILE_NOT_VALID",
"status": "KO"
}
}
- Error parsing the threshold parameter
{
"response": {
"data": null,
"date": "2020-01-12 15:12:14.784154",
"description": "Threshold is not an integer!",
"error": "UNABLE_CAST_INT",
"status": "KO"
}
}
- Dataset upload is not valid
{
"response": {
"data": null,
"date": "2019-11-23 18:10:11.038329",
"description": "Seems that the dataset is not valid",
"error": "ERROR DURING LOADING DAT",
"status": "KO"
}
}
- Unable to detect a face
{
"response": {
"data": null,
"date": "2019-11-23 18:10:11.038329",
"description": "Seems that in this images there is no face :/",
"error": "FACE_NOT_FOUND",
"status": "KO"
}
}
- Face not recognized
{
"response": {
"data": {},
"date": "2019-11-23 18:17:58.287413",
"description": "FACE_NOT_RECOGNIZED",
"error": null,
"status": "OK"
}
}
- Face recognized
{
"response": {
"data": {
"iroy": 0.5762745881923004 # Name of the person: confidence
},
"date": "2019-11-23 18:23:01.762757",
"description": "ijyibbvcgq.png", # Random string for view image prediction (visit /uploads/ijyibbvcgq.png)
"error": null,
"status": "OK"
}
}
- Missing model's classifier
{
"response": {
"data": null,
"date": "2019-11-23 18:27:55.761851",
"description": "CLASSIFIER_NOT_LOADED",
"error": null,
"status": "KO"
}
}
- Login not successfully
{
"response": {
"data": null,
"date": "2019-11-23 18:27:55.761851",
"description": "The password inserted is not valid!",
"error": "PASSWORD_NOT_VALID",
"status": "KO"
}
}
- Unable to connect to redis
{
"response": {
"data": null,
"date": "2019-11-23 18:27:55.761851",
"description": "Seems that the DB is not reachable!",
"error": "UNABLE_CONNECT_REDIS_DB",
"status": "KO"
}
}
- Feel free to open issue in order to require new functionality;
- Feel free to open issue if you discover a bug;
- New idea/request/concept are very appreciated!;
In order to run the basic test case, you need to:
- Spawn the
PyRecognizer
tool usingpython main.py
- Change directory into the
test/
folder - Run
python -m unittest test_classifier.TestPredict
If you are the admin of the neural network, you can test the Admin related methods:
- Spawn the docker image of a redis-db
docker run -dt -p 6379:6379 redis
- Change directory into the
test/
folder - Run
python -m unittest test_admin.TestAdmin
We use SemVer for versioning.
- Alessio Savi
This project is licensed under the MIT License - see the LICENSE file for details
Face data are sensible information. In order to mitigate the risk of stealing sensible data, the tool can run in SSL mode for avoid packet sniffing and secure every request using a CSRF mitigation