A tensorflow implement faceboxes. And some changes has been made in RDCL module, to achieve a better performance,and runs faster:
- input size is 512 (1024 in the paper), then the first conv stride is 2, kernel size 7x7x12.
- replace the first maxpool by conv 3x3x24 stride 2
- replace the second 5x5 stride2 conv and maxpool by two 3x3 stride 2 conv
- anchor based sample is used in data augmentaion.
codes like below
with tf.name_scope('RDCL'):
net = slim.conv2d(net_in, 12, [7, 7], stride=2,activation_fn=tf.nn.relu, scope='init_conv1')
net = slim.conv2d(net, 24, [3, 3], stride=2, activation_fn=tf.nn.crelu, scope='init_conv2')
net = slim.conv2d(net, 32, [3, 3], stride=2,activation_fn=tf.nn.relu,scope='conv1x1_before1')
net = slim.conv2d(net, 64, [3, 3], stride=2, activation_fn=tf.nn.crelu, scope='conv1x1_before2')
return net
I want to name it faceboxes++ ,if u don't mind
Pretrained model can be download from:
-
baidu disk (code eb6b )
Evaluation result on fddb
| fddb |
|---|
| 0.961 |
Speed: it runs over 70FPS on cpu (i7-8700K), 30FPS (i5-7200U), 140fps on gpu (2080ti) with fixed input size 512, tf1.14, multi thread. And i think the input size, the time consume and the performance is very appropriate for application :)
Hope the codes can help you, and i am struggling with the new tf api, contact me if u have any question, 2120140200@mail.nankai.edu.cn .
-
tensorflow1.14
-
tensorpack (for data provider)
-
opencv
-
python 3.6
-
download widerface data from http://shuoyang1213.me/WIDERFACE/ and release the WIDER_train, WIDER_val and wider_face_split into ./WIDER,
-
download fddb, and release FDDB-folds into ./FDDB , 2002,2003 into ./FDDB/img
-
then run
python prepare_data.pyit will produce train.txt and val.txt(if u like train u own data, u should prepare the data like this:
...../9_Press_Conference_Press_Conference_9_659.jpg| 483(xmin),195(ymin),735(xmax),543(ymax),1(class) ......one line for one pic, caution! class should start from 1, 0 means bg) -
then, run:
python train.pyand if u want to check the data when training, u could set vis in train_config.py as True
-
after training ,convert to pb file:
python tools/auto_freeze.py
-
(if u like train u own data, u should prepare the data like this:
...../9_Press_Conference_Press_Conference_9_659.jpg| 483(xmin),195(ymin),735(xmax),543(ymax),1(class) ......one line for one pic, caution! class should start from 1, 0 means bg) -
set config.MODEL.pretrained_model='your model pretrained', and config.MODEL.continue_train=True in train_config.py
-
python train.py
python test/fddb.py [--model [TRAINED_MODEL]] [--data_dir [DATA_DIR]]
[--split_dir [SPLIT_DIR]] [--result [RESULT_DIR]]
--model Path of the saved model,default ./model/detector.pb
--data_dir Path of fddb all images
--split_dir Path of fddb folds
--result Path to save fddb results
example python model_eval/fddb.py --model model/detector.pb --data_dir 'FDDB/img/' --split_dir FDDB/FDDB-folds/ --result 'result/'
-
If u get a trained model, run
python tools/auto_freeze.py, it will read the checkpoint file in ./model, and produce detector.pb. -
python vis.py --img_dir 'your images dir ,by default it detect pics with .jpg -
or use a camera:
python vis.py --cam_id 0
You can check the code in vis.py to make it runable, it's simple.

