Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New layers of mobilenet-ssd #1

Open
ysh329 opened this issue Dec 17, 2018 · 2 comments
Open

New layers of mobilenet-ssd #1

ysh329 opened this issue Dec 17, 2018 · 2 comments

Comments

@ysh329
Copy link
Owner

ysh329 commented Dec 17, 2018

MobileNet-SSD移植

要找个MobileNet-SSD的Caffe模型移植到内部框架(内部框架支持Caffe模型转换到内部框架的特殊格式),好不容易找到这个项目:BeloborodovDS/MobilenetSSDFace。本文对该项目进行分析:

  1. 先跑项目例子
  2. MobileNet-SSD引入的新层
  3. MobileNet-SSD新层的实现

1. 先跑项目例子

在caffe-ssd的docker容器中跑的,将BeloborodovDS/MobilenetSSDFace项目的代码克隆下来,发现该项目的scripts/test_on_examples.py,我对其修改,在项目代码根目录下执行以下代码,就可以跑图片的人脸关键点并画到图片上输出结果并保存:

mkdir images/output
python ./scripts/test_on_examples.py

2. MobileNet-SSD引入的新层

我这里不仅对其prototxt的文件进行字符串的过滤,另外通过netscope工具观察了其结构,见ssd_face_deploy_bn netscope,可以看到有3个输出:mbox_conf_flattenmbox_locmbox_priorbox,分别对应检测框内物体的类别概率、检测框信息(location)、先验检测框,这三部分。

prototxt_path = "./ssd_face_deploy_bn.prototxt"
with open(prototxt_path) as proto_handle:
    prototxt_lines = proto_handle.readlines()
type_list = filter(lambda line: "type" in line, prototxt_lines)
type_list = set(type_list)
{'      type: "constant"\n',
 '      type: "msra"\n',
 '    code_type: CENTER_SIZE\n',
 '  type: "BatchNorm"\n',
 '  type: "Concat"\n',
 '  type: "Convolution"\n',
 '  type: "DetectionOutput"\n',
 '  type: "Flatten"\n',
 '  type: "Permute"\n',
 '  type: "PriorBox"\n',
 '  type: "ReLU"\n',
 '  type: "Reshape"\n',
 '  type: "Scale"\n',
 '  type: "Softmax"\n'}

新的层:

  • DetectionOutput
  • Permute
  • Flatten
  • PriorBox
  • Softmax(可能有些许变化)

3. MobileNet-SSD新层的实现

@ysh329
Copy link
Owner Author

ysh329 commented Dec 17, 2018

#include <assert.h>
#include <stdio.h>
#include <memory.h>
#include <stdlib.h>

permute

#define INPUT_SHAPE_NUM (4) // input axes number
void permute_helper(const int *input_shape, const int *input_shape_swap_index, int *input_steps, int *permuted_input_steps)
{
    assert(input_shape && input_shape_swap_index);

    for(int axe1_idx = 0; axe1_idx < INPUT_SHAPE_NUM; ++axe1_idx)
    {
        int input_lens = 1;
        int permuted_input_lens = 1;
        for(int axe2_idx = 0; axe2_idx < INPUT_SHAPE_NUM; ++axe2_idx)
        {
            input_lens *= input_shape[axe2_idx];
            int swap_axe_idx = input_shape_swap_index[axe2_idx]
            permuted_input_lens *= input_shape[swap_axe_idx];
        }
        input_steps[axe1_idx] = input_lens;
        permuted_input_lens[axe1_idx] = permuted_input_lens;
    }
    return;
}

void permute(float *input, const int *input_shape, const int *input_shape_swap_index)
{
    assert(input && input_shape && input_shape_swap_index);

    int input_num = 1;
    for(int axe_idx = 0; axe_idx < INPUT_SHAPE_NUM; ++axe_idx)
    {
        input_num *= input_shape[axe_idx];
    }
    float *permuted_input = calloc(input_num, sizeof float);
    memcpy(permuted_input, input, input_num * sizeof float);
    
    int input_steps[INPUT_SHAPE_NUM] = {0};
    int permuted_input_steps[INPUT_SHAPE_NUM] = {0};
    permute_helper(input_shape, input_shape_swap_index, input_steps, permuted_input_steps);
    
    for(int pidx = 0; pidx < input_num; ++pidx)
    {
        int input_idx = 0;
        int permuted_idx = pidx;
        for(int axe_idx = 0; axe_idx < INPUT_SHAPE_NUM; ++axe_idx)
        {
            int swap_axe_idx = input_shape_swap_index[axe_idx];
            input_idx += (permuted_idx / permuted_input_steps[axe_idx]) * input_steps[swap_axe_idx];
            permute_idx %= permuted_input_lens[axe_idx];
        }
        input[pidx] = permuted_input[input_idx];
    }
    
    if(permuted_input) free(permuted_input);
    permuted_input = NULL;
    return;
}

flatten

void flatten(const float *input, const int *input_shape, float *output)
{
    int input_num = 1;
    for(int axe_idx = 0; axe_idx < INPUT_SHAPE_NUM; ++axe_idx)
    {
        input_num *= input_shape[axe_idx];
    }
    memcpy(input, output, input_num * sizeof float);
    return;
}

detection_output

#define MAX_STR_LEN (100)
void detection_output(float *mbox_conf, int mbox_conf_num, float *mbox_loc, int mbox_loc_num, char *anchor_file_path)
{
    //char anchor_file_path[MAX_STR_LEN];
    strcpy(anchor);
    return;
}

@ysh329
Copy link
Owner Author

ysh329 commented Dec 19, 2018

三个分支结果都能拿到且正确。

  1. ProirBox分支,通过跑Caffe得到所有的anchor,生成了anchor.txt这个文件;
  2. mbox_loc分支和mbox_conf_flatten分支,通过get_result,拿到结果;

目前下一步是,

  1. 加入后处理即detectionOutput层,将mbox_loc和mbox_conf_flatten的结果结合anchors生成检测框的结果;
  2. 结合已有项目,摄像头可视化检测结果。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant