Skip to content

Latest commit

 

History

History
568 lines (456 loc) · 24.9 KB

DEVELOPER.md

File metadata and controls

568 lines (456 loc) · 24.9 KB

Before learning this developer guide of bolt, the code architecture is strongly recommended for you to read in advance. In code architecture, you will get the deep understanding of the whole design of bolt, which helps you develop bolt more efficiently. If you want to verify your model quickly, you can use the out-of-the-box c api or java api to infer your model and check the inference result. If your model run with time series data, you can use Flow to accelerate the inference. What’s more, if your encounter unsupported operators in conversion or inference of your model, you can customize the unsupported operators step by step which has been described in details in the document.

Contents


    Use out-of-the-box API to infer your model
        C API
        Java API
    Accelerate time series model by Flow
    Customize models with unsupported operators step by step
        model conversion customization
        tensor computing customization
        inference's engine customization
    How to contribute
        submit issue
        pull request

Use out-of-the-box API to infer your model


C API

Bolt provides C API document generated by doxygen to help you use C API, image classification example and Chinese input method example. You can compile it and link libbolt.so library with your C/C++ project.

Java API

Bolt provides Java API document generated by doxygen to help use Java API with a detailed example. You can compile bolt and load libBoltModel.so to using the Java Native Interface(JNI) with your Java project.

Python API

Bolt provides easy-to-use python api for the developers.Please check the usage of python api in Bolt.

Accelerate time series model by Flow


Flow provides API document generated by doxygen to help use Flow C++ header, and examples(tinybert, faceSR, ASR). You can also use Java API and there is a simple GSR test.

Here are the steps to use Flow:

  • Use predefined flow protobuf standard to define a graph

    Here is an example for CV application faceSR graph file flow_facesr.prototxt. This graph has one input, one input node, one inference node and one output. Input node need to be marked as Input, and inference node need to be marked as Inference. Each node can have multiple input or output tensors. Each type node has typical fields.

  • Add output tensor size infer function for each node, and register function to Flow function manager (optional)

    facesr doesn't need to post-process the final tensor, so the node's output tensor can be used directly.

    If you need to post-process the final tensor, you can refer to flow_tinybert which has defined a post-processing function(tinybertInferOutputSize) and register the post-processing function by using flowRegisterFunction API.

  • Add input tensor pre-processing function for each node, and register function to Flow function manager (optional)

    (same as output tensor size infer function)

  • Add output tensor post-processing function for each node, and register function to Flow function manager (optional)

    (same as output tensor size infer function)

  • Define a Flow object and add task

    Declare a Flow object and set CPU cores and GPU. Describe the task by Task format and use enque API to add the task into Flow heterogeneous executor.

  • Get Flow process result

    Use dequeue API to get the result sorted in FIFO order. You can choose to set the results as a block to get all enqueue task results at the same time. size function can be used to query the unfinished task number.

Customize models with unsupported operators step by step


model conversion customization

In model_tools, you can define any operator for model conversion.

  1. Switch to code of the specific framework (caffe/onnx/tflite) you are working on;
  2. Judge the op whether it is a weight-op or non-weight-op;
  3. Define the Operator parameter format;
  4. Extract the meta information of the operator;
  5. Extract the weight data if the operator is a weight-op, otherwise skip this step.
  • Example: support pooling in caffe converter

    1. Switch to model_tools/src/caffe, which is the caffe converter for bolt;

    2. Judgment: pooling is non-weight-op.

    3. Define pooling parameter format.

      3.1 Modify OperatorType data structure in common/uni/include/operator_type.h

      typedef enum {
      ...
          OT_Pooling,    //  Addition 
      ...
      } OperatorType 

      3.2 Modify inline const char* const* OperatorTypeName() function in common/uni/include/operator_type.h

      inline const char* const* OperatorTypeName() {
          static const char* const names[] = {
              ...
              "OT_Pooling",    // Addition, please corresponds to the OperatorType
              ...
          }
      }

      3.3 Add pooling definition of bolt in common/uni/include/parameter_spec.h

      // Addition ======>
      typedef struct {
          unsigned int kernel_t;
          unsigned int kernel_h;
          unsigned int kernel_w;
          unsigned int stride_t;
          unsigned int stride_h;
          unsigned int stride_w;
          unsigned int pad_before;
          unsigned int pad_after;
          unsigned int pad_top;
          unsigned int pad_bottom;
          unsigned int pad_left;
          unsigned int pad_right;
          RoundMode round_mode;
          PoolingMode mode;
      } PoolingParamSpec;
      // <====== Addition 

      3.4 Modify int get_operator_parameter_size(OperatorType operatorType) function in common/uni/include/parameter_spec.h

      std::map<OperatorType, int> operatorParameterSizeMap = {
          ...
          {OT_Pooling, sizeof(PoolingParamSpec)},    // Addition
          };
    4. Extract the meta information of pooling operator in caffe.

      4.1 Modify OperatorType convert_caffe_type(std::string inputType) function in model_tools/src/caffe/caffe_adaptee.h.

      Add the caffe type mapping code as following:

      OperatorType convert_caffe_type(std::string inputType) {
          std::map<std::string, OperatorType> operatorMap = {
             // Addition ======>
             {"Pooling", OT_Pooling},
             // <====== Addition
          };
      }

      4.2 Register the abstract adapt_Pooling() function in class ModelAdaptee if it has not been registered in model_tools/src/model_adaptee.h. Otherwise, skip this step.

      virtual EE adapt_operator(OperatorType type, ParameterSpec *ps) {
          std::map<OperatorType, AdaptOperatorFunction> functions = {
              // Addition ======>
              {OT_Pooling, &ModelAdaptee::adapt_Pooling},
              // <====== Addition
          };
      }
      
      // Addition ======>
      REGISTER_EMPTY_ADAPT_OPERATOR(adapt_Pooling)
      // <====== Addition

      4.3 Extract the meta information of pooling operator from caffe model, add ParameterSpec adapt_Pooling() override function in model_tools/src/caffe/caffe_adaptee.h.

      // Addition ======>
      ParameterSpec adapt_Pooling() override
      {
          ParameterSpec ps;
          PoolingParamSpec p;
          memset(&p, 0, sizeof(p));
          p.kernel_t = 1;
          p.stride_t = 1;
          p.pad_before = 0;
          p.pad_after = 0;
          auto cp = layer.pooling_param();
          if (cp.has_kernel_w() && cp.has_kernel_h()) {
              p.kernel_w = cp.kernel_w();
              p.kernel_h = cp.kernel_h();
          } else {
              p.kernel_h = cp.kernel_size();
              p.kernel_w = p.kernel_h;
          }
          if (cp.has_stride_w() && cp.has_stride_h()) {
              p.stride_w = cp.stride_w();
              p.stride_h = cp.stride_h();
          } else {
              p.stride_h = cp.stride();
              p.stride_w = p.stride_h;
          }
          bool global_pooling = cp.global_pooling();
          if (global_pooling) {
              p.kernel_h = 0;
              p.kernel_w = 0;
              p.stride_h = 1;
              p.stride_w = 1;
          } else {
              CHECK_REQUIREMENT(p.kernel_h > 0);
          }
          if (cp.has_pad_w() && cp.has_pad_h()) {
              p.pad_left = cp.pad_w();
              p.pad_right = p.pad_left;
              p.pad_top = cp.pad_h();
              p.pad_bottom = p.pad_top;
          } else {
              p.pad_top = cp.has_pad() ? cp.pad() : 0;
              p.pad_bottom = p.pad_top;
              p.pad_left = p.pad_top;
              p.pad_right = p.pad_top;
          }
      
          if (cp.has_round_mode() && cp.round_mode() == 1) {
              p.round_mode = ROUND_FLOOR;
          } else {
              p.round_mode = ROUND_CEIL;
          }
          auto op = cp.pool();
          switch (op) {
              case caffe::PoolingParameter_PoolMethod_MAX: {
                  p.mode = POOLING_MAX;
                  break;
              }
              case caffe::PoolingParameter_PoolMethod_AVE: {
                  p.mode = POOLING_MEAN;
                  break;
              }
              default: {
                  const google::protobuf::EnumDescriptor *descriptor =
                      caffe::PoolingParameter::PoolMethod_descriptor();
                  UNI_ERROR_LOG("can not map operator name:%s %s to Pooling.\n",
                      this->layer.name().c_str(), descriptor->FindValueByNumber(op)->name().c_str());
              }
          }
          ps.pooling_spec = p;
          return ps;
      }
      // <====== Addition
    5. Pooling is non-weight op, skip this step.

  • Example: support pooling in onnx converter

    1. Switch to model_tools/src/onnx, which is the onnx converter for bolt;

    2. Judgment: pooling is non-weight-op;

    3. Define pooling parameter format.

      Note: Definition actions same with add pooling in caffe converter step 3 . Please refer the former content.

    4. Extract the meta information of pooling operator in onnx.

      4.1 Modify the function named OperatorType convert_onnx_type(std::string inputType) in model_tools/onnx/onnx_adaptee.h.

      Add the onnx type mapping code as following:

      OperatorType convert_onnx_type(std::string inputType) {
          std::map<std::string, OperatorType> operatorMap = {
              // Addition ======>
              {"AveragePool", OT_Pooling},
              {"MaxPool", OT_Pooling},
              {"GlobalAveragePool", OT_Pooling},
              // <====== Addition
          };
      }

      4.2 Register the abstract adapt_Pooling() function in class ModelAdaptee if it has not been registered in model_tools/src/model_adaptee.h. Otherwise, skip this step.

      virtual EE adapt_operator(OperatorType type, ParameterSpec *ps) {
          std::map<OperatorType, AdaptOperatorFunction> functions = {
              // Addition ======>
              {OT_Pooling, &ModelAdaptee::adapt_Pooling},
              // <====== Addition
          };
      }
      
      // Addition ======>
      REGISTER_EMPTY_ADAPT_OPERATOR(adapt_Pooling)
      // <====== Addition

      4.3 Extract the meta information of pooling operator from onnx model, add ParameterSpec adapt_Pooling() override function in model_tools/onnx/onnx_adaptee.h.

      // Addition ======>
      ParameterSpec adapt_Pooling() override
      {
          ParameterSpec ps;
          PoolingParamSpec p;
          memset(&p, 0, sizeof(p));
          std::string autoPad = get_string(this->onnxNode, "auto_pad");
          std::vector<int> kernels = get_ints(this->onnxNode, "kernel_shape");
          std::vector<int> strides = get_ints(this->onnxNode, "strides");
          std::vector<int> pads = get_ints(this->onnxNode, "pads");
          int ceil_mode = get_int(this->onnxNode, "ceil_mode", 0);
      
          const std::string &onnxNodeType = this->onnxNode.op_type();
          if (onnxNodeType == "AveragePool" || onnxNodeType == "ReduceMean" ||
              onnxNodeType == "GlobalAveragePool") {
              p.mode = POOLING_MEAN;
          } else {
              p.mode = POOLING_MAX;
          }
      
          if (ceil_mode) {
              p.round_mode = ROUND_CEIL;
          } else {
              p.round_mode = ROUND_FLOOR;
          }
      
          p.kernel_t = 0;
          p.kernel_h = 0;
          p.kernel_w = 0;
          if (kernels.size() == 3) {
              p.kernel_t = kernels[0];
              p.kernel_h = kernels[1];
              p.kernel_w = kernels[2];
          } else if (kernels.size() == 2) {
              p.kernel_t = 1;
              p.kernel_h = kernels[0];
              p.kernel_w = kernels[1];
          } else if (kernels.size() == 1) {
              p.kernel_t = 1;
              p.kernel_h = kernels[0];
              p.kernel_w = 1;
          }
      
          p.stride_t = 1;
          p.stride_h = 1;
          p.stride_w = 1;
          if (strides.size() == 3) {
              p.stride_t = strides[0];
              p.stride_h = strides[1];
              p.stride_w = strides[2];
          } else if (strides.size() == 2) {
              p.stride_h = strides[0];
              p.stride_w = strides[1];
          } else if (strides.size() == 1) {
              p.stride_h = strides[0];
          }
      
          p.pad_before = 0;
          p.pad_top = 0;
          p.pad_left = 0;
          p.pad_after = 0;
          p.pad_bottom = 0;
          p.pad_right = 0;
          if (pads.size() == 6) {
              p.pad_before = pads[0];
              p.pad_top = pads[1];
              p.pad_left = pads[2];
              p.pad_after = pads[3];
              p.pad_bottom = pads[4];
              p.pad_right = pads[5];
          } else if (pads.size() == 4) {
              p.pad_top = pads[0];
              p.pad_left = pads[1];
              p.pad_bottom = pads[2];
              p.pad_right = pads[3];
          } else if (pads.size() == 2) {
              p.pad_top = pads[0];
              p.pad_bottom = pads[1];
          } else if (autoPad == "SAME_UPPER") {
              p.pad_top = (p.kernel_h - 1) / 2;
              p.pad_bottom = (p.kernel_h - 1) - p.pad_top;
              p.pad_left = (p.kernel_w - 1) / 2;
              p.pad_right = (p.kernel_w - 1) - p.pad_left;
          }
          ps.pooling_spec = p;
          return ps;
      }
      // <======= Addition
    5. Pooling is non-weight op, skip this step.

  • Example: support pooling in tflite converter

    1. Switch to model_tools/src/tflite, which is the tflite converter for bolt;

    2. Judgment: pooling is non-weight-op;

    3. Define pooling parameter format;

      Note: Definition actions same with add pooling in caffe converter step(3) . Please refer the former content.

    4. Extract the meta information of pooling operator in tflite.

      4.1 Modify OperatorType convert_tflite_type(std::string inputType) function in model_tools/tflite/tflite_adaptee.h.

      Add the tflite type mapping code as following:

      OperatorType convert_tflite_type(tflite::BuiltinOperator tfliteType) {
          std::map<tflite::BuiltinOperator, OperatorType> operatorMap = {
              // Addition ======>
              {tflite::BuiltinOperator_MAX_POOL_2D, OT_Pooling},
              {tflite::BuiltinOperator_AVERAGE_POOL_2D, OT_Pooling},
              // <====== Addition
          };
      }

      4.2 Register the abstract adapt_Pooling() function in class ModelAdaptee if it has not been registered in model_tools/src/model_adaptee.h. Otherwise, skip this step.

      virtual EE adapt_operator(OperatorType type, ParameterSpec *ps) {
          std::map<OperatorType, AdaptOperatorFunction> functions = {
              // Addition ======>
              {OT_Pooling, &ModelAdaptee::adapt_Pooling},
              // <====== Addition
          };
      }
      
      // Addition ======>
      REGISTER_EMPTY_ADAPT_OPERATOR(adapt_Pooling)
      // <====== Addition

      4.3 Extract the meta information of pooling operator from tflite model, add ParameterSpec adapt_Pooling() override function in model_tools/tflite/tflite_adaptee.h.

      // Addition ======>
      ParameterSpec adapt_Pooling() override
      {
          ParameterSpec ps;
          PoolingParamSpec p;
          memset(&p, 0, sizeof(p));
          p.kernel_t = 1;
          p.stride_t = 1;
          p.pad_before = 0;
          p.pad_after = 0;
          p.pad_top = 0;
          p.pad_bottom = 0;
          p.pad_left = 0;
          p.pad_right = 0;
          p.round_mode = ROUND_CEIL;
      
          const auto &inputTensor =
              this->tfliteTensors[this->tfliteOperators[this->tfliteOperatorIndex]->inputs[0]];
          const auto &inputShape = inputTensor->shape;
          CHECK_REQUIREMENT(inputShape.size() == 4);
          if (opCode == tflite::BuiltinOperator_MEAN) {  // Interpret as global pooling
              const auto &axisTensor =
                  this->tfliteTensors[this->tfliteOperators[this->tfliteOperatorIndex]->inputs[1]];
              const auto &axisData = tfliteModelBuffer[axisTensor->buffer]->data;
              auto axisPtr = reinterpret_cast<const int32_t *>(axisData.data());
              CHECK_REQUIREMENT(1 == axisPtr[0] && 2 == axisPtr[1]);
              p.mode = POOLING_MEAN;
              p.kernel_h = 0;
              p.kernel_w = 0;
              p.stride_h = 1;
              p.stride_w = 1;
          } else {
              const auto &tflitePoolOption =
                  this->tfliteOperators[this->tfliteOperatorIndex]->builtin_options.AsPool2DOptions();
              p.kernel_h = tflitePoolOption->filter_height;
              p.kernel_w = tflitePoolOption->filter_width;
              p.stride_h = tflitePoolOption->stride_h;
              p.stride_w = tflitePoolOption->stride_w;
              int tfPaddingRoundMode = tflitePoolOption->padding;
              if (tfPaddingRoundMode == 0) {
                  p.round_mode = ROUND_TF_SAME;
      
                  int oLength = (inputShape[2] + p.stride_w - 1) / p.stride_w;
                  int padLength = UNI_MAX((oLength - 1) * p.stride_w + p.kernel_w - inputShape[2], 0);
                  p.pad_left = padLength / 2;
                  p.pad_right = padLength - p.pad_left;
      
                  oLength = (inputShape[1] + p.stride_h - 1) / p.stride_h;
                  padLength = UNI_MAX((oLength - 1) * p.stride_h + p.kernel_h - inputShape[1], 0);
                  p.pad_top = padLength / 2;
                  p.pad_bottom = padLength - p.pad_top;
              } else if (tfPaddingRoundMode == 1) {
                  p.round_mode = ROUND_TF_VALID;
              } else {
                  UNI_ERROR_LOG("can not process operator location:%d Pooling round mode.\n",
                      this->tfliteOperatorIndex);
              }
              if (opCode == tflite::BuiltinOperator_MAX_POOL_2D) {
                  p.mode = POOLING_MAX;
              } else if (opCode == tflite::BuiltinOperator_AVERAGE_POOL_2D) {
                  p.mode = POOLING_MEAN;
              }
              insertActivationOperator(
                  getActivationOperatorType(tflitePoolOption->fused_activation_function));
          }
          ps.pooling_spec = p;
          return ps;
      }
      // <====== Addition
    5. Pooling is non-weight op, skip this step.

tensor computing customization

In tensor, you can define any operator for computing.

  1. Create a new operator file in compute/tensor/src;
  2. The computing implementations on various backends(CPU, GPU) are usually different. You should add the corresponding operator implementation to the specific folder in compute/tensor/src depending on the target backend.

inference's engine customization

In engine, you can define any operator for the inference of your model.

  1. Add the definition of the specific operator in inference/engine/include;
  2. If the specific operator implement in CPU is different from its implement in GPU, implement should be divided into CPU and GPU version. If the specific operator implement in CPU is same with its implement in GPU, skip this step.

How to contribute


submit issue

  • question

    Submit any question you have encountered when you use Bolt. You can give feedback to us through committing issues. Refer to https://github.com/huawei-noah/bolt/issues, create your new issue and submit it. The issue can be a bug in Bolt, a suggestion for Bolt, or anything you don't understand about Bolt.

  • feature request

    Submit any feature that you want but it has not been implemented in Bolt. We have created a special issue and you can leave a commit under this issue . We will seriously consider the needs of all users and continue to enrich the functions of Bolt.

pull request

  • add MIT license

    For consistency, please add MIT license at the head of your source codes indicating your codes will be open to all.

  • provide an executable unit test

    Fork Bolt on your github account. Modify your code and make sure your code pass all testing cases. Commit the code and initiate a pull request on github.