Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model Optimizer: Caffe Custom Layer Not Recognized? #191

Closed
jzhan299 opened this issue Jun 24, 2019 · 18 comments
Closed

Model Optimizer: Caffe Custom Layer Not Recognized? #191

jzhan299 opened this issue Jun 24, 2019 · 18 comments
Labels
category: MO Model Optimizer feature New feature request

Comments

@jzhan299
Copy link

jzhan299 commented Jun 24, 2019

I tried writing a custom layer for the Caffe Absolute Value Layer. As per your IDZ post and the official documentation, I created a absval.py and absval_ext.py:

EDIT: .\deployment_tools\model_optimizer\extensions\ops\absval.py:

import logging as log
import networkx as nx
import numpy as np

from mo.front.caffe.extractors.utils import get_canonical_axis_index
from mo.graph.graph import Node, Graph
from mo.ops.op import Op, PermuteAttrs


class AbsValOp(Op):
    op = 'AbsVal'

    def __init__(self, graph: Graph, attrs: dict):
        mandatory_props = {
            'type': __class__.op,
            'op': __class__.op,
            'infer': __class__.infer,
            'in_ports_count': 1,
            'out_ports_count': 1,
        }
        super().__init__(graph, mandatory_props, attrs)

    @staticmethod
    def infer(node: Node):
        assert len(node.in_nodes()) == 1
        assert len(node.out_nodes()) == 1
        input_node = node.in_node()
        assert input_node.has_valid('shape')
        node.out_node().shape = input_node.shape.copy()
        if input_node.has_valid('value'):
            node.out_node().value = np.abs(input_node.value)

EDIT: .\deployment_tools\model_optimizer\extensions\front\caffe\absval_ext.py

from mo.front.extractor import FrontExtractorOp
from mo.ops.op import Op

class AbsValExtractor(FrontExtractorOp):
    op = 'AbsVal'
    enabled = True

    @staticmethod
    def extract(node):
	    proto_layer = node.pb
	    Op.get_op_class_by_name('AbsVal').update_node_stat(node, {'operation': 'AbsVal'})
	    return __class__.enabled

After writing this code, I was able to convert my Caffe landmark detection model using the ModelOptimizer into an IR representation:

[ SUCCESS ] Generated IR model.
[ SUCCESS ] XML file: C:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\model_optimizer\.\facemarks.xml
[ SUCCESS ] BIN file: C:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\model_optimizer\.\facemarks.bin
[ SUCCESS ] Total execution time: 0.99 seconds.

But when I try to use the converted model in the interactive face demo, it errors out and says the primitive is not recognized.

The error log:
InferenceEngine:
API version ............ 1.6
Build .................. 23780
[ INFO ] Parsing input parameters
[ INFO ] Reading input
[ INFO ] Loading plugin CPU

    API version ............ 1.6
    Build .................. 23780
    Description ....... MKLDNNPlugin

[ INFO ] Loading network files for Face Detection
[ INFO ] Batch size is set to 1
[ INFO ] Checking Face Detection network inputs
[ INFO ] Checking Face Detection network outputs
[ INFO ] Loading Face Detection model to the CPU plugin
[ INFO ] Age/Gender DISABLED
[ INFO ] Loading network files for Head Pose Estimation network
[ INFO ] Batch size is set to 16 for Head Pose Estimation network
[ INFO ] Checking Head Pose Estimation network inputs
[ INFO ] Checking Head Pose Estimation network outputs
[ INFO ] Loading Head Pose Estimation model to the CPU plugin
[ INFO ] Emotions Recognition DISABLED
[ INFO ] Loading network files for Facial Landmarks Estimation
[ INFO ] Batch size is set to 16 for Facial Landmarks Estimation network
[ INFO ] Checking Facial Landmarks Estimation network inputs
[ INFO ] Checking Facial Landmarks Estimation network outputs
[ INFO ] Loading Facial Landmarks Estimation model to the CPU plugin
[ ERROR ] Unsupported primitive of type: AbsVal name: ActivationAbs1

Is there an error in my implementation, or there another problem that must be solved for Model Optimizer to recognize this primitive?

@shubha-ramani
Copy link

shubha-ramani commented Jun 25, 2019

Dear @jzhan299
First of all, congratulations on creating a Model Optimizer custom layer extension ! I am the one who wrote the IDZ post you referred to. But note also that I mention this (it should be on the 2019 branch but I wrote this post when we didn't have 2019 yet):
https://github.com/opencv/dldt/blob/2018/inference-engine/src/extension/ext_argmax.cpp

So now you must create a CPU (and/or) GPU extension as well for ActivationAbs1. This is your problem. Once you create the CPU (and/or GPU) extension, you will overcome the above error.

A CPU extension is C++ code. A GPU extension is a combination of OpenCL code and *.xml file (for GPU take a look at https://github.com/opencv/dldt/tree/2019/inference-engine/src/cldnn_engine/cldnn_global_custom_kernels)

Please pay close attention to the name of your layer which shows up in your IR (*.xml file). The name in your CPU/GPU extension must match exactly which includes case sensitivity ! This is a common mistake that all of us make in creating custom layer extensions so please watch out for that.

You're almost there ! All you have to do now is create a CPU or GPU extension. Unfortunately though, custom layers for MYRIAD are not yet supported.

Please post here if you have further questions.

Thanks,

Shubha

@jzhan299
Copy link
Author

jzhan299 commented Jun 25, 2019

Hi Shubha:

Thanks for the quick response. I'm a bit confused on how to create a CPU extension; in particular, I'm not sure what add_config and REG_FACTORY_FOR are for. Here's my attempt at creating the AbsVal extension:
.\deployment_tools\inference_engine\src\extension\ext_absval.cpp

// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

#include "ext_list.hpp"
#include "ext_base.hpp"

#include <algorithm>
#include <string>
#include <vector>
#include <cmath>
#include <utility>
#include <functional>

namespace InferenceEngine {
namespace Extensions {
namespace Cpu {

class AbsValImpl: public ExtLayerBase {
public:
    explicit AbsValImpl(const CNNLayer* layer) {
        try {
            if (layer->insData.size() != 1  || layer->outData.empty())
                THROW_IE_EXCEPTION << "Incorrect number of input/output edges!";

            addConfig(layer, {DataConfigurator(ConfLayout::PLN)}, {DataConfigurator(ConfLayout::PLN)});
        } catch (InferenceEngine::details::InferenceEngineException &ex) {
            errorMsg = ex.what();
        }
    }

    StatusCode execute(std::vector<Blob::Ptr>& inputs, std::vector<Blob::Ptr>& outputs,
                       ResponseDesc *resp) noexcept override {
        const float* src_data = inputs[0]->buffer();
        float* dst_data = outputs[0]->buffer();
        for (size_t o = 0; o < outputs.size(); o++) {
            dst_data[o] = src_data[o]*2; // the kernel just multiplies the input
    }
};

REG_FACTORY_FOR(ImplFactory<AbsValImpl>, AbsVal);

}  // namespace Cpu
}  // namespace Extensions
}  // namespace InferenceEngine

But after writing this CPU extension, it still does not recognize the layer. Do I need to rebuild the extensions? If so, how? Otherwise, what other steps do I need to take?

Thanks,
James

@shubha-ramani
Copy link

shubha-ramani commented Jun 25, 2019

Dear @jzhan299

Your ext_absval.cpp looks OK to me. To be honest, the easiest way to create a CPU extension is to copy an existing one, for example, argmax_cpp then modify the code to suit your needs. The Inference Engine Kernels Extensibility Doc is also helpful.

After writing your new cpu extension, the cpu_extension.dll (or *.so) will be newly built, containing your new extension. When you build your Inference Engine and samples, you will also get a new cpu_extension.dll containing your new extension.

Please see the classification_sample for how you include the extension library in your code:

        /** Loading default extensions **/
        if (FLAGS_d.find("CPU") != std::string::npos) {
            /**
             * cpu_extensions library is compiled from "extension" folder containing
             * custom MKLDNNPlugin layer implementations. These layers are not supported
             * by mkldnn, but they can be useful for inferring custom topologies.
            **/
            plugin.AddExtension(std::make_shared<Extensions::Cpu::CpuExtensions>());
        }

        if (!FLAGS_l.empty()) {
            // CPU(MKLDNN) extensions are loaded as a shared library and passed as a pointer to base extension
            auto extension_ptr = make_so_pointer<IExtension>(FLAGS_l);
            plugin.AddExtension(extension_ptr);
            slog::info << "CPU Extension loaded: " << FLAGS_l << slog::endl;
        }
        if (!FLAGS_c.empty()) {
            // clDNN Extensions are loaded from an .xml description and OpenCL kernel files
            plugin.SetConfig({{PluginConfigParams::KEY_CONFIG_FILE, FLAGS_c}});
            slog::info << "GPU Extension loaded: " << FLAGS_c << slog::endl;
        }

        /** Setting plugin parameter for collecting per layer metrics **/

You will have to do something similar.

Hope it helps,

Thanks,

Shubha

@jzhan299
Copy link
Author

jzhan299 commented Jun 25, 2019

Hi Shubha:

Thanks for the update; I've added the extension loader to my code. I am currently using the prebuilt Inference Engine that comes with OpenVino.

Is the only option to build Inference Engine from this repository, and then load the created cpu_extension library in my code? Or is there a way to build the new cpu_extension.so using OpenVino's Inference Engine?

-James

@shubha-ramani
Copy link

shubha-ramani commented Jun 25, 2019

Dear @jzhan299 when you add your extension code in the same location as the other extensions (like argmax.cpp), and rebuild your Inference Engine, then your extension will get built into the cpu_extension.dll . Please see this readme for how to rebuild Inference Engine. For the non-dldt version you simply just rebuild the samples as normal and your cpu_extension.dll will automatically get built if you put the code in the correct location (put it where argmax.cpp lives)

@jzhan299
Copy link
Author

jzhan299 commented Jun 26, 2019

Dear Shubha:

Update: I have built Inference Engine, made a minor change to my ext_absval.cpp (changed output->size() to output.size(), see edit above), and now the CPU plugin recognizes my custom facemark detection model!

However, my program now crashes when calling the InferenceEngine::InferRequest::Infer method. It does not crash for any of the pretrained OpenVINO models. Do you have any guidance on how to debug this? Infer seems to just be a wrapper function, so I can't tell if or how the custom layer is crashing.

I can attach the code (the sample interactive_face_demo with modified FacialLandmarksDetection class) and my caffe model + IR representation if that helps.

@shubha-ramani
Copy link

shubha-ramani commented Jun 26, 2019

Dear @jzhan299
Wow ! You've made fantastic progress. OK, so I can tell you how to debug this crash for sure. You're going to have to now build a Debug version of Inference Engine and step through your code. Pretty soon you will figure out exactly why your Infer is crashing because we have made Inference Engine source code fully available to you.

I have given very detailed steps on how to build a Debug IE here:
github answer 173

Make sure your PATH variable is set to the directory which contains your debug plugins.

I ran into this recently - sometimes the debug plugins are not there. If this is the case you must pull up the solution file (generated when you build Inference Engine) and build the specific plugin which you require. I'm talking about Windows here, for you in Linux it will be cmake. For instance your plugins are located here:
C:\Intel\dldt\inference-engine\bin\intel64\Debug

And since I was debugging on CPU, I had to manually build MKLDNNPlugind.dll because it was missing from the Debug folder.

But long story short, once you build a Debug version of Inference Engine and you get your paths all set up correctly, you should be able to step into your Infer function and understand exactly what is wrong. I have done this several times, trust me, it works !

The only problem for you is now you have to build IE in Debug which takes awhile.

Please report back on this forum. I'm very happy you've made it so far with custom layers @jzhan299 ! Congrats !

Thanks,

Shubha

@jzhan299
Copy link
Author

jzhan299 commented Jun 27, 2019

Dear Shubha:

I've built debug IE, got the environment working (did have to build tbb_debug,dll and some other dlls manually). When debugging my program, stepping into the infer function brings me from InferenceEngine::InferRequest::Infer() to MKLDNNPlugind.dll!InferenceEngine::InferRequestBaseMKLDNNPlugin::MKLDNNAsyncInferRequest::Infer(InferenceEngine::ResponseDesc * resp):

    StatusCode Infer(ResponseDesc *resp) noexcept override {
        IE_PROFILING_AUTO_SCOPE(Infer);
        TO_STATUS(_impl->Infer());
    }

Stepping in further leads to void MKLDNNPlugin::MKLDNNAsyncInferRequest::Infer():

void MKLDNNPlugin::MKLDNNAsyncInferRequest::Infer() {
    _callbackManager.disableCallback();
    StartAsync();
    Wait(InferenceEngine::IInferRequest::WaitMode::RESULT_READY);
    _callbackManager.enableCallback();
}

Exceptions are then thrown in Wait(). But I am unable to tell what layer caused in my network this issue; there seems to be no way to step into the inference? Instead, I can only wait for the network to output and see the (erroring) output?

@shubha-ramani
Copy link

shubha-ramani commented Jun 27, 2019

Dear @jzhan299
Ok, so it looks like you can't step in because dldt doesn't give you access to mkldnn source code. There is are these files though:

C:\Intel\dldt\inference-engine\src\mkldnn_plugin\mkldnn_async_infer_request.cpp
C:\Intel\dldt\inference-engine\src\inference_engine\cpp_interfaces\impl\ie_infer_async_request_thread_safe_default.hpp

Could you possibly put breakpoints in those files to see if you hit it ?
Also what does your call-stack look like ?

Also you may have to include

#include <iostream>

in some files and sprinkle some debug statements using std::cout (this sucks because you'll have to recompile Inference Engine)

Let me ask an Inference Engine developer how to debug this and I will report back to you.

Thanks for your patience !

Shubha

@jzhan299
Copy link
Author

jzhan299 commented Jun 27, 2019

Dear Shubha:

I believe I have access to mkldnn source code; as seen in my previous comment, I was able to step into mkldnn_async_infer_request. The problem is that this code seems to black box the neural network itself. There seems to be no way to step into the forward pass and finding a more specific error than "the network errors out while processing the input."

As for my call stack, the call stack on Line 215 in ie_infer_async_request_thread_safe_default.hpp (before taskCopy->checkException in the wait function):

MKLDNNPlugind.dll!InferenceEngine::AsyncInferRequestThreadSafeDefault::Wait(__int64 millis_timeout) Line 215
MKLDNNPlugind.dll!MKLDNNPlugin::MKLDNNAsyncInferRequest::Infer() Line 21
MKLDNNPlugind.dll!InferenceEngine::InferRequestBase<MKLDNNPlugin::MKLDNNAsyncInferRequest>::Infer(InferenceEngine::ResponseDesc * resp) Line 32
interactive_face_detection_demo.exe!InferenceEngine::InferRequest::Infer() Line 101
interactive_face_detection_demo.exe!BaseDetection::submitRequest() Line 57
interactive_face_detection_demo.exe!FacialLandmarksDetection::submitRequest() Line 756
interactive_face_detection_demo.exe!main(int argc, char * * argv) Line 247
[External Code]	

After exiting wait and extracting exception:

interactive_face_detection_demo.exe!InferenceEngine::details::extract_exception(InferenceEngine::StatusCode status, char * msg) Line 71
interactive_face_detection_demo.exe!InferenceEngine::InferRequest::Infer() Line 102
interactive_face_detection_demo.exe!BaseDetection::submitRequest() Line 57
interactive_face_detection_demo.exe!FacialLandmarksDetection::submitRequest() Line 75
interactive_face_detection_demo.exe!main(int argc, char * * argv) Line 247
[External Code]	

Also, thank you for asking an Inference Engine developer! I hope that together we can solve this issue quickly.

-James

@jzhan299
Copy link
Author

jzhan299 commented Jun 27, 2019

Update: I poked around in the mo folder. Looking at the implementation of the Tanh and other activation layers, I think my implementation of the Absolute Value layer is incorrect, and is thus causing the crash. While I was able to modify their ops and _ext implementations for Absolute Value, I could not find their CPU extension files to modify from.

Where is the CPU implementation for non-extension layers (in particular, where is the CPU file for dldt\model-optimizer\mo\ops\Activation.py)? Also, could you/the inference engine developer take a look at the revised files? I've zipped and attached them below.

  • James
    Update 2: Updated absval.py (forgot to inherit from Ops). Model now converts, though still crashes when running facial detection demo. 06/27 3:25PM
    Update 1: Wrong absval_test.py, updated with new. Also added test caffe model. 06/27 2:53PM.

    absval_custom.zip

@shubha-ramani
Copy link

shubha-ramani commented Jun 27, 2019

@jzhan299

Thanks for the update and the *.zip which you attached. I will take a look and will consult IE developers as needed.

But http://caffe.berkeleyvision.org/tutorial/layers/absval.html is not an Activation Layer ! Are you using it as an Activation layer in your model ? Usually when we think of Activation Layers we think of Relu, tanh, exp, sigmoid, etc....

But are you actually using absval as an activation layer function in your model ? An activation layer is distinct from an Input or Output Layer, and I suppose you can use any function as an Activation Layer actually.

To see where Activation.py is implemented for CPU, take a look at

C:\Intel\dldt\inference-engine\src\mkldnn_plugin\nodes\mkldnn_activation_node.cpp
C:\Intel\dldt\inference-engine\src\mkldnn_plugin\mkldnn_node.h
C:\Intel\dldt\inference-engine\src\mkldnn_plugin\nodes\mkldnn_activation_node.h

Hope it helps,

Thanks,

Shubha

@jzhan299
Copy link
Author

jzhan299 commented Jun 27, 2019

Dear Shubha:

Yes, I believe that I am using it as an activation layer. Caffe notes it as one as well. You can check the facemark detection model attached in the zip to confirm; though feel free to correct me if I am wrong. I will try implementing it in Activation.py and reporting back. Thank you for looking at my zip!

Best,
James

@shubha-ramani
Copy link

Dear @jzhan299
Wow I learned something new. Indeed caffe does consider absval as an Activation Layer ! I bet once you code absval up on both the MO end and the IE/mkldnn end as an Activation Layer, things should start working for you.

I already see this code in C:\Intel\dldt\inference-engine\src\mkldnn_plugin\nodes\mkldnn_activation_node.cpp but it's eltwise, which may be different than what you're trying to do.

{"abs", [](GenericLayer* activationLayer, mkldnn::algorithm& algorithm, float& alpha, float& beta) {
            alpha = 0.0f;
            beta = 0.0f;
            algorithm = eltwise_abs;
        }},
`'`

Shubha

@jzhan299
Copy link
Author

jzhan299 commented Jun 27, 2019

Dear Shubha:

Yes, element-wise is exactly what I am trying to do. As a result, I have added a quick if statement in
void MKLDNNActivationNode::initValues():

...
    if (comparator(type, "activation"))
        type = activationLayer->GetParamAsString("type");
    if (comparator(type, "sigmoid"))
        type = "logistic";
    if (comparator(type, "absval")) //AbsVal custom layer goes to already implemented abs case.
        type = "abs";
...

In .\mo\ops\activation.py, I have also added 'absval': lambda x: np.abs(x) to the operations dict.
In .\mo\front\caffe\extractor, I have added absval.py. It is essentially tanh.py but with the op switched out:

from mo.front.extractor import FrontExtractorOp
from mo.ops.activation import Activation
class AbsValFrontExtractor(FrontExtractorOp):
    op = 'AbsVal'
    enabled = True

    @staticmethod
    def extract(node):
        Activation.update_node_stat(node, {'operation': 'absval'})
        return __class__.enabled

I have removed absval.py, absval_ext.py, and ext_absval.py from the extensions folder. I believe these would be redundant, and have opted instead to follow the structure of the other activation functions, but correct me if I am wrong.

Reconverting the model and running the demo still yields:
[ ERROR ] Error reading network: Unsupported Activation layer type: absval

I also noticed that there is also ie_activation_layer.hpp (eg. ie_tanh_layer.hpp) and activation.cpp (eg. tanh.cpp) for each activation layer. Do we also need to implement these files before the layer is recognized? Are there other files which I must create/edit as well?

[Update: I have implemented my own .\inference-engine\src\vpu\graph_transformer\src\stages\ absval.cpp, added the necessary methods and enums in stage.hpp and frontend.hpp, and rebuilt the vpu project. Same error still shows up. Now implementing ie_absval_layer.hpp]

[Update 2 (06/28): Implemented ie_absval_layer.cpp and ie_absval_layer.hpp. Did so by taking the corresponding files for Tanh and replacing all tanh references with absval references. Still unsupported activation layer type. Will try rebuilding entire inference engine instead of individual projects.]

Thanks,
James

@knandanan
Copy link

knandanan commented Sep 20, 2019

Hi James, were you able to make the absval layer work in openvino? We are also trying to add it as a custom layer for our Siamese network model but confused on how to go about doing it

@lazarevevgeny lazarevevgeny added the category: MO Model Optimizer label May 25, 2020
@lazarevevgeny
Copy link
Contributor

@jzhan299 were you able to finish this task? As of now OpenVINO support eltwise operation Abs. And all you need is to create an extractor for this op in the Model Optimizer (no need to write any CPU extension).

@lazarevevgeny lazarevevgeny added the feature New feature request label May 25, 2020
@AnastasiaKazantaeva
Copy link
Contributor

It seems that the issues is not actual anymore as no response. Closing it. Feel free to reopen it or create a new one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: MO Model Optimizer feature New feature request
Projects
None yet
Development

No branches or pull requests

5 participants