Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mask_RCNN in tensorflow c++ Running model failed: Not found: FeedInputs: #222

Open
hxw111 opened this issue Jan 31, 2018 · 39 comments
Open

Comments

@hxw111
Copy link

hxw111 commented Jan 31, 2018

I have saved the mask_rcnn model as the .pb file .But there are totally two parts of input in keras code: image and meta. I couldn't find how to feed them into the input in tensorflow c++.
This is my c++ code:

             Status run_status = session->Run(
                         {{"input_image", image_tensor},{"input_image_meta",meta_tensor}},
                         {"output_node0"}, {}, &outputs
             );

And I got an error "Running model failed: Not found: FeedInputs :unable to find feed output input_image_meta". Are there any tricks to solve the problem??
Thanks!

@ivshli
Copy link

ivshli commented Feb 27, 2018

Hi,
Did you solved your problem? I'm facing the same issus

@jmtatsch
Copy link
Contributor

jmtatsch commented Feb 28, 2018

Try to visualize the graph in your pb file with
https://github.com/tensorflow/tensorflow/blob/r1.4/tensorflow/python/tools/import_pb_to_tensorboard.py .
There is a good post about it at https://medium.com/@daj/how-to-inspect-a-pre-trained-tensorflow-model-5fd2ee79ced0. Probably the name "input_image_meta" has changed somehow or the pb is not complete?
I'm also very keen on getting this to run...

@moorage
Copy link
Contributor

moorage commented Mar 12, 2018

@hxw111 @ivshli @jmtatsch I'm curious how you built your image_tensor and meta_tensor.
I can get my session to run, but I'm having trouble getting the outputs to work (all zeros on output_detections:0).

UPDATE: now working; posted full code in follow up comment: #222 (comment)

Here's how I'm running it

    std::vector<tensorflow::Tensor> outputs;
    tensorflow::Status run_status = session->Run({{"input_image:0", inputTensor}, {"input_image_meta:0", inputMetadataTensor}},
                                                 {"output_detections:0", "output_mrcnn_class:0", "output_mrcnn_bbox:0", "output_mrcnn_mask:0", "output_rois:0", "output_rpn_class:0", "output_rpn_bbox:0"},
                                                 {},
                                                 &outputs);

@ivshli
Copy link

ivshli commented Mar 14, 2018

@moorage I'm working on it, try to recode the python code into c++ style, you can check utils.resize_image molded_images, compose_image_meta fucntions
If anyone who knows or already done it, very welcome to share their experiences :)

@moorage
Copy link
Contributor

moorage commented Mar 14, 2018

@ivshli I indeed was able to implement all these. Including unmold. Kind of painful. But I suppose in the spirit of sharing and good karma I'll post it below :) . It's not particularly efficient; happy to take feedback in that department.

    // given inputMat of type RGB (not BGR) / CV_8UC3 (possibly from an imread + cvtColor)
    // also given dest of type cv::Mat(inputMat.size(), CV_8UC1)
    // we trained on 256x256 , so TF_MASKRCNN_IMG_WIDTHHEIGHT = 256
    // we copied MEAN_PIXEL configs, so cv::Scalar TF_MASKRCNN_MEAN_PIXEL(123.7, 116.8, 103.9);
    // we statically defined float TF_MASKRCNN_IMAGE_METADATA[10] = {  0 ,TF_MASKRCNN_IMG_WIDTHHEIGHT ,TF_MASKRCNN_IMG_WIDTHHEIGHT , 3 , 0 , 0 ,TF_MASKRCNN_IMG_WIDTHHEIGHT ,TF_MASKRCNN_IMG_WIDTHHEIGHT , 0 , 0 }; 

    // Resize to square with max dim, so we can resize it to 512x512
    int largestDim = inputMat.size().height > inputMat.size().width ? inputMat.size().height : inputMat.size().width;
    cv::Mat squareInputMat(cv::Size(largestDim, largestDim), CV_8UC3);
    int leftBorder = (largestDim - inputMat.size().width) / 2;
    int topBorder = (largestDim - inputMat.size().height) / 2;
    cv::copyMakeBorder(inputMat, squareInputMat, topBorder, largestDim - (inputMat.size().height + topBorder), leftBorder, largestDim - (inputMat.size().width + leftBorder), cv::BORDER_CONSTANT, cv::Scalar(0));
    cv::Mat resizedInputMat(cv::Size(TF_MASKRCNN_IMG_WIDTHHEIGHT, TF_MASKRCNN_IMG_WIDTHHEIGHT), CV_8UC3);
    cv::resize(squareInputMat, resizedInputMat, resizedInputMat.size(), 0, 0);
    
    // Need to "mold_image" like in mask rcnn
    cv::Mat moldedInput(resizedInputMat.size(), CV_32FC3);
    resizedInputMat.convertTo(moldedInput, CV_32FC3);
    cv::subtract(moldedInput, TF_MASKRCNN_MEAN_PIXEL, moldedInput);
    
    // Move the data into the input tensor
    // remove memory copies by using code at https://github.com/tensorflow/tensorflow/issues/8033#issuecomment-332029092
    // allocate a Tensor and get pointer to memory for that Tensor, allocate a "fake" cv::Mat from it to use as a  basis to convert
    tensorflow::Tensor inputTensor(tensorflow::DT_FLOAT, {1, moldedInput.size().height, moldedInput.size().width, 3}); // single image instance with 3 channels
    float_t *p = inputTensor.flat<float_t>().data();
    cv::Mat inputTensorMat(moldedInput.size(), CV_32FC3, p);
    moldedInput.convertTo(inputTensorMat, CV_32FC3);
    
    // Copy the TF_MASKRCNN_IMAGE_METADATA data into a tensor
    tensorflow::Tensor inputMetadataTensor(tensorflow::DT_FLOAT, {1, TF_MASKRCNN_IMAGE_METADATA_LENGTH});
    auto inputMetadataTensorMap = inputMetadataTensor.tensor<float, 2>();
    for (int i = 0; i < TF_MASKRCNN_IMAGE_METADATA_LENGTH; ++i) {
        inputMetadataTensorMap(0, i) = TF_MASKRCNN_IMAGE_METADATA[i];
    }
    
    // Run tensorflow
    cv::TickMeter tm;
    tm.start();
    std::vector<tensorflow::Tensor> outputs;
    tensorflow::Status run_status = tfSession->Run({{"input_image", inputTensor}, {"input_image_meta", inputMetadataTensor}},
                                                      {"output_detections", "output_mrcnn_class", "output_mrcnn_bbox", "output_mrcnn_mask",
                                                          "output_rois", "output_rpn_class", "output_rpn_bbox"},
                                                       {},
                                                       &outputs);
    if (!run_status.ok()) {
        std::cerr << "tfSession->Run failed: " << run_status << std::endl;
    }
    tm.stop();
    std::cout << "Inference time, ms: " << tm.getTimeMilli()  << std::endl;
    
    if (outputs[3].shape().dims() != 5 || outputs[3].shape().dim_size(4) != 2) {
        throw std::runtime_error("Expected mask dimensions to be [1,100,28,28,2] but got: " + outputs[3].shape().DebugString());
    }
    
    auto detectionsMap = outputs[0].tensor<float, 3>();

    for (int i = 0; i < outputs[3].shape().dim_size(1); ++i) {
        auto scoreAtI = detectionsMap(0, i, 5);
        auto detectedClass = detectionsMap(0, i, 4);
        auto y1 = detectionsMap(0, i, 0), x1 = detectionsMap(0, i, 1), y2 = detectionsMap(0, i, 2), x2 = detectionsMap(0, i, 3);
        auto maskHeight = y2 - y1, maskWidth = x2 - x1;

        if (maskHeight != 0 && maskWidth != 0) {
            // Pointer arithmetic
            const int i0 = 0, /* size0 = (int)outputs[3].shape().dim_size(1), */ i1 = i, size1 = (int)outputs[3].shape().dim_size(1), size2 = (int)outputs[3].shape().dim_size(2), size3 = (int)outputs[3].shape().dim_size(3), i4 = (int)detectedClass /*, size4 = 2 */;
            int pointerLocationOfI = (i0*size1 + i1)*size2;
            float_t *maskPointer = outputs[3].flat<float_t>().data();
        
            // The shape of the detection is [28,28,2], where the last index is the class of interest.
            // We'll extract index 1 because it's the toilet seat.
            cv::Mat initialMask(cv::Size(size2, size3), CV_32FC2, &maskPointer[pointerLocationOfI]); // CV_32FC2 because I know size4 is 2
            cv::Mat detectedMask(initialMask.size(), CV_32FC1);
            cv::extractChannel(initialMask, detectedMask, i4);
        
            // Convert to B&W
            cv::Mat binaryMask(detectedMask.size(), CV_8UC1);
            cv::threshold(detectedMask, binaryMask, 0.5, 255, cv::THRESH_BINARY);
        
            // First scale and offset in relation to TF_MASKRCNN_IMG_WIDTHHEIGHT
            cv::Mat scaledDetectionMat(maskHeight, maskWidth, CV_8UC1);
            cv::resize(binaryMask, scaledDetectionMat, scaledDetectionMat.size(), 0, 0);
            cv::Mat scaledOffsetMat(moldedInput.size(), CV_8UC1, cv::Scalar(0));
            scaledDetectionMat.copyTo(scaledOffsetMat(cv::Rect(x1, y1, maskWidth, maskHeight)));
        
            // Second, scale and offset in relation to our original inputMat
            cv::Mat detectionScaledToSquare(squareInputMat.size(), CV_8UC1);
            cv::resize(scaledOffsetMat, detectionScaledToSquare, detectionScaledToSquare.size(), 0, 0);
           detectionScaledToSquare(cv::Rect(leftBorder, topBorder, inputMat.size().width, inputMat.size().height)).copyTo(dest);
        }
    }

@samhodge
Copy link

this is really useful, thanks a lot.

@luoshanwei
Copy link

@moorage hello, thanks for your code
but I think
int pointerLocationOfI = (i0size1 + i1)size2;
should be
int pointerLocationOfI = (i0
size1 + i1)size2size3
size4;
How do you think?
I don't know much about outputs[3].flat<float_t>().data()

@moorage
Copy link
Contributor

moorage commented Apr 2, 2018

My version worked for me @luoshanwei :)

@ypflll
Copy link

ypflll commented Apr 3, 2018

@moorage Hi, do you run your code on cpu or gpu? I try to run the code on a single cpu (to test the time cost), by setting the cpu number:

    GraphDef graph_def;
    SessionOptions opts;
    TF_CHECK_OK(ReadBinaryProto(Env::Default(), graph_definition, &graph_def));
    graph::SetDefaultDevice("/cpu:0", &graph_def);`

However, it doesn't work. The program also occupy other cpus.
Do you meet this problem?

@moorage
Copy link
Contributor

moorage commented Apr 3, 2018

@ypflll never tried that, sorry. I ran on a single i7 laptop, but didn't check CPU usage.

@samhodge
Copy link

@luoshanwei did you get the pointer math sorted?

@samhodge
Copy link

I am looking at this an going mildy cross eyed: https://eli.thegreenplace.net/2015/memory-layout-of-multi-dimensional-arrays/ but short of addressing each pixel by hand in a five deep for loop via the tensor math I cannot be sure of how else to do it.

@samhodge
Copy link

OpenCV leaves a lot to be desired when it comes to multichannel images: this should make short work of the problem: https://github.com/OpenImageIO/oiio/blob/master/src/libOpenImageIO/imagebuf_test.cpp

@Masahiro1002
Copy link

Masahiro1002 commented May 31, 2018

Hi, I have a question about C++ implementation.
I implemented with reference the #222 comment
But the output and input is different from the comment now. I think this is because of latest update.
The error message is like this.

tfSession->Run failed: Invalid argument: You must feed a value for placeholder tensor 'input_anchors' with dtype float and shape [?,?,4]

But I am not sure how to build 'input_anchors'. Does anybody know how to build in C++?

Thank you

@samhodge
Copy link

samhodge commented Jul 17, 2018

@Masahiro1002 did you get to the bottom of this?

seems like we need to convert some python from here:

parai@6289c1b

@marcown
Copy link

marcown commented Jul 18, 2018

@moorage thank you! could tell how you converted the keras to .pb in the first place?

@samhodge
Copy link

@marcown you can find a multitude of guides here: #218

@marcown
Copy link

marcown commented Jul 19, 2018

thanks!

@msr-peng
Copy link

@ivshli I indeed was able to implement all these. Including unmold. Kind of painful. But I suppose in the spirit of sharing and good karma I'll post it below :) . It's not particularly efficient; happy to take feedback in that department.

    // given inputMat of type RGB (not BGR) / CV_8UC3 (possibly from an imread + cvtColor)
    // also given dest of type cv::Mat(inputMat.size(), CV_8UC1)
    // we trained on 256x256 , so TF_MASKRCNN_IMG_WIDTHHEIGHT = 256
    // we copied MEAN_PIXEL configs, so cv::Scalar TF_MASKRCNN_MEAN_PIXEL(123.7, 116.8, 103.9);
    // we statically defined float TF_MASKRCNN_IMAGE_METADATA[10] = {  0 ,TF_MASKRCNN_IMG_WIDTHHEIGHT ,TF_MASKRCNN_IMG_WIDTHHEIGHT , 3 , 0 , 0 ,TF_MASKRCNN_IMG_WIDTHHEIGHT ,TF_MASKRCNN_IMG_WIDTHHEIGHT , 0 , 0 }; 

    // Resize to square with max dim, so we can resize it to 512x512
    int largestDim = inputMat.size().height > inputMat.size().width ? inputMat.size().height : inputMat.size().width;
    cv::Mat squareInputMat(cv::Size(largestDim, largestDim), CV_8UC3);
    int leftBorder = (largestDim - inputMat.size().width) / 2;
    int topBorder = (largestDim - inputMat.size().height) / 2;
    cv::copyMakeBorder(inputMat, squareInputMat, topBorder, largestDim - (inputMat.size().height + topBorder), leftBorder, largestDim - (inputMat.size().width + leftBorder), cv::BORDER_CONSTANT, cv::Scalar(0));
    cv::Mat resizedInputMat(cv::Size(TF_MASKRCNN_IMG_WIDTHHEIGHT, TF_MASKRCNN_IMG_WIDTHHEIGHT), CV_8UC3);
    cv::resize(squareInputMat, resizedInputMat, resizedInputMat.size(), 0, 0);
    
    // Need to "mold_image" like in mask rcnn
    cv::Mat moldedInput(resizedInputMat.size(), CV_32FC3);
    resizedInputMat.convertTo(moldedInput, CV_32FC3);
    cv::subtract(moldedInput, TF_MASKRCNN_MEAN_PIXEL, moldedInput);
    
    // Move the data into the input tensor
    // remove memory copies by using code at https://github.com/tensorflow/tensorflow/issues/8033#issuecomment-332029092
    // allocate a Tensor and get pointer to memory for that Tensor, allocate a "fake" cv::Mat from it to use as a  basis to convert
    tensorflow::Tensor inputTensor(tensorflow::DT_FLOAT, {1, moldedInput.size().height, moldedInput.size().width, 3}); // single image instance with 3 channels
    float_t *p = inputTensor.flat<float_t>().data();
    cv::Mat inputTensorMat(moldedInput.size(), CV_32FC3, p);
    moldedInput.convertTo(inputTensorMat, CV_32FC3);
    
    // Copy the TF_MASKRCNN_IMAGE_METADATA data into a tensor
    tensorflow::Tensor inputMetadataTensor(tensorflow::DT_FLOAT, {1, TF_MASKRCNN_IMAGE_METADATA_LENGTH});
    auto inputMetadataTensorMap = inputMetadataTensor.tensor<float, 2>();
    for (int i = 0; i < TF_MASKRCNN_IMAGE_METADATA_LENGTH; ++i) {
        inputMetadataTensorMap(0, i) = TF_MASKRCNN_IMAGE_METADATA[i];
    }
    
    // Run tensorflow
    cv::TickMeter tm;
    tm.start();
    std::vector<tensorflow::Tensor> outputs;
    tensorflow::Status run_status = tfSession->Run({{"input_image", inputTensor}, {"input_image_meta", inputMetadataTensor}},
                                                      {"output_detections", "output_mrcnn_class", "output_mrcnn_bbox", "output_mrcnn_mask",
                                                          "output_rois", "output_rpn_class", "output_rpn_bbox"},
                                                       {},
                                                       &outputs);
    if (!run_status.ok()) {
        std::cerr << "tfSession->Run failed: " << run_status << std::endl;
    }
    tm.stop();
    std::cout << "Inference time, ms: " << tm.getTimeMilli()  << std::endl;
    
    if (outputs[3].shape().dims() != 5 || outputs[3].shape().dim_size(4) != 2) {
        throw std::runtime_error("Expected mask dimensions to be [1,100,28,28,2] but got: " + outputs[3].shape().DebugString());
    }
    
    auto detectionsMap = outputs[0].tensor<float, 3>();

    for (int i = 0; i < outputs[3].shape().dim_size(1); ++i) {
        auto scoreAtI = detectionsMap(0, i, 5);
        auto detectedClass = detectionsMap(0, i, 4);
        auto y1 = detectionsMap(0, i, 0), x1 = detectionsMap(0, i, 1), y2 = detectionsMap(0, i, 2), x2 = detectionsMap(0, i, 3);
        auto maskHeight = y2 - y1, maskWidth = x2 - x1;

        if (maskHeight != 0 && maskWidth != 0) {
            // Pointer arithmetic
            const int i0 = 0, /* size0 = (int)outputs[3].shape().dim_size(1), */ i1 = i, size1 = (int)outputs[3].shape().dim_size(1), size2 = (int)outputs[3].shape().dim_size(2), size3 = (int)outputs[3].shape().dim_size(3), i4 = (int)detectedClass /*, size4 = 2 */;
            int pointerLocationOfI = (i0*size1 + i1)*size2;
            float_t *maskPointer = outputs[3].flat<float_t>().data();
        
            // The shape of the detection is [28,28,2], where the last index is the class of interest.
            // We'll extract index 1 because it's the toilet seat.
            cv::Mat initialMask(cv::Size(size2, size3), CV_32FC2, &maskPointer[pointerLocationOfI]); // CV_32FC2 because I know size4 is 2
            cv::Mat detectedMask(initialMask.size(), CV_32FC1);
            cv::extractChannel(initialMask, detectedMask, i4);
        
            // Convert to B&W
            cv::Mat binaryMask(detectedMask.size(), CV_8UC1);
            cv::threshold(detectedMask, binaryMask, 0.5, 255, cv::THRESH_BINARY);
        
            // First scale and offset in relation to TF_MASKRCNN_IMG_WIDTHHEIGHT
            cv::Mat scaledDetectionMat(maskHeight, maskWidth, CV_8UC1);
            cv::resize(binaryMask, scaledDetectionMat, scaledDetectionMat.size(), 0, 0);
            cv::Mat scaledOffsetMat(moldedInput.size(), CV_8UC1, cv::Scalar(0));
            scaledDetectionMat.copyTo(scaledOffsetMat(cv::Rect(x1, y1, maskWidth, maskHeight)));
        
            // Second, scale and offset in relation to our original inputMat
            cv::Mat detectionScaledToSquare(squareInputMat.size(), CV_8UC1);
            cv::resize(scaledOffsetMat, detectionScaledToSquare, detectionScaledToSquare.size(), 0, 0);
           detectionScaledToSquare(cv::Rect(leftBorder, topBorder, inputMat.size().width, inputMat.size().height)).copyTo(dest);
        }
    }

Hey man, could you share you pb model file with me? Now you code can not work on the latest pd model of mask-rcnn. Here is my e-mail: peterpeng9723@gmail.com
I really need to make mask-rcnn to do inference in C++ environment, thanks!

@95xueqian
Copy link

@Masahiro1002 I have the same issue on c++ Tensorflow, have you solved this problem?
why @moorage 's code hasn't 'input_anchors' ?

@ChauncyFr
Copy link

@moorage Can you provide complete code for calling the Mask RCNN model in C++?
Mask RCNN model input should have three parameters, why is there less input_anchors parameter in your code?

@gyp2448565528
Copy link

@moorage I'm working on it, try to recode the python code into c++ style, you can check utils.resize_image molded_images, compose_image_meta fucntions
If anyone who knows or already done it, very welcome to share their experiences :)

i am working on it ,did you finish it

@kongjibai
Copy link

My version worked for me @luoshanwei :)

this mrcnn model of .pb should have 3 inputs (input_image:0, input_image_meta:0, input_anchors:0) and 7 outputs, but why you only have two inputs? where's the last one?

@gyp2448565528
Copy link

My version worked for me @luoshanwei :)

this mrcnn model of .pb should have 3 inputs (input_image:0, input_image_meta:0, input_anchors:0) and 7 outputs, but why you only have two inputs? where's the last one?

the Version2.1 have 3 inputs, V2.0 have 2 inputs, you can find them in source code mrcnn/model.py

@kongjibai
Copy link

kongjibai commented Feb 28, 2019

My version worked for me @luoshanwei :)

this mrcnn model of .pb should have 3 inputs (input_image:0, input_image_meta:0, input_anchors:0) and 7 outputs, but why you only have two inputs? where's the last one?

the Version2.1 have 3 inputs, V2.0 have 2 inputs, you can find them in source code mrcnn/model.py

Hi, man! Do you know how to generate mask on the image from the model output? Because I find with above code moorage shared, the maskHeight<1 and maskWidth<1. This leads to resize failed, how did you solved it? Could you help me?
cv::resize(binaryMask, scaledDetectionMat, scaledDetectionMat.size(), 0, 0);

@gyp2448565528
Copy link

My version worked for me @luoshanwei :)

this mrcnn model of .pb should have 3 inputs (input_image:0, input_image_meta:0, input_anchors:0) and 7 outputs, but why you only have two inputs? where's the last one?

the Version2.1 have 3 inputs, V2.0 have 2 inputs, you can find them in source code mrcnn/model.py

Hi, man! Do you know how to generate mask on the image from the model output? Because I find with above code moorage shared, the maskHeight<1 and maskWidth<1. This leads to resize failed, how did you solved it? Could you help me?
cv::resize(binaryMask, scaledDetectionMat, scaledDetectionMat.size(), 0, 0);

My colleague modified the original code.But I dont know how to upload the file, give me your email , I will send it to u

@kongjibai
Copy link

My colleague modified the original code.But I dont know how to upload the file, give me your email , I will send it to u

My email: mengxs221@163.com, thank you very much!

@121649982
Copy link

@ivshli I indeed was able to implement all these. Including unmold. Kind of painful. But I suppose in the spirit of sharing and good karma I'll post it below :) . It's not particularly efficient; happy to take feedback in that department.

    // given inputMat of type RGB (not BGR) / CV_8UC3 (possibly from an imread + cvtColor)
    // also given dest of type cv::Mat(inputMat.size(), CV_8UC1)
    // we trained on 256x256 , so TF_MASKRCNN_IMG_WIDTHHEIGHT = 256
    // we copied MEAN_PIXEL configs, so cv::Scalar TF_MASKRCNN_MEAN_PIXEL(123.7, 116.8, 103.9);
    // we statically defined float TF_MASKRCNN_IMAGE_METADATA[10] = {  0 ,TF_MASKRCNN_IMG_WIDTHHEIGHT ,TF_MASKRCNN_IMG_WIDTHHEIGHT , 3 , 0 , 0 ,TF_MASKRCNN_IMG_WIDTHHEIGHT ,TF_MASKRCNN_IMG_WIDTHHEIGHT , 0 , 0 }; 

    // Resize to square with max dim, so we can resize it to 512x512
    int largestDim = inputMat.size().height > inputMat.size().width ? inputMat.size().height : inputMat.size().width;
    cv::Mat squareInputMat(cv::Size(largestDim, largestDim), CV_8UC3);
    int leftBorder = (largestDim - inputMat.size().width) / 2;
    int topBorder = (largestDim - inputMat.size().height) / 2;
    cv::copyMakeBorder(inputMat, squareInputMat, topBorder, largestDim - (inputMat.size().height + topBorder), leftBorder, largestDim - (inputMat.size().width + leftBorder), cv::BORDER_CONSTANT, cv::Scalar(0));
    cv::Mat resizedInputMat(cv::Size(TF_MASKRCNN_IMG_WIDTHHEIGHT, TF_MASKRCNN_IMG_WIDTHHEIGHT), CV_8UC3);
    cv::resize(squareInputMat, resizedInputMat, resizedInputMat.size(), 0, 0);
    
    // Need to "mold_image" like in mask rcnn
    cv::Mat moldedInput(resizedInputMat.size(), CV_32FC3);
    resizedInputMat.convertTo(moldedInput, CV_32FC3);
    cv::subtract(moldedInput, TF_MASKRCNN_MEAN_PIXEL, moldedInput);
    
    // Move the data into the input tensor
    // remove memory copies by using code at https://github.com/tensorflow/tensorflow/issues/8033#issuecomment-332029092
    // allocate a Tensor and get pointer to memory for that Tensor, allocate a "fake" cv::Mat from it to use as a  basis to convert
    tensorflow::Tensor inputTensor(tensorflow::DT_FLOAT, {1, moldedInput.size().height, moldedInput.size().width, 3}); // single image instance with 3 channels
    float_t *p = inputTensor.flat<float_t>().data();
    cv::Mat inputTensorMat(moldedInput.size(), CV_32FC3, p);
    moldedInput.convertTo(inputTensorMat, CV_32FC3);
    
    // Copy the TF_MASKRCNN_IMAGE_METADATA data into a tensor
    tensorflow::Tensor inputMetadataTensor(tensorflow::DT_FLOAT, {1, TF_MASKRCNN_IMAGE_METADATA_LENGTH});
    auto inputMetadataTensorMap = inputMetadataTensor.tensor<float, 2>();
    for (int i = 0; i < TF_MASKRCNN_IMAGE_METADATA_LENGTH; ++i) {
        inputMetadataTensorMap(0, i) = TF_MASKRCNN_IMAGE_METADATA[i];
    }
    
    // Run tensorflow
    cv::TickMeter tm;
    tm.start();
    std::vector<tensorflow::Tensor> outputs;
    tensorflow::Status run_status = tfSession->Run({{"input_image", inputTensor}, {"input_image_meta", inputMetadataTensor}},
                                                      {"output_detections", "output_mrcnn_class", "output_mrcnn_bbox", "output_mrcnn_mask",
                                                          "output_rois", "output_rpn_class", "output_rpn_bbox"},
                                                       {},
                                                       &outputs);
    if (!run_status.ok()) {
        std::cerr << "tfSession->Run failed: " << run_status << std::endl;
    }
    tm.stop();
    std::cout << "Inference time, ms: " << tm.getTimeMilli()  << std::endl;
    
    if (outputs[3].shape().dims() != 5 || outputs[3].shape().dim_size(4) != 2) {
        throw std::runtime_error("Expected mask dimensions to be [1,100,28,28,2] but got: " + outputs[3].shape().DebugString());
    }
    
    auto detectionsMap = outputs[0].tensor<float, 3>();

    for (int i = 0; i < outputs[3].shape().dim_size(1); ++i) {
        auto scoreAtI = detectionsMap(0, i, 5);
        auto detectedClass = detectionsMap(0, i, 4);
        auto y1 = detectionsMap(0, i, 0), x1 = detectionsMap(0, i, 1), y2 = detectionsMap(0, i, 2), x2 = detectionsMap(0, i, 3);
        auto maskHeight = y2 - y1, maskWidth = x2 - x1;

        if (maskHeight != 0 && maskWidth != 0) {
            // Pointer arithmetic
            const int i0 = 0, /* size0 = (int)outputs[3].shape().dim_size(1), */ i1 = i, size1 = (int)outputs[3].shape().dim_size(1), size2 = (int)outputs[3].shape().dim_size(2), size3 = (int)outputs[3].shape().dim_size(3), i4 = (int)detectedClass /*, size4 = 2 */;
            int pointerLocationOfI = (i0*size1 + i1)*size2;
            float_t *maskPointer = outputs[3].flat<float_t>().data();
        
            // The shape of the detection is [28,28,2], where the last index is the class of interest.
            // We'll extract index 1 because it's the toilet seat.
            cv::Mat initialMask(cv::Size(size2, size3), CV_32FC2, &maskPointer[pointerLocationOfI]); // CV_32FC2 because I know size4 is 2
            cv::Mat detectedMask(initialMask.size(), CV_32FC1);
            cv::extractChannel(initialMask, detectedMask, i4);
        
            // Convert to B&W
            cv::Mat binaryMask(detectedMask.size(), CV_8UC1);
            cv::threshold(detectedMask, binaryMask, 0.5, 255, cv::THRESH_BINARY);
        
            // First scale and offset in relation to TF_MASKRCNN_IMG_WIDTHHEIGHT
            cv::Mat scaledDetectionMat(maskHeight, maskWidth, CV_8UC1);
            cv::resize(binaryMask, scaledDetectionMat, scaledDetectionMat.size(), 0, 0);
            cv::Mat scaledOffsetMat(moldedInput.size(), CV_8UC1, cv::Scalar(0));
            scaledDetectionMat.copyTo(scaledOffsetMat(cv::Rect(x1, y1, maskWidth, maskHeight)));
        
            // Second, scale and offset in relation to our original inputMat
            cv::Mat detectionScaledToSquare(squareInputMat.size(), CV_8UC1);
            cv::resize(scaledOffsetMat, detectionScaledToSquare, detectionScaledToSquare.size(), 0, 0);
           detectionScaledToSquare(cv::Rect(leftBorder, topBorder, inputMat.size().width, inputMat.size().height)).copyTo(dest);
        }
    }

Hello,my dear friend ,can you send your tensorflow.dll, tensorflow.lib and include files to me ?I can't compile the dll from tensorflow source files. I really need it, my email:121649982@qq.com. thanks a lot

@buaacarzp
Copy link

I have saved the mask_rcnn model as the .pb file .But there are totally two parts of input in keras code: image and meta. I couldn't find how to feed them into the input in tensorflow c++.
This is my c++ code:
Status run_status = session->Run(
{{"input_image", image_tensor},{"input_image_meta",meta_tensor}},
{"output_node0"}, {}, &outputs
);

And I got an error "Running model failed: Not found: FeedInputs :unable to find feed output input_image_meta". Are there any tricks to solve the problem??
Thanks!

hey , why my x,y=0? and no mask and box was showed, please help me ,thanks

@buaacarzp
Copy link

image

wht my detectionsMap is zero ?

@CasonTsai
Copy link

@MennoK
Copy link

MennoK commented Jul 8, 2019

@caishengzao Could you push it to a repository? My Chinese is not the best ;) , which makes it difficult to read

@CasonTsai
Copy link

@MennoK ok,i will try to make it into English and then post all resource into github

@CasonTsai
Copy link

@MennoK this is my code implementation https://github.com/CasonTsai/MaskRcnn_tensorflow_cpp_inference,
Sorry, it's late

@yiakwy
Copy link

yiakwy commented Jul 27, 2020

@moorage @CasonTsai I implemented a tensorflow c++11 infer engine and run inference of Mask RCNN sucessfully, except that the output of tensor "detection" is full of zeros (I can see non-zeros values for "mrcnn mask" tensor). I see that you have encountered with similar questions.

debug-tensorflow-cpp

debug_eigen

debug_tensorflow

output

The model are exported using modern TF 2.0 methods with tf 1.x compatible interface (you can easily see it from codes)

mrcnn

I wonder that how did you solve the problem (zeros with "detection" tensor)before when you got zeros from output tensors. This prevent me from moving forward, any helps are appreciated. @moorage

Here is some snippet of codes for inference:

        // In c++ it is also possible for tensorflow to create an reader operator to automatically read images from an image
        // path, where image tensor is built automatically and graph_def is finally converted from a variable of type tf::Scope.
        // In tensorflow, see codes defined in "tensorflow/core/framework/tensor_types.h" and "tensorflow/core/framework/tensor.h"
        // that users are able to use Eigen::TensorMap to extract values from the container for reading and assignment. (Lei (yiak.wy@gmail.com) 2020.7)
        tfe::Tensor _molded_images(tf::DT_FLOAT, tf::TensorShape({1, molded_shape(0), molded_shape(1), 3}));
        auto _molded_images_mapped = _molded_images.tensor<float, 4>();
        // @todo TODO using Eigen::TensorMap to optimize the copy operation, e.g.: float* data_mapped = _molded_images.flat<float>().data();  copy to the buf using memcpy
        //   ref: 1. discussion Tensorflow Github repo issue#8033
        //        2. opencv2 :
        //          2.1. grab buf: Type* buf = mat.ptr<Type>();
        //          2.2  memcpy to the buf
        //        3. Eigen::Tensor buffer :
        //          3.1 grab buf in RowMajor/ColMajor layout: tensor.data();
        //          3.2 convert using Eigen::TensorMap : Eigen::TensorMap<Eigen::Tensor<Type, NUM_DIMS>>(buf)
        //  _molded_images_mapped = Eigen::TensorMap<Eigen::Tensor<float, 4, Eigen::RowMajor>>(&data[0], 1, molded_shape_H, molded_shape_W, 3);
        for (int h=0; h < molded_shape(1); h++) {
            for (int w=0; w < molded_shape(2); w++) {
                _molded_images_mapped(0, h, w, 0) = molded_images(0, h, w, 0);
                _molded_images_mapped(0, h, w, 1) = molded_images(0, h, w, 1);
                _molded_images_mapped(0, h, w, 2) = molded_images(0, h, w, 2);
            }
        }
        inputs->emplace_back("input_image:0", _molded_images);

        tfe::Tensor _images_metas(tf::DT_FLOAT, tf::TensorShape({1, images_metas.cols() } ) );
        auto _images_metas_mapped = _images_metas.tensor<float, 2>();
        for (int i=0; i < images_metas.cols(); i++)
        {
            _images_metas_mapped(0, i) = images_metas(0, i);
        }
        inputs->emplace_back("input_image_meta:0", _images_metas);

        tfe::Tensor _anchors(tf::DT_FLOAT, tf::TensorShape({1, anchors.rows(), anchors.cols()}));
        auto _anchors_mapped = _anchors.tensor<float, 3>();
        for (int i=0; i < anchors.rows(); i++)
        {
            for (int j=0; j < anchors.cols(); j++)
            {
                 _anchors_mapped(0,i,j) = anchors(i,j);
            }
        }
        inputs->emplace_back("input_anchors:0", _anchors);

        // @todo : TODO
        // run base_engine_ detection
        // see examples from main.cpp, usage of TensorFlowEngine

        // load saved model
//        tfe::FutureType fut = base_engine_->Run(*inputs, *outputs,
//                                                {"mrcnn_detection/Reshape_1:0", "mrcnn_class/Reshape_1:0", "mrcnn_bbox/Reshape:0", "mrcnn_mask/Reshape_1:0", "ROI/packed_2:0", "rpn_class/concat:0", "rpn_bbox/concat:0"}, {});
        // load saved graph
        tfe::FutureType fut = base_engine_->Run(*inputs, *outputs,
                                                {"output_detections:0", "output_mrcnn_class:0", "output_mrcnn_bbox:0", "output_mrcnn_mask:0", "output_rois:0", "output_rpn_class:0", "output_rpn_bbox:0"}, {});
        // pass fut object to anther thread by value to avoid undefined behaviors
        std::shared_future<tfe::ReturnType>  fut_ref( std::move(fut) );

        // wrap fut with a new future object and pass local variables in
        std::future<ReturnType> wrapped_fut = std::async(std::launch::async, [=, &rets]() -> ReturnType {
            LOG(INFO) << "enter into sfe TF handler ...";

            // fetch result
            fut_ref.wait();

            tf::Status status = fut_ref.get();
            std::string graph_def = base_model_dir_;
            if (status.ok()) {

                if (outputs->size() == 0) {
                    LOG(INFO) << format("[Main] Found no output: %s!", graph_def.c_str(), status.ToString().c_str());
                    return status;
                }
                LOG(INFO) << format("[Main] Success: infer through <%s>!", graph_def.c_str());
                // @todo : TODO fill out the detectron result

                tfe::Tensor detections = (*outputs)[0];
                tfe::Tensor mrcnn_mask = (*outputs)[3];

                // @todo : TODO convert tf::Tensor to eigen matrix/tensor
                auto detections_mapped = detections.tensor<float, 3>();
                auto mrcnn_mask_mapped = mrcnn_mask.tensor<float, 5>();
#ifndef NDEBUG
                LOG(INFO) << format("detections(shape:(%d,%d,%d)):",
                        detections_mapped.dimension(0),
                        detections_mapped.dimension(1),
                        detections_mapped.dimension(2))
                << std::endl << detections_mapped;
                // LOG(INFO) << "mask:" << std::endl << mrcnn_mask_mapped;
#endif
                for (int i=0; i < images.size(); i++) {
                    // Eigen::Tensor is default ColMajor layout, which is different from c/c++ matrix layout.
                    // Note only column layout is fully supported for the moment (v3.3.9)
//                    Eigen::Tensor<float, 2> detection = Eigen::TensorLayoutSwapOp<Eigen::Tensor<float, 2, Eigen::RowMajor>>
//                    (detections_mapped.chip(i, 0));
                    Eigen::Tensor<float, 2, Eigen::RowMajor> detection = detections_mapped.chip(i, 0);
                    // Generate mask using a threshold
//                    Eigen::Tensor<float, 4> mask = Eigen::TensorLayoutSwapOp<Eigen::Tensor<float, 4, Eigen::RowMajor>>
//                    (mrcnn_mask_mapped.chip(i, 0));
                    Eigen::Tensor<float, 4, Eigen::RowMajor> mask = mrcnn_mask_mapped.chip(i, 0);


                    DetectronResult ret;
                    Eigen::MatrixXi window = windows.row(i);

                    unmold_detections(detection, mask, image_shape, molded_shape, window, ret);
                    rets.push_back( std::move(ret) );
                }

@CasonTsai
Copy link

@yiakwy you may need to check these steps:
1 keep same config(such input size ,batch_size ) when you save the keras model,turn to keras model to tf model,and inference model with c++
2 check the input name ,and grantee the image data have been flow into tensor correctly,
3 check the preprogress,such as generating the anchors
4 you can reference the link
https://github.com/CasonTsai/MaskRcnn_tensorflow_cpp_inference ,https://blog.csdn.net/qq_33671888/article/details/89254537

@yiakwy
Copy link

yiakwy commented Aug 8, 2020

@CasonTsai Thanks for the suggestions. Could you help to check following codes ? You can also checkout the codes here :
https://github.com/yiakwy/SEMANTIC_VISUAL_SUPPORTED_ODEMETRY/blob/master/modules/models/sfe.h
https://github.com/yiakwy/SEMANTIC_VISUAL_SUPPORTED_ODEMETRY/blob/master/modules/models/simple_mrcnn_infer.cpp

  1. Codes on exportation of models from keras to Tensorflow either in pure protobuf file with constant variables or introduced saved format (google Cloud team introduced this method in 2017 for tensorflow serving) could be find in https://github.com/yiakwy/SEMANTIC_VISUAL_SUPPORTED_ODEMETRY/blob/master/python/pysvso/models/sfe.py

  2. Both input tests and output tests are included in cpp source to compare with results from python backend

Some operations has already been token to ensure:

  1. input tensors tests covered with exactly or nearly the same quantities
  2. tensorflow graph are generated in two ways : both protobuf file for for tf 1.x and saved format for tf 2.x。 Documentation provided: https://github.com/yiakwy/SEMANTIC_VISUAL_SUPPORTED_ODEMETRY/blob/master/docs/tensorflow_models/tensorflow_cpp_inference.md
  3. exportation signature In python side is in consistent with cpp inference engine

Have you ever encountered the same question before ?

@yiakwy
Copy link

yiakwy commented Aug 24, 2020

@hxw111 @CasonTsai @121649982 @moorage

Ultimate solution.

The bug has been fixed with tensorflow inference test sboth in python and cpp (fix a typo, wrong index in image_meta). Here is an example from output:

resolved_bug

Recently I made a speech in Google Developer Group (GDG) about inference in end devices. Welcome to have a look at it!

Close the issue.

@yiakwy
Copy link

yiakwy commented Aug 24, 2020

@waleedka @MennoK I want to add a pull request about solution to this problem. Also I used MASK_RCNN in my POC poroject svso on depth estimation in real time (a semantic SLAM project) and introduce it to the public in GDG.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests