Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNN/ONNX: outputs registration regression, feature request for new version of Clip operator #21698

Closed
cesarpgouveia opened this issue Mar 7, 2022 · 8 comments
Assignees
Labels
category: dnn (onnx) ONNX suport issues in DNN module category: dnn confirmed There is stable reproducer / investigation complete
Milestone

Comments

@cesarpgouveia
Copy link
Contributor

System information (version)
  • OpenCV => 4.5.5
  • Operating System / Platform => Windows 64 Bit
  • Compiler => Visual Studio 2019
Detailed description

I tried to infer using a Selfie Segmenter ONNX model (you can find the model here: https://github.com/PINTO0309/PINTO_model_zoo/tree/main/109_Selfie_Segmentation), however I get Nan on all output values.

Steps to reproduce

You can replicate this issue simply by running this simple script with OpenCV 4.5.5:

#include <iostream>

#include <opencv2/imgcodecs.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/dnn.hpp>
#include <opencv2/core.hpp>

int main()
{
    cv::Size inputSizeNewBarracuda = cv::Size(256, 256);

    std::string imagefilename = "C:/Lixo/SantaNoel.jpg";
    std::string newBarracuda = "C:/Users/cesar.gouveia/Downloads/saved_model_openvino/model_float32.onnx";

    cv::dnn::Net net = cv::dnn::readNetFromONNX(newBarracuda);
    net.setPreferableBackend(cv::dnn::DNN_BACKEND_OPENCV);
    net.setPreferableTarget(cv::dnn::DNN_TARGET_CPU);

    cv::Mat img = cv::imread(imagefilename);
    cv::Mat resized;
    cv::resize(img, resized, inputSizeNewBarracuda);

    std::vector<cv::Mat> imgBatch = { resized };
    bool swapRBChannels = false;
    cv::Mat blob = cv::dnn::blobFromImages(imgBatch, 1.0, cv::Size(), cv::Scalar(), swapRBChannels, false, CV_32F);
    blob = blob.reshape(1, { 1, inputSizeNewBarracuda.height, inputSizeNewBarracuda.width, 3 }); // because the model has input in channels last

    net.setInput(blob);

    std::vector<cv::Mat> outputs;
    outputs.clear();

    std::vector<cv::String> unconnectedOutLayerNames = net.getUnconnectedOutLayersNames();
    net.forward(outputs, unconnectedOutLayerNames);

    const cv::Mat& targetMat = outputs[0];
    const float* targetBuffer = (float*)targetMat.data;

    std::cout << targetMat.size[0] << std::endl; // 1
    std::cout << targetMat.size[1] << std::endl; // 256
    std::cout << targetMat.size[2] << std::endl; // 256
    std::cout << targetMat.size[3] << std::endl; // 1

    // Access contiguos buffer
    for (size_t i = 0; i < 256 * 256; i++)
    {
        std::cout << targetBuffer[i] << std::endl;
    }

    // Access Mat
    targetMat.reshape(1, { 1, 256, 256 });
    for (size_t o = 0; o < targetMat.size[0]; o++)
    {
        for (size_t i = 0; i < targetMat.size[1]; i++)
        {
            for (size_t j = 0; j < outputs[0].size[2]; j++)
            {
                std::cout << outputs[0].at<float>(o, i, j);
            }
        }
    }
    
    std::cout << "Finished" << std::endl;
}

The model can be downloaded by the link I provided. This is the first "channels last" model that I use with OpenCVDNN, all my other models are channel first and I never had this code behavior before. I tried to access the mat and the contiguos memory array but neither of them worked.

Thanks,
César.

@berak
Copy link
Contributor

berak commented Mar 8, 2022

"channels last" model

are you sure about this ?
the pdf here says so, but this might simply not be accurate,
it also states, there would be 2 output layers, but i can find only one
also, if you look at the IE xml files in the zip, those clearly have BCHW order

i checked your code using the model_float32.pb and got decent results without reordering the blob channels

however, i could NOT import the model_float32.onnx in 4.5.5-dev:

[ERROR:0@0.693] global C:\p\opencv\modules\dnn\src\onnx\onnx_importer.cpp (909) handleNode DNN/ONNX: ERROR during processing node with 3 inputs and 1 outputs: [Clip]:(Relu6:0) from domain='ai.onnx'
OpenCV: terminate handler is called! The last OpenCV error is:
OpenCV(4.5.5-dev) Error: Unspecified error (> Node [Clip@ai.onnx]:(Relu6:0) parse error: OpenCV(4.5.5-dev) C:\p\opencv\modules\dnn\src\onnx\onnx_importer.cpp:1613: error: (-2:Unspecified error) in function 'void cv::dnn::dnn4_v20211220::ONNXImporter::parseClip(cv::dnn::dnn4_v20211220::LayerParams&, const opencv_onnx::NodeProto&)'
> >  (expected: 'node_proto.input_size() == 1'), where
> >     'node_proto.input_size()' is 3
> > must be equal to
> >     '1' is 1
> ) in handleNode, file C:\p\opencv\modules\dnn\src\onnx\onnx_importer.cpp, line 928

while this works ok with a previous one (pip install opencv-python==4.5.4.60):

>>> import cv2
>>> print(cv2.__version__)
4.5.4
>>> cv2.dnn.readNet("model_float32.onnx")
<dnn_Net 0x7f99b27e99b0>

there seems to be some regression here

@cesarpgouveia , can you check the version again ? sure it's 4.5.5 ?

@cesarpgouveia
Copy link
Contributor Author

are you sure about this ?
the pdf here says so, but this might simply not be accurate,
it also states, there would be 2 output layers, but i can find only one
also, if you look at the IE xml files in the zip, those clearly have BCHW order

Yes, if you load the onnx model to netron for example you will see that the input is channels last, you can check it on the image bellow:
OpenCVIssueNetron

i checked your code using the model_float32.pb and got decent results without reordering the blob channels

I didn't tried with the tensorflow model but that might be a good option! I will try it today and get back to you.

@cesarpgouveia , can you check the version again ? sure it's 4.5.5 ?

Yes, it's release 4.5.5 with OpenVINO.

@berak
Copy link
Contributor

berak commented Mar 8, 2022

yea, seems you're right about the onnx, would not have expected such fundamental differences between different exports of the same model..

however, this seems wrong:

blob = blob.reshape(1, { 1, inputSizeNewBarracuda.height, inputSizeNewBarracuda.width, 3 }); // because the model has input in channels last

you cant simply reshape from BCHW to BHWC, memory needs to be reshuffled (like a transpose or permute op)
maybe you can just avoid the blobFromImages() and setup your blob like:

resized.convertTo( resized, CV_32F );
int sz[] = {1, inputSizeNewBarracuda.height, inputSizeNewBarracuda.width, 3};
Mat blob(sz, 4, CV_32F, resized.ptr<float>(0));

@cesarpgouveia
Copy link
Contributor Author

cesarpgouveia commented Mar 8, 2022

you cant simply reshape from BCHW to BHWC, memory needs to be reshuffled (like a transpose or permute op)

Yes you are right, this needs to be reshuffled off course. However, the worst that could happen in this case is that the results would not match, however I get only nan, which suggests that the problem is something else I think.

I tried using your approach:

Mat blob(sz, 4, CV_32F, resized.ptr(0));

I just changed dimensions and sz because I think they were swapped.

Unfortunately I get the same output, a mat filled with nan.

Do you have some more ideias on why this could be happening? And thank you very much for the help you are providing!

@cesarpgouveia
Copy link
Contributor Author

I didn't tried with the tensorflow model but that might be a good option! I will try it today and get back to you.

I tried with the pb model and yes the inference runs with no problems.

@berak
Copy link
Contributor

berak commented Mar 10, 2022

yep, just give it an input scale factor of 1.0/255, then it makes less noise (almost perfect mask)
i also tried the onnx version on colab -- all fine using 4.5.4. --
just cannot load it into a more recent cv2 !

@asmorkalov asmorkalov added category: dnn (onnx) ONNX suport issues in DNN module category: dnn labels Mar 11, 2022
@asmorkalov
Copy link
Contributor

@rogday could you take a look on the issue.

@rogday
Copy link
Member

rogday commented Mar 14, 2022

@cesarpgouveia, This network doesn't work on the current master for the following reasons:

  1. Something happened in new output handling (3.4) dnn: support outputs registration under new names #21540, which needs further inverstigation.
  2. We never supported Clip with 3 inputs - now there is an assert for that and min/max attributes are set properly(as per previous version of Clip operator). It worked before because somewhere down the road our min/max defaults happened to be the same as yours.
  3. Assert was added after the fix for default values, so your version of OpenCV doesn't yet contain it, so you effectively have no Clip, hence NaNs.

@rogday rogday added the confirmed There is stable reproducer / investigation complete label Mar 14, 2022
@rogday rogday changed the title Can't infer using Selfie Segmenter (MobileNetV3-like) ONNX using OpenCV 4.5.5 DNN/ONNX: outputs registration regression, feature request for new version of Clip operator Mar 14, 2022
@rogday rogday linked a pull request Apr 4, 2022 that will close this issue
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: dnn (onnx) ONNX suport issues in DNN module category: dnn confirmed There is stable reproducer / investigation complete
Projects
None yet
Development

No branches or pull requests

5 participants