Nary Elementwise Broadcast Failure Cases #23556

WanliZhong · 2023-04-28T05:58:08Z

I wrote a brute force test to test elementwise which cases will be wrong, here is the test report
report.md

I'm not sure if I should put this brute force test into test_layer.cpp, because it will test about 3844 times for each backend and it can't detect case which is fallback from GPU to CPU unless we add something to record this situation.

Steps to reproduce

If you want to test on CUDA, modify file nary_eltwise_layer.cpp. Change all return Ptr<BackendNode>(); to throw std::logic_error("fallback"); and it will be detected when fallback to CPU.
build OpenCV with CUDA and in Release mode.
use code here to generate this report

    // config
    std::vector<std::pair<int, int>> backend_target_list = {{DNN_BACKEND_OPENCV, DNN_TARGET_CPU}, {DNN_BACKEND_CUDA, DNN_TARGET_CUDA}};
    std::vector<int> dims_list = {1, 2, 3, 4, 5};

    struct util{
        // give n to generate all n-D arrays with 0 or 1
        static void get_all_arr(std::vector<std::vector<int>> &arr,  int n){
            int total = 1 << n;
            arr.assign(total, std::vector<int>(n, -1));
            for(int i = 0; i < total; i++)
                for(int j = 0; j < n; j++)
                    arr[i][j] = (i >> (n - j - 1)) & 1;
        }

        // zero will replace all 0, one will replace all 1
        static void replace(std::vector<std::vector<int>> &arr, int zero, int one){
            for(int i = 0; i < arr.size(); i++)
                for(int j = 0; j < arr[0].size(); j++)
                    arr[i][j] = arr[i][j] ? one : zero;
        }

        // test if the shape can be forwarded
        static int test_bcast(const std::vector<int>& a_shape, const std::vector<int>& b_shape, const String& op, const std::pair<int, int> &backend_target)
        {
            Mat a = Mat::zeros((int) a_shape.size(), a_shape.data(), CV_32FC1);
            Mat b = Mat::ones((int) b_shape.size(), b_shape.data(), CV_32FC1);

            Net net;
            LayerParams lp;
            lp.type = "NaryEltwise";
            lp.name = "testLayer";
            lp.set("operation", op);
            int id = net.addLayerToPrev(lp.name, lp.type, lp);
            net.connect(0, 1, id, 1);

            std::vector<String> inpNames(2);
            inpNames[0] = "a";
            inpNames[1] = "b";
            net.setInputsNames(inpNames);
            net.setInput(a, inpNames[0]);
            net.setInput(b, inpNames[1]);

            net.setPreferableBackend(backend_target.first);
            net.setPreferableTarget(backend_target.second);
            try{
                Mat re = net.forward();
                auto ptr_re = (float *) re.data;
                // check if result is right
                for(int i = 0; i < re.total(); i++)
                    if(op == "sum" && ptr_re[i] != 1)
                        return -2; // sum result is wrong
                return 1; // all right
            }catch(std::logic_error& e){
                if((std::string) e.what() == "fallback")
                    return -3; // fallback to cpu
                else
                    return -1; // other error
            }
            catch(...){
                return -1; // runtime error
            }
        }

        static void print_result(int type, const std::vector<int> &shp1, const std::vector<int> &shp2){
            std::string error_content;
            switch(type){
                case 1:
                    return;
                case -1:
                    error_content = "runtime error";
                    break;
                case -2:
                    error_content = "result wrong";
                    break;
                case -3:
                    error_content = "fallback to cpu";
                    break;
                default:
                    error_content = "";
                    break;
            }
            std::cout << toString(shp1) << " op " << toString(shp2) << ", fail reason is " << error_content << std::endl;
        }
    };


    std::vector<std::vector<int>> dim_shape_list;
    std::vector<std::vector<int>> sub_shape_list;
    std::cout << "# Nary Elementwise Broadcast Failure Cases" << std::endl;
    for(auto backend_target: backend_target_list){
        std::cout << "## BackendID: " << backend_target.first << ", TargetID: " << backend_target.second << std::endl;
        for (int dim: dims_list)
        {
            std::cout << "### Dimension: " << dim << std::endl;
            sub_shape_list.insert(sub_shape_list.end(), dim_shape_list.begin(), dim_shape_list.end());
            util::get_all_arr(dim_shape_list, dim);
            util::replace(dim_shape_list, 1, 3);
            // same shape
            std::cout << "- **Same Shape**" << std::endl;
            for (int i = 0; i < dim_shape_list.size(); i++)
                for (int j = 0; j < dim_shape_list.size(); j++)
                    util::print_result(util::test_bcast(dim_shape_list[i], dim_shape_list[j], "sum", backend_target), dim_shape_list[i], dim_shape_list[j]);

            // diff shape
            std::cout << "- **Different Shape**" << std::endl;
            for (const auto & shp1 : dim_shape_list)
                for (const auto & shp2 : sub_shape_list)
                    util::print_result(util::test_bcast(shp1, shp2, "sum", backend_target), shp1, shp2);

            // diff shape
            for (const auto & shp1 : sub_shape_list)
                for (const auto & shp2 : dim_shape_list)
                    util::print_result(util::test_bcast(shp1, shp2, "sum", backend_target), shp1, shp2);
        }
    }

Issue submission checklist

I report the issue, it's not a question
I checked the problem with documentation, FAQ, open issues, forum.opencv.org, Stack Overflow, etc and have not found any solution
I updated to the latest OpenCV version and the issue is still there
There is reproducer code and related data files (videos, images, onnx, etc)

The text was updated successfully, but these errors were encountered:

DNN/CUDA: make 'abcd op 1b11' broadcast eltwise operator support cuda

WanliZhong · 2023-04-28T11:04:12Z

I will re-implement this test then put it into regular test after these 2 releated PR are merged.

WanliZhong added bug category: gpu/cuda (contrib) OpenCV 4.0+: moved to opencv_contrib category: dnn labels Apr 28, 2023

WanliZhong referenced this issue Apr 28, 2023

Merge pull request #23528 from WanliZhong:issue23278

e3e1f70

DNN/CUDA: make 'abcd op 1b11' broadcast eltwise operator support cuda

This was referenced Apr 28, 2023

fix nary elementwise bug in cpu #23557

Merged

DNN/CUDA: Solve the bug of same shape broadcast with CUDA #23560

Merged

WanliZhong closed this as completed Jun 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nary Elementwise Broadcast Failure Cases #23556

Nary Elementwise Broadcast Failure Cases #23556

WanliZhong commented Apr 28, 2023 •

edited

WanliZhong commented Apr 28, 2023 •

edited

Nary Elementwise Broadcast Failure Cases #23556

Nary Elementwise Broadcast Failure Cases #23556

Comments

WanliZhong commented Apr 28, 2023 • edited

Steps to reproduce

Issue submission checklist

WanliZhong commented Apr 28, 2023 • edited

WanliZhong commented Apr 28, 2023 •

edited

WanliZhong commented Apr 28, 2023 •

edited