New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why there are so many reorders in a binary network IR trained from nncf #5817
Comments
@dmitry-gorokhov Could you have a look? |
@dmitry-gorokhov here are model onnx files for your reference, https://drive.google.com/drive/folders/1EzkohZOjGg-Cul0IoLok9QJu1NO0GMQ2?usp=sharing |
any updates on this? |
Hello @gj-raza. Thank you for your patience. The developer is investigating the case and this might take some time. Ref. 57585 |
Hi @gj-raza I apologize for the delay in our response. The development team has confirmed that Binary models support is currently experimental in OpenVINO. Additional optimizations still need to be implemented in order to speed up such binary models. I will convert this issue into a feature request. Regards, |
Hi @gj-raza, thank you for your feedback on this case, |
System information (version)
Detailed description
I trained a squeezenet classification model using given training pipeline script in nncf examples with XNOR binarization as compression method, keeping first conv layer as fp32 and rest of the net in binary, converted to onnx and then to IR via model optimizer and ran benchmarks with the OV's benchmark tool and found that in addition to fake quantize, there are reorders happening before every binary conv layer execution which are costly and essentially taking away the speed up that could be achieved through binarization as a result, the FPS are even lower that a full FP32 model (benchmark report attached).
As you can see below before every binary conv, a reorder is happening from nchw8c to nhwc, although as per docs of binary conv here, the binary conv layer can accept NCHW format.
Concern is, why the model optimizer/Inferece Engine is adding reorders before every bin conv and is there a way to avoid these?
Detailed benchmark counters for full FP32 model and binary model are attached for reference.
benchmark_detailed_counters_report_squeezenet1_1_imagenet_binary_xnor.csv
benchmark_detailed_counters_report_squeezenet_1_1_fp32.csv
benchmark_report_squeezenet1_1_imagenet_binary_xnor.csv
benchmark_report_squeezenet_1_1_fp32.csv
The text was updated successfully, but these errors were encountered: