-
-
Notifications
You must be signed in to change notification settings - Fork 55.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't parse layer for node='layer_normalization/strided_slice' of type='StridedSlice' #23872
Comments
We do support TensorFlow models but it's not perfect. We do support layer norm but currently we only parse it if it is a ONNX model. Since your exported model breaks layer norm into several operators as the following: The problem goes to the implementation of each operator instead of the one of layer norm. If you are looking for a quick fix, I suggest you try using https://github.com/onnx/tensorflow-onnx to do a tf-to-onnx convertion, then load the converted onnx using opencv. Will update soon once I locate the bug. |
Hello @charvey2718 , I located the problems and there is a chance that they can be fixed. Problems are:
opencv/modules/dnn/src/tensorflow/tf_graph_simplifier.cpp Lines 625 to 636 in c982be3
Comment the above line leads to the following issue.
|
Thanks for the suggestion for a quick fix. I did try doing a tf2onnx conversion, but it still wouldn't load into OpenCV when LayerNormalization is used, this time giving a different error though (the one below). Just as for the MWE, skipping the LayerNormalization lets it load.
As far as I can tell from the ONNX model, the LayerNormalization layer is still decomposed into sub operators, e.g. strided_slice. I understood from another issue (sorry, I lost the link) that converting an untrained architecture to ONNX (as in my MWE) can be problematic. Interestingly, when I convert my trained model (containing LayerNormalization) to ONNX then the error changes to
From here, this seems to come back to the dynamic shape, i.e., not specifying a batch size. Unfortunately, specifying batch size as 1 then gives yet another error when I try to load it:
Unfortunately, as much as I'd like a quick fix, using the ONNX format hasn't worked and may be a digression from the original issue. Following your last post, am I right to think that commenting the line in opencv/modules/dnn/src/tensorflow/tf_graph_simplifier.cpp as per item 1 in your list, and changing batch size to 1 in my model, should resolve my issues? (Item 2 in your list did not seem to require any fix.) |
Another quick try on your side is once the model is converted to ONNX, use |
I added the following to my python MWE above, and called tf2onnx() instead of save_weights():
Running this outputs the following
As you can see, onnxsim didn't help with readNet. I've attached a zip tfmodel.zip with tfmodel.onnx and tfmodelsim.onnx. Thank you for the work to fix the tf importer. I will download and test it later and let you know how I get on. |
Another option for you to try is either fake training a bit your model or set it in eval mode, then export it. |
I thought the same and so I also applied the conversion to onnx + onnxsim process to my trained full architecture. Doing so gives the same error as before when trying to load an untrained onnx model containing LayerNormalization. The trained onnxsim version of the full architecture is here (obviously it's a much bigger file as the architecture is much more complex). |
I was just wondering if you still think this bug is fixable, and if so, on what timescale? Please don’t misunderstood me - I’m not being impatient, and I’m very grateful to you for the time you’ve spent on this already. I’m just trying to get an idea of whether I should wait for a fix, or work on other possible solutions, since the ONNX import workaround you suggested seems to have a similar problem. I could for instance try the Tensorflow C++ API or the ONNX Runtime. Neither is desirable, especially as my wider project is considerably invested in OpenCV, but it may be necessary. |
Hello @charvey2718 , sorry for the late response. I was deeply involved in other projects. The fix for your TF model may need a lot of effort and therefore it is marked as lower priority on my side. I am not sure when I can finish the patch. So I think you can try other inference framework to get started. |
No worries. Thanks for your efforts. In the meantime, cppflow looks promising. But I’ll enthusiastically await OpenCV DNN developments all the same! |
System Information
C++ version details
OpenCV version: 4.6.0 (but I also checked the issue exists in the latest OpenCV Python release version 4.7.0 as below)
Operating System / Platform: Windows 10
Compiler & compiler version: GCC 8.1.0
Python version details
OpenCV python version: 4.7.0
Operating System / Platform: Window 10
Python version: 3.9.0
Detailed description
According to here, cv::dnn::LayerNormLayer is a supported layer.
However, loading a Tensorflow net containing a LayerNormalization layer into OpenCV using cv::dnn::readNet (C++) or cv2.dnn.readNet (Python) generates the following error:
The mention of strided slice led me to this, however, I was unable to get optimize_for_inference.py to work for me. It may be related, but I'm not sure it's the same issue.
Steps to reproduce
This Python code creates a Keras model (I'm using version 2.11.0) containing a Conv2D layer, and a LayerNormalization layer; freezes and saves the model, and then attempts to load it into OpenCV, generating an error.
If you change 'outputs' parameter to 'convolved' as per the comment in the 'create_model(...)' function, then the problem goes away, indicating that LayerNormalization is the issue.
I actually want to load my pb model into the C++ version of OpenCV, but it was easier to demonstrate the issue in a MWE in one Python code. The same issue applies to both cv::dnn::readNet (C++) and cv2.dnn.readNet (Python). It all relates to the readNet() command in either version anyway.
I've also uploaded the model here in mwe.zip, so you can skip the Tensorflow / Keras part of the code above, and then the relevant MWE code is just the last line.
Issue submission checklist
The text was updated successfully, but these errors were encountered: