Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Barracuda inference results using tf2onnx converted BlazePose model. #91

Closed
InternetSalmon opened this issue Sep 28, 2020 · 16 comments

Comments

@InternetSalmon
Copy link

InternetSalmon commented Sep 28, 2020

Hi,

I have a model I've converted from TensorFlow to Onnx with tf2onnx, using TF and OnnxRuntime the inference is the same with matching output using the same test data.

Tensorflow model

#load tf saved model
loaded = tf.saved_model.load("saved_model_full_pose_detection")
inference_func = loaded.signatures["serving_default"] 
# Test the model on test image.
test_data_tensor = tf.convert_to_tensor(test_data)
saved_model_output_classificators = inference_func(test_data_tensor)['classificators'].numpy()

Onnx model

#load onnx model generated using tf2onnx 1.6.3
sess = rt.InferenceSession("posedetector_full.onnx")
input_name = sess.get_inputs()[0].name
output_name = sess.get_outputs()[0].name
res = sess.run([output_name], {input_name: test_data})
onnx_output_classificators = np.array(res[0])

Using the same onnx model and test data with Barracuda I get drastically different inference results.

inference_results

I believe i'm creating the Barracuda input tensor correctly, and there is a error occurring in the inference.

barracuda_test

I've attached my Unity Barracuda test project, as well as Jupyter notebook and tensorflow + onnx models.
Unity:
InferenceTest.zip
Jupyter + Models:
Jupyter test and models

Any insight into the different output when using Barracuda would be greatly appreciated!

@InternetSalmon
Copy link
Author

InternetSalmon commented Sep 29, 2020

Stepping through the outputs of each node in Barracuda & Onnx Runtime, the results appear to diverge at the Padding layer (layer 17).
image
intermediate node output data

Barracuda also displays the warning only spacial padding is supported.
image

I'm not sure how to interpret the padding attributes, could the padding be for non-spacial dimensions which is unsupported in Barracuda?
image

If so is support for pad layers with non-spacial padding in the road map for future features?

@AlexRibard
Copy link
Collaborator

Hi! @InternetSalmon I checked your model. You were correct with your debugging, errors starts appearing starting to the Pad node.

The padding attributes are interpreted as follows:
(cf https://github.com/onnx/onnx/blob/master/docs/Operators.md#Pad)
first 4 are the start index in NCHW (0,0,0,0)
second 4 are the end index in NCHW (0,32,0,0)
so add 32 0s to the end of C
input = (1, 64, 64, 64) => (1, 96, 64, 64)

As of yet, we do not support this operation. It doesn't seem so hard to do, so we will ad it to the list.
In the meantime let me try to think of a potential workaround...

@InternetSalmon
Copy link
Author

Thanks @AlexRibard, appreciate it!

I had a look at Pad.compute, unfortunately I don't know much HLSL but would the aim be to have a CHW version of Border2D e.g. Border3D which is Onnx mode = constant?

The three supported modes are (similar to corresponding modes supported by numpy.pad):

constant(default) - pads with a given constant value as specified by constant_value (which defaults to 0)

reflect - pads with the reflection of the vector mirrored on the first and last values of the vector along each axis

edge - pads with the edge values of array

@AlexRibard
Copy link
Collaborator

I unfortunately can't give you a precise estimate to when we will implement it.
Could you implement that operation with Concat? Followed by a Mul to zero out the channels you want?

@InternetSalmon
Copy link
Author

Thanks Alex, i'll keep watch of the release logs and give the Concat & Mul operation approach a go in the meantime!

@InternetSalmon
Copy link
Author

InternetSalmon commented Oct 25, 2020

I've been working on this with the latest version of barracuda 1.1.2 preview. Although don't believe this error is correct, the model will no longer import with the Only tensors of rank 4 or less are supported, but got rank 8 error.
Although inspecting the Identity_1:0 layer it appears to be rank 3?
image

model.zip

@AlexRibard
Copy link
Collaborator

Hi @InternetSalmon
I tested your model on default and the upcoming 1.2.0 release happy to say that it imports correctly :)
This version should be coming out this week.
I did try to validate the model with onnx-runtime but it is giving me an error

@InternetSalmon
Copy link
Author

Thanks @AlexRibard, that is exciting! Appreciate all the work the team has been doing and looking forward to testing out 1.2.0!
Does that mean non-spacial padding support is also included in 1.2.0?

I did try to validate the model with onnx-runtime but it is giving me an error

Is the onnx-runtime error that it would not load or different inference results? I had uploaded the opset 13 converted model and that may have been why, I have attached opset 9.
model_opset9.zip

I've been working with onnxruntime v1.1.0

Name: onnxruntime Version: 1.1.0 Summary: ONNX Runtime Python bindings

@AlexRibard
Copy link
Collaborator

Ah indeed, we didn't implement non spacial padding for 1.2.0
Your model doesn't work correctly because of that.
I have managed to get it to run with onnx_runtime which is good.

Let me spend some time getting the non spacial padding working for you. It won't make it in 1.2.0 but I can send you the CS

@InternetSalmon
Copy link
Author

Hey Alex, if that is possible would be amazing and highly appreciated!

@InternetSalmon
Copy link
Author

I was able to get padding working and the correct inference result although the changes are a bit of a hack...
InternetSalmon@98ec1be

Looking forward to a official implementation to see how it is done properly! I was unsure on how to approach TensorExtensions.ApplyBorder() it seems to iterate TensorShape.DataFeatures for the indexes to apply the start/end padding but there is no index available for the batch and channel dimensions.

@JoeProgram
Copy link

JoeProgram commented Mar 11, 2021

Hi @AlexRibard - have there been any updates on support for spatial padding?

I believe I'm seeing the same issue as the original poster:

Only spatial (H and W) padding is currently supported. Non spatial padding (N and C) will be ignored and default to 0.

Thanks!

@AlexRibard
Copy link
Collaborator

@JoeProgram can you share the model please?
Yes this issue about non spatial padding is something we should address. I'll add it to the list, as it's not hard. I hope we have time for the next release

@InternetSalmon
Copy link
Author

@JoeProgram I was able to produce correct output from the model in my project after making some changes to the padding compute shader. An official fix would be greatly appreciated.

@danieltanfh95
Copy link

What changes are needed actually?

@AlexRibard
Copy link
Collaborator

AlexRibard commented Sep 2, 2021

Commenting on this thread to say that official support for non spatial padding has been added.
It will not make it for version 2.2.0 but the one after that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants