Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: Shape Translator missing for OP of type Pad. #147

Closed
gi097 opened this issue Mar 23, 2018 · 41 comments
Closed

TypeError: Shape Translator missing for OP of type Pad. #147

gi097 opened this issue Mar 23, 2018 · 41 comments
Assignees

Comments

@gi097
Copy link

gi097 commented Mar 23, 2018

Hi,

When I try to convert a deeplab model: http://download.tensorflow.org/models/deeplabv3_mnv2_cityscapes_train_2018_02_05.tar.gz

Using the following commands:

import tfcoreml as tf_converter
tf_converter.convert(
    tf_model_path = 'frozen_inference_graph.pb',
    mlmodel_path = 'cityscapes.mlmodel',
    output_feature_names = ['SemanticPredictions:0'],
    input_name_shape_dict={"ImageTensor:0":[1,513,513,3]},
    image_input_names=["import/ImageTensor"]
)

I get the following error:
TypeError: Shape Translator missing for OP of type Pad.,

But in the README.md there is written that Pad is supported. What is going wrong?

@aseemw
Copy link
Collaborator

aseemw commented Mar 23, 2018

Yes Pad is supported:

'Pad' : _layers.pad,

Did you update your package to point to the latest master?

@gi097
Copy link
Author

gi097 commented Mar 23, 2018

Yes, but it’s not in the dictionary:

https://github.com/tf-coreml/tf-coreml/blob/master/tfcoreml/_interpret_shapes.py

@aseemw
Copy link
Collaborator

aseemw commented Mar 23, 2018

ah yes, can you try if adding the entry "'Pad': _identity" in that dictionary fixes the issue.

@gi097
Copy link
Author

gi097 commented Mar 23, 2018 via email

@aseemw
Copy link
Collaborator

aseemw commented Mar 23, 2018

Let me know which all ops these are.
It will not corrupt the code, if you use "op_name: terminate".
"op_name: identity" could be used when that op does not change the dimensionality of the input blob (i.e. if output and input have same rank).
I'll put up a PR adding more ops to this dictionary.

@aseemw aseemw added the bug Unexpected behaviour that should be corrected (type) label Mar 23, 2018
@aseemw aseemw self-assigned this Mar 23, 2018
@gi097
Copy link
Author

gi097 commented Mar 23, 2018

Okay, Pad and ResizeBilinear are needed by this graph and missing. I added the option _terminate as well as _identity in the dictionary.

Now it gives me the error Reshape interpret shapes: Case not handled currently, which means there must be done more to make this work.

@aseemw
Copy link
Collaborator

aseemw commented Mar 24, 2018

PR #148 fixes these errors and has some other fixes as well.
However, the graph you linked cannot be converted. Converting using #148 will give the correct error message.
Currently, CoreML supports a limited version of the ResizeBilinear op, if the resize is equivalent to upsampling by an integer factor, that would work. Fractional upsampling is not supported currently.
Either, you can change the parameter for that layer such that the ratio of output to input height/width becomes an integer, or you could convert the graph up to that layer, which anyways appears towards the end of the graph.
The custom layer functionality of CoreML can also be used, however that support hasn't been added to the tfcoreml converter yet (issue #120)

A snapshot of the end portion of the TF graph (the unsupported ResizeBilinear op marked inside red):

screen shot 2018-03-23 at 10 44 08 pm

@gi097
Copy link
Author

gi097 commented Mar 24, 2018 via email

@aseemw
Copy link
Collaborator

aseemw commented Mar 24, 2018

Quantization would not affect the shape of the blobs or the parameters of the resize layer.

@gi097
Copy link
Author

gi097 commented Mar 24, 2018

I understand the shapes now indeed. I tweaked it all a bit but it did not work out. It would be really helpful if you could help me to convert this model to Core ML since I really need it for my project. Converting until that layer is not a good option I guess.

@aseemw
Copy link
Collaborator

aseemw commented Mar 24, 2018

How did you tweak it? Wherever you are defining the TF model architecture, you should make sure that the spatial resize of the image feature array occurs in integer multiples, instead of a fractional number.

Yes converting till that layer is not ideal at all, but since fractional upsampling (resize bilinear) is not supported in CoreML version 1, there is no other option. You should be able to convert the model, before and after that layer, and then implement the resize bilinear op functionality in your Xcode app.

@gi097
Copy link
Author

gi097 commented Mar 24, 2018 via email

@seantempesta
Copy link

@gi097: Any progress? I'm trying to convert the same model (well the MobileNet version).
@aseemw: Can you elaborate on how to convert the model before and after the layer? I'm trying to convert the MobileNet version of the DeepLab model and am stuck at the

AssertionError: Reshape interpret shapes: Case not handled currently

The values for the reshape are input=1 and output=3. That seems like it's also not that hard to implement?

@gi097
Copy link
Author

gi097 commented Mar 28, 2018

Sorry, no progress yet. I am now trying to compile caffe instead.

@aseemw aseemw removed the bug Unexpected behaviour that should be corrected (type) label Mar 28, 2018
@aseemw
Copy link
Collaborator

aseemw commented Mar 28, 2018

@seantempesta Hopefully the ssd_example notebook in the examples folder should explain the process. Basically you'd cut two chunks of the TF model before and after the unsupported op and then convert them separately. You can visualize the TF graph to figure out the the correct op and input and output tensor names or use the inspect_pb.py script in the utils folder.
Hopefully this process will become automatic and easier after issue #120 is resolved, which is in my pipeline.

The reshape error you are getting maybe can be removed. You can try adding another condition here. Or you can share the model and I can take a look at it.

@gi097
Copy link
Author

gi097 commented Mar 28, 2018

@seantempesta we can take a look and try to fix this together?

@seantempesta
Copy link

@aseemw: This script will generate the checkpoint and the frozen graph for the MobileNet version of the model I'm trying to convert.

https://github.com/tensorflow/models/blob/master/research/deeplab/local_test_mobilenetv2.sh

I tried adding some code to handle the reshape case, but then all sorts of strange errors started happening so I likely didn't do it right. #120 does seem like the better solution.

@gi097: Sure. I've got a LinkNet model converted and trained for my client, so I'm not sure when I'll have more time to dedicate to this process though. Why are you compiling caffe? Have you tried converting this Keras implementation using CoreMLTools?

https://github.com/xiaochus/MobileNetV2

@gi097
Copy link
Author

gi097 commented Mar 29, 2018

@seantempesta Great! Could you send me that converted model if you don’t mind? See my email on Github.

I am trying to port caffe, since I wanted to try SegNet, which seemed to be very fast.

@aseemw
Copy link
Collaborator

aseemw commented Mar 29, 2018

@seantempesta you won't get the reshape error if you use the latest master of tcoreml

Both the models can be converted up to the input of the last resizeBilinear op ("ResizeBilinear_3") which is not supported.
So use the output of the "ResizeBilinear_2" op as the final output for the converter (circled in red below)

(Visualized here using Netron)

screen shot 2018-03-28 at 11 05 41 pm

Here is what you can do for the mobilenet model:

H_in, W_in = 224, 224 # or whatever input size, must be less than 513 
convert(tf_model_path = 'model.pb', 
              mlmodel_path = 'model.mlmodel', 
              output_feature_names = ['ResizeBilinear_2:0'],
              input_name_shape_dict = {'ImageTensor:0': (1, H_in, W_in, 3)})

This will return a multiarray output of shape [21, 65, 65] (i.e. C=21, H=65, W=65).
After that, you can do the remaining operations that the TF graph performs:

  • resizeBilinear [21,65,65] to get a [21,513,513] array (it should be easy to get the code for bilinear interpolation)
  • take argmax along the first axis to get a [513,513] array (say X)
  • slice from the indices 0 to get the [H_in, W_in] array which is the final output
    (i.e. out = X[:H_in, :W_in])

For the other model its the same thing, but maybe different shape values (you can find that out by running the TF graph and evaluating the last few tensors).

@aseemw aseemw closed this as completed Mar 29, 2018
@leonidas002
Copy link

@aseemw do you convert the deeplab model success? I ran into the same problem and couldn't convert

@FrozenGene
Copy link
Contributor

Convert successfully? I will also need this tf model

@gi097
Copy link
Author

gi097 commented May 7, 2018

Not for me, I just stopped trying.

@FrozenGene
Copy link
Contributor

@aseemw Could you explain why ResizeBilinear_3 not be supported? For the input ImageSensor, why we don't need to do like SSD-MobileNet(drop the preprocess)?

@FrozenGene
Copy link
Contributor

I find report this error: AssertionError: Resize Bilinear: height upsampling factor must be an integer (input height = 65, output height = 513)

@aseemw
Copy link
Collaborator

aseemw commented May 17, 2018

Yes, that is an expected error.
CoreML framework does not have a resize bilinear layer. (List of supported layers are here, here).

There is support for bilinear upsampling which is equivalent to resize bilinear when the height/width changes by an integer factor (not the case here).

I don't remember SSD architecture exactly, does it have a resize bilinear at the end? If it has maybe it has integer factor.

Currently there are two work arounds this:

  • specify as output an op before this layer, and then implement it as post-processing in your app
  • insert custom layers in the model. Again you would need to provide the code in your app, but now the layer will be part of the mlmodel. Ability to add custom layers easily will be landing soon in master (see PR Option to add CoreML custom layers during conversion #179 )

@FrozenGene
Copy link
Contributor

FrozenGene commented May 18, 2018

@aseemw No, SSD-MobileNet doesn't have resize bilinear. It has input named ImageSensor as deeplab. But we cut the preprocess step:

Here we will extract the FeatureExtractor from the model and strip off the other subgraphs, as these subgraphs contain structures not currently supported in CoreML. The tasks in Preprocessor, MultipleGridAnchorGenerator and Postprocessor subgraphs can be achieved by other means, although they are non-trivial.

I asked this question just because they have the same type and name: ImageSensor. But SSD we need to to cut but for deeplab we don't.

These workarounds specify that we can not modify code to accomplish it, right? we have to stop at layer of Resize_Blinear and add code to accomplish work next. I am doing the translator from CoreML model to others, can I translate this custom layer in my translator too(recognize this customer layer, then finish it using code in my model translator)? Because I don't want to leave these code to do by our customers.

Morever, could you help to explain the mode:

    enum InterpolationMode {
        NN = 0;
        BILINEAR = 1;
    }

I don't find any difference in the doc. I am doing the model translator, so I should understand it. Thanks.

@FrozenGene
Copy link
Contributor

FrozenGene commented May 18, 2018

@aseemw I just find one bug of our resize_bilinear, we don't consider resize_bilinear cooperation with squeeze.
This is tensorflow output:
image

And after squeeze:

image

CoreML:
We use upsampling to handle it:
image
image

The output is [1, 3,513, 513] and loadConstant is [3, 1, 1] We can not elementwise_add it.

I think it should be similar as #69. Because we ignore squeeze by default.

@aseemw aseemw reopened this May 18, 2018
@aseemw
Copy link
Collaborator

aseemw commented May 18, 2018

Checkout PR #179 that should make adding the custom layer easier.

NN stands for nearest neighbor interpolation and the other one is bilinear interpolation.
Conversion from CoreML model should be straightforward, even when it has a custom layer, since the description of that layer can hold all the parameters.

@FrozenGene
Copy link
Contributor

Thanks for answering. So, could we solve the bug of ResizeBilinear after Squeeze?

@aseemw
Copy link
Collaborator

aseemw commented May 21, 2018

"The output is [1, 3,513, 513] and loadConstant is [3, 1, 1] We can not elementwise_add it."

I think this should not be an issue. CoreML is able to add tensors [3,513,513] and [3,1,1].
Do you have a Coreml model that shows an error? A reproducible case with specific steps will be useful. If you do, please open a new issue as well. This one was originally about the pad op, which has been solved.

@FrozenGene
Copy link
Contributor

Yes, my show error is that:
Input name(s) and shape(s):
ImageTensor__0 : (C,H,W) = (3, 513, 513)
Neural Network compiler 0: 210 , name = ResizeBilinear:0, output shape : (C,H,W) = (3, 513, 513)
Neural Network compiler 1: 290 , name = negated_Reshape:0_sub_2:0, output shape : (C,H,W) = (3, 1, 1)
Neural Network compiler 2: 230 , name = sub_2:0, output shape : (C,H,W) = (3, 513, 513)
Traceback (most recent call last):
File "test_tvm_coreml.py", line 43, in
coreml_out = mlmodel.predict(coreml_input, useCPUOnly = True)[coreml_output_name]
File "/usr/local/lib/python3.6/site-packages/coremltools/models/model.py", line 264, in predict
return self.proxy.predict(data,useCPUOnly)
RuntimeError: value type not convertible

Converted code:

 import tfcoreml as tf_converter
H_in, W_in = 513, 513 # or whatever input size, must be less than / eqaul to 513
tf_converter.convert(tf_model_path = 'frozen_inference_graph.pb',
                     mlmodel_path = 'deeplab.mlmodel',
                     output_feature_names = ['sub_2:0'],
                     input_name_shape_dict = {'ImageTensor:0': (1, H_in, W_in, 3)}
model_file = 'deeplab.mlmodel'
mlmodel = coremltools.models.MLModel(model_file)
img = Image.open('cat.png').resize((513, 513))
image = np.asarray(img)
img_tf = np.expand_dims(image, axis = 0)
image = image.transpose((2, 0, 1))

#evaluate CoreML
coreml_output_name = 'sub_2__0'
coreml_input_name = 'ImageTensor__0'
coreml_input = {coreml_input_name: image}

#Test the default CoreML evaluation
coreml_out = mlmodel.predict(coreml_input, useCPUOnly = True)[coreml_output_name]
print(coreml_out)

I am not sure this is new issue after your explain. If you can help to make sure, I will open the new issue.

@aseemw
Copy link
Collaborator

aseemw commented May 21, 2018

Its giving error because input type is int, whereas it should be float
image = np.asarray(img).astype(np.float32)

@FrozenGene
Copy link
Contributor

Thanks @aseemw , which gives the same output with Tensorflow. I will continue to test this deeplab v3 model, if I find any issues, I will open.

By the way, can we specify exactly our add support broadcast in the AddLayerParams doc? Like Tensorflow's doc: https://www.tensorflow.org/api_docs/python/tf/add, which said:

NOTE: Add supports broadcasting
But our doc doesn't say anything about it, so I misunderstood it and translated it into elementwise_add without broadcast support.

@aseemw
Copy link
Collaborator

aseemw commented May 21, 2018

Doesn't it mention there that the set of broadcast able shapes are {[1], [C], [1, H, W], or [C, H, W]} ?

@FrozenGene
Copy link
Contributor

You are right. I just saw it. If I could saw it before, I would save much time :-(

@aseemw aseemw closed this as completed May 22, 2018
@gi097
Copy link
Author

gi097 commented Jul 27, 2018

If anybody is still looking for DeepLab support on CoreML, I converted the cityscapes MobileNetV2 model with DeepLabv3+.

https://github.com/gi097/blindassist-scripts

Based on the work of @seantempesta

@xiangcong
Copy link

@gi097 nice job~!
But is there anyway that can deal with the ArgMax ?
loop the multiarray to search for max index on cpu is time-consuming.

@gi097
Copy link
Author

gi097 commented Sep 4, 2018

@xiangcong
Copy link

I get a result like this image

the original image and the mask image are not perfectly matched,
not sure whether it is because the bilinear-upsampleing replaced with normal upsample?
@gi097 @seantempesta

@gi097
Copy link
Author

gi097 commented Sep 5, 2018

The CoreML pixel input size differs from the actual view’s size. Hence it might not be calibrated properly. You can take a look at my Storyboard which aligns the views perfectly.

@xiangcong
Copy link

OK~I'll check that.
Thanks a lot! @gi097

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants