Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debugging Tensorflow Lite Model #514

Closed
mm7721 opened this issue Jun 10, 2020 · 12 comments
Closed

Debugging Tensorflow Lite Model #514

mm7721 opened this issue Jun 10, 2020 · 12 comments
Assignees
Labels

Comments

@mm7721
Copy link

mm7721 commented Jun 10, 2020

Hi there,

First off, just wanted to say thanks for creating such a great tool - Netron is very useful.

I'm having an issue that likely stems from Tensorflow, rather than from Netron, but thought you might have some insights. In my flow, I use TF 1.15 to go from .ckpt --> frozen .pb --> .tflite. Normally it works reasonably smoothly, but a recent run shows an issue with the .tflite file: it is created without errors, it runs, but it performs poorly. Opening it with Netron shows that the activation functions (relu6 in this case) have been removed for every layer. Opening the equivalent .pb file in Netron shows the relu6 functions are present.

Have you seen any cases in which Netron struggled with a TF Lite model (perhaps it can open, but isn't displaying correctly)? Also, how did you figure out the format for .tflite files (perhaps knowing this would allow me to debug it more deeply)?

Thanks in advance.

@lutzroeder
Copy link
Owner

Can you share the files to reproduce the issue?

@mm7721
Copy link
Author

mm7721 commented Jun 10, 2020

Definitely, can share the files. But before I do, here's an extra piece of info: went back and looked at previous versions of the TF Lite model, and they have a mix of layers with relu6 and without. And many of these models are working fine. So the missing relu6 activations are unlikely the cause of the issue with the current model. My guess is that Toco is automatically dropping relu6 for layers whose output stays within the numeric bounds of 0 <= y <= 6.

This ties back to one of my questions: how did you figure out the format of the .tflite files? Would be great to walk through it line by line and see if my guess is correct.

@lutzroeder
Copy link
Owner

If this is a TensorFlow Lite question maybe start a thread in the tensorflow repo?

@mm7721
Copy link
Author

mm7721 commented Jun 10, 2020

Yeah, it really is looking like a TF Lite question, so I can ask there. They're just really slow to respond, and thought you might have been through the .tflite file decomposition before. But no problem, I'll ask there. Thanks for the quick replies. And once again, great tool :)

@lutzroeder
Copy link
Owner

lutzroeder commented Jun 10, 2020

Duplicate of #151

.tflite is using this schema. To convert to JSON you can run the command listed in #487.

tensorflow/tensorflow#40363

@lutzroeder lutzroeder marked this as a duplicate of #151 Jun 10, 2020
@mm7721
Copy link
Author

mm7721 commented Jun 11, 2020

Awesome, thanks. Exactly what I was looking for.

@mm7721
Copy link
Author

mm7721 commented Jun 11, 2020

Ok, got flatc working. I notice that there are relu6 activations present in the json, but these don't show up in the Netron display. Is this expected (perhaps related to the issue you mentioned about fused activations)?

For example:
"type": "UINT8",
"buffer": 227,
"name": "FeatureExtractor/MobilenetV2/expanded_conv_12/expand/Relu6",
"quantization": {
"min": [
0.0
],
"max": [
5.970459
],
"scale": [
0.023414
],
"zero_point": [
0
],
"details_type": "NONE",
"quantized_dimension": 0
},
"is_variable": false
},

is from the json, but not displayed in Netron.

@lutzroeder
Copy link
Owner

lutzroeder commented Jun 11, 2020

Can you share a repro file?

@mm7721
Copy link
Author

mm7721 commented Jun 11, 2020

Sure, here's a sample .tflite file that shows the 'issue.' Will that suffice?

model.zip

@lutzroeder
Copy link
Owner

lutzroeder commented Jun 11, 2020

Ctrl+F or +F and search for FeatureExtractor/MobilenetV2/expanded_conv_12/expand/Relu6. It will show that this is a connection between two nodes with quantization min/max at 0...6.
screenshot

@mm7721
Copy link
Author

mm7721 commented Jun 12, 2020

Ahhh so it is there, just not explicitly displayed. I've got other builds of the same model (but with different param values), and sometimes the activation function is displayed visually within the Conv2D or DepthwiseConv2D box. What determines if it's displayed explicitly or if it's embedded in an arrow between layers?

relu6_display

@lutzroeder
Copy link
Owner

lutzroeder commented Jun 12, 2020

The layer gets stripped from the model and replaced with min/max when converted to use quantization? The TensorFlow repo is probably a better place to ask these questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants