Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TFLite] Support TFLite FP32 Relay frontend. #2365

Merged
merged 9 commits into from Jan 23, 2019

Conversation

FrozenGene
Copy link
Member

@FrozenGene FrozenGene commented Jan 3, 2019

This is the first PR of #2351 to support importing exist quantized int8 TFLite model. The base version of Tensorflow / TFLite is 1.12.

This PR support TFLite FP32 Relay frontend. The input layout is NCHW, which is not the same as TFLite's NHWC layout, because TVM's default supporting layout is NCHW and support NCHW very well like AutoTVM on ARM CPU. Don't worry, I have completed the translation layout work in the TFLite FE. Currently, it could run Mobilenet V1 successfully.

Prerequisite of Python packages:

  • flatbuffers
  • tflite

You could install flatbuffers using pip3 install flatbuffers

For the tflite package, as far as I know, we haven't pip package until now. You could choose two ways to install:

  1. git clone my generated tflite package: https://github.com/FrozenGene/tflite
  2. generate it by yourself. You could find the instructions how to do it in previous repo's README.

@tqchen To run the TFLite frontend test, we should have flatbuffers and tflite packages in our CI machine. Please let me know if you have any questions.

After this PR is merged, I will write the tutorial of how to use TFLite frontend like previous frontends do.

@FrozenGene
Copy link
Member Author

FrozenGene commented Jan 3, 2019

Seems that our CI Tensorflow has low version. (i.e. smaller than 1.12). It report error:
AttributeError: module 'tensorflow.contrib.lite.python.lite' has no attribute 'TFLiteConverter'
@tqchen Could we upgrade our Tensorflow version to 1.12? I think this shouldn't have side-effect on our Tensorflow frontend. If not correct, please correct me @srkreddy1238 .

However, flatbuffers / tflite package should be installed in our CI machine as said before.

@srkreddy1238
Copy link
Contributor

@FrozenGene no issues to upgrade with TF 1.12.
With these dependencies you could propose another PR for CI recipe to bring in TF upgrade and other dependencies while this PR under review.

@FrozenGene
Copy link
Member Author

@FrozenGene no issues to upgrade with TF 1.12.
With these dependencies you could propose another PR for CI recipe to bring in TF upgrade and other dependencies while this PR under review.

@srkreddy1238 Do you mean I could make a PR to install / upgrade our CI's machine python packages? Even I can git clone and set PYTHONPATH environment? Could you help to give me one example how to do it?

@srkreddy1238
Copy link
Contributor

I mean the PR for docker changes to accommodate the dependencies for TFLite.
Ref. https://github.com/dmlc/tvm/blob/master/docker/Dockerfile.ci_cpu for CPU docker changes.
Ref. https://github.com/dmlc/tvm/blob/master/docker/install/ubuntu_install_keras.sh and for the script which installs tensorflow package.
You need to build docker image locally with changes, verify and then propose it for merging and deploy.

@FrozenGene I can help with docker changes if you have difficulty.

@FrozenGene
Copy link
Member Author

I mean the PR for docker changes to accommodate the dependencies for TFLite.
Ref. https://github.com/dmlc/tvm/blob/master/docker/Dockerfile.ci_cpu for CPU docker changes.
Ref. https://github.com/dmlc/tvm/blob/master/docker/install/ubuntu_install_keras.sh and for the script which installs tensorflow package.
You need to build docker image locally with changes, verify and then propose it for merging and deploy.

@FrozenGene I can help with docker changes if you have difficulty.

Yes. I am not familiar with docker. If you could help to install TFLite dependency, that's very nice! Thanks in advance.

@srkreddy1238
Copy link
Contributor

@FrozenGene I see tensorflow version is already 1.12.
Or
@tqchen Do we have to rebuild the CI docker image?

 ---> Using cache
 ---> f448a5f23a1e
Step 53/54 : ENV LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${VULKAN_SDK}/lib
 ---> Using cache
 ---> 2fdae98744fc
Step 54/54 : ENV VK_LAYER_PATH=${VULKAN_SDK}/etc/explicit_layer.d
 ---> Using cache
 ---> 8c35d2ddd51e
Successfully built 8c35d2ddd51e
Successfully tagged tvm.ci_gpu:latest
Running 'pip3 search tensorflow' inside tvm.ci_gpu...
nvidia-docker
mesg: ttyname failed: Inappropriate ioctl for device
Adding group `srk' (GID 1002) ...
Done.
tensorflow (1.12.0)                                      - TensorFlow is an open source machine learning framework for everyone.
  INSTALLED: 1.12.0 (latest)
tensorflow-qndex (0.0.22)                                - tensorflow-qnd x tensorflow-extenteten
tensorflow-plot (0.2.0)                                  - TensorFlow Plot
tensorflow-estimator (1.10.12)                           - TensorFlow Estimator.

@FrozenGene
Copy link
Member Author

FrozenGene commented Jan 4, 2019

@srkreddy1238 I think we should rebuild the docker image. Because I find this latest commit: e8d6d9a, the tensorflow version is 1.10. I have tested tensorflow 1.10, which will raise the same error: AttributeError: module 'tensorflow.contrib.lite.python.lite' has no attribute 'TFLiteConverter'. But tensorflow 1.12 will not.

@srkreddy1238
Copy link
Contributor

@FrozenGene
Ref. #2371 & dmlc/web-data#148
Building new docker image with these changes should resolve tflite dependencies.

I simplified TVM CI recipe by making a python package out of your tflite repository and uploaded it under web-data.
I will send a PR to your branch with changes I added in there.

@FrozenGene
Copy link
Member Author

Thanks @srkreddy1238 I have seen your prs, which helps a lot to resolve dependencies, thanks again!

@FrozenGene
Copy link
Member Author

@srkreddy1238 I saw your PR #2371 is merged and rebase to the latest code. However, it raise the same error:

  • module 'tensorflow.contrib.lite.python.lite' has no attribute 'TFLiteConverter'

  • ImportError: The tflite package must be installed

Have we upgraded successfully? or we need some time waiting docker image is updated?

@srkreddy1238
Copy link
Contributor

I think docker images should be rebuilt with new changes. cc @tqchen

@FrozenGene
Copy link
Member Author

@tqchen could you help to rebuild docker image? Then I could update related code. Thanks in advance.

@tqchen
Copy link
Member

tqchen commented Jan 15, 2019

I have updated the ci docker image, please check again

@FrozenGene
Copy link
Member Author

@tqchen Thanks. it works now.

@srkreddy1238 could you help to review this PR?

@tqchen
Copy link
Member

tqchen commented Jan 18, 2019

@srkreddy1238 @ajtulloch @junrushao1994 @zhreshold @xqdan please help to review this PR when you have time

-------
tensor name in UTF-8 encoding
"""
return subgraph.Tensors(tensor_idx).Name().decode("utf-8")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FrozenGene I am a little bit surprised that this could pass python3 CI, because str.decode shouldn't exist in Python3...Any ideas? Is the resulted type of Name() str or bytearray in Python3?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resulted type of Name() is not str, it is bytes() in TFLite. See: https://github.com/FrozenGene/tflite/blob/master/tflite/Tensor.py#L58

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. So it is reasonable to me. Thanks!

# 3: N H W C, reshape to N H*W C, transpose to N C H*W
# 4: N H W C, transpose to N C H W
# add more if we need target shapes in future
if output_shape_length == 1 or output_shape_length == 2:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I understand the logic here, but I do think it introduces some underlying assumption.

The assumption is: the input of squeeze operator comes out of conv2d with some specific layout, otherwise it will transpose layers crazily.

It doesn't look safe to me, but for now for specific models like MobileNet, it is fine. @tqchen please advise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. When we do handle Reshape or squeeze ops we should be very careful if we handle from input layout NHWC to NCHW. So, this is to obey channel first rule, i.e. C is before H, W but after N. Currently we only handle 1-4D, not N-D. This issue is not special for us, other converters will meet the same issue. Please see Tensorflow to CoreML converter, which has similar logic: https://github.com/tf-coreml/tf-coreml/blob/master/tfcoreml/_shape_sensitive_layers.py#L275

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it is ok for me for now, but probably we could document it somewhere? DL models in other fields like NLP heavily use squeeze, which may violate our rules and crashes every thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the reason why I comment the rule in code. I think another place we could comment is our TFLite frontend tutorial. Any other suggestions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, it looks good to me. @tqchen if you have any suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layout is specific to vision operators, bringing in layout specific customization into basic transforms will not work for all models. Here it's only mobilenet hence it works, but I doubt it may get more complex when more models are added. A simple NLP model with one LSTM layer will fail these assumptions.

Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Future improvement may include

  1. multi subgraph support
  2. documenting how squeeze is handled (see our discussion for details)

In the scope of this PR, I think everything looks good.

@tqchen tqchen assigned srkreddy1238 and unassigned ZihengJiang Jan 18, 2019
@tqchen
Copy link
Member

tqchen commented Jan 18, 2019

Thanks for the reviews, I will leave this PR's management to @srkreddy1238 :)

python/tvm/relay/frontend/tflite.py Show resolved Hide resolved
python/tvm/relay/frontend/tflite.py Outdated Show resolved Hide resolved
python/tvm/relay/frontend/tflite.py Show resolved Hide resolved
python/tvm/relay/frontend/tflite.py Show resolved Hide resolved
# 3: N H W C, reshape to N H*W C, transpose to N C H*W
# 4: N H W C, transpose to N C H W
# add more if we need target shapes in future
if output_shape_length == 1 or output_shape_length == 2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layout is specific to vision operators, bringing in layout specific customization into basic transforms will not work for all models. Here it's only mobilenet hence it works, but I doubt it may get more complex when more models are added. A simple NLP model with one LSTM layer will fail these assumptions.

return tflite_output


def compare_tflite_with_tvm(tflite_in_data, tvm_in_data, in_name, input_tensors,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different input data for TVM and tflite ??
I understand this conversion is to felicitate NCHW across the model, this impose preprocessing.
Again non vision models may have challenges.

cc @tqchen pls advice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is discussed with @junrushao1994 before. As discussed, the issue is not our specific. Like TF-CoreML (https://github.com/tf-coreml/tf-coreml/blob/master/tfcoreml/_shape_sensitive_layers.py#L275), Intel OpenVINO, TensorRT and so on do the same things like us, we will meet the same problem. Yes, I agree there will be challenges, but currently let us start from little things, such as here limited 1-4D transformation and vision models (under internal development, we have tested many other models, doesn't have problems until now). When we want to support more generally ND reshape and NLP, let us see how to handle it better in future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(under internal development, we have tested many other models, doesn't have problems until now).

Have you tested InceptionV1 & V3 together ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I tested Inception V3 just now(https://www.tensorflow.org/lite/models) and no problem, don't find official Inception V1 TFLite model. Maybe I can convert Tensorflow Inception V1 model into TFLite. Inception V3 model has one op named concatenation not included in this PR, I will supply more ops later. If someone need Inception V3 support, I will provide concatenation op implementation without any problem. Current PR's main purpose is to provide infrastructure of TFLite frontend.

tests/python/frontend/tflite/test_forward.py Outdated Show resolved Hide resolved
tests/python/frontend/tflite/test_forward.py Outdated Show resolved Hide resolved
@srkreddy1238
Copy link
Contributor

@FrozenGene
We need add a tutorial under frontend for tflite too.

@FrozenGene
Copy link
Member Author

@FrozenGene
We need add a tutorial under frontend for tflite too.

Yes. I will add the tutorial in the following PR after this is merged like our frontends do.

Copy link
Contributor

@srkreddy1238 srkreddy1238 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the assumption of tflite frontend will be applicable to only vision models and NCHW layout as default for input.

@srkreddy1238
Copy link
Contributor

@FrozenGene request to bring in Inception V1/3/4 support soon to have an evidence that this layout conversion generalize for all types.

@srkreddy1238 srkreddy1238 merged commit 10df78a into apache:master Jan 23, 2019
@FrozenGene
Copy link
Member Author

@FrozenGene request to bring in Inception V1/3/4 support soon to have an evidence that this layout conversion generalize for all types.

Thanks @srkreddy1238 I will add related ops and Inception Models testing soon.

zhiics pushed a commit to zhiics/tvm that referenced this pull request Jan 23, 2019
* Support TFLite FP32 Relay frontend.

* Fix lint issue

* Remove unnecessary variables and packages

* Add method doc string

* Fix capital letter of method doc string

* Add TFLite FP32 Relay frontend based on latest code

* Merge the latest code

* Solve cython issue with relay type inference

* Modify code based on suggestions
This was referenced Jan 24, 2019
Anthony-Mai pushed a commit to Anthony-Mai/tvm that referenced this pull request Jan 25, 2019
* Support TFLite FP32 Relay frontend.

* Fix lint issue

* Remove unnecessary variables and packages

* Add method doc string

* Fix capital letter of method doc string

* Add TFLite FP32 Relay frontend based on latest code

* Merge the latest code

* Solve cython issue with relay type inference

* Modify code based on suggestions
merrymercy pushed a commit to merrymercy/tvm that referenced this pull request Feb 18, 2019
* Support TFLite FP32 Relay frontend.

* Fix lint issue

* Remove unnecessary variables and packages

* Add method doc string

* Fix capital letter of method doc string

* Add TFLite FP32 Relay frontend based on latest code

* Merge the latest code

* Solve cython issue with relay type inference

* Modify code based on suggestions
wweic pushed a commit to neo-ai/tvm that referenced this pull request Feb 20, 2019
* Support TFLite FP32 Relay frontend.

* Fix lint issue

* Remove unnecessary variables and packages

* Add method doc string

* Fix capital letter of method doc string

* Add TFLite FP32 Relay frontend based on latest code

* Merge the latest code

* Solve cython issue with relay type inference

* Modify code based on suggestions
wweic pushed a commit to neo-ai/tvm that referenced this pull request Feb 20, 2019
* Support TFLite FP32 Relay frontend.

* Fix lint issue

* Remove unnecessary variables and packages

* Add method doc string

* Fix capital letter of method doc string

* Add TFLite FP32 Relay frontend based on latest code

* Merge the latest code

* Solve cython issue with relay type inference

* Modify code based on suggestions
@FrozenGene FrozenGene deleted the tflite_fe branch September 10, 2019 13:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants