Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Argmax error on deeplab quantized model with tflite hexagon delegate #37871

Closed
anilsathyan7 opened this issue Mar 24, 2020 · 3 comments
Closed
Assignees
Labels
comp:lite TF Lite related issues type:bug Bug

Comments

@anilsathyan7
Copy link

anilsathyan7 commented Mar 24, 2020

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04.4 LTS
  • TensorFlow installed from (source or binary): Source
  • TensorFlow version: Build from: git clone --recurse-submodules https://github.com/tensorflow/tensorflow.git
  • Python version: 3.6.9
  • Bazel version (if compiling from source): 2.0.0, 1.14
  • GCC/Compiler version (if compiling from source):gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  • Device: Redmi Note 7 Pro (Hexagon 685 DSP), Android 10.0; MIUI 11

Describe the current behavior
When i try to run the official deeplab model with quantization aware training in tflite benchmark using hexagon delegate, it fails with the following error:-

`adb shell /data/local/tmp/benchmark_model --graph=/data/local/tmp/frozen_inference_graph_dm05_5.tflite --use_hexagon=true hexagon_profiling=true
adb: /opt/intel/intelpython27/lib/libcrypto.so.1.0.0: no version information available (required by adb)
STARTING!
Min num runs: [50]
Min runs duration (seconds): [1]
Max runs duration (seconds): [150]
Inter-run delay (seconds): [-1]
Num threads: [1]
Benchmark name: []
Output prefix: []
Min warmup runs: [1]
Min warmup runs duration (seconds): [0.5]
Graph: [/data/local/tmp/frozen_inference_graph_dm05_5.tflite]
Input layers: []
Input shapes: []
Input value ranges: []
Input layer values files: []
Use legacy nnapi : [0]
Allow fp16 : [0]
Require full delegation : [0]
Enable op profiling: [0]
Max profiling buffer entries: [1024]
CSV File to export profiling data to: []
Max number of delegated partitions : [0]
Enable platform-wide tracing: [0]
Use gpu : [0]
Allow lower precision in gpu : [1]
Use Hexagon : [1]
Hexagon lib path : [/data/local/tmp]
Hexagon Profiling : [0]
External delegate path : []
External delegate options : []
Use nnapi : [0]
Use xnnpack : [0]
Loaded model /data/local/tmp/frozen_inference_graph_dm05_5.tflite
INFO: Initialized TensorFlow Lite runtime.
loaded libcdsprpc.so
INFO: Created TensorFlow Lite delegate for Hexagon.
INFO: Hexagon delegate: 71 nodes delegated out of 71 nodes.

Applied Hexagon delegate, and the model graph will be completely executed w/ the delegate.
The input model file size (MB): 0.746232
Initialized session in 315.521ms.
Running benchmark for at least 1 iterations and at least 0.5 seconds but terminate if exceeding 150 seconds.


Timestamp: Tue Mar 24 18:13:02 2020

Log
hexagon/ops/src/op_argminmax_8_d32.c:119:argminmax_8_d32 out too small
hexagon/src/execute.c:142:execute() failed on node id=37b err=-1
hexagon/src/interface.c:1174:fail in execute_inner()


ERROR: Failed: Failed to execute graph.. STATE: FAILED_TO_EXECUTE_GRAPH
ERROR: Node number 71 (TfLiteHexagonDelegate) failed to invoke.


Timestamp: Tue Mar 24 18:13:02 2020

Log
hexagon/ops/src/op_argminmax_8_d32.c:119:argminmax_8_d32 out too small
hexagon/src/execute.c:142:execute() failed on node id=37b err=-1
hexagon/src/interface.c:1174:fail in execute_inner()


ERROR: Failed: Failed to execute graph.. STATE: FAILED_TO_EXECUTE_GRAPH
ERROR: Node number 71 (TfLiteHexagonDelegate) failed to invoke.


Timestamp: Tue Mar 24 18:13:02 2020

Log
hexagon/ops/src/op_argminmax_8_d32.c:119:argminmax_8_d32 out too small
hexagon/src/execute.c:142:execute() failed on node id=37b err=-1
hexagon/src/interface.c:1174:fail in execute_inner()


ERROR: Failed: Failed to execute graph.. STATE: FAILED_TO_EXECUTE_GRAPH
ERROR: Node number 71 (TfLiteHexagonDelegate) failed to invoke.


Timestamp: Tue Mar 24 18:13:02 2020

Log
hexagon/ops/src/op_argminmax_8_d32.c:119:argminmax_8_d32 out too small
hexagon/src/execute.c:142:execute() failed on node id=37b err=-1
hexagon/src/interface.c:1174:fail in execute_inner()


ERROR: Failed: Failed to execute graph.. STATE: FAILED_TO_EXECUTE_GRAPH
ERROR: Node number 71 (TfLiteHexagonDelegate) failed to invoke.


Timestamp: Tue Mar 24 18:13:03 2020

Log
hexagon/ops/src/op_argminmax_8_d32.c:119:argminmax_8_d32 out too small
hexagon/src/execute.c:142:execute() failed on node id=37b err=-1
hexagon/src/interface.c:1174:fail in execute_inner()


ERROR: Failed: Failed to execute graph.. STATE: FAILED_TO_EXECUTE_GRAPH
ERROR: Node number 71 (TfLiteHexagonDelegate) failed to invoke.


Timestamp: Tue Mar 24 18:13:03 2020

Log
hexagon/ops/src/op_argminmax_8_d32.c:119:argminmax_8_d32 out too small
hexagon/src/execute.c:142:execute() failed on node id=37b err=-1
hexagon/src/interface.c:1174:fail in execute_inner()


ERROR: Failed: Failed to execute graph.. STATE: FAILED_TO_EXECUTE_GRAPH
ERROR: Node number 71 (TfLiteHexagonDelegate) failed to invoke.


Timestamp: Tue Mar 24 18:13:03 2020

Log
hexagon/ops/src/op_argminmax_8_d32.c:119:argminmax_8_d32 out too small
hexagon/src/execute.c:142:execute() failed on node id=37b err=-1
hexagon/src/interface.c:1174:fail in execute_inner()


ERROR: Failed: Failed to execute graph.. STATE: FAILED_TO_EXECUTE_GRAPH
ERROR: Node number 71 (TfLiteHexagonDelegate) failed to invoke.

count=7 first=84641 curr=81095 min=79276 max=84641 avg=80591 std=1766

Benchmarking failed.`

I removed argmax and model worked with nearest neighbor resizing.The model does not seem to work with int32/int64 argmax as the final layer; even though the operator is supported by hexagon delegate. The final resize has output shape [513,513], so i tried replacing bilinear resize with nearest neighbor resizing; but the benchmark still failed . I also verified the setup by correctly running mobienet_v2 quantized model. Both models with int32 and int64 argmax seems to give same error: argminmax_8_d32 out too small

Hexagon library: v1.14

Describe the expected behavior
The benchmark model should run the quantized deeplab model without any problems.

Other info / logs
Android NDK: 20, Benchmark tool built from latest source with bazel 2.0

Here are the three models that i've tried :-
quant_aware_deeplab_dm05_513.zip

@ravikyram ravikyram added the comp:lite TF Lite related issues label Mar 26, 2020
@ravikyram ravikyram assigned ymodak and unassigned ravikyram Mar 26, 2020
@anilsathyan7
Copy link
Author

What does this error mean: argminmax_8_d32 out too small?
Is there any way by which i can make it work, like changing shape, operator type etc?

@karimnosseir karimnosseir assigned karimnosseir and unassigned ymodak Mar 31, 2020
@karimnosseir
Copy link
Contributor

I have a fix for the issue. Sorry for the trouble. Will be pushed soon.

Thanks

@tensorflow-bot
Copy link

tensorflow-bot bot commented Apr 1, 2020

Are you satisfied with the resolution of your issue?
Yes
No

andrewxcav added a commit to andrewxcav/tensorflow that referenced this issue Apr 23, 2020
* 'master' of github.com:tensorflow/tensorflow: (125 commits)
  Go: Update generated wrapper functions for TensorFlow ops.
  Internal change
  Roll forward with a fix
  Go: Update generated wrapper functions for TensorFlow ops.
  Internal change
  compat: Update forward compatibility horizon to 2020-04-01
  Add 5D support for TFLite Transpose
  Fixed reference names.
  Fix incorrect type in argmin/max  in hexagon delegate. Fixes tensorflow#37871
  Add logging to notify users that with automatic outside compilation, uncompilable ops will be placed on CPU.
  Enable Keras/RNN case via MLIR SavedModel import in TFLiteConverterV2
  Remove an unused validation for non-empty output tensors.
  create_sine_model.ipynb: update to Python 3 & tensorflow 2
  Enable Keras/RNN case via MLIR SavedModel import in TFLiteConverterV2
  Fix testRemoteCall when running on GPU.
  don't "print xla expression" in trunk annotation.
  Go: Update generated wrapper functions for TensorFlow ops.
  [XLA:Python] Remove the tuple_arguments argument from Execute().
  In Cloud Storage reader, if a buffered read catches an Out of Range error, cache the file size and return an error when the user tries to access an offset past the cached file size.
  Add extra mutable accessors for Slice and Constant instructions.
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:lite TF Lite related issues type:bug Bug
Projects
None yet
Development

No branches or pull requests

4 participants