New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Argmax error on deeplab quantized model with tflite hexagon delegate #37871

Closed

anilsathyan7 opened this issue Mar 24, 2020 · 3 comments

Assignees

Labels

comp:lite type:bug

anilsathyan7 commented Mar 24, 2020 •

edited

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04.4 LTS
TensorFlow installed from (source or binary): Source
TensorFlow version: Build from: git clone --recurse-submodules https://github.com/tensorflow/tensorflow.git
Python version: 3.6.9
Bazel version (if compiling from source): 2.0.0, 1.14
GCC/Compiler version (if compiling from source):gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Device: Redmi Note 7 Pro (Hexagon 685 DSP), Android 10.0; MIUI 11

Describe the current behavior
When i try to run the official deeplab model with quantization aware training in tflite benchmark using hexagon delegate, it fails with the following error:-

`adb shell /data/local/tmp/benchmark_model --graph=/data/local/tmp/frozen_inference_graph_dm05_5.tflite --use_hexagon=true hexagon_profiling=true
adb: /opt/intel/intelpython27/lib/libcrypto.so.1.0.0: no version information available (required by adb)
STARTING!
Min num runs: [50]
Min runs duration (seconds): [1]
Max runs duration (seconds): [150]
Inter-run delay (seconds): [-1]
Num threads: [1]
Benchmark name: []
Output prefix: []
Min warmup runs: [1]
Min warmup runs duration (seconds): [0.5]
Graph: [/data/local/tmp/frozen_inference_graph_dm05_5.tflite]
Input layers: []
Input shapes: []
Input value ranges: []
Input layer values files: []
Use legacy nnapi : [0]
Allow fp16 : [0]
Require full delegation : [0]
Enable op profiling: [0]
Max profiling buffer entries: [1024]
CSV File to export profiling data to: []
Max number of delegated partitions : [0]
Enable platform-wide tracing: [0]
Use gpu : [0]
Allow lower precision in gpu : [1]
Use Hexagon : [1]
Hexagon lib path : [/data/local/tmp]
Hexagon Profiling : [0]
External delegate path : []
External delegate options : []
Use nnapi : [0]
Use xnnpack : [0]
Loaded model /data/local/tmp/frozen_inference_graph_dm05_5.tflite
INFO: Initialized TensorFlow Lite runtime.
loaded libcdsprpc.so
INFO: Created TensorFlow Lite delegate for Hexagon.
INFO: Hexagon delegate: 71 nodes delegated out of 71 nodes.

Applied Hexagon delegate, and the model graph will be completely executed w/ the delegate.
The input model file size (MB): 0.746232
Initialized session in 315.521ms.
Running benchmark for at least 1 iterations and at least 0.5 seconds but terminate if exceeding 150 seconds.

Timestamp: Tue Mar 24 18:13:02 2020

Log
hexagon/ops/src/op_argminmax_8_d32.c:119:argminmax_8_d32 out too small
hexagon/src/execute.c:142:execute() failed on node id=37b err=-1
hexagon/src/interface.c:1174:fail in execute_inner()

ERROR: Failed: Failed to execute graph.. STATE: FAILED_TO_EXECUTE_GRAPH
ERROR: Node number 71 (TfLiteHexagonDelegate) failed to invoke.

Timestamp: Tue Mar 24 18:13:02 2020

Log
hexagon/ops/src/op_argminmax_8_d32.c:119:argminmax_8_d32 out too small
hexagon/src/execute.c:142:execute() failed on node id=37b err=-1
hexagon/src/interface.c:1174:fail in execute_inner()

ERROR: Failed: Failed to execute graph.. STATE: FAILED_TO_EXECUTE_GRAPH
ERROR: Node number 71 (TfLiteHexagonDelegate) failed to invoke.

Timestamp: Tue Mar 24 18:13:02 2020

Log
hexagon/ops/src/op_argminmax_8_d32.c:119:argminmax_8_d32 out too small
hexagon/src/execute.c:142:execute() failed on node id=37b err=-1
hexagon/src/interface.c:1174:fail in execute_inner()

ERROR: Failed: Failed to execute graph.. STATE: FAILED_TO_EXECUTE_GRAPH
ERROR: Node number 71 (TfLiteHexagonDelegate) failed to invoke.

Timestamp: Tue Mar 24 18:13:02 2020

Log
hexagon/ops/src/op_argminmax_8_d32.c:119:argminmax_8_d32 out too small
hexagon/src/execute.c:142:execute() failed on node id=37b err=-1
hexagon/src/interface.c:1174:fail in execute_inner()

ERROR: Failed: Failed to execute graph.. STATE: FAILED_TO_EXECUTE_GRAPH
ERROR: Node number 71 (TfLiteHexagonDelegate) failed to invoke.

Timestamp: Tue Mar 24 18:13:03 2020

Log
hexagon/ops/src/op_argminmax_8_d32.c:119:argminmax_8_d32 out too small
hexagon/src/execute.c:142:execute() failed on node id=37b err=-1
hexagon/src/interface.c:1174:fail in execute_inner()

ERROR: Failed: Failed to execute graph.. STATE: FAILED_TO_EXECUTE_GRAPH
ERROR: Node number 71 (TfLiteHexagonDelegate) failed to invoke.

Timestamp: Tue Mar 24 18:13:03 2020

Log
hexagon/ops/src/op_argminmax_8_d32.c:119:argminmax_8_d32 out too small
hexagon/src/execute.c:142:execute() failed on node id=37b err=-1
hexagon/src/interface.c:1174:fail in execute_inner()

ERROR: Failed: Failed to execute graph.. STATE: FAILED_TO_EXECUTE_GRAPH
ERROR: Node number 71 (TfLiteHexagonDelegate) failed to invoke.

Timestamp: Tue Mar 24 18:13:03 2020

Log
hexagon/ops/src/op_argminmax_8_d32.c:119:argminmax_8_d32 out too small
hexagon/src/execute.c:142:execute() failed on node id=37b err=-1
hexagon/src/interface.c:1174:fail in execute_inner()

ERROR: Failed: Failed to execute graph.. STATE: FAILED_TO_EXECUTE_GRAPH
ERROR: Node number 71 (TfLiteHexagonDelegate) failed to invoke.

count=7 first=84641 curr=81095 min=79276 max=84641 avg=80591 std=1766

Benchmarking failed.`

I removed argmax and model worked with nearest neighbor resizing.The model does not seem to work with int32/int64 argmax as the final layer; even though the operator is supported by hexagon delegate. The final resize has output shape [513,513], so i tried replacing bilinear resize with nearest neighbor resizing; but the benchmark still failed . I also verified the setup by correctly running mobienet_v2 quantized model. Both models with int32 and int64 argmax seems to give same error: argminmax_8_d32 out too small

Hexagon library: v1.14

Describe the expected behavior
The benchmark model should run the quantized deeplab model without any problems.

Other info / logs
Android NDK: 20, Benchmark tool built from latest source with bazel 2.0

Here are the three models that i've tried :-
quant_aware_deeplab_dm05_513.zip

anilsathyan7 added the type:bug label

tensorflow-bot bot assigned ravikyram

ravikyram added the comp:lite label

ravikyram assigned ymodak and unassigned ravikyram

Author

anilsathyan7 commented Mar 30, 2020

What does this error mean: argminmax_8_d32 out too small?
Is there any way by which i can make it work, like changing shape, operator type etc?

karimnosseir assigned karimnosseir and unassigned ymodak

Contributor

karimnosseir commented Mar 31, 2020

I have a fix for the issue. Sorry for the trouble. Will be pushed soon.

Thanks

tensorflow-copybara closed this as completed in

bda5ba0

tensorflow-bot bot commented Apr 1, 2020

Are you satisfied with the resolution of your issue?
Yes
No

andrewxcav added a commit to andrewxcav/tensorflow that referenced this issue


          Merge branch 'master' of github.com:tensorflow/tensorflow

cedac10

* 'master' of github.com:tensorflow/tensorflow: (125 commits)
  Go: Update generated wrapper functions for TensorFlow ops.
  Internal change
  Roll forward with a fix
  Go: Update generated wrapper functions for TensorFlow ops.
  Internal change
  compat: Update forward compatibility horizon to 2020-04-01
  Add 5D support for TFLite Transpose
  Fixed reference names.
  Fix incorrect type in argmin/max  in hexagon delegate. Fixes tensorflow#37871
  Add logging to notify users that with automatic outside compilation, uncompilable ops will be placed on CPU.
  Enable Keras/RNN case via MLIR SavedModel import in TFLiteConverterV2
  Remove an unused validation for non-empty output tensors.
  create_sine_model.ipynb: update to Python 3 & tensorflow 2
  Enable Keras/RNN case via MLIR SavedModel import in TFLiteConverterV2
  Fix testRemoteCall when running on GPU.
  don't "print xla expression" in trunk annotation.
  Go: Update generated wrapper functions for TensorFlow ops.
  [XLA:Python] Remove the tuple_arguments argument from Execute().
  In Cloud Storage reader, if a buffered read catches an Out of Range error, cache the file size and return an error when the user tries to access an offset past the cached file size.
  Add extra mutable accessors for Slice and Constant instructions.
  ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment