-
Notifications
You must be signed in to change notification settings - Fork 74.9k
Description
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
- OS Platform and Distribution:Linux Ubuntu 18.04
- Mobile device if the issue happens on mobile device: Pixel 3 XL
- TensorFlow installed from (source or binary): source
- TensorFlow version (use command below): r2.2.0 and master
- Python version: 3.6
- Bazel version (if compiling from source): 3.0.0
- GCC/Compiler version (if compiling from source): default
- CUDA/cuDNN version: 10.0
- GPU model and memory:
Describe the current behavior
GPU delegate gives very different result from CPU.
I was able to hard-code the source (in tensorflow/lite/delegates/gpu/common/model_builder.cc) to allow some operations to be delegated to GPU. Out of 270+ operations:
- Delegating just one Conv_2d will produce very similar result as the one by CPU only.
- Delegating a few more operations seem to produce bigger difference.
- Delegating just the first MUL operation will produce very different result.
Describe the expected behavior
result should be close
Standalone code to reproduce the issue
I personally hijacked tflite tools/benchmark code and give an sample image as deterministic input, instead of random input.
I would love to provide my code change if it helps.
Other info / logs Include any logs or source code that would be helpful to
diagnose the problem. If including tracebacks, please include the full
traceback. Large logs and files should be attached.
The tflite model was converted from InsightFace/ArcFace MXNet model
https://github.com/deepinsight/insightface/wiki/Model-Zoo (3.2 model)
link to download the tflite
https://drive.google.com/file/d/1pJX2I8btskVy-QHiF-mcUF7HF6ZFaQuf/view?usp=sharing
Also attached the graph of above model:
visualized_official_arcface_no_sub.zip