Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tflite of ssd_mobilenet_v1_fpn produces a mess while pb version infers well. #7475

Closed
Jumpool opened this issue Aug 20, 2019 · 8 comments
Closed
Assignees
Labels
models:research models that come under research directory

Comments

@Jumpool
Copy link

Jumpool commented Aug 20, 2019

System information
What is the top-level directory of the model you are using: research/object_detection
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): no
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04.1 LTS
TensorFlow installed from (source or binary): source
TensorFlow version (use command below): 1.14.0
Bazel version (if compiling from source): 0.28.0
CUDA/cuDNN version: 10.0.130/NO
GPU model and memory: Tesla V100-PCIE 16G
Exact command to reproduce:
I followed the tensorflow offical instructions:
Step 1: generate tflite_graph.pb:
cd models/research
python export_tflite_ssd_graph.py
--pipeline_config_path=object_detection/fpn_training/ssd_mobilenet_v1_fpn_coco.config
--trained_checkpoint_prefix=object_detection/fpn_training/model.ckpt-25040
--output_directory=object_detection/fpn_training/output
--add_postprocessing_op=true

Step 2: generate detect.tflite:
toco --graph_def_file=tflite_graph.pb
--output_file=detect.tflite
--input_shapes=1,640,640,3
--input_arrays=normalized_input_image_tensor
--output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3'
--inference_type=QUANTIZED_UINT8
--mean_values=128
--std_dev_values=128
--change_concat_input_ranges=false
--default_ranges_min=0
--default_ranges_max=6
--allow_custom_ops
Describe the problem
I trained ssd_mobilenet_v1 tflite model, and it works well in reference. It takes less than 100 mini-second in an inference.
I trained ssd_mobilenet_v1_fpn, and generated pb model with export_inference_graph.py, and it works well in reference. The only problem is that it takes around 5 seconds to process an image.
I tried to generate an tflite model as the above two steps. Well, The tflite file was generated successfully, but the inference result is a mess.
Source code / logs
config:
model {
ssd {
num_classes: 5
image_resizer {
fixed_shape_resizer {
height: 640
width: 640
}
}
feature_extractor {
type: "ssd_mobilenet_v1_fpn"
depth_multiplier: 1.0
min_depth: 16
conv_hyperparams {
regularizer {
l2_regularizer {
weight: 3.99999989895e-05
}
}
initializer {
random_normal_initializer {
mean: 0.0
stddev: 0.00999999977648
}
}
activation: RELU_6
batch_norm {
decay: 0.996999979019
scale: true
epsilon: 0.0010000000475
}
}
override_base_feature_extractor_hyperparams: true
}
box_coder {
faster_rcnn_box_coder {
y_scale: 10.0
x_scale: 10.0
height_scale: 5.0
width_scale: 5.0
}
}
matcher {
argmax_matcher {
matched_threshold: 0.5
unmatched_threshold: 0.5
ignore_thresholds: false
negatives_lower_than_unmatched: true
force_match_for_each_row: true
use_matmul_gather: true
}
}
similarity_calculator {
iou_similarity {
}
}
box_predictor {
weight_shared_convolutional_box_predictor {
conv_hyperparams {
regularizer {
l2_regularizer {
weight: 3.99999989895e-05
}
}
initializer {
random_normal_initializer {
mean: 0.0
stddev: 0.00999999977648
}
}
activation: RELU_6
batch_norm {
decay: 0.996999979019
scale: true
epsilon: 0.0010000000475
}
}
depth: 256
num_layers_before_predictor: 4
kernel_size: 3
class_prediction_bias_init: -4.59999990463
}
}
anchor_generator {
multiscale_anchor_generator {
min_level: 3
max_level: 7
anchor_scale: 4.0
aspect_ratios: 1.0
aspect_ratios: 2.0
aspect_ratios: 0.5
scales_per_octave: 2
}
}
post_processing {
batch_non_max_suppression {
score_threshold: 0.300000011921
iou_threshold: 0.600000023842
max_detections_per_class: 100
max_total_detections: 100
}
score_converter: SIGMOID
}
normalize_loss_by_num_matches: true
loss {
localization_loss {
weighted_smooth_l1 {
}
}
classification_loss {
weighted_sigmoid_focal {
gamma: 2.0
alpha: 0.25
}
}
classification_weight: 1.0
localization_weight: 1.0
}
encode_background_as_zeros: true
normalize_loc_loss_by_codesize: true
inplace_batchnorm_update: true
freeze_batchnorm: false
}
}
train_config {
batch_size: 24
data_augmentation_options {
random_horizontal_flip {
}
}
data_augmentation_options {
random_crop_image {
min_object_covered: 0.0
min_aspect_ratio: 0.75
max_aspect_ratio: 3.0
min_area: 0.75
max_area: 1.0
overlap_thresh: 0.0
}
}
sync_replicas: true
optimizer {
momentum_optimizer {
learning_rate {
cosine_decay_learning_rate {
learning_rate_base: 0.0799999982119
total_steps: 12500
warmup_learning_rate: 0.0266660004854
warmup_steps: 1000
}
}
momentum_optimizer_value: 0.899999976158
}
use_moving_average: false
}
fine_tune_checkpoint: "object_detection/fpn_training/model.ckpt-22174"
from_detection_checkpoint: true
num_steps: 30000
startup_delay_steps: 0.0
replicas_to_aggregate: 8
max_number_of_boxes: 100
unpad_groundtruth_tensors: false
}
train_input_reader {
label_map_path: "object_detection/data/label_map.pbtxt"
tf_record_input_reader {
input_path: "object_detection/data/train.record"
}
}
eval_config {
num_examples: 490
metrics_set: "coco_detection_metrics"
use_moving_averages: false
}
eval_input_reader {
label_map_path: "object_detection/data/label_map.pbtxt"
shuffle: false
num_readers: 1
tf_record_input_reader {
input_path: "object_detection/data/val.record"
}
}

@Jumpool
Copy link
Author

Jumpool commented Aug 20, 2019

The visualization of ssd_mobilenet_fpn tflite model:

https://ibb.co/fXkrtr7

@Jumpool
Copy link
Author

Jumpool commented Aug 21, 2019

Log of SSD_Mobilenet_v1 inference

INFO: Initialized TensorFlow Lite runtime.
input_details: [{'name': 'normalized_input_image_tensor', 'index': 175, 'shape': array([ 1, 300, 300, 3]), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.0078125, 128)}]
output_details: [{'name': 'TFLite_Detection_PostProcess', 'index': 167, 'shape': array([ 1, 10, 4]), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0)}, {'name': 'TFLite_Detection_PostProcess:1', 'index': 168, 'shape': array([ 1, 10]), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0)}, {'name': 'TFLite_Detection_PostProcess:2', 'index': 169, 'shape': array([ 1, 10]), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0)}, {'name': 'TFLite_Detection_PostProcess:3', 'index': 170, 'shape': array([1]), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0)}]
enter inference: 2019-08-21 07:43:02.878540
real_num_detection:10
detection_boxes:[[ 0.11452022 0.16612452 0.8792511 0.8888673 ]
[-0.17423266 0.18975034 1.168004 0.8608687 ]
[ 0.026932 -0.04624186 0.2779865 0.45064378]
[ 0.44972825 0.11615509 0.7661195 0.2581224 ]
[ 0.21767046 -0.04448819 0.4792601 0.5620995 ]
[-0.0215684 -0.12464276 0.17847647 0.48194495]
[ 0.48397958 -0.04588143 0.6259469 0.24618444]
[ 0.00209561 0.03837673 0.55059004 0.45782566]
[ 0.24198976 0.54873335 0.33878568 0.6858511 ]
[ 0.26620382 0.1350857 0.6372247 0.63667 ]]
detection_scores:[### 0.99609375 0.9140625 0.5 0.5 0.5 0.5
0.5 0.5 0.5 0.5 ]
detection_classes:[1 1 0 0 0 0 0 0 0 0]
Exit inference: 2019-08-21 07:43:03.503393

@Jumpool
Copy link
Author

Jumpool commented Aug 21, 2019

"Log of SSD_Mobilenet_v1_fpn inference for the same image, but result is a mess"
detection_classes output is always 0. (should be 1);
Inference time is much longer than SSD_Mobilenet_v1. It takes around 23 seconds :-(

INFO: Initialized TensorFlow Lite runtime.
input_details: [{'name': 'normalized_input_image_tensor', 'index': 374, 'shape': array([ 1, 640, 640, 3]), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.0078125, 128)}]
output_details: [{'name': 'TFLite_Detection_PostProcess', 'index': 116, 'shape': array([ 1, 10, 4]), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0)}, {'name': 'TFLite_Detection_PostProcess:1', 'index': 117, 'shape': array([ 1, 10]), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0)}, {'name': 'TFLite_Detection_PostProcess:2', 'index': 118, 'shape': array([ 1, 10]), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0)}, {'name': 'TFLite_Detection_PostProcess:3', 'index': 119, 'shape': array([1]), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0)}]
enter inference: 2019-08-21 07:45:03.126329
real_num_detection:10
detection_boxes:[[0.22905463 0.59362036 0.30417538 0.73793405]
[0.35495275 0.5050981 0.6373057 0.6494118 ]
[0.6169615 0.17568628 0.68598115 0.21333335]
[0.9244125 0.7780393 0.99343216 0.8156863 ]
[0.7266659 0.69960785 0.7686164 0.7686275 ]
[0.50823534 0.64627457 0.5584314 0.6964706 ]
[0.64 0.69019616 0.7403922 0.79058826]
[0.29052457 0.57725495 0.3595442 0.61490196]
[0.6713726 0.44549024 0.72156864 0.4956863 ]
[0.34509808 0.8407844 0.39529413 0.8909804 ]]
detection_scores:[0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
detection_classes:[0 0 0 0 0 0 0 0 0 0]
Exit inference: 2019-08-21 07:45:26.710481

@gowthamkpr gowthamkpr added the models:research models that come under research directory label Sep 10, 2019
@srjoglekar246
Copy link
Contributor

I think the fpn model you are trying to convert is floating-point, but the command you are running assumes quantized inference. Can you try toco with params like these:

bazel run --config=opt tensorflow/lite/toco:toco -- \
--input_file=$DIR/tflite_graph.pb \
--output_file=$DIR/detect.tflite \
--input_shapes=1,300,300,3 \
--input_arrays=normalized_input_image_tensor \
--output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3'  \
--inference_type=FLOAT \
--allow_custom_ops

@kangkang59812
Copy link

why there are negative values in boundingbox?

@srjoglekar246
Copy link
Contributor

@Jumpool were you able to convert with the floating-point command? I am closing this for now, feel free to re-open if you are still facing errors.
@kangkang59812 The negative values might signify rounding errors, and you can just treat them as 0.0

@kangkang59812
Copy link

kangkang59812 commented Nov 21, 2019

@srjoglekar246 sorry to bother again. When I convert the .pb to .tflite using the command like yours, I meet the warning convert unsupport post_processing operation. The .pb model works well on PC, but the result is diffferent on Android using .tflite. The error is that the bbox is different from that on PC except the TOP1 result. Only the TOP1 is right. So how did you deal with this post_processing?

@srjoglekar246
Copy link
Contributor

If you ran the following command:

python export_tflite_ssd_graph.py
--pipeline_config_path=object_detection/fpn_training/ssd_mobilenet_v1_fpn_coco.config
--trained_checkpoint_prefix=object_detection/fpn_training/model.ckpt-25040
--output_directory=object_detection/fpn_training/output
--add_postprocessing_op=true

Then the graph should contain the post-processing op as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
models:research models that come under research directory
Projects
None yet
Development

No branches or pull requests

4 participants