Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

object detection api toco model conversion problem #5298

Open
FreestylePocker opened this issue Sep 12, 2018 · 28 comments

Comments

@FreestylePocker
Copy link

commented Sep 12, 2018

System information

  • What is the top-level directory of the model you are using: research/object_detection
  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): no
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below): 1.10.0
  • Bazel version (if compiling from source): 0.16.1
  • CUDA/cuDNN version: 9.2/7.2
  • GPU model and memory: GeForce GTX 970 4G
  • Exact command to reproduce:
bazel run --config=opt tensorflow/contrib/lite/toco:toco -- \
--input_file=$OUTPUT_DIR/tflite_graph.pb \
--output_file=$OUTPUT_DIR/detect.tflite \
--input_shapes=1,640,640,3 \
--input_arrays=normalized_input_image_tensor \
--output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' \
--inference_type=QUANTIZED_UINT8 \
--mean_values=128 \
--std_values=128 \
--change_concat_input_ranges=false \
--allow_custom_ops

Describe the problem

i am trying to create a fully quantized tflite model for inference

while trained this model from scratch with custom dataset there was a problem related to #5139 but i used a workaround to increase eval delay and restarted process few times so this problem was just slowed down training process

finnaly model was trained and works fine with .pb file created by export_inference_graph.py

to create tflite file i followed this instructions https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_on_mobile_tensorflowlite.md

object_detection/export_tflite_ssd_graph.py \
--pipeline_config_path=$CONFIG_FILE \
--trained_checkpoint_prefix=$CHECKPOINT_PATH \
--output_directory=$OUTPUT_DIR \
--add_postprocessing_op=true

exported tflite_graph.pb without a problem

but when converting it to the tflite with toco it crashes:

bazel run --config=opt tensorflow/contrib/lite/toco:toco -- \
--input_file=$OUTPUT_DIR/tflite_graph.pb \
--output_file=$OUTPUT_DIR/detect.tflite \
--input_shapes=1,640,640,3 \
--input_arrays=normalized_input_image_tensor \
--output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' \
--inference_type=QUANTIZED_UINT8 \
--mean_values=128 \
--std_values=128 \
--change_concat_input_ranges=false \
--allow_custom_ops

results in
tensorflow/contrib/lite/toco/graph_transformations/propagate_fixed_sizes.cc:116] Check failed: dim_x == dim_y (256 vs. 24)Dimensions must match

Source code / logs

toco log:

2018-09-13 05:07:46.154129: I tensorflow/contrib/lite/toco/import_tensorflow.cc:1055] Converting unsupported operation: TFLite_Detection_PostProcess                                                                                                                                      
2018-09-13 05:07:46.361651: I tensorflow/contrib/lite/toco/graph_transformations/graph_transformations.cc:39] Before Removing unused ops: 1992 operators, 2969 arrays (0 quantized)                                                                                                       
2018-09-13 05:07:46.429271: I tensorflow/contrib/lite/toco/graph_transformations/graph_transformations.cc:39] Before general graph transformations: 1992 operators, 2969 arrays (0 quantized)                                                                                             
2018-09-13 05:07:50.926104: F tensorflow/contrib/lite/toco/graph_transformations/propagate_fixed_sizes.cc:116] Check failed: dim_x == dim_y (256 vs. 24)Dimensions must match                                                                                                             
Emergency stop (memory stack is flushed to disk)

export_tflite_ssd_graph.py log:

2018-09-13 05:03:51.397799: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero                                                               
2018-09-13 05:03:51.398192: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1411] Found device 0 with properties:                                                                                                                                                                      
name: GeForce GTX 970 major: 5 minor: 2 memoryClockRate(GHz): 1.367                                                                                                                                                                                                                       
pciBusID: 0000:04:00.0                                                                                                                                                                                                                                                                    
totalMemory: 3.95GiB freeMemory: 3.88GiB                                                                                                                                                                                                                                                  
2018-09-13 05:03:51.398208: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0                                                                                                                                                                        
2018-09-13 05:03:51.637059: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] Device interconnect StreamExecutor with strength 1 edge matrix:                                                                                                                                       
2018-09-13 05:03:51.637107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977]      0                                                                                                                                                                                                
2018-09-13 05:03:51.637115: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0:   N                                                                                                                                                                                                
2018-09-13 05:03:51.637295: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3607 MB memory) -> physical GPU (device: 0, name: GeForce GTX 970, pci bus id: 0000:04:00.0, compute capability: 5.2)   
2018-09-13 05:03:55.605093: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0                                                                                                                                                                        
2018-09-13 05:03:55.605145: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] Device interconnect StreamExecutor with strength 1 edge matrix:                                                                                                                                       
2018-09-13 05:03:55.605162: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977]      0                                                                                                                                                                                                
2018-09-13 05:03:55.605168: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0:   N                                                                                                                                                                                                
2018-09-13 05:03:55.605283: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3607 MB memory) -> physical GPU (device: 0, name: GeForce GTX 970, pci bus id: 0000:04:00.0, compute capability: 5.2)   
2018-09-13 05:03:57.218987: I tensorflow/tools/graph_transforms/transform_graph.cc:317] Applying strip_unused_nodes

config:

# SSD with Resnet 50 v1 FPN feature extractor, shared box predictor and focal
model {
  ssd {
    inplace_batchnorm_update: true
    freeze_batchnorm: false
    num_classes: 2
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
        use_matmul_gather: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    encode_background_as_zeros: true
    anchor_generator {
      multiscale_anchor_generator {
        min_level: 3
        max_level: 7
        anchor_scale: 4.0
        aspect_ratios: [1.0, 2.0, 0.5]
        scales_per_octave: 2
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 640
        width: 640
      }
    }
    box_predictor {
      weight_shared_convolutional_box_predictor {
        depth: 256
        class_prediction_bias_init: -4.6
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            random_normal_initializer {
              stddev: 0.01
              mean: 0.0
            }
          }
          batch_norm {
            scale: true,
            center: true,
            train: true,
            decay: 0.97,
            epsilon: 0.001,
          }
        }
        num_layers_before_predictor: 4
        kernel_size: 3
      }
    }
    feature_extractor {
      type: 'ssd_resnet50_v1_fpn'
      fpn {
        min_level: 3
        max_level: 7
      }
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          scale: true,
          center: true,
          decay: 0.97,
          epsilon: 0.001,
        }
      }
      override_base_feature_extractor_hyperparams: true
    }
    loss {
      classification_loss {
        weighted_sigmoid_focal {
          alpha: 0.25
          gamma: 2.0
        }
      }
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    normalize_loc_loss_by_codesize: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  batch_size: 1
  sync_replicas: true
  startup_delay_steps: 0
  replicas_to_aggregate: 1
  num_steps: 400000
  data_augmentation_options {
    random_rgb_to_gray {
      probability: 0.75
    }
  }
  data_augmentation_options {
    random_adjust_brightness {
    }
  }
  data_augmentation_options {
    random_adjust_contrast {
    }
  }
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        cosine_decay_learning_rate {
          learning_rate_base: .005
          total_steps: 400000
          warmup_learning_rate: .0001
          warmup_steps: 1000
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false
}

train_input_reader: {
  tf_record_input_reader {
    input_path: train.record
  }
  label_map_path: label_map.pbtxt
}

eval_config: {
  use_moving_averages: false
  num_examples: 213
  metrics_set: "coco_detection_metrics"
  eval_interval_secs: 300
  max_evals: 100
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: eval.record
  }
  label_map_path: label_map.pbtxt
  shuffle: false
  num_readers: 1
}

graph_rewriter {
  quantization {
    delay: 0
    activation_bits: 8
    weight_bits: 8
  }
}
@karmel

This comment has been minimized.

Copy link
Member

commented Sep 14, 2018

@gargn -- can you take a look at the toco conversion error detailed above?

@gargn

This comment has been minimized.

Copy link
Member

commented Sep 14, 2018

Adding @achowdhery who works on the Object Detection model.

@achowdhery

This comment has been minimized.

Copy link
Contributor

commented Sep 14, 2018

FPN model support is still pending on our end. We have noted this feature request and will keep you updated on adding support for it next 4 weeks.

@jackweiwang

This comment has been minimized.

Copy link

commented Oct 9, 2018

Hi, @achowdhery Has the problem been solved?

@SiddhantKapil

This comment has been minimized.

Copy link
Contributor

commented Oct 31, 2018

Hi, @achowdhery do fpn supports toco conversion now?

@Vandmoon

This comment has been minimized.

Copy link

commented Nov 11, 2018

Hi, @achowdhery! Does toco conversion support FPN now?
Actually I am more concerning about the quantization of upsampling operation. Everytime I tried to quantize nearest_neighbor_upsampling, it delivered an error that mul is lacking min/max data.

@maxcrous

This comment has been minimized.

Copy link

commented Jan 22, 2019

Any update on FPN model support @achowdhery?

When I train ssd_mobilenet_v1_fpn

  • I reach a low loss.
  • I am able to export_tflite_ssd_graph succesfully
  • I am able to tflite_convert succesfully

But, when invoking the tfilite model on mobile or in python, I receive a Fatal signal 6 (SIGABRT).

The same happens when using the frozen_inference_graph supplied with the download from the model zoo.

Everything works when I use ssd_mobilenet_v1_coco instead of ssd_mobilenet_v1_fpn.

@achowdhery

This comment has been minimized.

Copy link
Contributor

commented Jan 22, 2019

@maxcrous If you are able to visualize the TF Lite after exporting, that would be extremely helpful in understanding and debugging the problem in open source version. Please share the visualization (TF Lite file can be visualized in Netron app) or you can use this tool (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tools/visualize.py)

@maxcrous

This comment has been minimized.

Copy link

commented Jan 22, 2019

@achowdhery, thank you for the quick reply.
Yes, I am able to visualize it with Netron.
Visually, it is very similar to ssd_mobilenet_v1_coco.

Link to Netron image for ssd_mobilenet_v1_coco:
https://ibb.co/Vxd2Xqf

Link to Netron image for ssd_mobilenet_v1_fpn:
https://ibb.co/JQPDRwj

@maxcrous

This comment has been minimized.

Copy link

commented Jan 22, 2019

This is my system information:

OS Platform and Distribution: MacOS Mojave 10.14.1
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 1.12.0
All processes run on CPU.

@achowdhery

This comment has been minimized.

Copy link
Contributor

commented Jan 22, 2019

In the export script, Can you please try turning off the addition of postprocessing op in the FPN model to see if the SIGABRT is in the main graph or postprocessing op?

@maxcrous

This comment has been minimized.

Copy link

commented Jan 22, 2019

When setting add_postprocessing_op to False, export_tflite_ssd_graph.py succeeds.
This is on the model.ckpt and pipeline.config supplied with the ssd_mobilenet_v1_fpn from the model zoo.

The model is then successfully converted to a tflite model with the following command.

tflite_convert \
--graph_def_file=tflite_graph.pb \
--output_file=detect.tflite \
--input_shapes=1,640,640,3 \
--input_arrays=normalized_input_image_tensor \
--output_arrays='concat_1' \
--inference_type=FLOAT \
--mean_values=128 \
--std_dev_values=128 \
--allow_custom_ops

The resulting tflite model still produces a Fatal signal 6 (SIGABRT).
The code I use to test the model can be found here:
https://bit.ly/2HqRPMi

The Netron image for the tflite model can be found here:
https://ibb.co/5LRwpV6

@maxcrous

This comment has been minimized.

Copy link

commented Jan 22, 2019

When using tensorflow 1.11.0 (same result as tensorflow 1.12.0)

  • export_tflite_ssd_graph is succesful
  • tflite_convert is succesful
  • SIGABRT on invoke

As stated in issue #4826, when using tensorflow 1.10.0

  • export_tflite_ssd_graph is succesful
  • tflite_convert throws the error :
RuntimeError: TOCO failed see console for info.
b'2019-01-22 16:34:31.145095: F tensorflow/contrib/lite/toco/import_tensorflow.cc:218] Check failed: input_shape.dim_size() <= 4 (6 vs. 4)\n'
None
@achowdhery

This comment has been minimized.

Copy link
Contributor

commented Jan 22, 2019

Yes, it was probably not possible to convert this until v1.12. I am still trying to understand why there is SIGABT if it converts. Please attach or email the frozen graph and tflite file. The converted tflite file should run.

@maxcrous

This comment has been minimized.

Copy link

commented Jan 22, 2019

Here is the model straight out of the model zoo after export_tflite_ssd_graph and tflite_convert, run with the arguments mentioned in previous posts.
https://drive.google.com/file/d/1TLGi9KqpYdAp86sc01c1GNBSA1C2a1VP/view?usp=sharing

Thanks again for the helpfulness.

@achowdhery

This comment has been minimized.

Copy link
Contributor

commented Jan 22, 2019

Please also add the tflite_convert command you used. I will try to repro the bug. And please add the stack trace for SIGABRT.

@maxcrous

This comment has been minimized.

Copy link

commented Jan 22, 2019

For export_tflite_ssd_graph I use:

export mobilenet_fpn=ssd_mobilenet_v1_fpn_shared_box_predictor_640x640_coco14_sync_2018_07_03

python3 object_detection/export_tflite_ssd_graph.py    \
--pipeline_config_path /..../$mobilenet_fpn/pipeline.config    \
--trained_checkpoint_prefix /..../$mobilenet_fpn/model.ckpt    \
--output_directory /..../$mobilenet_fpn/output \
--add_postprocessing_op=False

Do note that the the postprocessing operations have been disregarded.

Then I cd into the previous command's output directory, and for tflite_convert I use:

tflite_convert \
--graph_def_file=tflite_graph.pb \
--output_file=detect.tflite \
--input_shapes=1,640,640,3 \
--input_arrays=normalized_input_image_tensor \
--output_arrays='concat_1' \
--inference_type=FLOAT \
--mean_values=128 \
--std_dev_values=128 \
--allow_custom_ops

The stack trace for the SIGABRT can be found here:
https://drive.google.com/open?id=1RdE5861tXiWBF3lxaGgwfWss3lJK-W_3

@achowdhery

This comment has been minimized.

Copy link
Contributor

commented Jan 22, 2019

The bug seems to be with Mul op. We will look in to this in the next few days. We sincerely appreciate your reporting the same.

@hxtkyne

This comment has been minimized.

Copy link

commented Jan 28, 2019

do you solve the problem?@maxcrous

@maxcrous

This comment has been minimized.

Copy link

commented Jan 28, 2019

Hey @hxtkyne, I don't have any knowledge of the tflite conversion process, so we will have to wait for the good people at Tensorflow to fix this one.
In the meantime I'm using the ssd_mobilenet_v1_coco for mobile deployment, which works ok for my problem.

@oopsodd

This comment has been minimized.

Copy link

commented Feb 11, 2019

I used TF object detection API to train ssd_resnet_50_fpn_coco with a 50-classes dataset.
Everything is okay with frozen model.
The checkpoint was converted successfully using this command:

bazel run -c opt tensorflow/lite/toco:toco -- \
  --input_file=$OUTPUT_DIR/tflite_graph.pb \
  --output_file=$OUTPUT_DIR/detect.tflite \
  --input_shapes=1,640,640,3 \
  --input_arrays=normalized_input_image_tensor \
  --output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3'  \
  --inference_type=FLOAT \
  --mean_values=128 \
  --std_values=128 \
  --change_concat_input_ranges=false \
  --allow_custom_ops
(ubuntu 16, latest tensorflow, models repo, tf-nightly)

But the tflite model detect wrong class, bbox. All the output classes are the same (1 class).
The tflite model takes 3s per image to inference on Galaxy S9 (same as the frozen model).
Did Tflite support ssd_resnet_50_fpn_coco?

@joeyM1997

This comment has been minimized.

Copy link

commented Feb 14, 2019

Ran into this yesterday. Anybody know the progress on this?

@holyhao

This comment has been minimized.

Copy link

commented Feb 15, 2019

@oopsodd used the weight_shared_convolutional_box_predictor in ppnnet,the tflite model detect wrong class, bbox. All the output classes are the same (1 class) too. I wonder if the convert tool support weight_shared_convolutional_box_predictor well?

@sarmadidrees

This comment has been minimized.

Copy link

commented Feb 27, 2019

@achowdhery any update on the FPN model support?

@AliceDinh

This comment has been minimized.

Copy link

commented Mar 15, 2019

still waiting good news on FPN model support, anyone gets any update?

@dkashkin

This comment has been minimized.

Copy link

commented Apr 29, 2019

Please make it a priority to add FPN support! Everybody needs this.

@yjfncu

This comment has been minimized.

Copy link

commented May 10, 2019

@oopsodd I also use TF object detection API to train ssd-mobilenet_v2 model and use export_tflite_ssd_graph.py convert ckpt model to .pb file.and the .pb file also works well, but when I use bazel run --config=opt tensorflow/lite to convert .pb to .tflite, there is some errors, if I need compile tensorflow soure use bazel tools and then can use this command to convert .pb file to tflite, and how do you compile tensorflow use bazel ,thank you

@thusinh1969

This comment has been minimized.

Copy link

commented May 19, 2019

Any update on FPN model support @achowdhery?

When I train ssd_mobilenet_v1_fpn

  • I reach a low loss.
  • I am able to export_tflite_ssd_graph succesfully
  • I am able to tflite_convert succesfully

But, when invoking the tfilite model on mobile or in python, I receive a Fatal signal 6 (SIGABRT).

The same happens when using the frozen_inference_graph supplied with the download from the model zoo.

Everything works when I use ssd_mobilenet_v1_coco instead of ssd_mobilenet_v1_fpn.

I have exactly the same problem with ssd fpn mobile. Same setup, same command line. Use Android Studio, keep crashing same SIGABRT error. While ssd_mobile_v2 and ssd_inception_v2 both work fine in both FLOAT and QUANTIZED_UINT8 mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.