Skip to content

Exported Inference model has too many outputs for serving  #9366

@dietermaes

Description

@dietermaes

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • [ x ] I am using the latest TensorFlow Model Garden release and TensorFlow 2.
  • [ x ] I am reporting the issue to the correct repository. (Model Garden official or research directory)
  • [ x ] I checked to make sure that this issue has not already been filed.

1. The entire URL of the file you are using

https://github.com/tensorflow/models/blob/master/research/object_detection/exporter_main_v2.py

2. Describe the bug

I've trained a model with a custom dataset using the Mask R-CNN with Inception Resnet v2 (no atrous) config which gave good results.
After that I've used the exporter_main_v2 to export the model for inferencing on tensorflow serving, the export itself worked and the model server is able to load the model.
The problem occurs when I try to inference the model that is being served. When I send in an image of +- 250kb it will return a payload of over 150+ MB.

After investigation it seems that by exporting the model with the exporter_main_v2 a lot of "outputs" are returned that are not needed for my case:
"outputs": {
"refined_box_encodings": {
},
"mask_predictions": {
},
"final_anchors": {
},
"detection_classes": {
},
"num_detections": {
},
"class_predictions_with_background": {
},
"raw_detection_boxes": {
},
"proposal_boxes": {
},
"rpn_box_encodings": {
},
"box_classifier_features": {
},
"raw_detection_scores": {
},
"proposal_boxes_normalized": {
},
"detection_multiclass_scores": {
},
"num_proposals": {
},
"anchors": {
},
"detection_boxes": {
},
"image_shape": {
},
"rpn_objectness_predictions_with_background": {
},
"detection_scores": {
},
"detection_masks": {
},
"detection_anchor_indices": {
}

In the exporter_main_v2 there is stated that:

and the following output nodes returned by the model.postprocess(..):
  * `num_detections`: Outputs float32 tensors of the form [batch]
      that specifies the number of valid boxes per image in the batch.
  * `detection_boxes`: Outputs float32 tensors of the form
      [batch, num_boxes, 4] containing detected boxes.
  * `detection_scores`: Outputs float32 tensors of the form
      [batch, num_boxes] containing class scores for the detections.
  * `detection_classes`: Outputs float32 tensors of the form
      [batch, num_boxes] containing classes for the detections.

Is there a way to export the model so that it will only get these 4 output payloads? Or even that a list of desired outputs can be supplied when exporting the model?

I found on stackoverflow a similar question regarding this, but there is no response yet:
https://stackoverflow.com/questions/64200782/tf-object-detection-return-subset-of-inference-payload

3. Steps to reproduce

Exporting a trained Mask R-CNN with Inception Resnet v2 (no atrous) model with the command:
python "${WORK_DIR}"/exporter_main_v2.py
--input_type image_tensor
--pipeline_config_path ${PIPELINE_CONFIG_PATH}
--trained_checkpoint_dir ${TRAIN_DIR}
--output_directory ${EXPORT_DIR} \

4. Expected behavior

Export a model that gives the following outputs when it is being inferenced:

  • num_detections: Outputs float32 tensors of the form [batch]
    that specifies the number of valid boxes per image in the batch.
  • detection_boxes: Outputs float32 tensors of the form
    [batch, num_boxes, 4] containing detected boxes.
  • detection_scores: Outputs float32 tensors of the form
    [batch, num_boxes] containing class scores for the detections.
  • detection_classes: Outputs float32 tensors of the form
    [batch, num_boxes] containing classes for the detections._

5. Additional context

When calling the model over grpc by default I get a resource exhaused issue due to the large payload that is returned:
<_Rendezvous of RPC that terminated with:
status = StatusCode.RESOURCE_EXHAUSTED
details = "Received message larger than max (151678381 vs. 4194304)"
debug_error_string = "{"created":"@1602496041.618226876","description":"Received message larger than max (151678381 vs. 104857600)","file":"src/core/ext/filters/message_size/message_size_filter.cc","file_line":190,"grpc_status":8}"

This can be overcome by increasing the max message size in a grpc channel, but then the serving takes ages to respond due to the large return.

6. System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04 LTS
  • Mobile device name if the issue happens on a mobile device:
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below): 2.3.0
  • Python version: python 3 latest version (via od docker image)
  • Bazel version (if compiling from source):(via od docker image)
  • GCC/Compiler version (if compiling from source):(via od docker image)
  • CUDA/cuDNN version:(via od docker image)
  • GPU model and memory: Nvidia K80

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions