Exported Inference model has too many outputs for serving 

# Prerequisites

Please answer the following questions for yourself before submitting an issue.

- [ x ] I am using the latest TensorFlow Model Garden release and TensorFlow 2.
- [ x ] I am reporting the issue to the correct repository. (Model Garden official or research directory)
- [ x ] I checked to make sure that this issue has not already been filed.

## 1. The entire URL of the file you are using

https://github.com/tensorflow/models/blob/master/research/object_detection/exporter_main_v2.py

## 2. Describe the bug

I've trained a model with a custom dataset using the Mask R-CNN with Inception Resnet v2 (no atrous) config which gave good results.
After that I've used the exporter_main_v2 to export the model for inferencing on tensorflow serving, the export itself worked and the model server is able to load the model.
The problem occurs when I try to inference the model that is being served. When I send in an image of +- 250kb it will return a payload of over 150+ MB. 

After investigation it seems that by exporting the model with the exporter_main_v2 a lot of "outputs" are returned that are not needed for my case:
"outputs": {
  "refined_box_encodings": {
  },
  "mask_predictions": {
  },
  "final_anchors": {
  },
  "detection_classes": {
  },
  "num_detections": {
  },
  "class_predictions_with_background": {
  },
  "raw_detection_boxes": {
  },
  "proposal_boxes": {
  },
  "rpn_box_encodings": {
  },
  "box_classifier_features": {
  },
  "raw_detection_scores": {
  },
  "proposal_boxes_normalized": {
  },
  "detection_multiclass_scores": {
  },
  "num_proposals": {
  },
  "anchors": {
  },
  "detection_boxes": {
  },
  "image_shape": {
  },
  "rpn_objectness_predictions_with_background": {
  },
  "detection_scores": {
  },
  "detection_masks": {
  },
  "detection_anchor_indices": {
  }

In the exporter_main_v2 there is stated that:
```
and the following output nodes returned by the model.postprocess(..):
  * `num_detections`: Outputs float32 tensors of the form [batch]
      that specifies the number of valid boxes per image in the batch.
  * `detection_boxes`: Outputs float32 tensors of the form
      [batch, num_boxes, 4] containing detected boxes.
  * `detection_scores`: Outputs float32 tensors of the form
      [batch, num_boxes] containing class scores for the detections.
  * `detection_classes`: Outputs float32 tensors of the form
      [batch, num_boxes] containing classes for the detections.
```

Is there a way to export the model so that it will only get these 4 output payloads? Or even that a list of desired outputs can be supplied when exporting the model?

I found on stackoverflow a similar question regarding this, but there is no response yet:
https://stackoverflow.com/questions/64200782/tf-object-detection-return-subset-of-inference-payload

## 3. Steps to reproduce

Exporting a trained Mask R-CNN with Inception Resnet v2 (no atrous) model with the command:
python "${WORK_DIR}"/exporter_main_v2.py \
    --input_type image_tensor \
    --pipeline_config_path ${PIPELINE_CONFIG_PATH} \
    --trained_checkpoint_dir ${TRAIN_DIR} \
    --output_directory ${EXPORT_DIR} \

## 4. Expected behavior

Export a model that gives the following outputs when it is being inferenced:
  * `num_detections`: Outputs float32 tensors of the form [batch]
      that specifies the number of valid boxes per image in the batch.
  * `detection_boxes`: Outputs float32 tensors of the form
      [batch, num_boxes, 4] containing detected boxes.
  * `detection_scores`: Outputs float32 tensors of the form
      [batch, num_boxes] containing class scores for the detections.
  * `detection_classes`: Outputs float32 tensors of the form
      [batch, num_boxes] containing classes for the detections._

## 5. Additional context

When calling the model over grpc by default I get a resource exhaused issue due to the large payload that is returned:
<_Rendezvous of RPC that terminated with:
	status = StatusCode.RESOURCE_EXHAUSTED
	details = "Received message larger than max (151678381 vs. 4194304)"
	debug_error_string = "{"created":"@1602496041.618226876","description":"Received message larger than max (151678381 vs. 104857600)","file":"src/core/ext/filters/message_size/message_size_filter.cc","file_line":190,"grpc_status":8}"
>

This can be overcome by increasing the max message size in a grpc channel, but then the serving takes ages to respond due to the large return.

## 6. System information

- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04 LTS
- Mobile device name if the issue happens on a mobile device:
- TensorFlow installed from (source or binary): source
- TensorFlow version (use command below): 2.3.0
- Python version: python 3 latest version (via od docker image)
- Bazel version (if compiling from source):(via od docker image)
- GCC/Compiler version (if compiling from source):(via od docker image)
- CUDA/cuDNN version:(via od docker image)
- GPU model and memory: Nvidia K80



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Exported Inference model has too many outputs for serving #9366

Prerequisites

1. The entire URL of the file you are using

2. Describe the bug

3. Steps to reproduce

4. Expected behavior

5. Additional context

6. System information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Exported Inference model has too many outputs for serving #9366

Description

Prerequisites

1. The entire URL of the file you are using

2. Describe the bug

3. Steps to reproduce

4. Expected behavior

5. Additional context

6. System information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions