Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to deploy models where the shape of output tensor is not known #5

Closed
srihari-humbarwadi opened this issue Nov 29, 2018 · 6 comments

Comments

@srihari-humbarwadi
Copy link

I have a tensorflow frozen graph of a objection detection model, i am unclear about creating a config.pbtxt file for this model since i cannot determine the output shapes before hand and i cannot start the inference server without the "dim" specified. i wanted to know how can i create a config file for this

name: "NF1"
    platform: "tensorflow_graphdef"
    max_batch_size: 16
    
    input [
      {
        name: "image_tensor"
        data_type: TYPE_UINT8
        format: FORMAT_NHWC
        dims: [ 1024, 800, 3 ]
      }
    ]
    
    output [
      {
        name: "num_detections"
        data_type: TYPE_FP32
        dims: [ 300 ]
      },

      {
        name: "detection_boxes"
        data_type: TYPE_FP32
        dims: [ 300, 4  ]
      },

      {
        name: "detection_scores"
        data_type: TYPE_FP32
        dims: [ 300 ]        
      },

      {
        name: "detection_classes"
        data_type: TYPE_FP32
        dims: [ 300 ]        
      }
    ]
    instance_group [    
      {
        gpus: [ 0 ]
      },
      {
        gpus: [ 1 ]
      },
      {
        gpus: [ 2 ]
      },
      {
        gpus: [ 3 ]
      }                  
    ]    
    dynamic_batching {
      preferred_batch_size: [ 16 ]
      max_queue_delay_microseconds: 100
    }

this my config which does not work, i tried fixing the shape to the maximum proposals ie 300. Which i knew wouldn't work

@dcyoung
Copy link

dcyoung commented Nov 29, 2018

Did you solve this issue? And if so, could you share your solution?

I am also interested in how to serve models with variably sized outputs.

@srihari-humbarwadi
Copy link
Author

I was wrong, as the output shape is not variable, there is an upper bound for number of objects detected. So jus set the dims to this upper bound. That should work fine

@dcyoung
Copy link

dcyoung commented Nov 30, 2018

Thanks for getting back @srihari-humbarwadi. Seems defining an upper bound is fine because your model type returns a fixed size tensor, but I'm still curious if variable sizes are supported in tensorrt-inference-server. Perhaps a dev can point me to the relevant docs??

For context:
I'm assuming your model (possibly from here ?) outputs tensors of fixed size, intending for boxes to be ignored based on the associated score.

However, returning a fixed size output is not ideal for performance reasons. While it doesn't matter much for simple result types, consider the case where the served model is a MaskRCNN and the return type includes a pixel mask for each detected object. Without an output signature with variable sized tensors, the payload size would be worst-case for every return. I like to support variable outputs to reduce the payload for the common case (where less than max objects are detected). For tf-serving, this involved modifying the output before exporting a saved model, such that the return type only includes results for object's whose score exceeds some threshold.

Is this behavior supported in tensorrt-inference-server

@deadeyegoodwin
Copy link
Contributor

TRTIS only supports variable-sized dimension for batching, but this is a common request so we are planning on fixing it. Issue #8 is tracking this request so add upvotes there to indicate that you are interested in it.

@dcyoung
Copy link

dcyoung commented Nov 30, 2018

Thanks @deadeyegoodwin !

@tilaba
Copy link

tilaba commented Dec 25, 2018

hello,have you solved it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants