Skip to content

Tensorflow serving performance and optimization tips #89

@viksit

Description

@viksit

Would someone post some data on their performance numbers for their Tensorflow serving systems in production? I'm curious about some latency numbers like tp99/90/50, QPS numbers, response/request sizes, and some comparisons of numbers within data centers vs the open web?

Also, what are some best practices in squeezing out performance? For instance, streaming/batching?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions