Would someone post some data on their performance numbers for their Tensorflow serving systems in production? I'm curious about some latency numbers like tp99/90/50, QPS numbers, response/request sizes, and some comparisons of numbers within data centers vs the open web?
Also, what are some best practices in squeezing out performance? For instance, streaming/batching?