How to load model when the server starts? #453

fromradio · 2017-05-25T07:02:51Z

We have trained a neural network and want to use it to infer at real time.

We have known that the average prediction time cost of our model on CPU is 1.6ms per instance. But tensorflow serving will cost much more time, up to 100ms per instance.

I think it is because that every time we call the grpc, the model is first loaded from disk. So I want to know is any solution here or we have to keep a long connection between client and server?

kirilg · 2017-05-25T18:31:02Z

It's surprising that the model server is adding so much overhead. The server is not reloading the model from disk with each request, it only does that once for each new version that comes in, and then keeps it in memory.

Are you querying a remote server? Could it be a network issue? One useful experiment is to run the model_server and the client on the same machine and issue requests there. That will make the latency numbers only include evaluation time.

How did you measure 1.6ms and 100ms? The model server is doing very little on top of TF's session.run call, so if evaluating the model actually takes 1.6ms, model_server requests should take very close to that.

kamei86i · 2017-05-30T12:18:33Z

I also noticed TF Serving overhead (#456) on the same machine comparing it to the JNI interface.

sukritiramesh · 2017-06-23T21:05:51Z

Closing due to inactivity; please reopen if required.

fromradio · 2017-07-03T05:35:48Z

@kirilg We finally found this is because the different linux system we used. At first tf-serving was compiled in CentOS 6 without cpu opts. Finally we use CentOS 7 to recompile tf-serving and the speed becomes the same.

fromradio changed the title ~~How to load model when start the server?~~ How to load model when the server starts? May 25, 2017

sukritiramesh closed this as completed Jun 23, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to load model when the server starts? #453

How to load model when the server starts? #453

fromradio commented May 25, 2017 •

edited

kirilg commented May 25, 2017

kamei86i commented May 30, 2017

sukritiramesh commented Jun 23, 2017

fromradio commented Jul 3, 2017

How to load model when the server starts? #453

How to load model when the server starts? #453

Comments

fromradio commented May 25, 2017 • edited

kirilg commented May 25, 2017

kamei86i commented May 30, 2017

sukritiramesh commented Jun 23, 2017

fromradio commented Jul 3, 2017

fromradio commented May 25, 2017 •

edited