Skip to content

Yet another problem about high latency when loading a new model #910

@weberxie

Description

@weberxie

I have noticed #385, and have read https://ai.googleblog.com/2017/11/latest-innovations-in-tensorflow-serving.html.

Now, even if loading new models in a isolated thread pool, but the main operation process is RestoreOpsV2, which is run in the thread pool of session run, This means that the load operation is not completely run in a separate thread pool. So, the serving query may be blocked by load operation.

In our environment, the model file are stored on HDFS, when loading new model, the latency from a few milliseconds to thousands of milliseconds, I confirm this does not include model warm up time.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions