Yet another problem about high latency when loading a new model

I have noticed https://github.com/tensorflow/serving/issues/385, and have read https://ai.googleblog.com/2017/11/latest-innovations-in-tensorflow-serving.html.

Now, even if loading new models in a isolated thread pool, but the main operation process is RestoreOpsV2, which is run in the thread pool of session run, This means that the load operation is not completely run in a separate thread pool. **So, the serving query may be blocked by load operation.**

In our environment, the model file are stored on HDFS, when loading new model, the latency from a few milliseconds to thousands of milliseconds, I confirm this does not include model warm up time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yet another problem about high latency when loading a new model #910

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Yet another problem about high latency when loading a new model #910

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions