Skip to content

Lazily load models and on insufficient resources for a load, look to unload idle models  #1403

@kimjuny

Description

@kimjuny

Feature Request

Describe the problem the feature is intended to solve

I have thousands of models to be served, but quite a big part of these models are not frequently requested, actually only a few of them are. Loading all of these models into memory at the same time is quite resource-consuming & time-consuming.

Describe the solution

I wonder if there's an option to lazy load models and only caching those most frequently requested models in memory?

Describe alternatives you've considered

None yet.

Additional context

Actually I'm not sure if this is a feature request or we are already having this feature.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions