[Feature] Sequential model loading

### Feature Summary

The ability to load models one by one (only load one model at a time when the calculation needs it) to reduce memory usage.

### Detailed Description

I'm working on an android library leveraging llama.cpp and stablediffusion.cpp for easy on device inference but I'm memory limited for some models where the encoder, vae and UNet are being loaded at the same time, would it be possible to add sequential model loading where a model is only loaded when it is needed to only have one model loaded at a time, to drastically reduce the memory footprint.

### Alternatives you considered

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Sequential model loading #979

Feature Summary

Detailed Description

Alternatives you considered

Additional context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Feature] Sequential model loading #979

Description

Feature Summary

Detailed Description

Alternatives you considered

Additional context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions