Closed
Description
Is your feature request related to a problem? Please describe.
We have a huge matrix of Docker images and Python images are quite big.
Describe the solution you'd like
A way to pull backends in runtime, ideally like installing models (from webUI and API), but for backends (e.g. parler-tts, etc)
This would allow us to release binaries or smaller container images without any embeded backend. At the same time it would allow to pull backends during the installation of the models too.
We could serve backends as container images that could be pulled with the same code that pulls models, but would just overlay backends in a defined folder.
There are challenges to this:
- container images would still depend on underlying drivers (e.g. nvidia, hipblas, etc)
- CI changes to the release process (build each backend in a container image and publish it separately)
Describe alternatives you've considered
Keep things as-is
Additional context