You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All current inference engines primarily function within the browser host. We now require them to be refactored to operate on the backend. This adjustment will enable UI components to simply request with minimal model knowledge, such as the model's ID and messages.
Without requiring extensive knowledge of running the model, the client can simply send an OpenAI-compatible request without additional parameters. Models will be loaded using default settings, read from model.json, on the server side.
See: #2758
This approach would also contribute to scaling the model hub, as clients can easily retrieve the latest supported model list from the backend, which is dynamically updatable.
Description:
All current inference engines primarily function within the browser host. We now require them to be refactored to operate on the backend. This adjustment will enable UI components to simply request with minimal model knowledge, such as the model's ID and messages.
Without requiring extensive knowledge of running the model, the client can simply send an OpenAI-compatible request without additional parameters. Models will be loaded using default settings, read from
model.json
, on the server side.See: #2758
This approach would also contribute to scaling the model hub, as clients can easily retrieve the latest supported model list from the backend, which is dynamically updatable.
The text was updated successfully, but these errors were encountered: