Closed
Description
There are applications that may not want to proxy all LLM calls through a backend server, which is a current limitation.
Specifically, the useChat hook in React assumes making a fetch call to the server, with a specific response type.
One option we've attempted is to pass in a remote url for the model provider's api directly into the useChat config - however this falls short because the decoding step is still missed (as that is assumed to happen on the server).
It would be very nice if the same abstractions around models could be used on the client - e.g passing in a config to produce a given stream.
Thank you for building this!