-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for HF Inference endpoints via text-generation-inference / EasyLLM #190
Comments
Hi Gregor, thanks for the suggestions. Additional backends are always welcome. We recently added first support for replicate.com which does allow you to run open models in the cloud (@charles-dyfis-net is working on this). After a brief look, EasyLLM seems promising and could be a way to switch out OpenAI models for other models, mocking the same interface. However, looking at https://github.com/philschmid/easyllm/blob/main/easyllm/schema/openai.py#L60, EasyLLM seems to lack support for the most crucial feature that LMQL requires to operate at full feature set, i.e. the
Now, since OpenAI APIs change a lot, deprecate fast and are also increasingly proprietary+intransparent (vendor lock-in), we decided to move away from relying too much on them, and instead build our own (open) protocol for efficient language model streaming. For this, we implement the language model transport protocol. All non-OpenAI backends in LMQL have moved to this protocol in the meantime and I would suggest this as a good starting point to build new backends (consider the |
Thanks for the detailed explanation, Luca. I understand. So let's wait for |
Hi Luca! It's excellent that one can use 🔥LMQL with locally served open models. Now, it would be handy if we could use open LLMs running in the cloud as easily as OpenAI's models. E.g.:
@philschmid has this EasyLLM and I wonder whether it might be pretty straightforward to implement that feature with
easyllm
?If you think it's worthwhile, too, some pointers about where to start and a short instruction/plan for implementing that feature would be very welcome.
Cheers, Gregor
The text was updated successfully, but these errors were encountered: