Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a current way to run lm-eval against a self-hosted inference server? #1072

Closed
sfriedowitz opened this issue Dec 6, 2023 · 3 comments
Labels
feature request A feature that isn't implemented yet. help wanted Contributors and extra help welcome.

Comments

@sfriedowitz
Copy link

sfriedowitz commented Dec 6, 2023

We are interested in trying to run lm-eval on a low-resource machine and have it talk to models on a self-hosted inference server. We are not bound to any specific inference server, but some that we are interested in are vLLM, TGI, and ray-llm.

Is there a current way to do this out of the box? It looks like the big-refactor branch has support for loading models with vLLM, but only in the same process as the evaluation runner (which would require GPU resources on the machine).

One option I was thinking about was to extend the LLM base class and implement bindings to one of our self-host inference servers. But I'm not sure if that is necessary if the library already supports that capability.

Thanks for the help!

@baberabb
Copy link
Contributor

baberabb commented Dec 7, 2023

Hi! vLLM does have a drop in replacement for the OpenAI API server which should work with minimal modifications by passing in the base_url and correcting the tokenizer to match. You could also use it as a template to support any custom server.

Would be interested to see how well this works!

@haileyschoelkopf haileyschoelkopf added help wanted Contributors and extra help welcome. feature request A feature that isn't implemented yet. labels Dec 7, 2023
@veekaybee
Copy link
Contributor

I have a PR in several steps to address this, the first step is making sure the completions API works. It's here: #1141

@veekaybee
Copy link
Contributor

Merged and resolved! #1174

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request A feature that isn't implemented yet. help wanted Contributors and extra help welcome.
Projects
None yet
Development

No branches or pull requests

4 participants