Is there a current way to run lm-eval against a self-hosted inference server? #1072

sfriedowitz · 2023-12-06T23:40:08Z

We are interested in trying to run lm-eval on a low-resource machine and have it talk to models on a self-hosted inference server. We are not bound to any specific inference server, but some that we are interested in are vLLM, TGI, and ray-llm.

Is there a current way to do this out of the box? It looks like the big-refactor branch has support for loading models with vLLM, but only in the same process as the evaluation runner (which would require GPU resources on the machine).

One option I was thinking about was to extend the LLM base class and implement bindings to one of our self-host inference servers. But I'm not sure if that is necessary if the library already supports that capability.

Thanks for the help!

The text was updated successfully, but these errors were encountered:

baberabb · 2023-12-07T06:14:32Z

Hi! vLLM does have a drop in replacement for the OpenAI API server which should work with minimal modifications by passing in the base_url and correcting the tokenizer to match. You could also use it as a template to support any custom server.

Would be interested to see how well this works!

veekaybee · 2023-12-15T21:46:11Z

I have a PR in several steps to address this, the first step is making sure the completions API works. It's here: #1141

veekaybee · 2023-12-20T20:52:48Z

Merged and resolved! #1174

haileyschoelkopf added help wanted Contributors and extra help welcome. feature request A feature that isn't implemented yet. labels Dec 7, 2023

veekaybee mentioned this issue Dec 15, 2023

Enabling OpenAI completions via gooseai #1141

Merged

veekaybee mentioned this issue Dec 19, 2023

Implementing local OpenAI API-style chat completions on any given inference server #1174

Merged

haileyschoelkopf closed this as completed Feb 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a current way to run lm-eval against a self-hosted inference server? #1072

Is there a current way to run lm-eval against a self-hosted inference server? #1072

sfriedowitz commented Dec 6, 2023 •

edited

baberabb commented Dec 7, 2023

veekaybee commented Dec 15, 2023

veekaybee commented Dec 20, 2023

Is there a current way to run lm-eval against a self-hosted inference server? #1072

Is there a current way to run lm-eval against a self-hosted inference server? #1072

Comments

sfriedowitz commented Dec 6, 2023 • edited

baberabb commented Dec 7, 2023

veekaybee commented Dec 15, 2023

veekaybee commented Dec 20, 2023

sfriedowitz commented Dec 6, 2023 •

edited