[Feature]: Dynamic reloading config file #964

krrishdholakia · 2023-11-30T21:36:30Z

The Feature

"I'm searching for a way to perform dynamic model and endpoint discovery, as it's crucial for our ecosystem. It would be beneficial if LiteLLM allowed the dynamic reloading of the config file, for example. That would help a lot."

Motivation, pitch

user request

Twitter / LinkedIn details

No response

krrishdholakia · 2023-12-01T17:24:38Z

cc: @chanwit

krrishdholakia · 2023-12-01T17:25:26Z

"Our environment uses Kubernetes, and LLM instances are served by the LM-controller. When the LM-controller starts a new LLM instance, we want it to be automatically registered with a proxy. To achieve this, we plan to write a Kubernetes informer that monitors the creation and deletion of LLM instances, captures their endpoint information, and sends it to register the endpoint with the proxy, in this case, LiteLLM.

For LiteLLM to support this dynamic endpoint discovery, it needs to implement endpoint registration and deregistration, such as POST and DELETE at /endpoints, allowing for on-the-fly alias and endpoint registration without restarting the proxy process. From a security standpoint, we would limit this endpoint to local access only, or possibly use a Unix socket."

krrishdholakia · 2023-12-02T04:38:13Z

@chanwit just pushed a new /model/new endpoint - 72381c3

krrishdholakia · 2023-12-02T04:38:34Z

I'll update the thread once this is in prod.

krrishdholakia · 2023-12-02T19:12:56Z

in prod in v1.10.0

chanwit · 2023-12-04T16:46:33Z

Thank you for the ping @krrishdholakia

Here's my take on the acceptance criteria for this feature.

There is the POST /model/new endpoint in LiteLLM for registering a new LLM instance. The new LLM instance can be used right away without restarting the current LiteLLM server. This AC has been addressed by 72381c3
There is the DELETE /models/{:id} endpoint in LiteLLM for removing an existing LLM instance. The instance is removed from the server without restarting.

krrishdholakia · 2023-12-06T06:44:12Z

@chanwit /model/delete endpoint added - 92b2cbc

Here's how to use it:

Step 1: Add a model with an id to your config.yaml

model_list: 
       - model_name: "openai-model"
          litellm_params:
             model: "gpt-3.5-turbo"
          model_info:
             id: "my-model-id"

Step 2: Start your proxy

$ litellm /path/to/config.yaml

Step 3: Make a POST /model/delete call to remove this model from your proxy

curl -X POST http://0.0.0.0:8000/model/delete \
-H 'Content-Type: application/json' \
-d '{"id": "my-model-id"}'

krrishdholakia · 2023-12-06T06:44:37Z

Let me know if this looks good to you

Will update this ticket once the release is in prod.

krrishdholakia · 2023-12-07T16:05:47Z

closing as this is now out

ericblue · 2023-12-09T22:08:23Z

Hi @krrishdholakia Sorry for the delayed response as I'm just testing this out now.

Can you give a valid example of the POST request for a new model? Following the Swagger example, the following model:

{
"model_name": "ollama-custom",
"litellm_params": {
"model": "ollama/llama2",
"api_base": "http://localhost:11434",
"host": "localhost",
"port": 8000
},
"model_info": {
"id": "my-test-model",
"mode": "embedding",
"input_cost_per_token": 0,
"output_cost_per_token": 0,
"max_tokens": 4096,
"base_model": "gpt-3.5-turbo"
}
}

Results in a server error:

{
"detail": "Internal Server Error: could not determine a constructor for the tag 'tag:yaml.org,2002:python/object:litellm.proxy._types.ModelInfo'\n in "config_1702158172.732068.yaml", line 7, column 15"
}

It looks like a file is initially created, but model_info is a serialized python object:

model_list:

litellm_params:
api_base: http://localhost:11434
host: localhost
model: ollama/llama2
port: 8000
model_info: !!python/object:litellm.proxy._types.ModelInfo
dict:
base_model: gpt-3.5-turbo
id: test
input_cost_per_token: 0.0
max_tokens: 4096
mode: embedding
output_cost_per_token: 0.0
pydantic_extra:
additionalProp1: {}
pydantic_fields_set: !!set
additionalProp1: null
base_model: null
id: null
input_cost_per_token: null
max_tokens: null
mode: null
output_cost_per_token: null
pydantic_private: null
model_name: ollama-custom

Interestingly enough litellm_params is persisted ok, and both are passed into yaml.dump the same way in the config object

krrishdholakia · 2023-12-09T22:32:41Z

that's concerning. Here's how i was testing it -

litellm/litellm/tests/test_proxy_server.py

Line 126 in a9f1039

def test_add_new_model(client):

I had to skip it on circle ci due to ci/fastapi issues. Thanks for flagging this, i'll have it fixed today + add better ci testing.

krrishdholakia · 2023-12-10T06:52:20Z

fix pushed + testing added @ericblue

krrishdholakia · 2023-12-10T06:52:31Z

will update this ticket once the fix is in prod.

krrishdholakia · 2023-12-11T15:31:39Z

fix live v1.12.3+ onwards

krrishdholakia added the enhancement New feature or request label Nov 30, 2023

krrishdholakia mentioned this issue Nov 30, 2023

Implement a proxy capability similar to LiteLLM weave-ai/weave-ai#13

Open

krrishdholakia added the proxy label Dec 1, 2023

krrishdholakia self-assigned this Dec 2, 2023

krrishdholakia closed this as completed Dec 7, 2023

krrishdholakia reopened this Dec 9, 2023

krrishdholakia closed this as completed Dec 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Dynamic reloading config file #964

[Feature]: Dynamic reloading config file #964

krrishdholakia commented Nov 30, 2023

krrishdholakia commented Dec 1, 2023

krrishdholakia commented Dec 1, 2023

krrishdholakia commented Dec 2, 2023

krrishdholakia commented Dec 2, 2023

krrishdholakia commented Dec 2, 2023

chanwit commented Dec 4, 2023 •

edited

Loading

krrishdholakia commented Dec 6, 2023

krrishdholakia commented Dec 6, 2023

krrishdholakia commented Dec 7, 2023

ericblue commented Dec 9, 2023

krrishdholakia commented Dec 9, 2023 •

edited

Loading

krrishdholakia commented Dec 10, 2023

krrishdholakia commented Dec 10, 2023

krrishdholakia commented Dec 11, 2023

[Feature]: Dynamic reloading config file #964

[Feature]: Dynamic reloading config file #964

Comments

krrishdholakia commented Nov 30, 2023

The Feature

Motivation, pitch

Twitter / LinkedIn details

krrishdholakia commented Dec 1, 2023

krrishdholakia commented Dec 1, 2023

krrishdholakia commented Dec 2, 2023

krrishdholakia commented Dec 2, 2023

krrishdholakia commented Dec 2, 2023

chanwit commented Dec 4, 2023 • edited Loading

krrishdholakia commented Dec 6, 2023

krrishdholakia commented Dec 6, 2023

krrishdholakia commented Dec 7, 2023

ericblue commented Dec 9, 2023

krrishdholakia commented Dec 9, 2023 • edited Loading

krrishdholakia commented Dec 10, 2023

krrishdholakia commented Dec 10, 2023

krrishdholakia commented Dec 11, 2023

chanwit commented Dec 4, 2023 •

edited

Loading

krrishdholakia commented Dec 9, 2023 •

edited

Loading