-
Notifications
You must be signed in to change notification settings - Fork 67
Add documentation on pointing llmengine client to self-hosted infrastructure #200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
docs/guides/self_hosting.md
Outdated
| ``` | ||
|
|
||
| ### Pointing LLM Engine client to use self-hosted infrastructure | ||
| The `llmengine` client makes requests to Scale AI's hosted infrastructure by default. You can have `llmengine` client make requests to your own self-hosted infrastructure by setting the `LLM_ENGINE_BASE_PATH` environment variable to the URL of the `llm-engine` pod. The exact URL of `llm-engine` pod depends on your Kubernetes cluster networking setup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure if this would just work out of the box as Spellbook API might be different from the API exposed by llm-engine gateway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does having people use https://github.com/scaleapi/launch-python-client and point the gateway_endpoint to self-hosted llm-engine be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have thought that we'd have wanted to have llmengine client be able to point to llmengine server directly? in the sense that if this doesn't work OOB we should make it work OOB?
IMO having people use launch-python-client seems kinda ugly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not mention launch-python-client here. FWIW, the EGP APIs do have a /llm-prefixed set of APIs that are at parity with the LLM Engine APIs, so having the LLM Engine client point to either the self-hosted or EGP-hosted one should still work. There is a separate set of EGP-specific Completion APIs, which is not at play here. cc @felixs8696
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok! Then I'll keep the documentation as it is and only use llmengine client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spellbook API might be different from the API exposed by llm-engine gateway
we intentionally keep them the same.
docs/guides/self_hosting.md
Outdated
| ``` | ||
|
|
||
| ### Pointing LLM Engine client to use self-hosted infrastructure | ||
| The `llmengine` client makes requests to Scale AI's hosted infrastructure by default. You can have `llmengine` client make requests to your own self-hosted infrastructure by setting the `LLM_ENGINE_BASE_PATH` environment variable to the URL of the `llm-engine` pod. The exact URL of `llm-engine` pod depends on your Kubernetes cluster networking setup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exact URL of
llm-enginepod depends on your Kubernetes cluster networking setup.
This might be true, but I think we have some default in our helm chart? @song-william @phil-scale
Also, let's add a code snippet example, which assumes this default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yixu34 yep you're right! The default cluster domain is specified here
| dns_host_domain: domain.llm-engine.com |
But do requests sent to the k8s cluster domain get properly routed to llm-engine gateway? I'm not 100% sure but I feel like there needs to be some networking config set up to route requests to the gateway?
cc @yunfeng-scale
docs/guides/self_hosting.md
Outdated
| ``` | ||
|
|
||
| ### Pointing LLM Engine client to use self-hosted infrastructure | ||
| The `llmengine` client makes requests to Scale AI's hosted infrastructure by default. You can have `llmengine` client make requests to your own self-hosted infrastructure by setting the `LLM_ENGINE_BASE_PATH` environment variable to the URL of the `llm-engine` pod. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would suggest replacing "pod" with "service". pod is too specific and low-level - in fact, there may be many pods for a given service.
docs/guides/self_hosting.md
Outdated
| ### Pointing LLM Engine client to use self-hosted infrastructure | ||
| The `llmengine` client makes requests to Scale AI's hosted infrastructure by default. You can have `llmengine` client make requests to your own self-hosted infrastructure by setting the `LLM_ENGINE_BASE_PATH` environment variable to the URL of the `llm-engine` pod. | ||
|
|
||
| The exact URL of `llm-engine` pod depends on your Kubernetes cluster networking setup. The domain is specified at `config.values.infra.dns_host_domain` in the helm chart values config file. Using `charts/llm-engine/values_sample.yaml` as an example, you would |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, pod -> service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also finish this sentence, e.g. you would do:
docs/guides/self_hosting.md
Outdated
|
|
||
| The exact URL of `llm-engine` service depends on your Kubernetes cluster networking setup. The domain is specified at `config.values.infra.dns_host_domain` in the helm chart values config file. Using `charts/llm-engine/values_sample.yaml` as an example, you would do: | ||
| ```bash | ||
| export LLM_ENGINE_BASE_PATH=https://domain.llm-engine.com |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should this be https://llm-engine.domain.com? (in addition to changing the value inside values_sample.yaml)
since it feels like llm-engine should be some subdomain of domain.com, not the other way around, if users are self-hosting at domain.com.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense! Updated!
Add documentation on pointing
llmengineclient to self-hosted infrastructure.