Skip to content

Conversation

@ruizehung-scale
Copy link
Contributor

Add documentation on pointing llmengine client to self-hosted infrastructure.

```

### Pointing LLM Engine client to use self-hosted infrastructure
The `llmengine` client makes requests to Scale AI's hosted infrastructure by default. You can have `llmengine` client make requests to your own self-hosted infrastructure by setting the `LLM_ENGINE_BASE_PATH` environment variable to the URL of the `llm-engine` pod. The exact URL of `llm-engine` pod depends on your Kubernetes cluster networking setup.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure if this would just work out of the box as Spellbook API might be different from the API exposed by llm-engine gateway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does having people use https://github.com/scaleapi/launch-python-client and point the gateway_endpoint to self-hosted llm-engine be better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd have thought that we'd have wanted to have llmengine client be able to point to llmengine server directly? in the sense that if this doesn't work OOB we should make it work OOB?

IMO having people use launch-python-client seems kinda ugly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not mention launch-python-client here. FWIW, the EGP APIs do have a /llm-prefixed set of APIs that are at parity with the LLM Engine APIs, so having the LLM Engine client point to either the self-hosted or EGP-hosted one should still work. There is a separate set of EGP-specific Completion APIs, which is not at play here. cc @felixs8696

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok! Then I'll keep the documentation as it is and only use llmengine client

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spellbook API might be different from the API exposed by llm-engine gateway

we intentionally keep them the same.

@ruizehung-scale ruizehung-scale self-assigned this Aug 7, 2023
```

### Pointing LLM Engine client to use self-hosted infrastructure
The `llmengine` client makes requests to Scale AI's hosted infrastructure by default. You can have `llmengine` client make requests to your own self-hosted infrastructure by setting the `LLM_ENGINE_BASE_PATH` environment variable to the URL of the `llm-engine` pod. The exact URL of `llm-engine` pod depends on your Kubernetes cluster networking setup.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exact URL of llm-engine pod depends on your Kubernetes cluster networking setup.

This might be true, but I think we have some default in our helm chart? @song-william @phil-scale

Also, let's add a code snippet example, which assumes this default.

Copy link
Contributor Author

@ruizehung-scale ruizehung-scale Aug 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yixu34 yep you're right! The default cluster domain is specified here

dns_host_domain: domain.llm-engine.com

But do requests sent to the k8s cluster domain get properly routed to llm-engine gateway? I'm not 100% sure but I feel like there needs to be some networking config set up to route requests to the gateway?
cc @yunfeng-scale

```

### Pointing LLM Engine client to use self-hosted infrastructure
The `llmengine` client makes requests to Scale AI's hosted infrastructure by default. You can have `llmengine` client make requests to your own self-hosted infrastructure by setting the `LLM_ENGINE_BASE_PATH` environment variable to the URL of the `llm-engine` pod.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would suggest replacing "pod" with "service". pod is too specific and low-level - in fact, there may be many pods for a given service.

### Pointing LLM Engine client to use self-hosted infrastructure
The `llmengine` client makes requests to Scale AI's hosted infrastructure by default. You can have `llmengine` client make requests to your own self-hosted infrastructure by setting the `LLM_ENGINE_BASE_PATH` environment variable to the URL of the `llm-engine` pod.

The exact URL of `llm-engine` pod depends on your Kubernetes cluster networking setup. The domain is specified at `config.values.infra.dns_host_domain` in the helm chart values config file. Using `charts/llm-engine/values_sample.yaml` as an example, you would
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, pod -> service.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also finish this sentence, e.g. you would do:

@ruizehung-scale ruizehung-scale requested a review from yixu34 August 8, 2023 00:27

The exact URL of `llm-engine` service depends on your Kubernetes cluster networking setup. The domain is specified at `config.values.infra.dns_host_domain` in the helm chart values config file. Using `charts/llm-engine/values_sample.yaml` as an example, you would do:
```bash
export LLM_ENGINE_BASE_PATH=https://domain.llm-engine.com
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should this be https://llm-engine.domain.com? (in addition to changing the value inside values_sample.yaml)

since it feels like llm-engine should be some subdomain of domain.com, not the other way around, if users are self-hosting at domain.com.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense! Updated!

@ruizehung-scale ruizehung-scale merged commit 9a1d567 into main Aug 8, 2023
@ruizehung-scale ruizehung-scale deleted the point-client-to-self-hosted branch August 8, 2023 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants