Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion charts/llm-engine/values_sample.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ config:
# k8s_cluster_name [required] is the name of the k8s cluster
k8s_cluster_name: main_cluster
# dns_host_domain [required] is the domain name of the k8s cluster
dns_host_domain: domain.llm-engine.com
dns_host_domain: llm-engine.domain.com
# default_region [required] is the default AWS region for various resources (e.g ECR)
default_region: us-east-1
# aws_account_id [required] is the AWS account ID for various resources (e.g ECR)
Expand Down
8 changes: 8 additions & 0 deletions docs/guides/self_hosting.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,4 +200,12 @@ $ curl -X POST 'http://localhost:5000/v1/llm/completions-sync?model_endpoint_nam
You should get a response similar to:
```
{"status":"SUCCESS","outputs":[{"text":". Tell me a joke about AI. Tell me a joke about AI. Tell me a joke about AI. Tell me","num_completion_tokens":30}],"traceback":null}
```

### Pointing LLM Engine client to use self-hosted infrastructure
The `llmengine` client makes requests to Scale AI's hosted infrastructure by default. You can have `llmengine` client make requests to your own self-hosted infrastructure by setting the `LLM_ENGINE_BASE_PATH` environment variable to the URL of the `llm-engine` service.

The exact URL of `llm-engine` service depends on your Kubernetes cluster networking setup. The domain is specified at `config.values.infra.dns_host_domain` in the helm chart values config file. Using `charts/llm-engine/values_sample.yaml` as an example, you would do:
```bash
export LLM_ENGINE_BASE_PATH=https://llm-engine.domain.com
```