Can create LLM endpoints #132

ruizehung-scale · 2023-07-19T03:03:08Z

Summary

Add missing a100 and t4 devices in values_sample.yaml that prevent llm-engine-cacher from starting successfully.
Update http-forwarder container python command path for streaming endpoint
Fix TGI repo name and forwarder config path
Fix forwarder aws config mount path

Test Plan

data = {
    "name": "llama-7b",
    "model_name": "llama-7b",
    "source": "hugging_face",
    "inference_framework": "text_generation_inference",
    "inference_framework_image_tag": "0.9.1",
    "num_shards": 4,
    "endpoint_type": "streaming",
    "cpus": 32,
    "gpus": 4,
    "memory": "40Gi",
    "storage": "40Gi",
    "gpu_type": "nvidia-ampere-a10",
    "min_workers": 1,
    "max_workers": 12,
    "per_worker": 1,
    "labels": {"team": "infra", "product": "llm_model_zoo"},
    "metadata": {}
}
headers = {'Content-Type': 'application/json'}
response = requests.post("http://localhost:5000/v1/llm/model-endpoints", headers=headers, data=json.dumps(data), auth=('test_user_id', ''))
print(response.status_code)
print(response.json())
------------------------
200
{'endpoint_creation_task_id': 'e9a7a99b-6fe6-4e69-97e9-4b007ab2e7bb'}

data = {
    "prompts": ["hi"],
    "max_new_tokens": 10,
    "temperature": 0.1
}

headers = {'Content-Type': 'application/json'}
response = requests.post("http://localhost:5000/v1/llm/completions-sync?model_endpoint_name=llama-7b", headers=headers, data=json.dumps(data), auth=('test-user-id', ''))
print(response.status_code)
print(response.json())
------------------------
200
{'status': 'SUCCESS', 'outputs': [{'text': 'hi hi hi2 hi2 hi2 hi2', 'num_completion_tokens': 10}], 'traceback': None}

ruizehung-scale · 2023-07-19T17:45:56Z

server/llm_engine_server/domain/use_cases/llm_model_endpoint_use_cases.py

                    flavor=StreamingEnhancedRunnableImageFlavor(
                        flavor=ModelBundleFlavorType.STREAMING_ENHANCED_RUNNABLE_IMAGE,
-                        repository="text-generation-inference",  # TODO: let user choose repo
+                        repository="ghcr.io/huggingface/text-generation-inference",  # TODO: let user choose repo


@yunfeng-scale It turns out I need to update TGI repo name in order to skip image existence check in ECR repo given the logic here

llm-engine/server/llm_engine_server/domain/use_cases/model_bundle_use_cases.py

Line 386 in 1370d79

and self.docker_repository.is_repo_name(request.flavor.repository)

Is this change reasonable? Should we back propagate this back to hmi as well?

we shouldn't hardcode this since it diverged internal / OSS code, can you add this as a parameter?

yes this change makes sense

…github.com/scaleapi/llm-engine into fix-llm-engine-image-cache-startup-failure

song-william · 2023-07-19T22:38:10Z

charts/llm-engine/templates/_helpers.tpl

 volumeMounts:
  - name: config-volume
-    mountPath: /root/.aws/config
+    mountPath: /home/user/.aws/config


note: We may need to parameterize this entirely.

song-william · 2023-07-19T22:38:31Z

charts/llm-engine/templates/service_template_config_map.yaml

                - run-service
                - --config
-                - /workspace/llm_engine/llm_engine/inference/configs/${FORWARDER_CONFIG_FILE_NAME}
+                - /workspace/server/llm_engine_server/inference/configs/${FORWARDER_CONFIG_FILE_NAME}


Note that this may need to be parameterized as well.

song-william

Discussed offline that we will parameterize these values in a later PR. Merging as is.

Add missing a100 and t4 devices in values_sample.yaml

41ecada

ruizehung-scale requested review from seanshi-scale, song-william and yixu34 July 19, 2023 04:01

ruizehung-scale marked this pull request as ready for review July 19, 2023 04:01

ruizehung-scale marked this pull request as draft July 19, 2023 04:22

ruizehung-scale added 3 commits July 19, 2023 04:23

Update http-forwarder python command for streaming endpoint

5dbe7dd

Fix tgi repo name

1370d79

Fix tgi repo name

1a9207a

ruizehung-scale changed the title ~~Add missing a100 and t4 devices in values_sample.yaml~~ Can create LLM endpoint Jul 19, 2023

ruizehung-scale commented Jul 19, 2023

View reviewed changes

ruizehung-scale added 3 commits July 19, 2023 21:55

Fix TGI http forwarder config path

b9bae4e

Fix forwarder aws config mount path

09dab97

Merge branch 'fix-llm-engine-image-cache-startup-failure' of https://…

d1da5f6

…github.com/scaleapi/llm-engine into fix-llm-engine-image-cache-startup-failure

ruizehung-scale marked this pull request as ready for review July 19, 2023 22:36

song-william reviewed Jul 19, 2023

View reviewed changes

song-william approved these changes Jul 19, 2023

View reviewed changes

song-william merged commit 767cbc4 into main Jul 19, 2023

song-william deleted the fix-llm-engine-image-cache-startup-failure branch July 19, 2023 22:46

song-william changed the title ~~Can create LLM endpoint~~ Can create LLM endpoints Jul 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can create LLM endpoints #132

Can create LLM endpoints #132

Uh oh!

ruizehung-scale commented Jul 19, 2023 •

edited

Loading

Uh oh!

ruizehung-scale Jul 19, 2023

Uh oh!

yunfeng-scale Jul 19, 2023

Uh oh!

yunfeng-scale Jul 19, 2023

Uh oh!

song-william Jul 19, 2023

Uh oh!

song-william Jul 19, 2023

Uh oh!

song-william left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Can create LLM endpoints #132

Can create LLM endpoints #132

Uh oh!

Conversation

ruizehung-scale commented Jul 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Uh oh!

ruizehung-scale Jul 19, 2023

Choose a reason for hiding this comment

Uh oh!

yunfeng-scale Jul 19, 2023

Choose a reason for hiding this comment

Uh oh!

yunfeng-scale Jul 19, 2023

Choose a reason for hiding this comment

Uh oh!

song-william Jul 19, 2023

Choose a reason for hiding this comment

Uh oh!

song-william Jul 19, 2023

Choose a reason for hiding this comment

Uh oh!

song-william left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ruizehung-scale commented Jul 19, 2023 •

edited

Loading