[Examples] vLLM example for SkyServe + Mixtral #2948

cblmemo · 2024-01-06T05:32:56Z

Added example in #2922 to llm/vllm.

Tested (run the relevant ones):

Code formatting: bash format.sh
Any manual or new tests for this PR (please specify below)
All smoke tests: pytest tests/test_smoke.py
Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
Backward compatibility tests: bash tests/backward_comaptibility_tests.sh

Michaelvll · 2024-01-08T23:54:51Z

llm/vllm/README.md

@@ -126,3 +126,61 @@ curl http://$IP:8000/v1/chat/completions \
  }
 }
 ```
+
+## Serving Mixtral 8x7b model with vLLM and SkyServe


We already have the mixtral 8x7b + vLLM and SkyServe in llm/mixtral. Should we just make the example above to be launchable with sky serve and have an additional link refering to the llm/mixtral?

Make sense! Changed. PTAL again 🫡

Michaelvll

Thanks for updating the example @cblmemo! Left several comments.

llm/vllm/README.md

Michaelvll · 2024-01-10T21:43:26Z

llm/vllm/service.yaml

+  HF_TOKEN: <your-huggingface-token> # Change to your own huggingface token
+
+resources:
+  accelerators: L4:1


Let's use multiple accelerators for this and the original yaml files, so that a user without GCP credentials can use the yaml out-of-the-box, e.g.,{L4:1, A10G:1, A10:1, A100:1, A100-80GB:1}

Done. Thanks!

Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com>

init

249ce7d

cblmemo mentioned this pull request Jan 6, 2024

[SkyServe] Doc with vllm example #2922

Merged

6 tasks

cblmemo requested a review from Michaelvll January 6, 2024 05:33

Michaelvll reviewed Jan 8, 2024

View reviewed changes

apply suggestions from code review

ceab7b9

Michaelvll approved these changes Jan 10, 2024

View reviewed changes

cblmemo and others added 2 commits January 11, 2024 10:43

Apply suggestions from code review

d523c80

Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com>

use multiple resources

a7e2cfc

cblmemo merged commit 7ac091f into master Jan 11, 2024
19 checks passed

cblmemo deleted the vllm-mixtral-example branch January 11, 2024 03:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Examples] vLLM example for SkyServe + Mixtral #2948

[Examples] vLLM example for SkyServe + Mixtral #2948

cblmemo commented Jan 6, 2024

Michaelvll Jan 8, 2024 •

edited

cblmemo Jan 9, 2024

Michaelvll left a comment

Michaelvll Jan 10, 2024

cblmemo Jan 11, 2024

[Examples] vLLM example for SkyServe + Mixtral #2948

[Examples] vLLM example for SkyServe + Mixtral #2948

Conversation

cblmemo commented Jan 6, 2024

Michaelvll Jan 8, 2024 • edited

Choose a reason for hiding this comment

cblmemo Jan 9, 2024

Choose a reason for hiding this comment

Michaelvll left a comment

Choose a reason for hiding this comment

Michaelvll Jan 10, 2024

Choose a reason for hiding this comment

cblmemo Jan 11, 2024

Choose a reason for hiding this comment

Michaelvll Jan 8, 2024 •

edited