Update to latest vLLM upstream and Support vLLM on CPU #149

xwu99 · 2024-04-23T07:40:53Z

Update models to pydantic v2 as latest vllm has adopted v2 models instead of v1
Fix AutoscalingConfig model as it's from Ray Serve that is based on pydantic v1
Add CPU model yaml files for Llama2 7B

XBeg9 · 2024-04-24T17:04:07Z

were you able to run this locally? does it work? I am just looking forward to see how to update this project to support latest vllm

xwu99 · 2024-04-25T00:57:52Z

were you able to run this locally? does it work? I am just looking forward to see how to update this project to support latest vllm

I am working on this. Several packages have been updated (ray, vllm, pydantic, openai etc.) since the last release of RayLLM. Hopefully to get it working soon.

Signed-off-by: Wu, Xiaochang <xiaochang.wu@intel.com>

lynkz-matt-psaltis · 2024-04-28T07:22:41Z

Hey all,

I also have similar updates on a fork however I've struggled to get feedback from the maintainers to work out how to proceed here. I similiarly updated rayllm to pydantic v2 due to the vllm migration to v2 proper (not using the v1 back-compat). The challenge this introduced is that it makes these changes incompatible with ray because ray is still using v1 compat. See: ray-project/ray#43908 (I haven't had a chance to go back and get further specifics as requested to help convince the core ray team to reconsider the pydantic upgrade)

There's numerous other signature changes with the tight coupling of ray and vllm so whilst you may get rayllm working directly with vllm, I wonder what the mileage will be here on getting this contribution accepted if it excludes ray support.

Just food for thought. :)

xwu99 · 2024-04-28T07:37:57Z

Hey all,

I also have similar updates on a fork however I've struggled to get feedback from the maintainers to work out how to proceed here. I similiarly updated rayllm to pydantic v2 due to the vllm migration to v2 proper (not using the v1 back-compat). The challenge this introduced is that it makes these changes incompatible with ray because ray is still using v1 compat. See: ray-project/ray#43908 (I haven't had a chance to go back and get further specifics as requested to help convince the core ray team to reconsider the pydantic upgrade)

There's numerous other signature changes with the tight coupling of ray and vllm so whilst you may get rayllm working directly with vllm, I wonder what the mileage will be here on getting this contribution accepted if it excludes ray support.

Just food for thought. :)

No need for ray to upgrade. I just upgrade AutoscalingConfig to v2 here.
Previously I used pydantic.v1 but found latest fastapi has issues supporting pydantic.v1.

Signed-off-by: Wu, Xiaochang <xiaochang.wu@intel.com>

marov · 2024-05-04T01:24:52Z

@xwu99 comment says vllm is installed seperately from source for now but I don't see anywhere it being installed?

xwu99 · 2024-05-05T09:11:41Z

@xwu99 comment says vllm is installed seperately from source for now but I don't see anywhere it being installed?

You just need to follow vLLM official guide.

depenglee1707 · 2024-05-06T01:22:47Z

@xwu99 I saw worker_use_ray=False, is that means your implements cannot support model parallelization? I mean world_size > 1?

xwu99 · 2024-05-06T01:27:59Z

@xwu99 I saw worker_use_ray=False, is that means your implements cannot support model parallelization? I mean world_size > 1?

vLLM for CPU does not support tensor parallelism yet. This PR should be revised later to support both CPU and GPU. Right now it's just adapted for CPU.

depenglee1707 · 2024-05-06T01:56:57Z

@xwu99 I saw worker_use_ray=False, is that means your implements cannot support model parallelization? I mean world_size > 1?

vLLM for CPU does not support tensor parallelism yet. This PR should be revised later to support both CPU and GPU. Right now it's just adapted for CPU.

Great, thanks for clarification. I also try to upgrade vllm to latest version but for GPU, I found it's not a easy work. the main problem is vllm require driver process also has GPU capability

Signed-off-by: Wu, Xiaochang <xiaochang.wu@intel.com>

xwu99 added 2 commits April 15, 2024 14:58

initial update

4daba71

fix runtime_env arg

0092d22

xwu99 marked this pull request as draft April 23, 2024 07:40

xwu99 added 2 commits April 25, 2024 05:49

add models

b09ff4e

update models to pydantic v2 and fix AutoscalingConfig model

f79a2bc

Signed-off-by: Wu, Xiaochang <xiaochang.wu@intel.com>

xwu99 marked this pull request as ready for review April 28, 2024 07:39

xwu99 added 2 commits April 29, 2024 08:33

apply bump-pydantic tool

e5fefc1

Signed-off-by: Wu, Xiaochang <xiaochang.wu@intel.com>

parse_obj => model_validate

cca3f68

Signed-off-by: Wu, Xiaochang <xiaochang.wu@intel.com>

XBeg9 mentioned this pull request May 1, 2024

Is this project still actively being maintained? #148

Open

Refactor

c5788a6

Signed-off-by: Wu, Xiaochang <xiaochang.wu@intel.com>

xwu99 changed the title ~~Support vLLM on CPU from vllm upstream~~ Update to latest vLLM upstream and Support vLLM on CPU May 6, 2024

xwu99 added 4 commits May 6, 2024 03:08

Update backend requirements

2658017

Signed-off-by: Wu, Xiaochang <xiaochang.wu@intel.com>

fix

a786c12

Signed-off-by: Wu, Xiaochang <xiaochang.wu@intel.com>

fix

19e9542

Signed-off-by: Wu, Xiaochang <xiaochang.wu@intel.com>

nit

605ae4a

Signed-off-by: Wu, Xiaochang <xiaochang.wu@intel.com>

xwu99 mentioned this pull request May 6, 2024

[DOC] Add instructions to install and run RayLLM backend locally #151

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update to latest vLLM upstream and Support vLLM on CPU #149

Update to latest vLLM upstream and Support vLLM on CPU #149

xwu99 commented Apr 23, 2024 •

edited

XBeg9 commented Apr 24, 2024

xwu99 commented Apr 25, 2024 •

edited

lynkz-matt-psaltis commented Apr 28, 2024

xwu99 commented Apr 28, 2024 •

edited

marov commented May 4, 2024

xwu99 commented May 5, 2024

depenglee1707 commented May 6, 2024

xwu99 commented May 6, 2024

depenglee1707 commented May 6, 2024

Update to latest vLLM upstream and Support vLLM on CPU #149

Are you sure you want to change the base?

Update to latest vLLM upstream and Support vLLM on CPU #149

Conversation

xwu99 commented Apr 23, 2024 • edited

XBeg9 commented Apr 24, 2024

xwu99 commented Apr 25, 2024 • edited

lynkz-matt-psaltis commented Apr 28, 2024

xwu99 commented Apr 28, 2024 • edited

marov commented May 4, 2024

xwu99 commented May 5, 2024

depenglee1707 commented May 6, 2024

xwu99 commented May 6, 2024

depenglee1707 commented May 6, 2024

xwu99 commented Apr 23, 2024 •

edited

xwu99 commented Apr 25, 2024 •

edited

xwu99 commented Apr 28, 2024 •

edited